# Introduction to Python

## [UCSAS 2021](https://statds.org/events/ucsas2021/)  Conference

## Surya Teja Eada

## October 09, 2021

## Welcome

- Welcome to today's workshop for Introduction to Python as part of [UCSAS 2021](https://statds.org/events/ucsas2021/) Conference.
- This workshop aims at giving a quick tour to Python
- The workshop may slightly lean towards statistics, data visualizations.

## About Me
 
<!-- <img src = 'figs/profile.JPG' alt="Anaconda Learning Library" style="width:200px; height:200px; align:right;"/> -->

<div class="wrapingimage">   
<img src=    
"figs/profile.JPG" height="100px" width="100px"  
alt="image">
</div>  

Third year Ph.D. student from Department of Statistics at UConn

**Research Interests**: 
   - Stochastic Processes, Diffusion Processes.
   - Financial Risk Modeling, Model Risk Assessment.
   
**Aspirations**:
   - Collaborate in various domains to contribute with Statistics.
   - Research and Academics
   
**Hobbies**:
   - Sports, Painting, Travel, Learning new things

## Prerequisites

- Prerequsites: A laptop with Anaconda installed. Anaconda can be downloaded for windows users [here](https://repo.anaconda.com/archive/Anaconda3-2020.07-Windows-x86_64.exe) and for Mac users [here](https://repo.anaconda.com/archive/Anaconda3-2020.07-MacOSX-x86_64.pkg).
- The slides and Practice workbook for today's workshop can be found at: [Github UCSAS 2021](https://github.com/suryaeada9/ucsas_intro_python)

### Anaconda
- When you install Anaconda GUI (package manager), it comes with `Python 3.8` (not the latest one), `pip 20`, `conda forge` and most of the important modules such as `Numpy`, `Scipy`, `Matplotlib`, etc.
- Anaconda allows to keep track of useful modules along with right version required for your project via environments
- pip and conda forge allows for these without a GUI using command line.
- [More](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#creating-an-environment-with-commands) documentation on how conda allows to create environments are available here.
- Anaconda also provides Jupyter notebook, a very useful tool in data science similar to Rmarkdown allowing for markdown and code runs.
- Furthermore a library full of documentation regarding important modules is present as you can see below:

<img src = 'figs/anaconda_library.png' alt="Anaconda Learning Library" align="center" style="width:600px; height:600px;"/>

## Motivation

Let us look from around us what brings motivation to learning a new language.

- Data is growing richer, bigger and broader and that means more application of Data Science
- Python provides us with a utilitarian, easy to read and write language
- Most deep learning models are implemented here first and there is a 
  lot of demand for Python
- It is an open source which implies it improves all the time
- And also adaptation is key to survival with the job market

<!-- ![Data Per Day](figs/data_size_domo.png) -->
<img src="figs/data_size_domo.png" alt="Data Per Day" style="width:600px; height:600px;"/>

<!-- !["Rankings of Computer Languages"](figs/language_rankings.png) -->
<img src="figs/language_rankings.png" alt="Rankings of Computer Languages" style="width:600px; height:600px;"/>

<img src="figs/likely_python.png" alt="Anaconda Most used language" style="width:600px; height:600px;"/>

## Python Intro

- Python is one of the most promising programming language released in 1991 by `Guido Van Rossum`.
- It is useful for creation of web applications, for software development, and now a lot in data science and constantly evolving.
- It has an easy `syntax` similar to English, simple to read and write.
- The most recent major version of Python is `Python 3` although Python 2 is still used majorly inside mac systems for development.
- We can use Python via `command line`, IDE's like `PyCharm`, or applications such as `Jupyter Notebook`. 
- Python works on all the different platforms like `Windows, Mac, Linux`.

## Aims of the Workshop

- **Python Syntax**
    + Variable
    + Indentation
    + Comments
    + Python Help
- **Data Types and Methods**: Mutable and Immutable
    + **Text Type**: `str`
    + **Numeric Types**: `int, float, complex`
    + **Sequence Types**: `list, tuple, range`
    + **Mapping Type**:	`dict`
    + **Set Types**: `set, frozenset`
    + **Boolean Type**:	`bool`
- **Conditions, Loops, Functions**
- **Basic Modules** and respective Functions
    + **`Numpy`**: Arrays, Universal Functions (Vectorized), Random numbers (Simulations)
    + **`Pandas`**: DataFrames, Data Manipulation
    + **`Scipy`**: Stats functions, optimize functions
    + **`Matplotlib`**: Visualization
- Exercises 

## Let's get Started

- We will be working on Jupyter Notebook to open a `Practice.ipynb` located in the [Github UCSAS 2021](https://github.com/suryaeada9/ucsas_intro_python).
- Download the file `Practice.ipynb`
- Open a Jupyter notebook from Anaconda GUI or from command line using command `jupyter notebook`
- Open the `Practice.ipynb` file

## Jupyter Notebook

### Some Useful comments for Jupyter
- In order to work with a jupyter notebook that allows for `markdown` and `code chunks`, it is most importanat to note the difference between `command` mode and `edit` mode. 
    + `Command` mode is activated when a `blue border` is around the highlighted chunk and often can be obtained by pressing `ESC`
    + This implies that you can add a new chunk using `A` or `B`, change the current chunk into a markdown chunk using `M`, or a code chunk using `Y`, activate help to find all shortcuts using `H` 
    + During the `Command` mode anything you type is considered a command for the notebook and will not be typed into the chunk. 
    + `Edit` mode is activated when a `green border` is highlighting the chunk you are at and can be attained by pressing `Enter`.
    + This implies you are editing the current chunk by writing code or text depending on the type of chunk it is.
    + A code chunk can be determined by a `In []` to the left of the chunk and a markdown chunk does not have the same. Moreover, a markdown chunk can be run using `CMD + ENTER` in which case it looks like a PDF.

### Python Installation Check and our first Python run

In [1]:
## Check Python Installation & Version (`!`)
!python --version

## print Hello World
print("Hello World!")

Python 3.8.8
Hello World!


## Python "f-string" and "input" function

- Python "f-string" allows concatenation of strings with variables

In [2]:
## taking user input
name = input("Enter your name: ")
age  = input("Enter your age: ")
print(f"My name is {name} and I am {age} years old")

Enter your name: Surya
Enter your age: 29
My name is Surya and I am 29 years old


## Save this a source file to run multiple times

%save is a magic function that allows us to save a source file once we like a function run, application run etc by running it in Jupyter and using the chunk number dynamically as an input.

In [3]:
## magic command allows you to save second run chunk in `.py`
## and -a allows to append to that rather than overwrite.

## I will run this once and comment it as it is a dynamic function
## and I don't want to create mutliple files.

# %save firstscript.py -n 2 -a 

The following commands were written to file `firstscript.py`:
## taking user input
name = input("Enter your name: ")
age  = input("Enter your age: ")
print(f"My name is {name} and I am {age} years old")


- `!` allows for magic use of `bash` within jupyter notebook. 
- `%save` allows for magic save of a code chunk into a `.py` file that you can run using python in bash.

## Python Syntax

- Python uses `new line` to inform `end` of command or function.
- This is unlike other programs that use brackets or semi-colons.

### Comments
- Comments are applied using `#` or apply shortcut `CMD + /`

In [4]:
# Demo: Print, End of command, Commenting
print("Hello UCSAS")
print("Workshop is going good??")

Hello UCSAS
Workshop is going good??


## Python Syntax

### Variable Assignment
- Variable is a pointer associated to an object (int, list) in Python and thus stores address of where the object is saved. 
- We use "=" for assignment of a variable to such a name. 
- We can assign integers, floats, strings, boolean, lists, functions or any class object to a variable.
- Python also allows multiple assignment in a single line from multiple objects or single list.
- "is" verifies if it is the same object in the same address while "==" verifies the equality pointwise.

In [8]:
# Integer Assignment to a variable
x = 2

# String
z = "UCSAS"

## Boolean
w = True

## list
a = [1, 2, 3, 4]

## function
def function_that_adds_10(x):
    return(x + 10)

f = function_that_adds_10

print(x)
print(z)
print(w)
print(a)
print(f(2))

2
UCSAS
True
[1, 2, 3, 4]
12


In [9]:
## Mutliple Assignment at once
a1,  a2, a3 = 2, "UCSAS", True

print(a1, ',', a2, ',', a3)

2 , UCSAS , True


### Indentation
- Indentation refers to the spaces at the beginning of a code line used to inform scope of conditional, loop, or a function.
    + You have to use some white space and it depends on programmer but should be `at least one space`
    + Use `same number` of space in same block of code
    + It informs that the next line does not mean end of the earlier code

In [30]:
# Indentation using one space to continue `if`
# Also see the use of `:`
x = 0

if x>2: 
 print("Greater than 2")
else: 
 print("Less than 2")

Less than 2


### Help in Python

- In python help can be required of three types.
    + you don't know what function to use: google, tutorials can be helpful
    + you know a function but don't know its usage: use `help`(function)
    + you know module but forgot the name of the function: use `tab` after module name and a dot
    + you want to use methods for your data type but don't know them: use `dir(type)`
- `dir()` allows you to get all the built-in methods applicable to a certain object.
- One can apply these by using a `.` at the end of the list name, similar to `x.sort()` or `x.append(4)`
- If you wish to get only the meaningful ones that are predominantly used, the following "`if` condition" avoiding all those that start with `__` or `_` may be helpful.  

In [36]:
## Using dir to find the methods for list
MyClass = list
method_list = [method for method in dir(MyClass) if method.startswith('__') is False]
print(method_list)

## Help on function `sum`
help(sum)

x = [1, 2, 3]
sum(x, 2)

['append', 'clear', 'copy', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']
Help on built-in function sum in module builtins:

sum(iterable, /, start=0)
    Return the sum of a 'start' value (default: 0) plus an iterable of numbers
    
    When the iterable is empty, return the start value.
    This function is intended specifically for use with numeric values and may
    reject non-numeric types.



array([3, 4, 5])

## Quiz on Help with variables: 

Use the above manner of using `dir` to find the list of methods applicable
to class `list` and apply those that are most appropriate for the following task:

**Task:** Use `x = [3, 10, 7]` and using the methods for a `list` and 
    - concatenate 4 to the list
    - sort in descending order
    - remove the item at index 2 from original variable and return removed value to a new variable
    - clear the entire list
    - Also find using help, the difference between `pop` and `remove`.
    
 Use print statements at the end of every task to view x at the end of every method application.

**Note:** If you are unsure of a particular method, use `help(type.method)` or `help(object.method)`

In [49]:
## Initiate a list
x = [3, 10, 7]

## Append 4
x.append(4)
print(x)

# Sort in ascending
x.sort()
print(x)

# Reverse order
x.reverse()
print(x)

# Pop and remove application
y = x.pop(2)  ## uses index, defaults last
print(x)
print(y)
x.remove(3)  ## uses value, removes its first occurence, no default
print(x)

## Clear
x.clear()
print(x)

print(help(x.pop))
print(help(x.remove))

[3, 10, 7, 4]
[3, 4, 7, 10]
[10, 7, 4, 3]
[10, 7, 3]
4
[10, 7]
[]
Help on built-in function pop:

pop(index=-1, /) method of builtins.list instance
    Remove and return item at index (default last).
    
    Raises IndexError if list is empty or index is out of range.

None
Help on built-in function remove:

remove(value, /) method of builtins.list instance
    Remove first occurrence of value.
    
    Raises ValueError if the value is not present.

None


## Data Types, Methods, Casting.


Every object containing data has a data type that can be found using `type()`. You can access `type()` on the variable pointing to the object.

- **Data Types**:
    + **Text**: `str`
    + **Numeric**: `int, float, complex`
    + **Sequence**: `list, tuple, range`
    + **Mapping**:	`dict`
    + **Set**: `set, frozenset`
    + **Boolean**:	`bool`

- The above is not an exhaustive set of all possibilities

### Guess/Try:
Do you know what happens if you try `type(type(str))`

### Illustration:

The following is an illustration of obtaining objects of different data types.

- String is an array of bytes representing unicode characters and thus elemets can be accessed.
- Multiline strings need three quotes and keep line breaks in tact while printing.
- Operators like `+` concatenates strings, `in` searches for membership
- Commonly used methods:
    + `upper()`: upper case
    + `lower()`: lower case
    + `strip()`: removes white space from beginning and end
    + ``replace(`a`, `b`)``: replaces a with b in string
    + ``split(`,`)``: splits based on given seperator
    + `format(a)`: adds variable a at the `{}` left in the string
    + `capitalize()`: converts first character to upper case.
    + `casefold()`: converts string into lower case
    + `count('a')`: counts number of times a specified value occurs in a string.
    + `endswith()`, `startswith`, `find`, `index`, `isalphabet`, `isdigit`

In [152]:
## Primitive type

# String
x = 'abcd'
print(f"`{x}` is of type: {type(x)}")

# multiline string
x = '''abcd
let me show what line breaks do'''
print(x)

# string access and slicing
x = 'I am at UCSAS'
print(f"`{x[0]}` is the first character in string x")
print(f"`{x[1:10]}` are the next nine characters including spaces")  # from index 1 to index 9
print(f"`{x[:4]}` are the first four characters in x excluding index 4")
print(f"`{x[4:]}` are the characters from location 4 including till end")
print(f"`{x[-1]}` is the last character")

# check string is Boolean
print("at" in x)

# format
x = "I am {} and I teach {}"
x.format('Surya', 'Statistics')

`abcd` is of type: <class 'str'>
abcd
let me show what line breaks do
`I` is the first character in string x
` am at UC` are the next nine characters including spaces
`I am` are the first four characters in x excluding index 4
` at UCSAS` are the characters from location 4 including till end
`S` is the last character
True


'I am Surya and I teach Statistics'

In [58]:
# integer 
y = 12
print(f"{y} is of type: {type(y)}")

12 is of type: <class 'int'>


In [59]:
# float
z = 12.0
print(f"{z} if of type: {type(z)}")

12.0 if of type: <class 'float'>


In [60]:
# complex
a = 2 + 3j  # can use J as well, or casting like complex(2, 3)
print(f"{a} is of type: {type(a)}")

(2+3j) is of type: <class 'complex'>


In [62]:
# Boolean
x = True
print(f"{x} is object of type: {type(x)}")

True is object of type: <class 'bool'>


### Data types containing multiple objects:

- In base python, we already have `list`, `tuple`, `range` for sequence like objects that allow repetitions while we have `set`, `dictionary` for objects that contain multiple items but without allowance for replacements.
- Note that the `indexing` for sequence like objects starts at `0` in Python and using index such as `x[0]`, value can be accessed.
- Sequence like objects can be accessed using index as `item[index]` to obtain element at a particular index.
- `List` uses a `[]` whereas `tuple` uses a `()`, set uses a `{}`, dictionary also uses a `{key: value}` .

- `List` is an immutable object containing a sequence of other objects where objects are of type `int`, `list`, or `functions`.
- `Set` has a collection of objects but without allowance for replacements. Also all objects in a set has to be simple or primitive and cannot have a list as an element.

#### Mapping object
- `dict` allows a mapping of an immutable key to any kind of objects. This is slightly different as it is also a mapping kind of object

In [114]:
# list
x = [1, 2, "abc", [1, 2]]
print(f"{x} is of type: {type(x)}")

print(f"element in x at index 0 is {x[0]}")
print(f"sliced elements between locations 0 and 3(until location 2) is {x[0:3]}")

[1, 2, 'abc', [1, 2]] is of type: <class 'list'>
element in x at index 0 is 1
sliced elements between locations 0 and 3(until location 2) is [1, 2, 'abc']


In [115]:
## list of functions
def a(x):
    return(x[0])

def b(x):
    return(x[1])

z = [a, b]
[x('123') for x in z]

['1', '2']

In [116]:
# tuple `immutable equivalent of list`
y = (1, 2, [1, 2], "abc")
print(f"{y} is element of type: {type(y)}")

(1, 2, [1, 2], 'abc') is element of type: <class 'tuple'>


In [117]:
# range(## start, stop, step). Also step can only be integers. This gives i, i+1, ... j-1
z = range(1, 5, 1)   
print(f"{z} is element of type: {type(z)}")

range(1, 5) is element of type: <class 'range'>


In [118]:
# set
x = {"ab", "cd", "ed", "ab"}
x

{'ab', 'cd', 'ed'}

In [120]:
# dict
x = {'a': [1, 2, 3], 'b': (2, 3), 'a': [2, 4]}

# dict can update
x.update({'c':[1, 5]})
print(x)

{'a': [2, 4], 'b': (2, 3), 'c': [1, 5]}


## Methods for each data type

- Every data type has certain built in methods defined based on characteristics of data type such as mutability, repetitions, kind of elements it can contain.
- A method is often applied at the end of name of the variable and that is what makes Python readable like English.

- Int: 

In [125]:
def methods(x):
    y = {method for method in dir(x) if method.startswith("__") is False}
    print(f"{x}:{y}")

# primitive object methods
methods(int)
methods(float)
methods(str)
methods(complex)
methods(bool)

# methods for objects containing multiple objects
methods(list)
methods(tuple)
methods(set)
methods(dict)

<class 'int'>:{'conjugate', 'bit_length', 'as_integer_ratio', 'to_bytes', 'denominator', 'imag', 'real', 'from_bytes', 'numerator'}
<class 'float'>:{'conjugate', 'as_integer_ratio', 'hex', 'imag', 'fromhex', 'real', 'is_integer'}
<class 'str'>:{'rsplit', 'islower', 'maketrans', 'replace', 'encode', 'isspace', 'capitalize', 'endswith', 'isdecimal', 'swapcase', 'rstrip', 'splitlines', 'isdigit', 'rjust', 'format_map', 'istitle', 'lower', 'translate', 'format', 'split', 'upper', 'isidentifier', 'rfind', 'expandtabs', 'partition', 'isascii', 'join', 'startswith', 'rpartition', 'isalnum', 'rindex', 'title', 'ljust', 'isprintable', 'isalpha', 'index', 'strip', 'center', 'isnumeric', 'lstrip', 'count', 'casefold', 'find', 'isupper', 'zfill'}
<class 'complex'>:{'conjugate', 'imag', 'real'}
<class 'bool'>:{'conjugate', 'bit_length', 'as_integer_ratio', 'to_bytes', 'denominator', 'imag', 'real', 'from_bytes', 'numerator'}
<class 'list'>:{'append', 'insert', 'remove', 'index', 'count', 'copy', 'e

In [131]:
x = 1.2
print(complex(x))
print(x.conjugate())

TypeError: can't convert complex to float

## Mutability and Immutability

- Few data types are mutable and few data types are immutable and Python handles them differently.
- Mutable objects: easy to change, less time expensive, doesn't create a copy
- Immutable objects: expensive to change, creates a copy.
- As a rule of thumb:
    + primitive like types like `int`, `float`, `complex`, `str`, `bool` are immutable
    + container like types like `list`, `set`
    + `dict` contains a mapping of `keys` to `values`. 
    + While `dict` values can be mutable or immutable from the above list, `dict` keys are immutable
- Others: 
    + `tuple` is immutable equivalent of `list`
    + `frozen set` is immutable equivalent of `set`

- **Exception:** immutable objects can contain mutable objects and may allow for mutation

- Most methods for mutable objects can be applied at the same address while modifying the original
- Methods on immutable objects often create a new object
- Even though data types can be mutable, we may have some methods or functions that may change location.
- Common way to find a immutable object is trying to change value and observing its id. 

#### Check address change
- `id` allows to find the address of the object.
- `is` allows to check if variables are at the same address.

In [93]:
## Primitive objects are immutable
x = 'abcd'
y = 'abcd'
print(id(x))
print(id(y))

x = x.capitalize()
print(id(x))
print(id(y))


## tuples are immutable
x = (1, 2, 3)

try:
    x[1] = 1
except: 
    print("Error obtained")

4582376688
4582376688
4582558512
4582376688
Error obtained


In [112]:
## Exception on Mutations: list within tuples
x = ([1,2], "str")
print(x)
print(id(x))
print(id(x[0]))  ## list is immutable, stays in same location
x[0].append(3)
print(x)
print(id(x))
print(id(x[0]))

([1, 2], 'str')
4572223872
4582660224
([1, 2, 3], 'str')
4572223872
4582660224


In [91]:
## Mutable Data type  - Example: List
x = [1, 2, [1,2]]
y = x
z = x[:]   ## [:] is method that changes location

print(x)
print(y)
print(z)

## is vs == 
print(x is y)
print(x is z)
print(x == z)

## append is a method of class list
a = x.append(4)
x[0] = 2
print(x is y)
print(y)
print(z)

[1, 2, [1, 2]]
[1, 2, [1, 2]]
[1, 2, [1, 2]]
True
False
True
True
[2, 2, [1, 2], 4]
[1, 2, [1, 2]]


## Operators:

- Arithmetic Operators: 
    + `add: +, subtract: -, multiply: *, division: /, modulus: %, exponentiation: **, floor division: //`
    
- Assignment Operators: 
    + `equals: =, add and equal: +=, subtract and equal: -=, multiply and equal: *=, divide and equal: /=, remainder and equal: %=, floor division and equal: //=, power and equal: **=`
    + `Boolean and and equal &=, Boolean or and equal |=, Boolean xor and equal ^=`
    
- Comparison Operators: 
    + `value equality: ==, value not equal: !=, value greater than: >, value less than: <, value greater than equal: >=, value less than equal: <=`

- Logical Operators: 
    + `and, or, ^, not`

- Identity Operators: 
    + `is, is not` (checks for object being same)
    
- Membership Operators for being part of list or tuple: 
    + `in, not in`

## Data Casting to types

- Some type of data can be cast to a different type as long as it is reasonable.
- Casting can be done using `type name` as a `function`
- For example, an `integer` can be cast into a `float` or a `complex` number. A `float` can become `int` by removing decimals (taking floor). However, a `complex` number cannot be cast into `int`
- A `list`, `tuple` can always be cast to each other
- A `list` can be cast into a `set` even if there are repetitions but not if they contain non-primitive elements.
- A `set` is already a similar to the keys of a dictionary but remember set is unordered and unindexed.
- Use `zip` function to map values to keys thus setting up a dictionary

In [134]:
## Integer to Float
x = 1
print(float(x))
print(complex(x))

## Boolean to Integer
x = True
print(int(x))

## List to Integer
x = [1, 2, 3]
a, b, c = x

1.0
(1+0j)
1


In [135]:
# list to a set
x = [1, 2, 1+2j, "a"]
y = set(x)
print(y)

# set and list to a dictionary use y as set of keys and x as values
dict(zip(y, x))

{1, 2, 'a', (1+2j)}


{1: 1, 2: 2, 'a': (1+2j), (1+2j): 'a'}

## Conditionals

- 

## Loops

## Functions, Scope

## Modules in Python

- Install
- Import
- Use functions

## `numpy`

- Numpy is known for arrays, random number generators
- 

## `pandas`

- Pandas are known for data frames with row names and column names similar to R `dplyr`


## `scipy`

- stats package in scipy has all statistical functions
- 

## `matplotlib`

## Miscellaneous 

- scikit 
- 

## Quiz

## Quiz Soln.

## Quiz Soln. Contd.



## Important References

- [Practical Data Science by Prof.Eubanks](https://www.practicaldatascience.org/html/index.html)
- Anaconda and respective documentations
- [W3Schools](https://www.w3schools.com/python/default.asp)



## Acknowledgements:

- I thank the audience for their patience and enthusiasm.
- I thank Professor Jun Yan whose class has taught a lot
- Also need to acknowledge IMSI bootcamp and the humongous open source available on internet.
- Wish you all the best of luck with working on Python.
- I will be available at [surya.eada@uconn.edu](mailto:surya.eada@uconn.edu)

## Questions?