# Intro to Python - Part 1

- **Why Python?**
    - Geospatial analysis (Geopandas, Shapely, ...): Thursday 
    - Web scrapping (BeautifulSoup, Selenium, ...): Friday
    - Machine Learning (Scikit-Learn, Pytorch)
- **Goal of this session**
    - Part 1: Basic syntax (data structure, control flow, functions)
    - Part 2: NumPy, Objected-oriented programming, Parallel computing
    - Part 3: Pandas

---

## Outline

- Types: Numbers, Strings, Booleans
- Data Structures: List, Tuple, Dictionary
- Control Flow Statements: If-else, for loop, while loop
- Functions

## Managing environment using Conda

Reference: [Managing environments](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html)

## Programming Basics

### Basic Types

- Numbers: integer and real numbers
- Strings: text
- Booleans: _True_ or _False_
- None: indication of absence of a value

In [1]:
a = 1
print(type(a))

b = "hello wolrd!"
print(type(b))

c = True # Note that Python is case sensitive, so the capital letter is required
print(type(c))

d = None
print(type(d))

<class 'int'>
<class 'str'>
<class 'bool'>
<class 'NoneType'>


### Operations

- Arithmetic operations on numbers: `+, -, *, **, /, //, %` 
- Relational operations on numbers: `>, <, >=, <=, ==, !=`

In [2]:
print(3 + 2)
print(3 - 2)
print(3 * 2)
print(3 ** 2)
print(3 / 2)
print(3 // 2) # integer division
print(3 % 2) # mod
print(3 > 2)
print(3 == 2)
print(3 != 2)

5
1
6
9
1.5
1
1
True
False
True


- Concatenate strings: `+`
    - Warning: you can not add a number to a string! Instead, convert the number to a string first.
- Comparing strings: `==, !=`
    - Warning: do not compare strings with numbers

In [6]:
print("abc" + "def")
print("123" + str(123))
number = 123
print(f"123{number}")
print(int("123") + 123)

string = "abc"
print("abc" == string)

print("123" == 123)

abcdef
123123
123123
246
True
False


- Boolean expression: `and, or, not`
    - Note: If the value of the left operand determines the result of the operation, then the right operand is not evaluated; that is
        - (True) or (_)
        - (False) and (_)

In [10]:
x = 1
y = 0

print((x > y) and (x < y+1))
print((x > y) or (x < y+1))
print(not (x > y))

print((y != 0) and (x / y == 0))
# print((x / y == 0) and (y != 0))

False
True
False
False


### List

In [11]:
empty_lst = []
num_list = [1, 2, 3]
str_list = ["a", "b", "c"]
mixed_lst = [1, "a", True] # We can do this but this is not the best practice

Concatenate lists

In [12]:
list1 = [1, 2, 3, 4, 5]
list2 = [6, 7, 8]
list1 + list2

[1, 2, 3, 4, 5, 6, 7, 8]

Access elements in a list

In [14]:
print(list1[0])
print(list1[-1])

i = 0
print(list1[i+1])

print(list1[:2])
print(list1[1:4])
print(list1[3:])
print(list1[:])
print(list1[::2])
print(list1[::-1])
print(list1[::-2])

1
5
2
[1, 2]
[2, 3, 4]
[4, 5]
[1, 2, 3, 4, 5]
[1, 3, 5]
[5, 4, 3, 2, 1]
[5, 3, 1]


Lists are **mutable**: we can change both the contents and the size of a list

In [15]:
list1[0] = 100
print(list1)

list1[1:3] = [200, 300]
print(list1)

[100, 2, 3, 4, 5]
[100, 200, 300, 4, 5]


In [16]:
print(len(list1))
list1.append(600)
print(list1)
print(len(list1))

5
[100, 200, 300, 4, 5, 600]
6


Comparing lists

In [17]:
list2 = list1[:]
print(list1 == list2) # have the same elements
print(list1 is list2) # stored in different place in memory

True
False


List of lists

In [18]:
list_lst = [[1, 2, 3], [4, 5, 6]]
print(list_lst)

print(list_lst[0])
print(list_lst[0][1])

[[1, 2, 3], [4, 5, 6]]
[1, 2, 3]
2


One may notice that a list of lists is very similar to a matrix. In practice, we use n-dimentional array object from the numpy package to represent a matrix more often, because it provides better support of linear algebra.

### Tuple

Tuples are very similar to list. The most obvious difference is that we use parentheses instead of square brackets.

In [19]:
a = (1, 3)
b = ((1, 2), (3, 4))

The most distictive feature of tuple is that tuple is **immutable**, i.e. once a tuple is created, its contents cannot be modiﬁed.

In [22]:
a[0]

# a[0] = 2
# TypeError: 'tuple' object does not support item assignment

1

This feature is nice when the data object we want to store should not be easily changed. For instance, the `(latitude, longitude)` of a place.

### Dictionary

Like lists, dictionaries can be used to store values, but do so by associating each `value` with a unique `key` rather than with a position in a sequence.

In [23]:
empty_dict = {}

estimate_dict = {
    "estimate": 0.5,
    "se": 0.23,
    "p_value": 0.05
}

print(estimate_dict.keys())
print(estimate_dict.items())

dict_keys(['estimate', 'se', 'p_value'])
dict_items([('estimate', 0.5), ('se', 0.23), ('p_value', 0.05)])


Instead of accessing values by their positions in the data structure, we access them by their keys.

- dict[key]
- dict.get(key)

The difference between the two is that, when key is not in your dictionary, the first one will raise KeyError while the second one will output None

In [26]:
print(estimate_dict["estimate"])
print(estimate_dict.get("se"))

# print(estimate_dict["abc"])
# KeyError: 'abc'
print(estimate_dict.get("abc"))

0.5
0.23
None


Add new (key, value) pair to the dictionary:

In [28]:
estimate_dict["upper_ci"] = estimate_dict["estimate"] + 1.96 * estimate_dict["se"]
estimate_dict

{'estimate': 0.5, 'se': 0.23, 'p_value': 0.05, 'upper_ci': 0.9508000000000001}

We can also create nested dictionaries:

In [29]:
estimate_dict = {
    "OLS": {"estimate": 0.5, "se": 0.23, "p_value": 0.05},
    "IV" : {"estimate": 0.1, "se": 0.5, "p_value": None}
}

print(estimate_dict["OLS"]["estimate"])

0.5


In [32]:
estimates = {
    "Gender" : {"beta1":{"pe":1.2, "se":0.2, "t":6}, "VCV": [[1,2],[3,4]]}
}

In [33]:
estimates["Gender"]["beta1"]["pe"]

1.2

In [35]:
estimates["Gender"]["VCV"]

[[1, 2], [3, 4]]

### Conditional statement

if-else statement

```
if <boolean expression>:
    <statement>
elif <boolean expression>:
    <statement>
...
else:
    <statement>
```

In [36]:
n = -1
if (n > 0):
    print(f"{n} is positive")
elif (n == 0):
    print(f"{n} is 0")
else:
    print(f"{n} is negative")
    n = -n
print(n)

-1 is negative
1


### For loop

Loop over a range object

In [39]:
list(range(3))

[0, 1, 2]

In [37]:
for i in range(3):
    print(i)

0
1
2


In [41]:
list(range(1,5))

[1, 2, 3, 4]

In [40]:
for i in range(1, 5, 2):
    print(i)

1
3


Loop over list:

In [5]:
lst = ["a", "b", "c"]

for element in lst:
    print(element)

a
b
c


Get both the element and the position index:

In [43]:
for i, element in enumerate(lst):
    print(i, element)

0 a
1 b
2 c


Loop over dictionary:

In [2]:
my_dict = {"a": 1, "b": 2, "c": 3}

for key in my_dict:
    print(key)

a
b
c


In [3]:
for key, value in my_dict.items():
    print(key, value)

a 1
b 2
c 3


Break a loop:

In [6]:
for i, element in enumerate(lst):
    if i > 1:
        break
    print(i, element)

0 a
1 b


Skip iteration:

In [7]:
for i, element in enumerate(lst):
    if i == 0:
        continue
    print(i, element)

1 b
2 c


### While loop

In [9]:
N = 10
i = 1
total = 0

while i <= N:
    total +=  i
    i += 1

total

55

While loop is generally less robust than for loop. For instance, people might introduce _infinite loop_.

### Functions

Functions help us organize and abstract code. We can define a function as follows:

In [10]:
def multiply(a, b):

    ''' 
    Compute the product of two values.

    Inputs: a, b: the values to be multiplied.
    Returns: the product of the inputs
    '''
    
    n = a * b
    return n

multiply(3, 5)

15

Alternatively, we can define anonymous functions:

```
lambda <parameters>: <expression>
```

In [11]:
f = lambda a, b: a * b 

# This is bad practice though. Anonymous functions should be annoymous.
# We will see why sometimes we want to use anonymous functions later.

f(3, 5)

15

### Functional programming

One major paradigm of programming languages is the functional paradigm, in which functions are primarily used to compute values. _Functional_, by definition, is a function of functions. For instance, consider the following maps 
$$h(f, x) = f(x)^2,$$

where the function $h$ takes a function $f$ and a value $x$ as inputs. 

We will introduce two (powerful) functions that reflect the spirit of functional programming, `map` and `reduce`. We want to discuss them because they are closely related to the design of `multiprocessing` package. 

Let's say we have a list of values, and we would like to apply a function $f: x\mapsto x^2$ to each element in the list. The most intuitive way might be using a for loop:

In [13]:
def f(x):
    return x**2

x_lst = [1, 2, 3, 4]
fx_lst = []
for x in x_lst:
    fx_lst.append(f(x))
print(fx_lst)

[1, 4, 9, 16]


Alternatively, we can **map** $x$ to $f(x)$ using `map(f, x_lst)`

In [14]:
list(map(lambda x: x ** 2, x_lst))

[1, 4, 9, 16]

Note that we are using an anonymous function because we know that it is a _temporary_ function; we will probably not reuse it in the future. In scenarios like this where we want to apply an temporary function, we can use anonymous functions.

We can of course use a function that is already defined as well.

In [15]:
list(map(f, x_lst))

[1, 4, 9, 16]

Sometimes, instead of applying the function to each of element, we want to repeatedly applies the function to **reduce** the list to a single value. For instance,
- $\sum$: $\ldots (((x1 + x2) + x3) + x4) \ldots $
- $\Pi$: $\ldots (((x1 \times x2) \times x3) \times x4) \ldots $
- And more generally: $ \ldots f(f(f(f(x1, x2), x3), x4) \ldots $

In [19]:
from functools import reduce

reduce(multiply, x_lst) # take the product of the x_lst

24

It is common to use `reduce` after `map`. For example, if we want to compute $\sum_i x_i^2$, we can do

In [20]:
reduce(lambda x, y: (x + y), list(map(lambda x: x ** 2, x_lst)))

30

Another major paradigm of programming languages is the objected oriented pradigm which we will cover in the second part.

## Reference

- CMSC 12100 by Borja Sotomayor