# i. Introduction to Python 1

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/gem-epidemics/practical-epidemics/blob/master/site/source/iddinf/i-intro-to-python-1.ipynb)

**Date**: Monday Sept 9, 2024

In this session we cover the Python syntax and commands that you will need throughout the course. Since a large proportion of people will have previously coded in R, we make an effort to relate and contrast Python and R.

## LEARNING OUTCOMES
- Understand and work with key data structures and functions
- Perform basic mathematical operations using scientific computing libraries

## Imports

Similar to how packages must be loaded into R, Python packages must be imported into the environment we are working within. This is done with the `import` function. Additionally, packages are typically given aliases as we have to call functions within the package by preempting the function with the package name.

The imports must be done on every script

In [None]:
# imports
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import string
import math

In [None]:
# set colour scheme for plots
sns.set(palette = "Dark2")

We can use a cell Python as a calculator.

In [None]:
3+4

7

In [None]:
3*4

12

In [None]:
3**4

81

In [None]:
3/4

0.75

## Python data structures

This section introduces the base data structures available to us in Python. We beging with scalars and lists before moving into more complex structures such as nested lists, dictionaries and finishing off with (named) tuples.



### Atoms

In [None]:
# scalar values stored in memory
x = 1                           # a number
y = 'one'                       # a string

In [None]:
# use built-in function print
print(x)

1


In [None]:
type(x)

int

In [None]:
type(y)

str

A few rules:

1. names can include letters, numbers, and underscores
2. *cannot* start with a digit
3. Python *is* case senstive

### Exercise 1

Check your understanding of the following lines of code by tracing through them and writin what you expect the output to be in a comment on the same line

In [None]:
x = 12
y = 6

# Trace me :)
x / 2               # Out:
x / y               # Out:
x*2 - y             # Out:
sum_x_y = x + y     # Out:
sum_x_y             # Out:

18

### Solution

In [None]:
# Trace me :) - Solution
print(x / 2)
print(x / y)
print(x*2 - y)

print(sum_x_y)

6.0
2.0
18
18


### Tuples

Tuples are a collection of atoms that are immutable (i.e. the elements in the tuple cannot be added or removed once created). Items within a list can be accessed by using the index of the item. A important thing to note is that Python is a zero-indexed langauge so the first entry is at index 0.

In [None]:
z = (x, y)                      # a tuple - round brackets

In [None]:
z[0]

12

In [None]:
z[1]

6

In [None]:
# negative indexing
z[-1]

6

In [None]:
z = (x,y,x+y)

In [None]:
# tuple slicing - left close right open
z[0:2]

(12, 6)

In [None]:
z[1:3]

(6, 18)

In [None]:
z[0:-2], z[0:-1], z[0:],

((12,), (12, 6), (12, 6, 18))

### Exercise 2

Given the list below, do the following:

1. What is the 13th letter in the alphabet?
2. What are the 7th to 11th letters of the alphabet?
3. What is the middle letter of the alphabet?
4. Spell dog using elements of the list


**NOTE** Python is 0-indexed

In [None]:
# tuple of all lowercase letters in the English alphabet
alphabet = tuple(string.ascii_lowercase) # tuple constructor

print(alphabet)

('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z')


In [None]:
# 1. 13th letter


In [None]:
# 2. 7th to 11th


In [None]:
# 3. middle letter


In [None]:
# 4. dog


### Solution

In [None]:
# Solution
print(alphabet[12])

m


In [None]:
# Solution
print(alphabet[7:11+1])

('h', 'i', 'j', 'k', 'l')


In [None]:
# Solution
num_letters = len(alphabet)
print(num_letters)
print(alphabet[int(num_letters/2)])

26
n


In [None]:
# Solution
print(alphabet[3], alphabet[14], alphabet[6])

d o g



### Lists
Lists are flexible structures since items can be of any type (eg. int, float, string, list, etc). Unlike tuples, lists *are* mutable so we can add or remove items from them.

In [None]:
z = [x,y]                       # a list - square brackets

In [None]:
z

[12, 6]

In [None]:
# indexing works the same way!
z[0]

12

In [None]:
# negative indexing
z[-1]

6

In [None]:
# since they are mutable we can change values
z[-1] = 'a'
z

[12, 'a']

### Exercise 3

Someone made a typo when spelling 'dog'. Fix the following lists to the correct spelling.

In [None]:
dog_1 = ['d', 'o', 'h']

In [None]:
# correct spelling


In [None]:
dog_2 = ['d', 'g', 'o']

In [None]:
# correct spelling

### Solution

In [None]:
# correct spelling - Solution
dog_1[-1] = 'g'
print(dog_1)

['d', 'o', 'g']


In [None]:
# correct spelling - Solution
dog_2 = [dog_2[0], dog_2[2], dog_2[1]]
print(dog_2)

['d', 'o', 'g']


### Dictionaries
Dictionaries are a data structure that is used to hold key-value pairs. This is helpful as we can assign specific names to the keys that are descriptive of the value they are associated to. Dictionaries are defined within a pair \{ \} as shown below.

In [None]:
# define a dictionary
epi_case = {'name': 'John Smith',
            'infected_date': '2024-01-01',
            'recovery_date': '2024-01-03',
            'age': 42}

print(epi_case)

{'name': 'John Smith', 'infected_date': '2024-01-01', 'recovery_date': '2024-01-03', 'age': 42}


In [None]:
# accessing elements of a dictionary

# option 1: using the key
epi_case['infected_date']

'2024-01-01'

In Python, methods are function associated with an object that allow users to interact and/or modify the data within the object. Dictionaries have several methods available which can be found in the [python documentation](https://www.w3schools.com/python/python_dictionaries.asp).

In [None]:
# option 2: using the get method
epi_case.get('infected_date')

'2024-01-01'

In [None]:
epi_case.keys()

dict_keys(['name', 'infected_date', 'recovery_date', 'age'])

In [None]:
epi_case.values()

dict_values(['John Smith', '2024-01-01', '2024-01-03', 42])

### Combining data structures

Epidemics are commonly represented as a vector of the number of units in each model state. Suppose we are working with a SIR model; for a given time $t$, the state of the population can be represented by a tuple of coutns in each of the S, I, R states (S,I,R)

In [None]:
# state counts for an SIR model
state0 = (9,1,0)

In [None]:
state1 = (7,2,1)

In [None]:
state2 = (5,3,2)

We can combine these states in a few ways. The easiest is to create a list of these tuples.

In [None]:
state_seq = [state0]

In [None]:
state_seq.append(state1)            # list method

In [None]:
state_seq

[(9, 1, 0), (7, 2, 1)]

In [None]:
state_seq = [state_seq,state2]      # overwrite the variable in memory

In [None]:
state_seq

[[(9, 1, 0), (7, 2, 1)], (5, 3, 2)]

In [None]:
# unpack the list
state_seq = [state0]
state_seq.append(state1)
state_seq = [*state_seq,state2]
state_seq

[(9, 1, 0), (7, 2, 1), (5, 3, 2)]

In [None]:
# can add lists together
state_seq + state_seq

[(9, 1, 0), (7, 2, 1), (5, 3, 2), (9, 1, 0), (7, 2, 1), (5, 3, 2)]

Combining data with dict

In [None]:
# define 2 dictionary
epi_case1 = {'name': 'John Smith',
            'infected_date': '2024-01-01',
            'recovery_date': '2024-01-03',
            'age': 42}

extra_data1 = {'hospitalize': True,
               'vacinated': False,
               'height': 175}

epi_case2 = {'name': 'Jane Smith',
            'infected_date': '2024-01-02',
            'recovery_date': '2024-01-05',
            'age': 42}

In [None]:
# only works when there are NO overlapping keys
john_details = {**epi_case1, **extra_data1}
john_details

{'name': 'John Smith',
 'infected_date': '2024-01-01',
 'recovery_date': '2024-01-03',
 'age': 42,
 'hospitalize': True,
 'vacinated': False,
 'height': 175}

If we want to add the entries for each key together, we can write our own function... See Exercise 5

## Functions

Users can define functions in Python like with other languages. They are what we would call a first class object because you can assign them to variables, store them in data structures, pass them as arguments to other functions, and even return them as values from other functions.

In [None]:
def add_7(number):
    new_number = number + 7
    return new_number

In [None]:
add_7(3)

10

In [None]:
def add_7(number):
    return number + 7

In [None]:
add_7(x)

19

In [None]:
# inline functions known as lambda expressions
inline_add_7 = lambda x: x + 7

In [None]:
inline_add_7(x)

19

To illustrate how we can pass functions as arguments, lets try to implement a two step process.

### Exercise 4
Write a function which takes in a number as an argument, adds 7 to it, and multiply it by 2. Note that function can take more than one argument and each argument can be of a differnt type.

In other words, code the following math formula:

$$ f(x) = 2(x+7)$$

### Solution

In [None]:
# Solution 1
def two_step_process(number):
    step1 = number + 7
    return 2 * step1

In [None]:
two_step_process(x)

38

In [None]:
# Solution 2: allow step 1 to be an input to the process
def two_step_process(step1_fn, number):
    return 2 * step1_fn(number)

In [None]:
two_step_process(step1_fn = add_7, number = x)

38

Equivalent to

\begin{align*}
g(x) &= x + 7 \\
h(x) &= 2x \\
f(x) &= h(g(x))
\end{align*}

In [None]:
# generalizing the add_7 function
def add_n(num_to_add):
    def fn (number):
        return number + num_to_add
    return fn

In [None]:
add_7 = add_n(7)
type(add_7)

function

In [None]:
2*add_7(x)

38

## Printing with f-strings

In [None]:
# block on printing with f-strings
print(f'z values are {z}')
print(f'First z value rounded to 3 decimal places is {z[0]:.3f}')

z values are [12, 'a']
First z value rounded to 3 decimal places is 12.000


Conditional statements

In [None]:
# block on conditional states
x == 2, x < 1, x >= 1

(False, False, True)

In [None]:
# in line if statements
print(f'z values are {z}') if x != 1/24 else print(f"z values are not hourly values")

z values are [12, 'a']


## Iterating

This section goes over different loops in python. TO DO

In [None]:
for ii in range(5):
    print(f'Iteration #{ii}')

Iteration #0
Iteration #1
Iteration #2
Iteration #3
Iteration #4


In [None]:
ii = 0
num_steps = 5

while ii < num_steps:
    print(f'Iteration #{ii}')
    ii+=1

Iteration #0
Iteration #1
Iteration #2
Iteration #3
Iteration #4


In [None]:
for key,values in epi_case.items():
    print(f'Key: {key} | Value: {values}')

Key: name | Value: John Smith
Key: infected_date | Value: 2024-01-01
Key: recovery_date | Value: 2024-01-03
Key: age | Value: 42


### Exercise 5

Write a function to add the values of two dictionaries with the same keys.

In [None]:
def combine_dictionary_values(dict1, dict2):
    # your code here
    return None

### Solution

In [None]:
def combine_dictionary_values(dict1, dict2):
    merged_case = {}

    for key in dict1.keys():
        merged_case[key] = (dict1[key], dict2[key])
    return merged_case

combine_dictionary_values(epi_case1, epi_case2)

{'name': ('John Smith', 'Jane Smith'),
 'infected_date': ('2024-01-01', '2024-01-02'),
 'recovery_date': ('2024-01-03', '2024-01-05'),
 'age': (42, 42)}

# Pep8 conventions

PEP 8 is the style guide for Python code. It provides conventions to help maintain readability and consistency across Python codebases. Here are a few highlights...

##### Indentation and max line lengths

- Python code is indented with 4 spaces per indentation level
- Each line should be at most 80 characters long for readibility.


In [None]:
# Recall this line, we can see that it goes beyond the 80 character limit ---->
print(f'z values are {z}') if x != 1/24 else print(f"z values are not hourly values")

z values are [12, 'a']


#### Use of blank lines

- surround top-level function with two blank lines

In [None]:
def combine_dictionary_values(dict1, dict2):
    merged_case = {}

    for key in dict1.keys():
        merged_case[key] = (dict1[key], dict2[key])
    return merged_case


def my_other_function():
    return 42

#### Imports

Imports should usually be on separate lines and placed at the top of the file.

Group imports in the following order:
- Standard library imports.
- Related third-party imports.
- Local application/library-specific imports.

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from collections import namedtuple
from typing import NamedTuple
from scipy.optimize import minimize

# import my_library as Alins_library


#### Naming conventions
- Function and variable names should be written in lowercase_with_underscores.
- Class names and user defined types should use CapitalizedWords (CamelCase).
- Constants should be written in UPPERCASE_WITH_UNDERSCORES.

In [None]:
EpiCase = namedtuple('EpiCase', ['Name', 'Infected', 'Vacinated'])

In [None]:
EpiCase('John Doe', True, True)

EpiCase(Name='John Doe', Infected=True, Vacinated=True)

#### Comments
- Use comments to explain the why, not the what
- Block comments should generally be complete sentences.
- Inline comments should be sparing and separated by at least two spaces from the statement.

#### Docstrings
- Use docstrings to document all functions and methods
- docstring should describe the function’s effect as a command, e.g., "Return the square of n."

In [None]:
# define combine dictionaries with same keys
def combine_dictionary_values(dict1, dict2):
    '''
    Combine two the values of dictionaries with the same keys

    Args:
        dict1: first dictionary
        dict2: second dictionary

    Return:
        Instance of dict() with merged values for each key
    '''
    merged_case = {}        # empty dictionary

    for key in dict1.keys():
        # create a list to combine the values for each key
        merged_case[key] = (dict1[key], dict2[key])
    return merged_case