# Python Bootcamp 2022

This is the main notebook with code for the [Python bootcamp](https://github.com/acoache/python-bootcamp-MFI). We are using a Jupyter notebook to more easily illustrate the code with texts and visualizations.

## 1. Install Python

##### Windows
- Use [Anaconda: https://www.anaconda.com/products/distribution](https://www.anaconda.com/products/distribution)
- Install packages, IDE, etc. using Anaconda

##### MacOS
- Use Anaconda, or use Homebrew via the command line:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)"
- If you use Homebrew, download pip with the following: brew install python
- Then use pip/pip3 to download packages and libraries
- For more information, see the [Python documentation: https://docs.python-guide.org/starting/install3/osx/](https://docs.python-guide.org/starting/install3/osx/)

##### Linux
- If you use linux, you probably already have everything set up.

##### Editors
- Any text editors for code, e.g. [Sublime Text](https://www.sublimetext.com/), [Atom](https://atom.io/), [Notepad++](https://notepad-plus-plus.org/), etc., or IDEs for Python, e.g. [Spyder](https://www.spyder-ide.org/), [Jupyter](https://jupyter.org/), etc.

## 2. Set up directories

The default directory is the first folder Jupyter loads into when it starts up. We can view the full path by loading the operating system package.

In [None]:
import os
print(os.path.abspath(os.curdir)) # print the current directory
print(os.listdir()) # print all files and folders in the current directory

We can import (pre-installed) Python libraries, or import our own Python files and functions.

In [None]:
from utils import func1  # import the function func1 from our helpers file

# which functions are visible?

In [None]:
from utils import *  # import all functions from our helpers file

# which functions are visible?

In [None]:
import utils # import our helpers file as a package

# which functions are visible?

## 3 Basic data types
Here is an overview of the most used data types in Python, both for elements and sequences. For more information, see [built-in types documentation](https://docs.python.org/3/library/stdtypes.html).

### 3.1 Built-in types

In [None]:
a = 5 # int
b = 5.0 # float
c = 3e-7 # float (scientific notation)
d = "5.0" # string
e = 1 + 1j * 5 # complex number
f = 2

print(type(a), type(b), type(c), type(d), type(e))

In [None]:
# operations with ints produce ints, otherwise result is float
print(type(a + a))
print(type(a + b))
print(type(a * b))
print(type(d + d))

In [None]:
print(a / f) # division
print(a // f) # floor division
print(a % f) # modulo
print(a ** f) # power 

In [None]:
# verify data types of variables
print(isinstance(a, int))
print(isinstance(a, float))

In [None]:
bool1 = True; bool2 = False # boolean

print(type(bool1))
print(bool1 and bool2) # both values are True
print(bool1 or bool2) # one of the values is True
print(not bool1) # opposite of value

In [None]:
print(5 > 5) # strictly greater
print(5 >= 5) # greater or equal
print(5 < 5) # strictly lower
print(5 <= 5) # lower or equal
print(5 == 5) # equal
print(5 != 5) # not equal

### 3.2 Data structures

In [None]:
list1 = [1, 2, 3, 4]; list2 = [5, 6, 7, 8] # list

print(type(list1))
print(list1)
print(list1 + list2) # concatenate lists
print(len(list1 + list2)) # size of a list

In [None]:
print(list1[0]) # first element
print(list1[-2]) # second to last element

In [None]:
# slicing with a:b includes all elements including a but excluding b
print(list1[1:2])
print(list1[:2])
print(list1[2:])
print(list1[2:-1])

In [None]:
# verify presence of an element in a list
print(2 in list1)
print(25 in list1)

In [None]:
# dictionary
dict1 = {
    "a" : 1,
    "b" : 2,
}

print(type(dict1))

In [None]:
dict1["a"] = 4 # change the value associated with the key "a" to 4
dict1["d"] = 5 # change the value associated with the key "d" to 5
print(dict1)

In [None]:
print(list(dict1.keys())) # all keys as a list
print(list(dict1.values())) # all values as a list
print(dict1.items()) # all pairs key-value 

## 4. Loops and conditionals
We show some basic loop and conditional statements in Python. For more information, see the [compound statements documentation](https://docs.python.org/3/reference/compound_stmts.html#).

In [None]:
# if/elseif/else statement (test multiple conditions)
criterion1 = True;
criterion2 = False;
if criterion1:
    print("Criterion 1 is true.\n")
elif criterion2:
    print("Criterion 2 is true.\n")
else:
    ("Criteria 1 and 2 are false.\n")

In [None]:
# for loop (stop after a number of iterations)
for idx in range(5, 25, 5):
    print(idx)

In [None]:
for key, val in dict1.items():
    print("Key: " + str(key) + " Value: " + str(val))

In [None]:
for idx_value, value in enumerate(['My', 'name', 'is', 'Anthony', 'Coache']):
    print("Value: " + value + " with index " + str(idx_value))

In [None]:
number_en = ['one', 'two', 'three', 'four']
number_fr = ['un', 'deux', 'trois', 'quatre']
number = [1, 2, 3, 4]

for num_en, num_fr, num in zip(number_en, number_fr, number):
    print(f"{num}: '{num_en}' in English, '{num_fr}' in French")

In [None]:
# while loop (stop after a certain condition)
import numpy as np
idx = 0
while idx < 3:
    idx = idx + np.random.rand()
    print("The random number is now " + str(np.round(idx,3)))

## 5. NumPy library

NumPy is a fundamental package for scientific computing in Python. It provides
- a multidimensional array object;
- various derived objects, such as matrices;
- an assortment of routines for fast operations on arrays;
- basic statistical operations;
- random simulation engines;
- and much more.

For more information on this library, see the [NumPy documentation](https://numpy.org/).

In [None]:
import numpy as np # we refer to numpy with the prefix "np."

### 5.1 Array manipulation

In [None]:
a1 = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]]) # create an array manually
print(type(a1), a1.shape)
print(a1)

In [None]:
a2 = np.ones((2,2,2,2)) # n-dimensional array of ones
a3 = np.zeros((2,2,2,2)) # n-dimensional array of zeros
print(a2.shape)
print(a2)

In [None]:
# slicing with a:b includes all elements including a but excluding b; similar to lists
print(a1[0,:])
print(a1[:,0])
print(a1[:2,:])

In [None]:
# reshape an array (row by row) to a compatible size
print(a1) # (3,4)
print(a1.reshape(-1,1)) # (12,1)
print(a1.reshape(1,-1)) # (1,12)
print(a1.reshape(3,4)) # (3,4)
print(a1.reshape(2,6)) # (2,6)

In [None]:
b1 = np.array([[1, 2], [3, 4]])
b2 = np.array([[5, 6], [7, 8]])

b3 = np.concatenate((b1, b2), axis=0) # concatenate on an existing dimension
print(b3) # (4,2)
b3 = np.concatenate((b1, b2), axis=1)
print(b3) # (2,4)

In [None]:
b3 = np.stack((b1, b2), axis=1) # stack based on a new dimension
print(b3) # (2,2,2) -- b3[:,0,:] == b1, b3[:,1,:] == b2

In [None]:
print(a1) # all values of a1
print(a1 > 5) # which values satisfy the condition
print(a1[a1 > 5]) # all values satisfying the condition

### 5.2 NumPy operations

In [None]:
a1 = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])
a2 = np.array([[-1,5,9,2],[1,3,7,4],[9,8,12,11]])

print(a1 + a2) # element-wise addition
print(a1 - a2) # element-wise subtraction
print(a1 ** 2) # element-wise power
print(a1 / a2) # element-wise division
print(np.exp(a1)) # element-wise exponential
print(np.log(a1)) # element-wise logarithm
print(np.sqrt(a1)) # element-wise square root -- equivalent to a1 ** 0.5

In [None]:
print(a1 * a2) # element-wise multiplication
print(np.matmul(a1,a2.transpose())) # matrix multiplication

### 5.3 Random numbers

For more information, see the [random sampling documentation from NumPy](https://numpy.org/doc/1.16/reference/routines.random.html).

In [None]:
import numpy.random as random

random.seed(1234) # fix the random number generator, for testing purposes
print(random.rand(2,4)) # (2,4) array of Uniform(0,1)
print(random.normal(3, 0.5, size=(2,4))) # (2,4) array of Normal(3,0.5)

## 6. pickle library

The pickle module implements binary protocols for serializing and de-serializing a Python object structure. The user can then store them locally and load them in another session.

For more information on this library, see the [pickle documentation](https://docs.python.org/3/library/pickle.html#module-pickle).

In [None]:
import pickle

a = {'hello': 'world'}
filename = "filename"

with open(filename + ".pickle", 'wb') as handle: # writing mode
    pickle.dump(a, handle, protocol=pickle.HIGHEST_PROTOCOL)

In [None]:
with open(filename + ".pickle", 'rb') as handle: # reading mode
    b = pickle.load(handle)
    
b

## 7. pandas library

pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool. It provides
- an efficient DataFrame object;
- functions to read and write data;
- optimized performance for reshaping, slicing, grouping, merging, etc.
- and much more.

For more information on this library, see the [pandas documentation](https://pandas.pydata.org/). We use random numbers to generate our dataframe, but we could instead read a .csv/.xlsx file, see <code> pandas.read_csv/pandas.read_excel("filename", ...) </code>.

In [None]:
import pandas as pd

df1 = pd.DataFrame(np.random.randn(8, 4), columns=list('ABCD')) # create a dataframe
df1.head() # print the first 5 lines

In [None]:
df1.sort_index(axis=1, ascending=False).head() # sort column names by descending order

In [None]:
df1.sort_index(axis=0, ascending=False).head() # sort index names by descending order

In [None]:
df1.sort_values(by='B').head() # sort by values of column "B"

In [None]:
df1.loc[1:4, ['A','C']] # selection, similar to NumPy

In [None]:
df1[df1['A'] > 0] # indexing with booleans, similar to NumPy

In [None]:
df1['E'] = ['one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight'] # add new columns
df1.loc[9] = [1,2,1,2,'nine'] # add new observations
df1

In [None]:
df2 = pd.DataFrame(np.random.randn(8,4), columns=list('ABCD')) # create another dataframe
df3 = pd.concat([df1, df2]) # concatenate both dataframes (reset index names with ignore_index=True)
df3

In [None]:
df3.loc[0] # selection by index

In [None]:
df3.iloc[0] # selection by rows

In [None]:
# import the file "grades.xlsx" (see https://pandas.pydata.org/docs/reference/api/pandas.read_excel.html?highlight=read_excel#pandas.read_excel)

# compute the average of each course (see https://pandas.pydata.org/docs/reference/api/pandas.core.groupby.GroupBy.mean.html)


## 8. Matplotlib library
There exists other options for producing plots, such as [seaborn](https://seaborn.pydata.org/) and [Plotly](https://plotly.com/python/). For more information on the Matplotlib library, see the [Matplotlib documentation](https://matplotlib.org/).

In [None]:
import matplotlib.pyplot as plt

xs = np.linspace(0, 10, 20, endpoint = False)
y1s = 2 * xs
y2s = 3 * xs
x_grid, y_grid = np.meshgrid(xs, xs)

In [None]:
# plot y1s vs xs AND y2s vs xs on the same plot

In [None]:
# create subplots with for loops

In [None]:
# histogram and contours on different plots

## 9. Create functions

Suppose we want to simulate trajectories from a geometric Brownian motion (GBM) satisfying the SDE
$$
d S_t = \mu S_t d t + \sigma S_t d W_t,
$$
where $W_t$ is a Brownian motion. The solution is
$$
\frac{S_t}{S_0} = \exp\Bigg\{ \bigg(\mu - \frac{\sigma^2}{2} \bigg) t + \sigma W_t \Bigg\}.
$$

Assuming we discretize the process, we obtain
$$
\frac{S_{t+\Delta t}}{S_{t}} = \exp\Big\{ \bigg(\mu - \frac{\sigma^2}{2} \bigg) \Delta t + \sigma \Delta t Z \Big\}, \qquad Z \sim N(0,1).
$$

In [None]:
# create a separate function called simulate_gbm in the file GBM.py

from GBM import simulate_gbm

In [None]:
# initialize different parameters
initial_price = 10
mu = 0.1
vol = 0.2
terminal = 1
Ndt = 2**10
Nsims = 500

# use the function with the right syntax
t1, price1 = simulate_gbm(initial_price, mu, vol, terminal, Ndt, Nsims); # 500 sims
t2, price2 = simulate_gbm(initial_price, mu, vol, terminal, Ndt); # 10 sims
# t3, price3 = simulate_gbm(initial_price, mu, vol, terminal, Ndt, -50); # error

# plot the GBM paths
fig, axes = plt.subplots()

plt.show()

## 10. SciPy library
Several functions for scientific computing. For more information on this library, see the [SciPy documentation](https://scipy.org/).

One library of interest for optimization: scipy.optimize. Suppose you want to minimize the following function (i.e. 2D Rosenbreck function):
$$
f(x,y) = 100 (y - x^2)^2 + (1 - x)^2
$$
subject to:
$$
1 - x - y \geq 0\\
x, y \geq 0
$$

In [None]:
from scipy.optimize import minimize # import the minimize module

def rosen(x): # define the function
    return sum(100.0*(x[1:]-x[:-1]**2.0)**2.0 + (1-x[:-1])**2.0)

# define inequalities
ineq_cons = {'type': 'ineq',
             'fun' : lambda x: np.array([x[0],
                                         x[1],
                                         1 - x[0] - x[1]]),}

res = minimize(rosen, # function to minimize
               np.array([0.5, 0]), # starting values
               method='SLSQP', # optimization method
               constraints=[ineq_cons], # constraints
               options={'ftol': 1e-9, 'disp': True})
print(res.x)

In [None]:
def rosen2D(X, Y):
    vals = []
    for i in range(len(X)):
        vals.append(rosen(np.array([X[i], Y[i]])))
    return np.array(vals)

In [None]:
x = np.linspace(0, 1, 50)
y = np.linspace(0, 1, 50)
X, Y = np.meshgrid(x, y)
Z = rosen2D(X, Y)

fig, axes = plt.subplots()
temp = axes.contourf(X, Y, Z, cmap='RdGy', levels=30)
plt.colorbar(temp)
axes.scatter(res.x[0], res.x[1], marker='*', color='gold', s=100)

## 11. Custom classes

It allows you to create a blueprint for any object with its own properties and variables.

In [None]:
# create two separate classes called Company and Employee in the file classes.py

from classes import Company, Employee

In [None]:
company1 = Company("Deloitte", 1845)
company2 = Company("Manulife", 1887)

worker1 = Employee("Anthony Coache", "2019-01-01", 75000)
worker2 = Employee("Jane Doe", "2000-07-01", 90000)
worker3 = Employee("SpongeBob SquarePants", "1999-05-01", 100000)

In [None]:
company1.add_employee(worker1)
company1.add_employee(worker2)
company2.add_employee(worker3)

company1.show_employees()
company2.show_employees()

## 12. Debugging

The Python debugger allows you to put breakpoints in your code (e.g. functions, longer scripts), and inspect variables in an interactive way. Here are some shortcuts with the pdb library:

- <code> continue </code> or <code> c </code>: continue execution
- <code> where </code> or <code> w </code>: shows the context of the current line it is executing.
- <code> args </code> or <code> a </code>: print the argument list of the current function
- <code> step </code> or <code> s </code>: execute the current line and stop at the first possible occasion.
- <code> next </code> or <code> n </code>: continue execution until the next line in the current function is reached or it returns.

For more information, see the [pdb documentation](https://docs.python.org/3/library/pdb.html).

In [None]:
import pdb

def func(a, b):
    counter = 0
    pdb.set_trace() # breakpoint
    for i in range(b):
        counter += a
    return counter