# Introduction to Python #

by Christoph Knochenhauer and Alexander Schütt

# 1. Motivation #

* Most popular programming language (according to https://pypl.github.io/PYPL.html)

* Beginner-friendly (easy syntax)

* Easily extendable (by importing packages)

* open source (even true for most packages)

* Top preferred language for data science and research

* Extremly useful for financial analysis, as it allows to import huge amounts of data and can be quickly analysed

# 2. Python Basics #

### Print statement

Python allows to output information using the print function

In [None]:
print("Hello world")   #with Shift+Enter code-cells can be executed

In [None]:
print('Hello','everyone')  #using commas one print statement can output multiple outpus

print("Hello"+"you")   #using the "+" symbol strings will be concatenated

"Hi"   #the last line of each code-block is printed automatically 

### Integers and floats
+( - ) for addition (subtraction) and *( / ) for multiplication (division)

In [None]:
5+5, 2-4, 1*1, 1*1.0, 1+2.5, 3*1.5, 3/10

Strings can be multiplied with integers

In [None]:
3*"abc", 0*"aa"

Other useful basic math operations:

In [None]:
print(5//2) #floor division

print(19%7) #modulo operator

print(3**2) #powers

print(9 ** 0.5) #root

In [None]:
max(1,0,5),min(-3,4)   #maximum and minimum functions

### Booleans
Data type, with new operations not, and, or

In [None]:
True, not False, True and True, True and False, True or False, False or False, True*False, True+True

In [None]:
bool(0), bool(1), bool(2), bool(3)

Comparisions can be made using the following operations

In [None]:
1 == 1, 1 == 1.0, 1 != 2   # == is used to check if 2 things are equal, != to check if something isn't equal

In [None]:
5 <= 2, 3 > 2, 4 < 4, 1 <= 1    # < and > are used for greater and smaller and can be combined with and "=" sign

### Type checking
With the operator type, the type of different objects can be checked

In [None]:
type(1), type(1.0), type(3/2), type(not 1 == 2)

## Variables

In [None]:
x = 4   #declaration of a variable

print(x)  #variables are printable

x = x + 1  #and can be changed

print(x)

x += 1  #another way to add 1, the "+" symbol can be changed by any other basic math operator

print(x)

x = x * "a"   #type changes of variables are possible

print(x)

Type-checking of a variable can be done in the following way:

In [None]:
type(x), type(x) == int, type(x) == float, type(x) == str, type(x) == bool

## Lists
Lists are very important, as multiple items can be stored in a single variable

In [None]:
list_example = [13, "abcde", True]   #lists are of arbitrary finite length. Any item can be stored within a list
list_example

In [None]:
list_example = list_example + [1,2]   #lists can be added to lists
list_example

In [None]:
list_example = 3 * [1,2]   #lists can be multiplied by integers
list_example

#### Length of lists

In [None]:
len(list_example)

#### Indexing in Python

In [None]:
list_example = [13, "abcde", True]
print(list_example)

print(list_example[0])   #lists begin with index 0

print(list_example[-1])   #-1 represents the last index

print(list_example[1],list_example[-2])

In [None]:
print(list_example[1:])  #colon symbol represents start/end element 
print(list_example[1:2])
print(list_example[:-1])

The same holds for strings

In [None]:
abc = "abc"
abc[0], abc[1:], abc[:-1], abc[-1:]

#### Change items by index

In [None]:
print(list_example)

list_example[0] = 31   #list elements can be changed using the index

list_example

#### Appending items to a list
Items can be appended to a list, by using append

In [None]:
print(list_example)

list_example.append(42)

print(list_example)

#### Deleting items in a list
And items can be deleted or popped out of the list

In [None]:
print(list_example)

list_example.remove(42)   #remove removes the first item of the list, that equals the input of remove

print(list_example)

first_item = list_example.pop(0)    #pop removes and returns item, given the index

print(first_item, list_example)

#### List sorting
A list can be sorted, using the sort command

In [None]:
list_example = [12,356,0,-1,99]
list_example.sort()   #sort command
list_example

### Tuples
Tupels are a special form of a list, that is used in cases, where the elements of a list shouldn't be changed

In [None]:
example_tuple = (1,2,3,4)   #Tuples are defined using ()
example_tuple[2:]    #indexing works in the same way as for lists

#### It is not possible to change, append, remove or pop items in a tuple!


### Dictionaries
Dictionaries are a special form of a list

In [None]:
python_dict = {            #definition of a dictionary, using brackets { }
    "name": "Python",      
    "year_created": 1989,
    "founder": "Guido van Rossum",
    "current_version": "3.12"
}

In [None]:
python_dict["name"], python_dict["current_version"]

In [None]:
print(python_dict.keys())   #how to get all keys of a dictionary
print(python_dict.values())  #collection of all values

## Functions
Functions are used to minimise code duplication, code organisation and abstraction of code

They are defined using def function_name and to return values the command return is used

In [None]:
def subtraction(a,b):   #we define here a function, that takes a and b as an input
    # here we could do some calculations
    
    return a-b  #the function returns the value a-b    

In [None]:
value = subtraction(2,3)  #by returning the value (instead of printing it in the end), we allow variables to become the returned value
value

### Optional inputs

In [None]:
def f(x, C = 0):
    return x**3 + C

In [None]:
f(2)

In [None]:
f(2, C = -3)

### Global variables
Variables that have been defined outside of functions can be used inside a function, without taking them as an input. Note that they can't be changed within the function (but overwritten)!

In [None]:
b = 3
def g(a):
    return a + b
g(2)

### If statements
The command if checks some condition, that has to be a boolean value. Can be combined together with elif and else

In [None]:
value = 3

if value > 5:
    print("The value is greater than 5")
elif value > 2:
    print("The value is greater than 2 and smaller or equal 5")
else:
    print("The value is not bigger than 2")

#### Recursive functions

In [None]:
def floor(n):    #functions rounds a positive number to the next smaller integer
    if n - 1 <= 0: 
        return 0
    return 1+floor(n-1)   #note that no else is needed, as the function would already return a value, if it got into the if condition

floor(6.8)

In [None]:
def fibonacci(n):
    if n == 0:
        return 0
    elif n == 1: 
        return 1
    return fibonacci(n-1)+fibonacci(n-2)
fibonacci(9)

## For-statements

In [None]:
for i in range(5):
    print(i)

In [None]:
for i in range(1,3):
    print(i)

In [None]:
for i in range(0,10,3):
    print(i)

In [None]:
for i in range(5,0,-1):
    print(i)

#### One can use for to go through each item in a list

In [None]:
example_list = [3,1,"test",{"name": "dictionary"}]
for list_item in example_list:
    print(list_item)

Or alternatively using the index:

In [None]:
for i in range(len(example_list)):
    print(example_list[i])

The values of a list can only be changed in a for statement if the index is used! 

In [None]:
for i in range(len(example_list)):
    example_list[i]=0
example_list

In [None]:
for list_item in example_list:
    list_item = 23
example_list

## While loops
While loops will repeat itselfs, until the boolean statement becomes False

In [None]:
n = 10.5   #set variable
while n > 1:   #as long as n is greater than 1
    n = n - 1   #substract 1 from n
n   #after the while loop, the integer n becomes 

In [None]:
print(example_list)
while len(example_list) > 0:
    example_list.pop(0)
example_list

### Python List Comprehension
One can integrate for and if statements directly inside the creation of a list

In [None]:
A = [i for i in range(10)]
A

In [None]:
B = [i if i%2==0 else 0 for i in range(10)]
B

In [None]:
C = [[i+j if i%2==0 else j for i in range(10)] for j in range(3)]
C

# 3. Additional Packages #

In [None]:
import math, tqdm, time, scipy

### math
The package math can be used for special mathematical operations, as:

In [None]:
math.e, math.pi, math.log(math.e)

### time
The package is useful to measure the time needed for certain operations or to limit the used time

In [None]:
i = 0
start_time = time.time()   #save current time, given in seconds
for j in range(10000000):   
    i += 1
time.time() - start_time  

In [None]:
i = 0
start_time = time.time()   #save current time, given in seconds
while time.time()-start_time < 5:   #within 5 seconds
    i += 1    #calculate how often 1 can be added to i
i

### tqdm
tqdm shows in real time, how far a for statement is

In [None]:
for j in tqdm.tqdm(range(100)):   
    time.sleep(.1)   #command does .1 seconds nothing

### SciPy
Collection of mathematical algorithms and functions. Includes special functions, integration, optimization, FFT, statistics

In [None]:
from scipy.stats import norm
norm.cdf(1)  #CDF function of N(0,1)

# 4. NumPy #
Numpy is a python package, that allows fast computations of N-dimensional arrays. 

In [None]:
import numpy as np    #typically numpy is imported as np

In [None]:
A = np.array([[1,2,3],[4,5,6]])
A

In [None]:
b = np.array([1,1,1])
A@b   #matrix multiplication

There are a lot of useful functions for matrices/ arrays

In [None]:
print("A transposed:")
print(A.T)
print("The sum of A:")
print(np.sum(A))
print("Sum over one axis:")
print(np.sum(A,axis = 0))
print("mean of A:")
print(np.mean(A))
print("stanard deviation of A:")
print(np.std(A))
print("shape of A:")
print(A.shape)
print("cumulated sum of A, over axis 1")
print(np.cumsum(A,axis = 1))

### 1D array creation functions 

##### List to array

In [None]:
example_list = [3,2,4]
np.array(example_list)   

##### numpy arange

In [None]:
np.arange(10)

##### numpy linspace

In [None]:
np.linspace(-1,1,11)

### 2D matrices creation functions

In [None]:
np.eye(3)

In [None]:
np.zeros((4,3))

In [None]:
np.ones((2,5))

In [None]:
np.diag([3,2,1])

In [None]:
np.arange(9).reshape((3,3))

### Numpy stacking
Multiple matrices can be stacked onto each other using the functions np.hstack and numpy.vstack

In [None]:
a = np.arange(4)
b = 2 * np.arange(4)
print(np.hstack([a,b]))   #horizontal stacking
print(np.vstack([a,b]))   #vertical stacking

In [None]:
A = np.arange(9).reshape((3,3))
B = np.arange(6).reshape((2,3))
np.vstack([A,B])

In [None]:
A = np.arange(9).reshape((3,3))
B = np.arange(6).reshape((3,2))
np.hstack([A,B])

#### numpy.tile
Constructs an array by adding copies of itself to the given shape

In [None]:
A = np.arange(5)
np.tile(A, (3,2))

### Numpy indexing

In [None]:
A = np.arange(12).reshape((4,3))
A

In [None]:
print(A[0])
print(A[:,0])
print(A[2:4,1:3])
A[2:4,1:3] = np.arange(4).reshape((2,2))
A

#### Boolean indexing

In [None]:
A < 3   #this outputs a boolean matrix of the same size as A. For each entry in A the condition is checked

In [None]:
A[A < 3] = 10   #we can index using this matrix
A

In [None]:
A[A <= 9] = np.arange(np.sum(A <= 9))  #it is possible to insert arrays
A

### np.all and np.any
Useful tools, allows the usage of if and while functions

In [None]:
np.all(A>0), np.any(A<4)   #return single boolean, all (any) checks if all (any) entries satisfy the condition

### Numpy math operations

In [None]:
A = np.arange(4).reshape((2,2)) + 1
A

In [None]:
np.log(A), np.exp(A), np.pi*A

In [None]:
np.nan, np.inf, float("inf")

In [None]:
np.maximum(np.arange(4),np.array([-1,3,2,1]))

### Numpy Broadcasting
Let $a\in\mathbb{R}^n$ and $b\in\mathbb{R}^m$ be 2 vectors. What do you expect $$a+b^T$$ is going to look like?

In [None]:
a = (10*np.arange(4))[:,None]
b = np.arange(3)[:,None]
a+b.T

Why is [:,None] needed? There is one small problem with numpy vectors:

In [None]:
a = np.arange(3)
a.shape, a.T.shape  #a and a transposed have the same shape.

This problem is solved by extending the dimension of the vector $a$. The following methods are possible:

In [None]:
a = np.arange(3)
print((a[:,None]).shape, (a[:,np.newaxis]).shape)

In [None]:
a = a[:,None]
a.T.shape

This procedure is useful for a lot of tasks

### NumPy random
The submodule numpy.random integrates the ability to sample from a variety of probability distriubtions

##### random integers

In [None]:
np.random.randint(1,high=10,size=4)   #4 random integers from 1 to 10

##### random floats in [0,1)

In [None]:
np.random.random(3)  #3 random numbers in [0,1)

##### random normal distributed data

In [None]:
np.random.normal(loc = 2, scale = 5, size = (5,5))   #random 5x5 matrix, each entry is distributed N(2,5)

In [None]:
np.random.normal(loc = 0, scale = np.arange(10), size = 10)   #it is possible to insert arrays for the loc and scale argument

##### shuffling

In [None]:
a = np.arange(10)
np.random.shuffle(a)
a

# 5. Matplotlib #
Visualization tool in Python

In [None]:
import matplotlib.pyplot as plt

## Plot function

In [None]:
x = np.linspace(-1,1,100)
plt.figure(figsize=(6,5))
for i in range(5):
    plt.plot(x,x**(i+1),label= f"$x^{{ {i+1} }}$")  
plt.xlabel("x-axis")   #label on x axis
plt.ylabel("y-axis")   #label on y axis
plt.axvline(0, c="r",ls="--")   #vertical line, "c" is used for the color, "ls" for the linestyle
plt.axhline(0, c="r",ls="-.")   #horizontal line

plt.legend()  #creates a legend, uses the labels in the plot function
plt.show()

## Scatter function

In [None]:
x = np.random.normal(loc = 1, scale = 1, size = 1000)
y = np.random.normal(loc = 1, scale = 1, size = 1000)
plt.figure(figsize = (7,7))
plt.scatter(x,y,alpha = 0.3)   #the alpha value makes the points less 
plt.xlim([-5,6])   
plt.ylim([-5,6])
plt.title("1000 samples from a normal distribution")
plt.show()

## Histogram Plot

In [None]:
data = np.random.normal(loc = 2, scale = 3, size=10000)
plt.hist(data, bins = 75)
plt.title(f"Normal distributed samples with empirical mean {np.mean(data):.2f} and std {np.std(data):.2f}")
plt.show()