<a href="https://colab.research.google.com/github/marshka/ml-20-21/blob/main/00_intro_to_python.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Machine Learning SP 2020/2021

Prof. Cesare Alippi   
Andrea Cini ([`andrea.cini@usi.ch`](mailto:andrea.cini@usi.ch))   
Ivan Marisca ([`ivan.marisca@usi.ch`](mailto:ivan.marisca@usi.ch))   
Nelson Brochado ([`nelson.brochado@usi.ch`](mailto:nelson.brochado@usi.ch))

---

# Lab 00: Intro to Python

![alt text](https://www.python.org/static/community_logos/python-logo-master-v3-TM.png)

Among the most important things that you should learn from the beginning about Python language is that:

- Python is interpreted,
- Python is dynamically typed,
- indentation matters.

Python acts as a high-level interface to a low-level interpreter written in C, and is built for fast prototyping and readable code.
The fact that the Python is interpreted, makes it less performant than compiled languages. However, programmers can build extensions in C and move high-load functions to external, compiled modules. 

In this course, we will use Python for all assignments, so make sure to familiarize with its syntax as soon as possible.
Since Python is so simple and flexible, it shouldn't take you more than a week to get the basics down. 

Let's see some examples.


In [None]:
# This is an inline comment. Anything after # is ignored by the interpreter

a = 1          # Variable assignment
b = 'string'   # This is a string
c = 3.2        # This is a float
d = True       # This is a bool
e = [1, 2, 3]  # This is a list of integers
g = (1, 2, 3)  # Tuples are immutable lists (can't be modified)
f = {'a': 1}   # Hashmaps in Python are called `dictionaries`
h = None       # This is a special object, equivalent to null in Java

print('This is how you can print a string to stdout')

### More on data structures

In [None]:
i = [1, 2.0, ['a', 'b', 5], 6]  # This list contains mixed-type elements
print(i[2])                     # This is how you access a data struct
print(i[2:4])                   # Lists and tuples support slicing (start inclusive, end exclusive)

f[123] = 'onetwothree'          # Anything can be a dictionary key
print(f)

### Dynamic typing

In [None]:
a = 10.        # Now a contains a float...
print(a, type(a))
a = [1, 2, 3]  # ...and now a list of integers.
print(a, type(a))
a = (1, 2, 3)  # Variables are just pointers to objects in memory
print(a, type(a))

### Control statements

In [None]:
# For loop behaves as a foreach
for i in [1, 4, 5]:
    # After a ':', code must be indented
    print(i)
print("Out of the loop!") # This statement is not indented as 'print(i)'

# While loop
a = 10
while a > 0:  # Other operators: >=, <=, <, >, ==, !=
    a = a - 1

# If - else if - else
if a == 0:
    print('Zero')
elif a < 0:
    print('Negative')
else:
    print('Positive')
    
if not a == 0:  # 'not' is the keyword for negation
    print('a is not 0')

# Inline if - else
a = 5
print('spam' if a < 0 else 'ham')

### Functions

In [None]:
def foo(x):
    print(x)
    
foo('foo')

# This is a function with optional parameters
def bar(x, optional=-1):
    print( x, optional)

bar(1)               # Prints 1 -1
bar(1, optional=2)   # Prints 1 2
bar(1, 3)            # Prints 1 3
bar(optional=3, x=1) # Prints 1 3
#bar(optional=3)     # Does not work!

__Remark__: In Python everything is an object and variables are references to objects.

In [None]:
def pippo(l):
    l[1] = None 

s = [1, 2, 3]
print(s)
pippo(s)
print(s)

In [None]:
def pluto(l):
    l = [4, 5, 6]

s = [1, 2, 3]
print(s)
pluto(s)
print(s)

- `pippo` changes the object not the variable
(side effect, out of the function control), 
- `pluto` make variable `l` point to a different object (but `l` is local).

__Take-away__: be careful of side effects.

### Syntactic sugar

In [None]:
h = None
if h is None:  # Check equivalence with "is"
    print('variable is None')
    
if 1 in [1, 2, 3]:  # Check membership with "in"
    print('Element found')
    
if 1 not in [1, 2, 3]:
    print('Element not found')

e = [1, 2, 3]    
e.append(4)  # Append to list
print(e)
e = [1, 2, 3] + [4, 5, 6]  # Concatenate two lists
print(e)
e = [1 for _ in range(1, 7)] # List comprehension
print(e)

### Built-in functions
Python has a lot of native methods to do all sorts of stuff.

In [None]:
a = list()     # List constructor
for i in range(10):  # Count from 0 to 10 
    a.append(5-i)
sorted(a)      # Return a sorted list
max(a)         # Find the max
min(a)         # Find the min
f = open('test_file.txt', 'w')  # Open a file
f.write("I don't know what to say...")
f.close()

### Importing external libraries
The true power of Python lies in the vast amount of libraries that are available to developers.

In [None]:
import math  # Import the library (or "module") called "math"
print(math.cos(1))

from math import cos  # Import a single function from a module
print(cos(1))

import math as m  # Import a module and rename it
m.cos(1)

## Intro to Numpy
The main library that we are going to use in the course is called Numpy, which the most popular Python library for scientific computing and array manipulation.   
This notebook contains a primer on how to use some basic functions of Numpy.

In [None]:
import numpy as np

## Numpy arrays

The building block of numpy is the `ndarray`,  short for "n-dimensional array". Arrays in Numpy are objects with three main properties: 
1. the actual data
2. shape (data dimensions) 
3. data type

To create an array, we use the `np.array` constructor. 

In [None]:
a = np.array([1, 2, 3])

print('Data:  ', a)
print('Shape: ', a.shape)
print('Data type:  ', a.dtype)
print('Type of a:', type(a))

We can also create arrays with a higher number of dimensions. An array with shape `(n, m)` is represented in classical notation with $\mathbb{R}^{n \times m}$.

In [None]:
b = np.array([[1., 2., 3.], 
              [4., 5., 6.]])  # a list of lists

print('Data:  ')
print(b)
print('Shape: ', b.shape)
print('Type:  ', b.dtype)

In [None]:
t = 2 * np.ones(shape=3)
print('Shape: ', t.shape)
print('Data:  ')
print(t)

print()

z = np.zeros(shape=(3, 2, 5)) 
print('Shape: ', z.shape)
print('Data:  ')
print(z)

Data, shape, and type af an array can be manipulated (obviously, we are mostly interested in manipulating the data, but the other two can be very important).

In [None]:
a = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8])
print('Original:    ', a)

# Edit data
a[2] = 9
print('Change data: ', a)

# Edit shape
print('Change shape to (3, 3):')
print(a.reshape((3, 3)))
print(a.reshape((3, -1)))

# Change type
print('Change type: ', a.astype(np.float))
print(a.dtype)

Arrays can be accessed like lists, but support an advanced slicing operator that allows for complex behaviours.

In [None]:
a = np.arange(9).reshape((3, 3))
print(a)

# Access element 0, 0
print("a[0, 0]:", a[0, 0])

In [None]:
# Access first row
print(a[0])

In [None]:
print(a[0, :])

In [None]:
# Access first column
print(a[:, 0])

In [None]:
print(a[:, 0:1])
print("shape:", a[:, 0:1].shape)

In [None]:
# Access 2 x 2 submatrix
print(a[0:2, 0:2])

In [None]:
print(a[:2, :2])

## Numpy operations
Numpy implements hundreds of useful mathematical operations on and between arrays. 

### Operations on arrays

Unary operations on arrays are applied element-wise, i.e., evaluated for each number saved in the array. Shape is usually not affected by unary operations, but type might. 

In [None]:
import matplotlib.pyplot as plt

# Create a sequence of 1000 floats equally spaced between 0 and 2pi
x = np.linspace(0, 2 * np.pi, 1000)

__Trick__: in a code cell hold "ctrl" and point to a function

This how we compute the sine function at all points defined in `x`.

In [None]:
y = np.sin(x)
plt.plot(x, y);  # Plot a line graph built using (x[i], y[i]) pairs

And similarly, $e^x$.

In [None]:
plt.plot(x, np.exp(x));

Numpy offers a very large collection of functions.

In [None]:
y = np.square(x)   # Square
y = np.sqrt(x)     # Square root
y = np.floor(x)    # Flooring
y = np.power(x, 3) # Exponentiation
y = np.tan(x)      # Tangent
y = np.arctan(x)   # Arctangent
y = np.tanh(x)     # Hyperbolic tangent

Arrays also support advanced manipulation via a process called **broadcasting**.  
The following is a valid expression in Numpy, and gets evaluated at every point in the array x. This concept can also be applied to higher-rank arrays.

In [None]:
y = x + 2  # 2 gets summed to every point in x
plt.plot(x, y);

This can be extended to arbitrarily complex functions!

In [None]:
y1 = np.sin(x ** 2 + 3) + 3 * np.cos(x)
plt.plot(x, y1);

In [None]:
# we can also draw multiple plots
y2 = np.sin(x) * np.cos(10*x)

plt.plot(x, y1, label="fun_2");
plt.plot(x, y2, label="fun_2");
plt.legend()

plt.title("nothing to declare")
plt.xlabel("my x")
plt.ylabel("my y")
plt.grid()

### Operations between arrays
Operations can also be computed between two or more arrays.  
There are two equivalent notations to compute the dot product between arrays in Numpy.

In [None]:
# 1D arrays
x = np.arange(9)
y = np.arange(9)

# Dot product
xy = np.dot(x, y)
print(xy)
xy = x.dot(y)
print(xy)
xy = 0
for i in range(9):
    xy += x[i] * y[i]
print(xy)

In [None]:
from time import time

# 1D arrays
x = np.arange(9999)
y = np.arange(9999)

start = time()
xy = np.dot(x, y)
np_time = time() - start
print("xy = {} - numpy dot time: {:6.3f} ms".format(xy, np_time * 1000))

start = time()
xy = 0
for i in range(9999):
    xy += x[i] * y[i]
for_time = time() - start
print("xy = {} - for cycle time: {:6.3f} ms".format(xy, for_time * 1000))


The same can be done for matrices.

In [None]:
# 2D arrays
v = np.arange(9).reshape((3, 3))
w = np.arange(9).reshape((3, 3))

# Matrix multiplication between square matrices (row-column multiplication)
vw = v.dot(w)
print(vw)
# The @ operator can be used as well
print(v @ w)

When multiplying non-square matrices, we have to be sure that their dimensions are aligned properly.

In [None]:
# Define two non-square matrices
v = np.arange(6).reshape((3, 2))
w = np.arange(6).reshape((3, 2))

In [None]:
# This will crash
try:
    vw = v.dot(w)
except ValueError as e:
    print('ValueError:', e)

In [None]:
# Transpose the second matrix with w.T to compute the correct product
vw = v.dot(w.T)
print(vw)

In [None]:
# which is very different from
vw = v.T.dot(w)
print(vw)

Note that the usual multiplication opertaror does not work as a dot product, but as an element-wise operator. The same holds for `+`, `-`, and `/`.

In [None]:
# Element-wise multiplication
v = np.arange(9)
w = np.arange(9)
print(v + w)
print(v * w)

Arrays can be stacked or concatenated together in several ways, along different **axes**.

In [None]:
import numpy as np

a = np.arange(9)
b = np.arange(9)

# Concatenate together
print('Concatenated')
ab = np.concatenate((a, b))
print(ab)

# Stack as a matrix
print('Stacked rows')
ab = np.vstack((a, b))  # Short for "vertical stack"
print(ab)

# Stack two column vectors
print('Stacked columns')
a = a.reshape((9, 1))   # Column vector
b = b.reshape((9, 1))   # Column vector
ab = np.hstack((a, b))  # Short for "horizontal stack"
print(ab)

## Resources

- Python docs: [https://docs.python.org/](https://docs.python.org/)
- [Python cheat sheet](https://perso.limsi.fr/pointal/_media/python:cours:mementopython3-english.pdf)
- Numpy docs: [https://docs.scipy.org/doc/numpy/](https://docs.scipy.org/doc/numpy/)
- [Cheat sheet for several scientific libraries](https://github.com/kailashahirwar/cheatsheets-ai/) (AI-oriented)
