<!--
SPDX-FileCopyrightText: Copyright (c) 2019-2024 Idiap Research Institute <contact@idiap.ch>
SPDX-FileContributor: Olivier Canévet <olivier.canevet@idiap.ch>
-->

# Introduction to Python

This notebook presents the minimal requirements to have in Python to do the labs. 

If you want more examples, you may want to visit:

- <https://www.learnpython.org/>
- <https://docs.python.org/3/tutorial/>
- <https://www.w3schools.com/python/>


This notebook was developped at the [Idiap Research Institute](https://www.idiap.ch) by [Olivier Canévet](mailto:olivier.canevet@idiap.ch).

## Basic types

In this section, you will manipulate basic types, such as integers, floats, and strings, as well as the `print()` function and comments which start with `#`.

You can create variables like the following:

```python
nb_samples = 1000 # integer
accuracy = 0.8645 # float
name = "svm-model" # string
```

and you can print a variable (here `nb_samples`) with:

```python
print(nb_samples)
```

and check its type with:

```python
print(type(nb_samples))
```

You can compose strings from several variables:

```python
print("Model " + name + " achieves " + str(100*accuracy) + " % on " + str(nb_samples) + " samples")
```

or by using f-strings, which start with `f"..."`, and integrate existing variables by encapsulating them between `{}`:

```python
print(f"Model {name} achieves {100*accuracy} % on {nb_samples} samples")
```

which both print

```
Model svm-model achieves 86.45 % on 1000 samples
```

Run the next cell to see a real output.

In [None]:
nb_samples = 1000 
accuracy = 0.8645 
name = "svm-model"

print(nb_samples, type(nb_samples))
print(accuracy, type(accuracy))
print(name, type(name))
print(f"Model {name} achieves {100*accuracy} % on {nb_samples} samples")

In the next cell, create
- a (string) variable named `database` with value `MNIST`,
- an (integer) variable named `nb_training_images` with value `60000`, and
- a variable `message_to_print` using variables `database` and `nb_training_images` equal to:
   ```
   Training with database MNIST containing 60000 images
   ```

In [None]:
# database = ...
# nb_training_images = ...
# message_to_print = ...
#
# (3 lines of code)
# YOUR CODE HERE
raise NotImplementedError()

print(message_to_print)

Run the next cell to see if your implementation is correct:

In [None]:
assert database == "MNIST", f"Variable `database` should be `MNIST`"
assert nb_training_images == 60000, f"Variable `nb_training_images` should be `60000`"
assert message_to_print == "Training with database MNIST containing 60000 images", "Wrong message"

## Functions

A Python function is declared with `def`, may take arguments as input, and may return an output.

```python
def affine_function(x, a, b=0):
    """Computes ax + b. If `b` is not specified, its value is 0."""
    y = a*x + b
    return y
```

The function is called like this:

```python
x = 1.0
a = 2.0
b = 3.0
y = affine_function(x, a, b)
print(y) # Prints 5.0 because 1*2 + 3 = 5
```

`b` is an optional parameter and can be omitted, in which case its value is `0` inside the function:

```python
print(affine_function(3.0, 2.0)) # Prints 6.0 because 3*2 + 0 = 6
```

In the cell below, implement function `parabola` which should implement the following formula:

$$f(x) = a x^2 + bx + c $$

Note that you can do a power with operator `**`, like

```python
x_cube = x**3
```

In [None]:
def parabola(x, a, b, c):
    """Computes ax^2 + bx + c
    
    Args:
      x: input
      a: coefficient of x^2
      b: coefficient of x
      c: intercept
      
    Returns:
      The value of ax^2 + bx + c
      
    """
    output = 0
    
    # Implement the formula
    #
    # output = ...
    #
    # (1 line of code)
    # YOUR CODE HERE
    raise NotImplementedError()
    
    return output   

Run the next cell to test whether your implementation is correct. Don't hesitate to make more tests by yourself.

In [None]:
# x^2 + 2x + 1
assert parabola(0, 1, 2, 1) == 1
assert parabola(1, 1, 2, 1) == 4
# x^2 - 3x + 2
assert parabola(0, 1, -3, 2) == 2
assert parabola(1, 1, -3, 2) == 0
assert parabola(2, 1, -3, 2) == 0
# x^2
assert parabola(0, 1, 0, 0) == 0
assert parabola(0.5, 1, 0, 0) == 0.25
assert parabola(-0.5, 1, 0, 0) == 0.25
assert parabola(3, 1, 0, 0) == 9

You will now implement the formula of the Gaussian density function with mean $\mu$ and standard deviation $\sigma$:

$$p(x) = \frac{1}{\sqrt{2\pi\sigma^2}} \exp \left(-\frac{(x-\mu)^2}{2\sigma^2} \right)$$

You will need functions and constants which are available in the `math` module:

```python
import math
print("Pi", math.pi)
print("e", math.exp(1))

```

which prints

```
Pi 3.141592653589793
e 2.718281828459045
```

You could also do

```python
from math import sqrt
```

and then, you don't need to use `math.` all the time in the code, but you need to import each function individually:

```python
print("sqrt2", sqrt(2))
```

which print

```
sqrt2 1.4142135623730951
```

In the cell below, we import module `math`. This is required only once, and it will now be available in the entire notebook. Usually, Python files and notebooks start with the packages you need:

```python
import os
import sys
import time
```

In [None]:
import math

In the cell below, implement the formula of the Gaussian density function (see formula above).

In [None]:
def gaussian(x, mu, sigma):
    """Computes the Gaussian density with mean `mu` and standard deviation `sigma`"""
    output = 0
    
    # Implement the formula
    #
    # output = ...
    #
    # (1 to 5 lines of code)
    # YOUR CODE HERE
    raise NotImplementedError()
    
    return output

Run the next cell to test whether your implementation is correct. Don't hesitate to make more tests by yourself.

In [None]:
# N(0, 1)
assert math.isclose(gaussian(0, 0, 1), 0.398942280, rel_tol=1e-6)
assert math.isclose(gaussian(0.5, 0, 1), 0.352065326, rel_tol=1e-6)
# N(1.2, 2.3)
assert math.isclose(gaussian(0, 1.2, 2.3), 0.1513812914, rel_tol=1e-6)
assert math.isclose(gaussian(2.9, 1.2, 2.3), 0.1319932418, rel_tol=1e-6)
assert math.isclose(gaussian(-0.3, 1.2, 2.3), 0.14022415038, rel_tol=1e-6)
assert math.isclose(gaussian(3.2, 1.2, 2.3), 0.11884686185, rel_tol=1e-6)

## Basic containers

In this section, you will manipulate basic containers, namely tuples, lists, and dictionaries.

### Python lists

A list is created with brackets:

```python
days = [ "Monday", "Tuesday", "Wednesday" ]
list_with_different_types = [ "John", "Doe", 37, "blue", 1.77 ]
```

and a new element can be added with `append()`:

```python
days.append("Thursday")
```

After adding this element we do:

```python
print(days)
```

which prints

```
['Monday', 'Tuesday', 'Wednesday', 'Thursday']
```

An element of the list is accessed with operator `[]`, indices start at `0`, and negative indices access elements in reverse order:

```python
print("First element:", days[0])
print("Second element:", days[1])
print("Last element:", days[-1])
```

The number of elements in the list (and more generally of the majority of Python containers) is given by `len()`:

```python
print("Number of elements in the list:", len(days))
```

We can iterate over the elements of the list with a `for` loop and function `range()`, which returns a sequence of integers like `[0, 1, ..., ]`, see [the documentation](https://docs.python.org/3/library/functions.html#func-range) of `range`.

```python
for i in range(len(days)):
    print(f"At index {i}, element is `{days[i]}`")
```

**Note that within the `for` loop, the indentation is (usually) 4 spaces**. We can also directly iterate over the values of the list:

```python
for day in days:
    print(day)
```

And function `enumerate` can provide an index over the value being iterated on, see [the documentation](https://docs.python.org/3/library/functions.html#enumerate) of `enumerate`:

```python
for i, day in enumerate(days, start=1):
    print(f"Day {i} of the week is {day}")
```

Run the next cell to see the output of these operations.

In [None]:
days = [ "Monday", "Tuesday", "Wednesday" ]
days.append("Thursday")

print(days)
print("First element:", days[0])
print("Second element:", days[1])
print("Last element:", days[-1])
print("Number of elements in the list:", len(days))

for i in range(len(days)):
    print(f"At index {i}, element is `{days[i]}`")
    
for day in days:
    print(day)
    
for i, day in enumerate(days, start=1):
    print(f"Day {i} of the week is {day}")    

You will now use lists and `for` loops in some little programming exercices.

In the cell below, write function `polynom` which should compute the value of a polynom given an input value and some input coefficients:

$$P(x) = \sum_{i=0}^D a_i x^i = a_0 + a_1 x + a_2 x^2 + \dots + a_D x^D$$

The pseudo code for this would be:

```

Input: x and the list of coefficients a = [ a_0, ..., a_D ]
y = 0
for i = 1, D do
    y = y + a[i] * x^i
return y
```

In [None]:
def polynom(x, coefficients):
    """Compute the value of the polynom
    
    Args:  
      x (float): input value to evaluate the polynom   
      coefficients (list): list of coefficients from a_0 to a_D
      
    Returns:  
      The value of P(x)
      
    Example: 
      For P(x) = x^2 - 3x + 2, we have coefficients = [2, -3, 1]
    
    """
    output = 0
    
    # Compute the value of the polynom on x. You may
    # want to loop over the coefficients with a for loop.
    #
    # output = ...
    #
    # (<5 lines of code)
    # YOUR CODE HERE
    raise NotImplementedError()
    
    return output

Run the next cell to test whether your implementation is correct. Don't hesitate to make more tests by yourself.

In [None]:
# P(x) = 2 - 3x + x^2
assert polynom(0, [2, -3, 1]) == 2
assert polynom(1, [2, -3, 1]) == 0
assert polynom(0.5, [2,-3,1]) == 0.75
# P(x) = 3 + 4x + 3*x^3
assert polynom(0, [3, 4, 0, 3]) == 3
assert polynom(1, [3, 4, 0, 3]) == 10
assert polynom(-2, [3, 4, 0, 3]) == -29
assert polynom(0.5, [3, 4, 0, 3]) == 5.375
assert polynom(-0.5, [3, 4, 0, 3]) == 0.625

Very often in machine learning, we need to compute the accuracy of a model given the predictions of the model and the ground truth labels.

A pseudo code for this operation can be

```
Inputs: 
1. the number of samples, N
2. the predictions y_1, ..., y_N
3. the ground truth labels l_1, ..., l_N

nb_correct = 0
for n = 1, N do
    if y_n = l_n then
        nb_correct = nb_correct + 1
    
return nb_correct / N
```

In Python, the syntax for a `if/elif/else` statement is the following:

```python
if score > 90:
    grade = "A"
elif score > 80:
    grade = "B"
elif score > 70:
    grade = "C"
elif score > 60:
    grade = "D"
else:
    grade = "E"
```

Both `elif` and `else` are optional. More on conditions [here](https://www.w3schools.com/python/python_conditions.asp). **Note again an indentation of 4 spaces**.

Implement function `compute_accuracy` in the cell below.

In [None]:
def compute_accuracy(predictions, labels):
    """Compute the accuracy
    
    Args:
      predictions (list): the predictions output by a model  
      labels (list): the ground truth labels
    
    Return:
      The accuracy (float)
      
    Example:   
      compute_accuracy([0, 4, 2, 3, 1], [0, 2, 6, 3, 1])    
      should return 0.6 because the first element `0` and
      the last two `3` and `1` are the same, so accuracy = 3/5
    
    """
    accuracy = 0
    
    # Compute the accuracy. You need to compare pairwise
    # the elements of predictions and labels.
    #
    # accuracy = ...
    #
    # (<8 lines of code)
    # YOUR CODE HERE
    raise NotImplementedError()
    
    return accuracy

Run the next cell to test whether your implementation is correct. Don't hesitate to make more tests by yourself.

In [None]:
pred = [3, 2, 4, 5, 6, 6, 1, 2, 3, 4]
lbl = [3, 2, 4, 5, 6, 6, 1, 2, 3, 4]
assert compute_accuracy(pred, lbl) == 1.0, "Labels and predictions are all the same"
pred = [3, 2, 4, 5, 6, 6, 1, 2, 3, 4]
lbl = [2, 4, 5, 7, 7, 1, 4, 3, 2, 1]
assert compute_accuracy(pred, lbl) == 0.0, "Labels and predictions are all different"
pred = [3, 2, 4, 5, 6, 6, 1, 2, 3, 4]
lbl = [3, 2, 4, 7, 6, 8, 1, 2, 3, 4]
assert compute_accuracy(pred, lbl) == 0.8, "Expect 0.8"
pred = [3, 2, 4, 5, 6, 6, 1, 2, 1, 4]
lbl = [3, 2, 4, 7, 6, 8, 1, 2, 3, 9]
assert compute_accuracy(pred, lbl) == 0.6 , "Expect 0.6"

### Python tuples

Tuples are like lists except they are initialized with parenthesis instead of brackets, and they are immutable: they cannot be changed once created. More on tuples [here](https://www.w3schools.com/python/python_tuples.asp).

One of the main differences with lists is that tuples are hashable (since they are immutable) and therefore can be used as key for a dictionnary (see next part).

```python
person = ("John", "Doe", 47)
print(person)
print(type(person))
```

displays

```
('John', 'Doe', 47)
<class 'tuple'>
```

And

```python
person[0] = "Jane"
```

yields

```
TypeError: 'tuple' object does not support item assignment
```

Another small detail, creating a one-element tuple requires and explicit comma:

```python
one_element_list = [ 3.14 ]
one_element_tuple = ( 3.14, )
not_a_tuple = ( 3.14 )
```

Run the next cell to see what is happening.

In [None]:
one_element_list = [ 3.14 ]
one_element_tuple = ( 3.14, )
not_a_tuple = ( 3.14 )

print(one_element_list, type(one_element_list))
print(one_element_tuple, type(one_element_tuple))
print(not_a_tuple, type(not_a_tuple))

### Python dictionnaries

A dictionnary stores data values in "key-value" pairs and is created with braces:

```python
person = { "firstname": "John", "familyname": "Doe", "age": 47 }
```

The values are accessed via the corresponding key:

```python
print("The person is called", person["firstname"], "and is", person["age"], "years old.")
```

The `[]` operator can be used to create a new element, or to update an existing one:

```python
person["firstname"] = "Jane" # Update firstname
person["height"] = 1.77 # Create key 'height' with value 1.77
```

We can iterate over the elements of the dictionary either by looping over the keys:

```python
for key in person:
    print(key, person[key])
```

or directly by looping over the pair using `items()`:

```python
for key, value in person.items():
    print(key, value)
```

And as for lists and tuples, `len` returns the number of pairs un the dictionnary.

Run the following cell to see what is going on.

In [None]:
person = { "firstname": "John", "familyname": "Doe", "age": 47 }
print("The person is called", person["firstname"], "and is", person["age"], "years old.")

person["firstname"] = "Jane"
person["height"] = 1.77

for key in person:
    print(key, person[key])
    
for key, value in person.items():
    print(key, value)
    
print("Number of elements:", len(person))    

In the cell below, implement function `count_occurrences` which should count how many times each letter of the input string occurs, and output a dictionnary with the following pairs (letter, number of occurrences).

For instance, if the input string is `exercice`, then the output dictionnary should be

```
{'e': 3, 'x': 1, 'r': 1, 'c': 2, 'i': 1}
```

You can iterate over the letters of a string with:

```python
string = "test"
for letter in string:
    print(letter)
```

And you can check whether a key is already present in the dictionnary with:

```python
if key in dictionnary:
    print("The key exists and is:", dictionnary[key])
else:
    print("The key is not present. Let's create it with value 0.")
    dictionnary[key] = 0
```

In [None]:
def count_occurrences(string):
    """Count the number of occurrences of each letter in the input `string`."""
    occurrences = {}
    
    # Count how many time each letter occurs in the input
    #
    # (<5 lines of code)
    # YOUR CODE HERE
    raise NotImplementedError()
    
    return occurrences

Run the next cell to see if your implementation is correct:

In [None]:
assert count_occurrences("ooooh") == {'o': 4, 'h': 1}
assert count_occurrences("exercice") == {'e': 3, 'x': 1, 'r': 1, 'c': 2, 'i': 1}
assert count_occurrences("Machine Learning") == {'M': 1, 'a': 2, 'c': 1, 'h': 1, 'i': 2, 'n': 3, 'e': 2, ' ': 1, 'L': 1, 'r': 1, 'g': 1}

# Feedback
You have now reached the end of this practical session. Please give us your feedback in the block below:
1. What have you learned?
2. What should be improved?
3. What was unclear?
4. Any comment is welcome!

**Do not forget to save your notebook before submitting it, otherwise you submit the last autosave checkpoint.**

YOUR ANSWER HERE