# Some advanced features of Python

This notebook will be dedicated to some _advanced_ features of Python. Keep in mind that Python is a mind-bogglingly rich language. Throughout these notebooks, we are only scratching the surface of what it can do :o

## Table of content
- [typing annotations](#Typing-annotations---toc)
- [lambda functions](#Lambda-functions---toc)
- [nested functions](#Nested-functions---toc)
- [recursion](#Recursion---toc)
- [dataclasses](#Dataclasses---toc)
- [time the execution of a piece of code](#Time-the-execution-of-a-piece-of-code---toc)
- [isolating executable code from library code](#Isolating-executable-code-from-library-code---toc)

## Typing annotations - [toc](#Table-of-content)

Let's say you want to write a function that computes the mean of a list of integers as a real number. You might write something like the following

In [None]:
def mean(list_of_ints):
    n = len(list_of_ints)
    if n == 0:
        raise Exception("the input list is empty")

    return sum(list_of_ints) / n


assert mean([1, 2, 3, 4, 5, 6]) == 3.5

However, having to write `list_of_ints` for the first argument of `mean` does not look very clean, does it?
We have to either
- not give any information to the caller of `mean`
- write a docstring with the type information
- tweak the argument names

And also, we don't know the type of the output of `mean`...

There is a way to add all this information without any extra comments or weird names! We will use _type annotations_ for that!

A type annotation can be added to variables, function arguments and function return values.
They take the following form in general
```python
a: int = 5  # the value store in the `a` variable is an `int` and has value `5`

# the function `f` takes a `float` called `x` and return a `str`
def f(x: float) -> str:
    return f"x = {x}"
```

> **Note**
>
> these types are valid Python syntax but won't be checked at _runtime_ (execution time).
> giving a `str` to the function `f` above won't give any error unless you use external tools surch a MyPy or PyRight, however, these are out of the scope of this notebook and won't be further mentioned.

Another very useful thing to know about is the `typing` module from the standard pool of modules that come with Python by default and which gives a lot of different types to be used in annotations, e.g. `List`, `Tuple` or `Dict`.

Let's rewrite the `mean` function above with proper annotations and argument names!

In [None]:
from typing import List


def mean(values: List[int]) -> float:
    n = len(values)
    if n == 0:
        raise Exception("the input list is empty")

    return sum(values) / n


assert mean([1, 2, 3, 4, 5, 6]) == 3.5

Much better, isn't it? By only looking at the signature of the `mean` function, i.e.
```python
(values: List[int]) -> float
```
we know that it expects a list of integers and will return a floating number!!

A bonus type, so you can see that it's possible to do really complicated functions with type annotations.

Right now, we raise an error if the input list of values is empty...
But what if we wanted to return `None` instead to indicate that there is no _mean_ for an empty list?

We can use the `Optional` type for that and early return `None` if the list is empty :)

In [None]:
from typing import List, Optional


def mean(values: List[int]) -> Optional[float]:
    """ returns `None` if the input is empty """
    n = len(values)
    if n == 0:
        return None

    return sum(values) / n


assert mean([]) is None
assert mean([1, 2, 3, 4, 5, 6]) == 3.5

> **Note**
>
> it's considered good practice to mention the cases where the returned value is `None` in some comment or docstring, so that the caller knows what to expect when calling the function on edge cases!

## Lambda functions - [toc](#Table-of-content)

Up until know, all the functions that you have been writing have been using the `def` keyword.
However, there is another way to define functions: the `lambda` keyword.

> **Note**
>
> such functions are sometimes called _anonymous_ functions.

A practical use of _lambda_ functions is to write _functional_ Python. (Python is NOT a functional language ;))

In such a paradigm, it's very common to use the three functions below
- `map` will apply a function to each value of an _iterable_
- `filter` will select from an _iterable_ only the values that satisfy a given predicate
- `reduce` from the `functools` module will accumulate values from an _iterable_ and produce a single value

You can see that all these functions will take functions as arguments, i.e. `map` to apply the function to all elements, `filter` to check whether an element should be discarded or not and `reduce` to accumulate the elements.

This is possible because function are _first-class citizens_ in the Python language and thus can be stored in variables, given to functions and returned from functions!

### Syntax
The syntax of _lambda_ functions is the following
```python
lambda x1, ..., xn: some_expression
```
where `x1`, ..., `xn` are the arguments of the _lambda_ function and `some_expression` is any expression involving the `x1`, ..., `xn` arguments.

### Example
Let's say we want to compute the product of the squares of all odd elements from a list of integers.
Without the three functions above, you might write something like

In [None]:
values = [1, 2, 3, 4, 5, 6, 7, 8, 9]

odd_values = []
for v in values:
    if v % 2 == 1:
        odd_values.append(v)

squared_values = []
for v in odd_values:
    squared_values.append(v ** 2)

product = 1
for v in squared_values:
    product *= v

assert product == 893025

We can rewrite all this in almost a single line with the `map`, `filter` and `reduce` functions and using _lambda_ functions.

In [None]:
from functools import reduce

values = [1, 2, 3, 4, 5, 6, 7, 8, 9]

is_odd = lambda x: x % 2 == 1
square = lambda x: x * x
product = lambda x, y: x * y

assert reduce(product, map(square, filter(is_odd, values)), 1) == 893025

Which can be further simplified by passing the _lambda_ functions directly to the functions

In [None]:
from functools import reduce

values = [1, 2, 3, 4, 5, 6, 7, 8, 9]

assert reduce(
    lambda x, y: x * y,
    map(
        lambda x: x * x,
        filter(lambda x: x % 2 == 1, values),
    ),
    1,
) == 893025

> **Note**
>
> Python is honestly not the best language to write functional code...
> Mostly the fact that `map`, `filter` and `reduce` are functions that take _iterables_ as arguments rather than methods on _iterables_ makes the syntax worse to read.
>
> Some language allow things like the following, which is arguably much better!
> ```rust
> values [1, 2, 3, 4, 5, 6, 7, 8, 9]
> values
>     .filter(lambda x: x % 2 == 1)
>     .map(lambda x: x * x)
>     .reduce(lambda x, y: x * y, initial=1)
> ```


## Nested functions - [toc](#Table-of-content)

Sometimes, we write functions that only make sense in some restricted scope, or maybe functions we would like to hide to the outside of our modules.

In order to do that, it is possible to _nest_ functions inside other functions!

### Example
Let's say we want to write a `mean` function again, but where we reimplement the `sum` function ourselves for some reason.
However, we don't want to shadow the `sum` function of the scope outside of `mean`!

We can use _nested_ functions to achieve that.

In [None]:
from typing import List, Optional


def mean(values: List[int]) -> Optional[float]:
    """ returns `None` if the input is empty """
    n = len(values)
    if n == 0:
        return None

    def sum(values: List[int]) -> int:
        res = 0
        for v in values:
            res += v
        return res

    # this line will use the `sum` defined just above
    return sum(values) / n


assert mean([]) is None
assert mean([1, 2, 3, 4, 5, 6]) == 3.5

## Recursion - [toc](#Table-of-content)

_Recursion_ is a programming technique where a function will call itself on a slightly different set of arguments until reaching a base case, where the function calls unfold and finally return a value.

Some important notes about _recursion_:
- the number of times a function can call itself is limited (in some languages implementing _tail recursion optimizations_, that can be worked around, but it's not the case of Python)
- you should always make sure each _recursive_ call of the function brings its arguments closer and closer to the base case, otherwise, the _recursive_ stack of calls might never end...

## Example
Let's try to implement the factorial function.

The factorial of a number $n$, is the product of all the integers from $1$ to $n$.

Following this iterative definition, we might write

In [None]:
from typing import Callable


# test if the function given as input implements the factorial function
def test_factorial(f: Callable[[int], int]):
    for i, o in [
        (0, 1),
        (1, 1),
        (2, 2),
        (3, 6),
        (4, 24),
    ]:
        assert f(i) == o

In [None]:
def fact_iter(n: int) -> int:
    res = 1
    for i in range(1, n + 1):
        res *= i
    return res

test_factorial(fact_iter)

Another definition of the factorial is _recursive_ and tells that the factorial of any positive integer $n$ is equal to $n$ times the factorial of $n - 1$, and the factorial of $0$ is equal to $1$.
Let's rewrite the factorial function!

In [None]:
def fact_rec(n: int) -> int:
    # our base case
    if n == 0:
        return 1

    return n * fact_rec(n - 1)
    
test_factorial(fact_rec)

Finally, if we want to raise an error in case the input number is negative, without checking that at each _recursive_ call, we can use a neste _auxiliary_ function :)

In [None]:
def fact_rec(n: int) -> int:
    if n < 0:
        raise Exception(f"n should be positive, found {n}")

    def aux(n: int) -> int:
        # our base case
        if n == 0:
            return 1

        return n * fact_rec(n - 1)

    return aux(n)
    
test_factorial(fact_rec)

try:
    fact_rec(-42)
except Exception as e:
    print(e)

## Dataclasses - [toc](#Table-of-content)

When writing classes, one almost always has to write an `__init__` and a `__repr__` method.
When these methods are simply listing all the fields of the class, then it's quite tideous to write them...
And we don't like tideous!

_Dataclasses_ are a powerful tool to simplify the writing of classes :)

### Example
Let's write a very simple 3D vector class, which will have `x`, `y` and `z` as it's coordinates and will be represented as `(x, y, z)`.

In [None]:
class Vec:
    def __init__(self, x: float, y: float, z: float):
        self.x = x
        self.y = y
        self.z = z
    def __repr__(self) -> str:
        return f"({self.x}, {self.y}, {self.z})"


v = Vec(1, 2, 3)
print(f"v = {v}")

And now, with _dataclasses_, we can greatly simplify this boilerplate code!

In [None]:
from dataclasses import dataclass


@dataclass
class Vec:
    x: float
    y: float
    z: float


v = Vec(1, 2, 3)
print(f"v = {v}")

## Time the execution of a piece of code - [toc](#Table-of-content) 

Sometimes, we would like to measure the time it takes certain functions to run. This is quite simple with the `time` function from the `time` builtin module of Python.

Let's say we have defined a function `f` earlier that takes an integer argument `n`, e.g. the function that computes the sum of the first $n$ integers $n \mapsto \sum\limits_{k=1}^n k^2$.

Let's measure the time it takes to run this function on a few inputs:

In [None]:
def f(n: int) -> int:
    return sum(i * i for i in range(1, n + 1))

In [None]:
from time import time

for n in range(25):
    start_time = time()
    f(2**n)
    elapsed = time() - start_time
    print(f"f({2**n}) took {elapsed}s")

## Isolating executable code from library code - [toc](#Table-of-content)

An important distinction can be made between _executable code_ and _library code_:
- the former is code that will be executed as soon as you interpret a Python module
- the latter is code that one wants to expose to the outside world but that should not run anything, only the _importer_ of the library will run the code when they want.

In this section we will see what you should do and what you should NOT do when you want to run code next to library code that needs to be exported and used elsewhere.
A concrete example of this in the following classes will be writing functions that (1) you want to test and debug in the same module as where they are defined and (2) will be exported and run separately in tests.

As an example, let's say we write a function that computes the sum of the first $n$ squares of integers. We want to define a function that does that and also measure the performance of our function on large values of $n$, in the same Python module.

The naive way would be to do as in [naive-module.py](../src/naive-module.py), i.e. put the code just next to the function.

Let's run it.

In [None]:
# don't worry about the next two lines
import sys
sys.path.append('../src/')

In [None]:
import naive_module
print(naive_module.sum_of_ints(10))

What just happened? Is this desirable if we just want to use the `sum_of_ints` function directly?

Another thing with this _naive_ module is that, now, we can access to the internal import of the `time` function from the `time` library!

In [None]:
print("current time:", naive_module.time())

In Python, we can use a special block of code that we will call _main block_ in the rest of these classes.

It is considered good practice to use these and the syntax is the following:

In [None]:
if __name__ == "__main__":
    print("i am main!")

Now, let's run the same code as before, but with the [module_with_main.py](../src/module_with_main.py) module

In [None]:
import module_with_main

print(module_with_main.sum_of_ints(10))

Now, the code runs directly without the internal performance measurements!

And we also can't access the internal `time` function anymore!

In [None]:
print("current time:", module_with_main.time())