# Session 01: 2024-09-26
## Matteo Poggi, Phd

## What is Python and why it is important in Data Science

* Python is a high-level, general-purpose programming language.
* It is an interpreted language: when you call `python your_program.py`:
  * the source is compiled to bytecode;
  * the bytecode is executed on a virtual machine.
* Wide ecosystem of tool.
* Wide community.

## Resources

Books:
  * *Fluent Python* by Luciano Ramalho
  * *Pyhon 101* by Michael Driscoll

Online:
  * [Official doc](https://docs.python.org/3/tutorial/index.html): you should use this resource as 
  * [Real Python](https://realpython.com/)

## Surviving kit

As a general rule, in Python there are no blocks like in Fortran, Pascal (delimited with (begin/)end) or C (delimited with curly braces).

To introduce a block (i.e. the portion of code that is under a condition, loop etc) in Python one use semicolon `:`. The the code of the block is written increasing the indentation (usually by 4 white spaces) and the end of the block the old indentation is resumed.

### Comments

In [None]:
# I'm a comment. I will be not evaluated.

Comments do not affect the program. They however can affect the readability of your code. Good comment are very precious.

### Importing packages...

You will use many library in your programs (*do no re-invent the wheel*). To make a library available to your program you have to `import` it.

In [None]:
import math

### ...and modules

a package can be subdiveded in modules

```Python
from matplotlib import pyplot # selectively import a module
```

this imports only `pyplot` module of package `matplotlib`, not the whole package.

### print

In [None]:
print("I am printing a string!")
print("Hello world.")

### conditionals

Guess. What will be printed out?

In [None]:
a = 3
if a == 3:
    print("a is 3")                # block of code => indentation
elif a == 5:
    print("a is 5")                # block of code => indentation
else: 
    print("a is neither 3 nor 5")  # block of code => indentation

*notice the indentation*

### `for` loop

In general loop must be avoided. In Python there are other more expressive machanisms to achieve the same goal.

In [None]:
lecture_list = ["first", "second", "third"]

for l in lecture_list:         # here the variable l is introduced
    print(l)                   # here the variable l is used

Here the variable `l` is introduced. At first iteration `l = "first"`, at second iteration `l = "second"`, at the third (and last) `l = "third"`.

#### iterating on a range

* this iterates on number from 0 to 9. Often, in computer science we start conting from 0 and not from 1.

In [None]:
for i in range(10):
    print(i)

* this iterates on number from 13 to 19.

In [None]:
for i in range(13, 20):
    print(i)

* this iterates from number -5 to 7 in setp of 2

In [None]:
for i in range(-5, 8, 2):
    print(i)

#### altering the loop

* a `break` statement immediately exit the loop

Guess. What will be printed out?

In [None]:
ecture_list = ["first", "second", "third"]

for lecture in lecture_list:
    print(lecture)
    if lecture == "second":
        break

* a `continue` statement pass to the next iteration

Guess. What will be printed out?

In [None]:
lecture_list = ["first", "second", "third"]

for lecture in lecture_list:
    if lecture == "second":
        continue
    print(lecture)

### `while` loop

this repeat someting as long as the condition after `while` is satisfied

Guess. What will be printed out?

In [None]:
a = 5

while a > 0:
    print(a)
    a -= 1   # this means a = a - 1

### Defining a function

A program is (almost) a collection of statements. Some of these statements can be collected together in functions.

A function is defined with the `def` keywords. Use `return` to return a value from a function

#### function taking no value and returning no value

In [None]:
from datetime import datetime

def print_now():               # this is a function definition
    print(datetime.now())


print_now()                    # this is a function call

#### function taking a value and returning no value

In [None]:
def print_hello(number_of_times):
    for _ in range(number_of_times):       # notice the _
        print("Hello!")

print_hello(5)                             # a way to call
print("----")
print_hello(number_of_times=5)             # another way to call

NOTE: when you have to introduce a variable but you cannot use it, it is a common convention to call it `_`.

Of course the name of the variable you pass to a function does *not* have to match the name of the argument of the function:

In [None]:
n = 5

print_hello(n)
print("----")
print_hello(number_of_times=n)

A function can have one (or more) default arguments: if that argument is not passed the default is taken.

In [None]:
def print_bye(number_of_times=2):
    for _ in range(number_of_times):
        print("Bye!")

print_bye(5)         # prints `bye` 5 times
print("----")
print_bye()          # nothing is passed to `number_of_argument` so 2 is used: `bye` is printed twice.

#### function taking no value and returning something

In [None]:
from datetime import datetime

def return_now():
    return datetime.now()     # keyword return

now = return_now()
print(now)

#### function taking values ad returing a value

In [None]:
def reminder(dividend, divisor):
    return dividend % divisor          # in Python `%` means reminder


rem1 = reminder(10, 4)
rem2 = reminder(dividend=10, divisor=4)
rem3 = reminder(divisor=4, dividend=10)
rem4 = reminder(10, divisor=4)

print(rem1, rem2, rem3, rem4)

Notice that there are several way to call this function. When an argument is specified by the position (like both arguments in function returning `rem1`, or the first argument in the one returning `rem4`) is called *positional*; when instead an argument is specified via a keyword it is call *keyword argument*.

In a function definition:
* if you place `/` as an argument, all the arguments before (at the left of) `/` must be positional;
* if you place `*` as an argument, all the arguments after (at the right of) `*` must be keyword.

In [None]:
def bar(a, b, /, c, *, d=5):
    pass    # pass means do nothing!

in the `bar` function `a` and `b` must be positional, `c` can be either positional or keyword while `d` can be keyword only. Moreover `d` is defaulted to 5. Here we list some legal call of the function

In [None]:
bar(1, 2, 3, d=7)
bar(1, 2, 3)
bar(1, 2, c=3, d= 42)
bar(1, 2, c=3)
bar(1, 2, d=45, c=53)

please: be sure you understand why these calls are legal.

#### docstring

It is a special comment tha Pyton is able to read. It is used to document the function. Use it!

In [None]:
def my_documented_function(x):
    """
    This function print the value x
    """
    print(x)

#### A function can have a function as its input parameters

In [None]:
def evaluate_on_five(func):
    """
    This function evaluated the func you pass
    on the number 5 and returns the result
    """
    return func(5)   # to be a valid call `func` must be a function

you can pass it a func you defined.

Guess. What is the result?

In [None]:
def add_one(x):
    return x + 1


evaluate_on_five(add_one)

This is legal. However if the function `add_one` is used only to be passed to `evaluate_to_five` it does not deserve a name...

#### Anonymous functions: `lambda`

The code immediately befor can be rewritten

In [None]:
evaluate_on_five(lambda x: x + 1)

It is exactly the same thing. `lambda x: x + 1` means: a(n anonymous) function that *take* x and returns x + 1.

#### A function returning a `lambda`

In [None]:
def addition_factory(addendum):
    return lambda x: x + addendum

This function takes a number (`addendum`) and returns a function (a `lambda`) that takes a number and add it up to `addendum`

In [None]:
add_5 = addition_factory(5)   # remember: add_5 is a function!

add_5(3)                      # and, as a function, it can be called!

This is equivalent to the more verbose:

In [None]:
def addition_factory_bis(addendum):
    def addition_function(x):           # we define a function inside a function
        return x + addendum

    return addition_function            # and return it!


add_42 = addition_factory_bis(42)

add_42(8)

### and... asking for help

In [None]:
help(print)

In [None]:
help(my_documented_function)

## Be pythonic

In [None]:
import this

You should try not only to write program correctly... but also to be expressive in the language you are programming in.

## Setting up a Pytthon development environment

Python has tons of library. Which library should you have on your pc to use or develop a program?

Two ways:

1. Have every library installed globally: every program can access it:  
   **Pros**: you save space on your disk because you do not install a library more than one times
   **Cons**:
     - very difficult to track dependencies: which libraries does your program really need?
     - very difficult to distribute: which library should another user install in his/her machine?
     - sometimes some programs relies on libraries that are incompatible one another (i.e. they cannot coexist on the same system)

2. Isolate every program (or group of program) having some library installed only for it (them).
   **Pros**: reproducible and self-contained. You need a short(er) list of dependencies.
   **Cons**: it happens to have the same library more than once. But given the size of the storage devices, is this really a problem?


We will go for way number **2**, in the following.

### Isolating the Python environment with `venv`

*(Native, multiplatform)*

```bash
# create new environment in the CWD
$ python [-p <python executable>] -m venv /path_to_venv
# activate the environment (bound to the current shell)
$ source /path_to_venv/bin/activate
# install packages within the environment with pip (more on pip later)
(venv)$ pip install jupyterlab
```
NOTE: the Python executable must be installed system wide;
      then, once you are into the venv the path to the python interpreter may change.

Ref: [Python documentation: Creation of virtual environments](https://docs.python.org/3/library/venv.html#module-venv)

### Isolating the Python environment with Conda

*(third party, also commercial, multiplatform)*

Centrally manages your Python environments

```bash
# create new environment
$ conda create --name my-env-name [python=3.5]
# activate the environment (bound to the current shell)
$ conda activate my-env-name
# search and install packages within the environment
$ conda search <pkg-name>
$ conda install <pkg-name>
```

NOTE: very powerful if multiple python versions are needed!

Ref: [Getting started with conda](https://docs.conda.io/projects/conda/en/latest/user-guide/getting-started.html)

### Isolating the Python environment with Poetry

*(third party, multiplatform)*

Works also somewhat as project templating manager.

```bash
# creates new project in new folder `prj`
$ poetry new prj
$ ls --tree prj
prj
├── prj
│  └── __init__.py
├── pyproject.toml
├── README.md
└── tests
   └── __init__.py
```

```bash
# install packages
$ poetry add <pkg-name>
# install project and its dependencies in the venv
$ poetry install
```

Ref: [Poetry: Basic usage](https://python-poetry.org/docs/basic-usage/)

Please: take a look to a typical project structure. Note the presence of `tests` folder.

## Package Management

### Python Package Index (PyPI)

* repository for most third party libraries
* https://pypi.org

Please: take a look!

### Package Installer for Python (PIP)

* `pip` is the most low level way to install packages to a python environment

```bash
# install [upgrade] a package
(venv)$ pip install [-U, --upgrade] <pkg-name>
# uninstall
(venv)$ pip uninstall <pkg-name>
# outputs to the file `requirements.txt` all the package installed in this envirnoment
(venv)$ pip freeze > requirements.txt
# install all the requirements in the file `requirements.txt`
(venv)$ pip install -r requirements.txt

Usually, when you download a project, it comes with a `requirements.txt` file. Then, to install all the dependencies needed to the project you just need to do `pip install -r requirements.txt`. In this way it is very useful to distribute software.

### Some tips

* Use virtual environments or equivalent rather than system wide installation: **flexibility**;
* Explicitly set package versions: **portability**;
* Every package is both a **resource** and a **liability**: choose your dependencies wisely.

# Homework

* create a virtual environment with `venv` and install package `jupyterlab` on it.
* use `pip freeze` to generate `requirements.txt`. Take a look:
  - why if you install just a package, you find more than one package listed?
  - what are those number next to every package listed?
* **chose your favorite editor**: it will necessare to proceed. **You must be able to open, edit and save a file in it**.  
  BEWARE: if you are using `WSL` this editor should be installed inside `WSL`.
* Using `jupyter lab` write a function that takes a number and returns `True` if the number is prime and `False` otherwise.