# Introduction to Jupyter and Python

## Jupyter Notebook

*This* is a **Jupyter Notebook**. 

The Jupyter Notebook is **web application** (you access it as a web page on any browser like Chrome, FireFox, etc.) and is **open-source**, maintained by the people at [Project Jupyter](https://jupyter.org/).

A notebook is an **interactive** document that allows you to both write and execute `python` code (not only python), without the need of using a command line (or shell), a script editor or and IDE. A notebook not only contains code, but text, visualization and equations.

The name, Jupyter, comes from the core supported programming languages that it supports: **Ju**lia, **Pyt**hon, and **R**. However, there are currently over 100 other kernels that you can also use.

You probably started Jupyter from a command line (or shell or terminal or whatever you call it) and, if you get back there, you will see that it is still running something. That is a Jupyter server (just like the ones you find on the internet). So you access Jupyter by your browser because it connects to that server at the URL: http://localhost:8888/lab (or something similar). The term *localhost* means you are hosting the server locally (on your machine).

### Notebook's Cell

A notebook is made of **cells** that can contain either: 

- **raw text**

- formatted text in **markdown** language.

This is a markdown cell. It does not seem so different from the text you have read since here, does it? That is because it was all written in markdown cells! 
Just double click on this text and you will see it will turn into a cell with unformatted text!

If you are interested in markdown, check out the web page on [Wikipedia](https://en.wikipedia.org/wiki/Markdown) and on [John Gruber's blog](https://daringfireball.net/projects/markdown/) (the inventor together with Aaron Swartz)

- **code**

In [None]:
# this is a python code cell, and this is a comment since it starts with `#`
print('hello world!') # here I say hello to the world.

When you **run** a cell you evaluate what is inside. If it was code you *execute* it and if it was markdown you transform it into formatted text (with raw cells nothing happens).

**How to run a cell?**
- *Boring way*: select the cell and then click the `Run` button on the toolbar in the upper part of the document.
- *Smart way*: when the cell is selected, just press `CTRL` + `ENTER` or `SHIFT` + `ENTER`

**How to change the type of a cell?**
- *Boring way*: on the toolbar you find a drop-down menu where you can select the cell type.
- *Smart way*: there are shorcuts. I will let you discover them.

## Python

Let's do some Python!

In [None]:
1 + 1

### Comments


A comment is a line that will not be executed. You can use comments to make your code more readable by others and/or to take notes. To write a comment, insert the `#` symbol and write your text after it.  

In [None]:
# This is how a comment looks like in python

### Variables

Variables are **containers** for storing data values.

To create a variable, you just assign it a value and then start using it. Assignment is done with a single equal sign `=`.

In [None]:
# you are creating a variable named `first_variable` that contains the value 42
first_variable = 42

# here, you are printing the value contained in `first_variable`
print(first_variable)

In [None]:
a, b = 7.5, 10 # you can do multiple assignment
print(a, b)    # you can print the content of more than one variable


For variable names NEVER use: numbers at the beginnig, blank spaces, this `-` symbol and special characters like !, @, %, $, &

<p style='font-size: 22px'>
    <span style='background:#FFCE33'>
        <b> Exercise: </b> Define two varables with different values and swap their content
    </span>
</p>

In [None]:
# Code here

### Data-Types

A programmer usually have to manage different data types and `python` provides some base types.

Here is an incomplete list:
1) `None`: null object (an object with no value)
2) **numeric**: simply numbers.
    - **boolean** (`bool`): only two possible logic values `True` or `False`. 
    - **integer** (`int`): all possible integer values (..., `-1`, `0`, `1`, `2`, ...)
    - **floating point** (`float`): represents rational numbers (`1.2`, `-7.0`, ...). Since Python (as any other programming language) runs on a finite machine irrational numbers are only
    approximated as rational.

#### Eamples

In [None]:
null_var = None
bool_var = True
int_var = 5
float_var = 333.6

#### Operations with numeric types

Each data type takes its operations. For numeric type they are quite intuitive, for data structure they may not.

With booleans there are logical operations like `not`, `and` and `or`.

In [None]:
a = True
b = False

c = a and (not b) or False
print(c)

Numeric types takes the traditional operations: sum `+`, subtraction `-`, product `*`, division `/`, and power `**`.

In [None]:
a = 6**2
a

In [None]:
a = (18*10 + 6**2 - 9*4)/10
a

We can also compare numeric types:
 - grater than (`>`), grater or equal than (`>=`)
 - less than (`<`), less or equal than (`<=`)
 - equal to (`==`), not equal to (`!=`)

And the result of a comparison is a boolean.

In [None]:
print(10 > 5)
print(10 <= 5)
print((10 > 5) and not (10 <= 5))

<p style='font-size: 22px'>
    <span style='background:#FFCE33'>
        <b> Exercise: </b> Compute the following expression:
    </span>
</p>

\begin{equation*}
\frac{1}{\Bigl(\sqrt{\pi \sqrt{5}}-\phi\Bigr) e^{\frac25 \pi}}
\end{equation*}

where $\pi=3.14$, $e = 2.72$, $\phi$ is your favourite number 

In [None]:
# Code here


<p style='font-size: 22px'>
    <span style='background:#FFCE33'>
        <b> Exercise: </b> What happens if I multiply different types: 1) int and float, 2) int and bool
    </span>
</p>

In [None]:
# Code here


3) **data structure**: structures that organize numeric and non-numeric values according to specific needs.
    - **string** (`str`): sequence of characters (simply text). Example: `'this is string'`.
    - **list** (`list`): ordered sequence of objects (numbers, strings, whatever) indexed by non-negative indexes. Example: `[1, -4.2, 'hello']`
    - **tuple** (`tuple`): same as lists but, while lists are *mutable* (insertion, deletion and substitution of elements is allowed) tuples are not. Indeed, tuples are *immutable*. Example: `(1, -4.2, 'hello')`
    - **dictionary** (`dict`): collection of objects that are indexed by another collection of nearly arbitrary key values. From another point of view, it is a collection of key-value pairs. Example `{'key': 'value', 'a': 1, 6: -3.2}`
    - **set** (`set`): unordered collection of unique items. Example: `{-1, 'string', 7.2}`
    
    

#### Eamples

In [None]:
str_example = 'Hello world'

In [None]:
print(str_example[0]) # indexing starts from 0

In [None]:
list_example = ['ab', 2] # a list containing 2 elements
list_example[1] = 3 # assignment
print(list_example)

In [None]:
tuple_example = ('ab', 3) # a tuple containing 2 elements
print(tuple_example[0]) # the indexing is the same as for lists and strings

<p style='font-size: 22px'>
    <span style='background:#FFCE33'>
        <b> Exercise: </b> Create a tuple with the sum/or of the elements of the same type in the following list:
    </span>
</p>

In [None]:
values_list = [2, 1.6, 2.3, True, 22, False, True]

In [None]:
# Code here


Dictionaries provide a **_mapping_** between two related elements: **key** and **value**.  A key-value pair is called a dictionary *item*.

In [None]:
players_empty = {}                                          # this is an empty dictionary
players_team = {"Kane": "Tottenham", "Salah": "Liverpool"}  # this is not

To access a value, use the **`[]`** operator on the dictionary by calling the corresponding key:

In [None]:
print(players_team["Kane"])

Some useful methods to manage dictionaries:

In [None]:
print(players_team.keys()) # returns all the keys
print(players_team.values()) # returns all the values
print(players_team.items()) # returns all the items

In [None]:
players_team['Martinelli'] = 'Arsenal' # Assigning new value 'Arsenal' to new key 'Martinelli'
print(players_team.items())

<p style='font-size: 22px'>
    <span style='background:#FFCE33'>
        <b> Exercise: </b> Fill the following dictionary through assignment operation
    </span>
</p>

In [None]:
info_dict = {
    'name': None,
    'surname': None,
    'age': None,
    'birthday': None, # tuple with (yyyy, mm, dd)
    'favourite movie': None
          }

In [None]:
# Code here

### Control Flow

Control flow allows a programmer to tell the machine the order in which the line of code must be executed. In `python` (similarly to most of other programming languages) the flow of the execution is regulated by conditional statements, loops, and function calls.

#### `if` statement

Often, you need to execute some statements only if some condition holds, or choose statements to execute depending on several mutually exclusive conditions.

```python
if <condition>:
    <statement>
elif <condition>:
    <statement>
else:
    <statement>
```

`if`, `elif`, and `else` are the keyword defining the control flow, `<condition>` is anything that can be interpreted as a boolean, and `<statement>` is a block of code.

Note that the colon mark `:` and the indentation have the role of defining the scope for each statement.

In [None]:
wavelength_nm = 500

if wavelength_nm < 400:
    print('ultraviolet')
elif (wavelength_nm >= 400) and (wavelength_nm <= 700):
    print('visible light')
else:# wavelength_nm > 700
    print('Infrared')

<p style='font-size: 22px'>
    <span style='background:#FFCE33'>
        <b> Exercise: </b> Given info_dict defined above, write an if statement that tells you in which season you were born
    </span>
</p>

- Spring: from 21 March to 20 June
- Summer: from 21 June to 23 September
- Autumn: from 24 September to 20 December
- Winter: from 21 December to 20 March

#### `for` loop

Often, you need to repeat the same operation for a given number of times or itereate over the elements of some sequence.

```python
for <element> in <sequence>:
    <statement>
```
  
`<sequence>` is anything you can iter over such as a tuple, a list, a string, an iterator (like `range()`). `<element>` is one of the elements of the sequence that are returned at each iteration.

In [None]:
# range(4) -> [0, 1, 2, 3]
for i in range(4): # range creates a sequence of int 
    print(i)

In [None]:
for letter in 'Python': # a string is a sequence
    print(letter)    

In [None]:
sounds = {'dog': 'wof', 'cat': 'meow'}
for key, value in sounds.items():
    print(key, value) 

<p style='font-size: 22px'>
    <span style='background:#FFCE33'>
        <b> Exercise: </b> Print the items of the info_dict and in case of birthdate also print the corresponding season
    </span>
</p>

In [None]:
# Code here


### Functions

Sometimes you need to use the same code in different part of your script. Re-writing the same code every time is very inefficient and error-prone (see the difference between [DRY and WET code](https://en.wikipedia.org/wiki/Don%27t_repeat_yourself)). To move around blocks of code avoiding repetition, you can use **functions**. Basically you wrap code inside a nice parcel and call it on need.

```python
def function_name(<argument1>, <argument2>, ...):
    <block of code>
    return <output>
```
    

In [None]:
def mean(number1, number2):
    average = (number1 + number2)/2
    return average
    
print(mean(10, 7))
print(mean(214, 34590))

<p style='font-size: 22px'>
    <span style='background:#FFCE33'>
        <b> Exercise: </b> Build a function that given an integer number outputs the cumulative sum (ex. if it takes 5, it should output the value that is equal to 0+1+2+3+4)
    </span>
</p>

In [None]:
def cumsum(x):
    # block of code here
    return y

### Importing modules

By default `python` does not come with all useful tools. They are often contained inside packages of code called **modules** that you need to explicitly invoke. You can do it with the `import` statement, followed by the name of the module.  

In [None]:
# By running this cell you import the `math` and `os` modules
import math
import os

In Python modules are treated as **objects** that contain other objects. To access their content, you need to recall the name of the module, append a dot `.` and then the name of the content: `module.content`

In [None]:
# here you are using the value `pi` and the function 'sqrt' both contained in the module `math`
print(math.pi, math.sqrt(121))

In [None]:
# here you are using the function `listdir` contained in the module `os`
os.listdir() # returns the list of the files' names in the current directory

You can also import a specific object from a module. In this case you do not need to specify the module before using it.

In [None]:
# here you are importing a specific function `randint` from a module `random`
from random import randint

In [None]:
randint(1, 6) # returns random integer in range [1, 6]

### NumPy

[`numpy`](https://numpy.org/) is a third-party package (that means it is not developed nor maintained by the [Python Software Foundation](https://www.python.org/psf/)) for **numerical computing** (**Num**erical **Py**thon package) and it's almost a standard in the Python community.

It is an open-source package belonging to the *SciPy ecosystem* ([SciPy.org](https://www.scipy.org/index.html)) which consists in a collection of open-source software for scientific computing in Python and it is a standard in the Python community.

`numpy` provides a multidimensional array object (arrays, matrices, etc.), various derived objects, and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more.

In [33]:
import numpy as np # standard abbreviation for numpy

#### `numpy` `ndarray`

`ndarray` (n-dimensional array) is the main object in `numpy` and it is a table of elements (usually numbers), all of the same type, indexed by a tuple of non-negative integers.

Example:
```python
np.ndarray([[1., 2., 3.],
            [4., 5., 6.]])
```
This is a 2-dimensional array (a matrix) that has 2 elements on the first dimension (2 rows) and 3 elements on the second dimension (3 columns). In `numpy` each dimension is called *axis*.

In [34]:
# generate a random 2D array with 2 rows and 3 columns (i.e., with shape (2,3))
a = np.random.rand(2, 3)
a

array([[0.29153198, 0.89089432, 0.88330406],
       [0.74959376, 0.93998535, 0.107906  ]])

In [35]:
print('number of dimensions:', a.ndim)
print('shape:', a.shape)
print('type:', a.dtype)

number of dimensions: 2
shape: (2, 3)
type: float64


`ndarray`s can have an aribitrary number of dimensions.

In [36]:
# generate 1D array containing integers from 0 to 23
b = np.arange(24)
print(b)
print(f'ndim: {b.ndim}, shape: {b.shape}, type: {b.dtype}')

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]
ndim: 1, shape: (24,), type: int32


In [39]:
b = b.reshape(2, 4, 3)
print(b)
print(f'ndim: {b.ndim}, shape: {b.shape}, type: {b.dtype}')

[[[ 0  1  2]
  [ 3  4  5]
  [ 6  7  8]
  [ 9 10 11]]

 [[12 13 14]
  [15 16 17]
  [18 19 20]
  [21 22 23]]]
ndim: 3, shape: (2, 4, 3), type: int32


##### basic operations

The same arithmetic operators we have seen for numeric types apply on arrays in a **elementwise**.

In [40]:
# create a 1D array
a = np.arange(12)
print('a:', a)

# multiply an array by a scalar
b = 2*a
print('b:', b)

# sum of 2 arrays with the same shape
c = a + b
print('c:', c)

a: [ 0  1  2  3  4  5  6  7  8  9 10 11]
b: [ 0  2  4  6  8 10 12 14 16 18 20 22]
c: [ 0  3  6  9 12 15 18 21 24 27 30 33]


In [41]:
for i in a:
    print(i)
    

0
1
2
3
4
5
6
7
8
9
10
11


Unlike in many matrix languages, the product operator `*` operates elementwise in `numpy`. The matrix product can be performed using the `@` operator.

In [43]:
A = np.arange(6).reshape(2,3)
B = 2*np.ones((2,3), dtype=int)
A, B

(array([[0, 1, 2],
        [3, 4, 5]]),
 array([[2, 2, 2],
        [2, 2, 2]]))

In [44]:
# elementwise multiplication
C = A * B
C

array([[ 0,  2,  4],
       [ 6,  8, 10]])

In [45]:
# matrix multiplication (dot product)
D = A @ B.T    # B transposed to match dimensions
D

array([[ 6,  6],
       [24, 24]])

<p style='font-size: 22px'>
    <span style='background:#FFCE33'>
        <b> Exercise: </b> 1) Build a random 1x3 vector and a random 4x3 matrix 2) execute a dot product among them 3) compute the squared sum of the elements in the result
    </span>
</p>

In [None]:
# Code here


##### Indexing

`numpy` arrays support basic indexing and slicing, plus new fancy ways of doing it.

In [None]:
a = np.arange(10)**3
a

In [None]:
a[9]

In [None]:
a[2:5]  # Note: 2 is included and 5 excluded (standard Python indexing)

In [None]:
a[0:6:2]

In [None]:
a[:6:2] = 1000 # equivalent to `a[0:6:2] = 1000`
a

In [None]:
a[-1]

In [None]:
a[::-1]

In [None]:
A = np.arange(6).reshape(2,3)
A

In [None]:
A[1, 0]

In [None]:
A[:, ::2]

<p style='font-size: 22px'>
    <span style='background:#FFCE33'>
        <b> Exercise: </b> 1) generate a 12x6 random matrix and print the elements corresponding to the last 3 rows and the first 2 even columns.
    </span>
</p>

In [None]:
# Code Here

### Matplotlib

[`matplotlib`](https://matplotlib.org/) is a comprehensive library for creating static, animated, and interactive **visualizations** in Python.

Like `numpy`, it is part of the *SciPy ecosistem* and play the role of the standard basic library for visualization in the Python community.

In [None]:
import matplotlib.pyplot as plt  # standard abbreviation for pyplot

Let's create a basic plot

In [None]:
t = np.arange(100)     # 1D array for time (x-axis in the plot)
x = np.sin(np.pi*t/10) # 1D array for amplitude (y-axis in the plot)

In [None]:
plt.plot(t, x, '-o')

A plot is made of two main elements:
- figure: think it as the frame of the image that is displayed
- axis: think as the object that is displayed and we can have more than one axis in a figure

`pyplot` provides the `subplots` method that allow to have control on those two objects.

In [None]:
fig, ax = plt.subplots()

ax.plot(t, x) # plotting directly from the axis
# adding title and labels on axes
ax.set(xlabel='time [s]', ylabel='amplitude', title='first plot')
ax.grid(True) # adding grid to the background

fig.savefig('first_plot')

In [None]:
t = np.arange(0, 10, 0.1)
x = np.sqrt(t) * np.sin(np.pi*t)
y = np.exp(-t**2/10) * np.cos(2*np.pi*t)

fig, ax = plt.subplots()
ax.plot(t, x, label='x') # first line plot
ax.plot(t, y, label='y') # second line plot
ax.legend(loc='upper left') # adding legend to the plot
ax.set(xlabel='time [s]', ylabel='amplitude', title='first plot')
ax.grid(True)

#### Histograms

Histogram is a way to visualize the distribution of numerical data.

In [None]:
n = np.random.normal(1, 2, 100000)               # generate data from normal distribution
fig, ax = plt.subplots(1, 1, figsize=(6, 3))

ax.hist(n, bins=50)              
ax.set_title("Normal histogram")
ax.set_xlim((min(n), max(n)))
ax.set_ylabel('frequency')
ax.set_xlabel('value');


### Files

#### read text

encoding is the name of the encoding used to decode or encode the file and has to be used only in text mode

In [None]:
with open(file='text_file.txt', mode='r') as f:
    txt = f.read()

In [None]:
txt

In [None]:
with open(file='text_file.txt', mode='r') as f:
    txt = f.read()

In [None]:
print(txt)

#### write pickle

We can also save/load files different from text, these files have to be managed in binary format.

To read/write it is necessary to indicate the binary mode through an additional charecter 'b'.

In [None]:
import pickle

with open(file='pleayers_team.pkl', mode='wb') as f:
        pickle.dump(pleayers_team, f)
      

The *pickle* module implements binary protocols for serializing and de-serializing a Python object structure. *“Pickling”* is the process whereby a Python object hierarchy is converted into a byte stream, and *“unpickling”* is the inverse operation, whereby a byte stream (from a binary file or bytes-like object) is converted back into an object hierarchy. 

#### read pickle

In [None]:
with open(file='pleayers_team.pkl', mode= 'rb') as f:
    pleayers_team2 = pickle.load(f)

In [None]:
pleayers_team

In [None]:
pleayers_team2

With text files sometims it is sometimes necessary to specify the value of an additional field *encoding*, e.,g., encoding='utf-8'.

Encoding is the name of the encoding used to decode or encode the file. 

In [None]:
with open(file='text_file.txt', encoding='utf-8') as f:
    txt = f.read()

In [None]:
print(txt)