# Introduction to Python

## Outline

1. What _is_ a Jupyter notebook
2. What _is_ Google Colab?
3. What _is_ python
4. Environments
5. Matrix Algebra and Parallelization

## Terms

Like in many disciplines, there is a simple informal language and a more precise formal one.

Terms in **bold** will appear here.

| Informal EN | Informale IT | Technical | Description |
| --- | --- | --- | --- |
| File/Code | Codice | Script | A file of python commands which can be run directly |
| Run | Girare | Execute | Read and act on each line of python code |
| Program (python) | App (piton) | Interpreter | Large and multi-featured set of codes with a single user interface |
| Folder | Cartella | Directory | Location on a computer |
| Notebook | Notebook | Jupyter Notebook | Document containing both executable code and rich text |

# What _is_ a Jupyter notebook?

<img src="jupyter.png" width="200">


Well, it's this.

A Jupyter¹ notebook combines together in one file:
- code that you can run directly
- human-friendly text, tables, and images

Notebooks are great for learning and experiments. No professional, "production" code runs in a notebook.

¹So, Gyove in Italian?

# What _is_ Google Colab?

<img src="colab.jpg" width="200">
<br>
<img src="colab.png" width="700">

Well, it's also this.

Colab (Colaboratory) lets you run code on extra Google cloud computers in a Jupyter notebook.

Colab is great for learning and small experiments, but computing resources are very limited.

# What _is_ python?

<img src="python.png" width="500">


It's really two different things:
1. A programming **language** (like C)
2. A computer **program** to run `.py` files written in python

Python is, in fact, named after Monty Python.

## Language

Like other languages, python has a "style guide" and even the short but canonical "Zen of Python":

<img src="zen-of-python.png" width="400">


Below is a valid command in python.

To **run** any **code block** click the arrow on the left of the block.


In [1]:
print('hello!')

hello!


If the **syntax** is wrong the computer won't understand.

In [2]:
print('hello!'

SyntaxError: incomplete input (617254288.py, line 1)

Feel free to experiment with the code blocks, you can't hurt anything.

## Program

<img src="interpreter.png" width="600">

<!-- <figure>
<center>
  <img src="interpreter.png" width="600">
  <figcaption>Image: python.org/downloads/</figcaption>
  </center>
</figure> -->



The program interprets code into actions the computer can take.

Let's:
1. Write a short code using the the language python
2. Make it do something using the program python

To create a new python file run the code block below:

In [3]:
!echo "print('hello!')">hello.py

Double click `hello.py` the file to see the code. That file we also call a **script**.

Now run the file with python, by running the code block below:

In [4]:
!python hello.py

hello!


Now edit the file again and make it say `no pineapple on my pizza!`. Save and re-run the code block above.

### Operations

Math operations, basically.

Let's make two variables to store information and modify later if we want. Comments begin with `#` and don't do anything.

This **assignment** always takes what is on the right side of `=` and puts it into the left side.

In [5]:
R=0.01  # resistence
I=5     # current

In a notebook, simply typing a variable _in the last line_ of a code block displays it.

In [6]:
I

5

We can use mathematical operations like $+,-,/,*,$ and $**$ for exponents.

The notebook "remembers" anything we already ran.

In [7]:
power = I**2 * R
power

0.25

We can use logical comparisons like $<,>,<=,>=,==$ (equal to), $!=$ (not equal to). Also `and`, `not`, and `or`.

Comments begin with `#` and don't do anything.

In [8]:
power == I

False

In [9]:
(R <= 1) or (R >= 1)

True

### Objects

Each **object** in python has a **type**, like a **float** (32-bit floating point number).

In [10]:
type(R)

float

#### String

A **string** (type "str") is just text. Most list (see below) operations can be done on a string.

In [11]:
coin = 'Bitcoin'
type(coin)

str

#### List

A **list** is some objects in series, like floats or strings:

In [12]:
coins = ['Bitcoin','Dogecoin','Ethereum'] + ['Litecoin'] # concatenation
coins

['Bitcoin', 'Dogecoin', 'Ethereum', 'Litecoin']

Read an item in the list (0 is the first):

In [13]:
coins[0]

'Bitcoin'

Edit an item:

In [14]:
coins[1] = 'Avalanche'
coins

['Bitcoin', 'Avalanche', 'Ethereum', 'Litecoin']

Add to the list

In [15]:
coins.append('Dogecoin')
coins

['Bitcoin', 'Avalanche', 'Ethereum', 'Litecoin', 'Dogecoin']

Remove from the list

In [16]:
coins.remove('Ethereum')
coins

['Bitcoin', 'Avalanche', 'Litecoin', 'Dogecoin']

#### Dictionary

A **dictionary** (type "dict") is a kind of list, where, instead of index numbers we can use string **keys**.

This is maybe one of the most important object types in python.

In [17]:
coin_value = {'Bitcoin':48625.24,'Ethereum':2010.17,'Dogecoin':0.08}
coin_value['Ethereum']

2010.17

### Flow Control

One of the biggest challenges in programming is controlling what code will run given certain inputs or conditions.

#### For/While

Sometimes we want to run some code many times. 

This `for` loop will run 6 times:

In [18]:
for i in range(6):
    print(i,'hello!'[:i])

0 
1 h
2 he
3 hel
4 hell
5 hello


And this `while` loop will run until the stopping condition is met:

In [19]:
sum = 0.0
while sum < 6:
    print(sum,'hello!'[:int(sum)])
    sum = sum + 0.9

0.0 
0.9 
1.8 h
2.7 he
3.6 hel
4.5 hell
5.4 hello


#### If.. Else

Other times we want to only run based on a condition.

This `if/then` statement does.. well can you figure it out? Run it several times :)

In [20]:
import time                     # we'll cover "import" later

seconds = time.time()
crystal_ball = seconds % .01    # % gives the remainder after division

if crystal_ball < 0.005:        # A) check this first
    print('Buy bitcoin!')
elif crystal_ball > 0.005:      # B) if not (A), check this
    print('Sell bitcoin!')
else:
    print('Hold bitcoin!')      # C) if neither (A) nor (B), do this

Sell bitcoin!


### Functions

Code we use often can be saved in a function.

```mermaid
%%{init: {"flowchart": {"curve": "basis"}} }%%

graph LR
    r(resistance) -->|input1| c(calculate_power)
    i(current) -->|input2| c
    c -->|output| p(power)
```

In [21]:
def calculate_power(resistance, current):
    return current**2 * resistance

In [22]:
calculate_power(resistance=0.01, current=5)

0.25

### Classes

A class is a kind of super function. Using them often makes cleaner code, resulting in fewer mistakes.

Let's **comment** the class (with `"""` for multiple lines).

Also lets define the **types** of the input and output data. This also reduces mistakes.

In [23]:
class Conductor():
    def __init__(self, resistance:float):
        """ Conductor useful calculations

        Args:
            resistance (float): resistance of the line in ohms
        """
        self.resistance = resistance

    def calc_loss(self,current:float)->float:
        """ Calculate the line loss

        Args:
            current (float): current of the line in ohms

        Returns:
            float: power
        """
        return current**2 * self.resistance

First we need to create a new **object**. We say it is a `Conductor` **type**.

In [24]:
line1 = Conductor(resistance=0.01)
type(line1)

__main__.Conductor

Then we can use the object and all it's `def` **methods** (to not confuse with functions).

In [25]:
line1.calc_loss(current=5)

0.25

In [26]:
line1.calc_loss(current=10)

1.0

# Environment

<img src="packages.png" width="400">

## Module

Remember `hello.py`? There we have to write or copy-paste all the code.

What if your colleage gives you some `.py` codes? You can use them directly with `import` -- no copy-paste.

Run the code below to create a new file called `hello_module.py`. 

In [27]:
!> hello_module.py
!echo "def say_hello():" >> hello_module.py
!echo "    print('hello from a module!')" >> hello_module.py

Double-click on `hello_module.py` to see inside.

Note that the code will not run: there is the function, but no code to call it. That's what makes it a **module** and not a **script**.

Now let's modify `hello.py` to use this the module.

Double-click on `hello.py` and add a new line at the top and bottom like below:

```python
import hello_module
print('hello!')
hello_module.say_hello()
```

Run the code below.

In [28]:
!python hello.py

hello!


Now look at each line in `hello.py` and be sure to understand it:
1. Find `hello_module.py`
2. Print "hello!"
3. Call the `say_hello()` function in `hello_module.py`

## Packages

```mermaid
graph LR
subgraph packageA
    m1(<b>module1.py</b><br>code..<br>code..)
    m2(<b>module2.py</b><br>code..<br>code..)
end
```

We want access to large "libraries" of code written by companies, universities, and the open source community. For example there is Google's machine learning tool `tensorflow`. 

These groups develop **packages** which are just folders containing a few to many modules. We always use packages, often without realizing it. Vanilla python would be incredibly boring without them.

`pip` will show us all the installed packages right now on the computer connected to the notebook. You will see `tensorflow` and some entertaining names like `weasel` and `kiwisolver`.

In [29]:
!pip list

Package                   Version
------------------------- ---------------
anyio                     4.3.0
archspec                  0.2.3
argon2-cffi               23.1.0
argon2-cffi-bindings      21.2.0
arrow                     1.3.0
asttokens                 2.4.1
async-lru                 2.0.4
attrs                     23.2.0
Babel                     2.14.0
beautifulsoup4            4.12.3
bibtexparser              1.4.1
bleach                    6.1.0
boltons                   23.1.1
Brotli                    1.1.0
cached-property           1.5.2
certifi                   2021.10.8
cffi                      1.16.0
charset-normalizer        2.0.7
colorama                  0.4.6
comm                      0.2.1
conda                     24.1.2
conda-libmamba-solver     24.1.0
conda-package-handling    2.2.0
conda_package_streaming   0.9.0
contourpy                 1.2.0
cycler                    0.12.1
daiquiri                  3.2.3
dcor                      0.6
debugpy         

## Environment

<img src="conda.png" width="600">

The **environment** is what is sounds like: all the packages and other things around when we run code (for example a solver which isn't a package but is a separate program that python can access). And often we don't have one fixed environment on a computer, but a few different _virtual_ environments for different projects.

We tend to use many different packages (and each package uses many) and they are all constantly being updated. So very quickly a problem arises where we need to manage which packages we have, and which version of each.

Managing environments is difficult and there are many competing approaches. These may change depening on which packages are used. For instance `tensorflow` can usually be managed with two tools: Anaconda and `pip`.


# Matrix Algebra and Parallelization

![gpu](gpu.gif)

Machine learning and AI, at their cores, involve lots of linear algebra, especially matrix multiplication and addition.

For example, the forward pass of a 6-unit artificial neural network with an input vector of length 24 requires multiplying a matrix $A$ with dimension $24\times6$ and adding to a vector of length $24$. 

A normal computer's **CPU** will do each of $24\times6=144$ multiplications in series. For large matrices (large models have billions of parameters) this would be far too slow, especially for training when the multiplication is done many, many times.

Rather, a **GPU** is a specialized chip which does much of the matrix multiplication in parallel. Original GPUs were developed for graphics, which also involves algebra with large matrices and needs to be fast. Now the foremost manufacturer NVIDIA is known for being a critical technology for AI development.

There are many layers of code which have to work perfectly with the GPU hardware to perform well. GPUs are expensive and power hungry, and CPU multi-threading is improving quickly. However the benefit of GPU parallelization can be 5-20x faster model training.

Parallelization refers to large computing jobs where many CPUs or GPUs are required. 



# Fun

In [30]:
import antigravity

Gtk-Message: 03:13:59.687: Failed to load module "xapp-gtk3-module"
Gtk-Message: 03:13:59.749: Failed to load module "canberra-gtk-module"
Gtk-Message: 03:13:59.751: Failed to load module "canberra-gtk-module"


Opening in existing browser session.
