# Introduction to Python

## Outline

1. What _is_ a Jupyter notebook?
2. What _is_ Google Colab?
3. What _is_ python?
4. Environment
5. Matrix Algebra and Parallelization


## Learning Objectives

- The function of python code and python interpreter
- Basic python syntax for variables, flow control, and using packages
- Find, import, and use a package
- Understand the problem of large matrix math and how parallelization helps

## Terms

Text in **bold** will appear here.

Like in many areas of tech and science, there is a simple informal language and a more precise formal one.

| Informal EN | Informale IT | Technical | Description |
| --- | --- | --- | --- |
| File/Code | Codice | Script, source | A file of python commands which can be run directly |
| Folder | Cartella | Directory | Location on a computer |
| Run | Girare | Execute | Read and act on each line of python code |
| Program (python) | App (pitone) | Interpreter (python) | The program which compiles and runs python code on a computer |
| Notebook | Notebook | Jupyter Notebook | Document containing both executable code and rich text |
| Is | È | Assignment | Saving data in a variable |
| -- | -- | Type | Python objects always have a type, like a list or float |
| -- | -- | Object | A data type which contains both data and code |
| -- | -- | Function | Selection of code that can be run simply by calling the function name |
| -- | -- | Class | A use-defined object type which contains both data and code (methods) |
| -- | -- | Method | A function inside a class |
| -- | -- | Initialize | Create a new object from a class |
| -- | -- | Module | A code file which isn't run directly but is imported into your main script |
| -- | -- | Package | A folder containing many related modules, like `tensorflow` |


# What _is_ a Jupyter notebook?

<img src="https://github.com/woodjmichael/PV-based-systems-lab/blob/main/jupyter.png?raw=true" width="200">


Well, it's this!

A Jupyter **notebook** combines together in one file:
- Human-friendly text, tables, and images
- **Code** that you can **run** directly

Notebooks are great for learning and experiments.

"Production" software (e.g. Chrome) is usually not in a notebook.

This is a text, or "Markdown", block.

In [1]:
# this is a code block
# click the arrow on the left to run it
6*7

42

# What _is_ Google Colab?

<img src="https://github.com/woodjmichael/PV-based-systems-lab/blob/main/colab.jpg?raw=true" width="200">
<br>
<img src="https://github.com/woodjmichael/PV-based-systems-lab/blob/main/colab.png?raw=true" width="700">

Well, it's also this!

Colab (Colaboratory) lets you run code on extra Google cloud computers in a Jupyter notebook.

Its great for learning and small experiments.

But you can't run code for many hours, especially if using a GPU.

# What _is_ python? 

<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/a/a3/Monty_Python_Live_02-07-14_12_56_41_%2814415567757%29.jpg/1280px-Monty_Python_Live_02-07-14_12_56_41_%2814415567757%29.jpg" width="400">

_From the Ministry of Silly Walks by Monty Python_

<!-- <img src="https://github.com/woodjmichael/PV-based-systems-lab/blob/main/python.png?raw=true" width="300"> -->

<!-- <img src="https://upload.wikimedia.org/wikipedia/commons/8/8e/Monty_Python_Live_02-07-14_13_04_42_%2814598710791%29.jpg" width="300"> -->


It's really two different things:
1. A programming **language** (like C)
2. An **interpreter** program to run `.py` files written in python

## Language

Python can be easily readable in english:

```python
while max(y) < 10 and iteration < 10:
    if x is in valid_inputs:
        y.append(x)
    else:
        raise ValueError
    iteration += 1
```

And when used **objectively** complex operations can quickly make sense.

```python
data_upsampled = data_raw.dropna().resample('1h').interpolate(method='linear')
```

Below is a valid command in python, which prints the canonical "Zen of Python."

In [2]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


If the syntax (sintassi) is wrong the computer won't understand.

In [3]:
print('hello!'

SyntaxError: incomplete input (617254288.py, line 1)

Try to fix the above command. What's missing?

Feel free to experiment with the code, you can't hurt anything.

### Operations

Math operations, basically.

This **assignment** always takes what is on the right side of `=` and saves it in the left side.

`I` _is_ `5`.

In [4]:
I=5     # current
R=0.01  # resistence

Comments begin with `#` and don't do anything.

In a notebook, simply typing a variable _in the last line_ of a code block displays it.

In [5]:
# I is 5
I

5

Arithmetic: $+,-,/,*,$ and $**$ for exponents.

The notebook remembers `I` and `R`, and anything else we already ran above.

In [6]:
power = I**2 * R
power

0.25

Logic: $<,>,<=,>=,==$ (equal to), $!=$ (not equal to).

Also `and`, `not`, and `or`.

In [7]:
power == I

False

In [8]:
(R <= 1) or (R >= 1)

True

### Objects

Each **object** in python has a **type**, like a float (32-bit floating point number).

In [9]:
type(R)

float

#### String

A string (type "str") is just text.

In [10]:
coin = 'Dogecoin'
type(coin)

str

Most list operations (see below) can be done on a string.

In [11]:
coin[0]

'D'

#### List

A list is a series of the same object:

In [12]:
coins = ['Dogecoin','Bitcoin','Ethereum'] + ['Litecoin'] # concatenate
coins

['Dogecoin', 'Bitcoin', 'Ethereum', 'Litecoin']

Read an item in the list (0 is the first):

In [13]:
coins[0]

'Dogecoin'

Edit an item:

In [14]:
coins[1] = 'Avalanche'
coins

['Dogecoin', 'Avalanche', 'Ethereum', 'Litecoin']

Add to the list

In [15]:
coins.append('Dogecoin')
coins

['Dogecoin', 'Avalanche', 'Ethereum', 'Litecoin', 'Dogecoin']

Remove from the list

In [16]:
coins.remove('Ethereum')
coins

['Dogecoin', 'Avalanche', 'Litecoin', 'Dogecoin']

It's easy to do a small operation on every item in a list:

In [17]:
coins_length = [len(coin) for coin in coins]
coins_length

[8, 9, 8, 8]

#### Dictionary

This is maybe the most powerful and common standard python object.

It's a kind of list with three main differences:
- The index is usually string *keys* instead of integer *index* values
- In each item of the dictionary you can put whatever you want
- We use `{` instead of `[` (but still `[` to access items)

In [18]:
coin_value = {'Bitcoin':48625.24,'Ethereum':2010.17,'Dogecoin':0.08}
coin_value['Dogecoin']

0.08

### Flow Control

A big challenge in programming is controlling what code runs given input X or condition Y.

For example you might want a code to only run once a day at midnight.

#### For

Sometimes we want to run some code many times. 

`range(N)` creates a list of integers starting at `0`, with length `N`.

This `for` loop will run 6 times:

In [19]:
for i in range(6):
    print(i,'hello!')

0 hello!
1 hello!
2 hello!
3 hello!
4 hello!
5 hello!


Often we want to access each item `in` something:

In [20]:
for s in 'hello!':
    print(s)

h
e
l
l
o
!


And if you want the index `i`:

In [21]:
for i,s in enumerate('hello!'):
    print(i,s)

0 h
1 e
2 l
3 l
4 o
5 !


#### If.. Else

Other times we want to only run based on a condition.

In [22]:
if 5<10:
    print('its true!')

its true!


Can you make sense of this `if/then` code? Run it several times :)

In [23]:
import time                     # we'll cover "import" later

seconds = time.time()
crystal_ball = seconds % .01    # % gives the remainder after division

if crystal_ball < 0.005:        # A) check this first
    print('Buy bitcoin!')
elif crystal_ball > 0.005:      # B) if not (A), check this
    print('Sell bitcoin!')
else:
    print('Hold bitcoin!')      # C) if neither (A) nor (B), do this

Buy bitcoin!


### Functions

Code we use often can be saved in a function and then *called*.

<img src="https://github.com/woodjmichael/PV-based-systems-lab/blob/main/function.png?raw=true" width="600">

In [24]:
def calculate_power(resistance, current):
    return current**2 * resistance

Now call the function with inputs

In [25]:
calculate_power(resistance=0.01, current=5)

0.25

### Classes

Classes a kind of super function, often making cleaner code with fewer mistakes.

Usually we **initialize** the class with some data or attributes.

Then we can call functions specific to the class, which we call **methods**.

In [26]:
class Conductor():
    def __init__(self, resistance):
        self.resistance = resistance
    def calc_loss(self,current):
        return current**2 * self.resistance

Let's comment the class (with `"""` for multiple lines).

Also lets define the types of the input and output data. This also reduces mistakes.

In [27]:
class Conductor():
    """ Conductor class """
    def __init__(self, resistance:float):
        """ Conductor useful calculations

        Args:
            resistance (float): resistance of the line in ohms
        """
        self.resistance = resistance

    def calc_loss(self,current:float)->float:
        """ Calculate the line loss

        Args:
            current (float): current of the line in ohms

        Returns:
            float: power
        """
        return current**2 * self.resistance

First we need to initialize a new object from the class. We say it is a `Conductor` type.

In [28]:
line1 = Conductor(resistance=0.01)
type(line1)

__main__.Conductor

Then we can use the object and any of it's `def` methods.

In [29]:
line1.calc_loss(current=5)

0.25

In [30]:
line1.calc_loss(current=10)

1.0

## Interpreter

<img src="https://www.dunebook.com/wp-content/uploads/2019/10/image1-1.png" width="600">

<!-- <figure>
<center>
  <img src="interpreter.png" width="600">
  <figcaption>Image: python.org/downloads/</figcaption>
  </center>
</figure> -->



The python program interprets code into actions the computer can take.

Notebooks use the interpeter but it's hidden. Magic!

Just once you should write a `.py` file and run the interpreter directly.

#### 1. Write code file

1. Click the folder icon on the far left of your screen. It's an empty rectangle.

2. In the side bar area right click, select `New File`, and name it `hello.py`. 

3. Double click `hello.py` to open it. Write `print('hello!')` and save (Ctrl+S).

#### 2. Run with interpreter

Run the code block below. The `!` gives us direct access to the Google cloud computer.

The command `python` runs the interpreter and `hello.py` is the file we want to run.

In [31]:
!python hello.py

python: can't open file 'c:\\Users\\Admin\\Code\\PV-based-systems-lab\\hello.py': [Errno 2] No such file or directory


The result is the same as typing `print('hello!')` into Colab.

The difference is we only used two things: a text file and the python interpreter.

# Environment

<img src="https://github.com/woodjmichael/PV-based-systems-lab/blob/main/packages.png?raw=true" width="400">

## Module

In `hello.py` we had to write (or copy-paste) all the code by hand.

What if your colleage gives you a `.py` code? Or you find something online?

You can use the code directly, or integrate it with your own code using `import`.

Run the code below to create a new file called `hello_module.py`. 

#### 1. Write a module file

1. In the side bar area right click, select `New File`, and name it `hello_module.py`. 
2. Double click `hello_module.py` to open it.
3. Write the following two lines and save (Ctrl+S).
```python
def say_hello():
    print('hello from a module!')
```

Note that the code will not run: there is the function, but no code to call it.

That's what makes it a **module** and not a script.

#### 2. Run in the notebook

Usually importing a module is unspectacular.

In [32]:
import hello_module

But now you can run our function from the module.

In [33]:
hello_module.say_hello()

hello from a module!


## Packages

<img src="https://github.com/woodjmichael/PV-based-systems-lab/blob/main/package-module.png?raw=true" width="400">

We want access to all the large "libraries" of code written the open source community. 

For example there is Google's machine learning tool `tensorflow`. 

These groups develop **packages** which are just folders containing a few to many modules.

We always use packages, often without realizing it. Otherwise vanilla python would be very boring.

`pip` will show us all the installed packages right now on the computer connected to the notebook.

You will see `tensorflow` and some entertaining names like `weasel` and `kiwisolver`.

In [34]:
!pip list

Access is denied.


Now let's find, install, and use a new package.

[meteostat](https://pypi.org/project/meteostat/) looks interesting!

> NB: Packages almost always have example code to get started

In [35]:
!pip install meteostat

Access is denied.


Let's run the the example code, but change the location for Milan and 2024

In [37]:
# Import Meteostat library and dependencies
from datetime import datetime
import matplotlib.pyplot as plt
from meteostat import Point, Daily

# Set time period
start = datetime(2024, 1, 1)
end = datetime(2024, 12, 31)

# Create Point for Milan, Italy
location = Point(45.503, 9.162)

# Get daily data for 2024
data = Daily(location, start, end)
data = data.fetch()

# Plot line chart including average, minimum and maximum temperature
data.plot(y=['tavg', 'tmin', 'tmax'])
plt.show()

URLError: <urlopen error unknown url type: https>

`pip` will now show the new package, `meteostat`.

In [38]:
!pip list

Access is denied.


## Environment

<img src="https://github.com/woodjmichael/PV-based-systems-lab/blob/main/conda.png?raw=true" width="600">

The **environment** is what is sounds like: all the installed packages and other code we might use.

Managing the environment is hard. Why?
- We tend to use many different packages (and each package uses many sub-packages)
- Also the packages are constantly being updated, and they aren't always compatible
- So we need to manage which version of which package we have, and "solve" compatability problems

There are different approaches to this problem.
- For example with the `tensorflow` package we often use two environment tools:
  - `conda`
  -  `pip`

Last, we don't often have one fixed environment on a computer, but different _virtual_ environments.


# Matrix Algebra and Parallelization


Machine learning and AI, at their cores, involve lots of linear algebra, especially matrix multiplication and addition.

For example, a single-layer neural network with 24 units and 96 input values:
- Each forward pass requires a $24\times96$ matrix multiplication
- A single-thread CPU will do each of the $24\times96=2304$ multiplcations in series
- However a GPU will do many of the columns in parallel, taking a fraction of the time 
- The neural network training requires _many_ forward passes and calculation of error gradients

The GPU advantage
- Large machine learning models can have millions of parameters and GB of training data
- In many realtime applications the model needs to calculate outputs in milliseconds
- Training time can be days or weeks
- The benefit of GPU parallelization can be 5-20x faster model training
- The cost is the expense of hardware, complexity, and energy

Then vs now
- Original GPUs were developed for graphics processing (also fast, large matrix multiplication)
- Meanwhile CPU "clusters" common at labs and universites


- Now the foremost manufacturer NVIDIA is critical path for AI development
- There are many layers of code which have to work perfectly with the GPU hardware to perform well
- However new CPUs have mutliple cores for _some_ parallel operation


<img src="https://github.com/woodjmichael/PV-based-systems-lab/blob/main/gpu.gif?raw=true" width="600">

# Code Project

If we have time we'll make a new module with the conductor class and then import it into Colab.