# Python Packages

## Programming Fundamentals (NB25)

### MIEIC/2019-20

#### Ricardo Cruz

INESC TEC

# About me

My name is Ricardo Cruz. I am doing a PhD in Informatics, and I am working with machine learning and computer vision.

If you interested in neural networks and etc, I recommend you do your Masters thesis with **Professor Jaime S. Cardoso**.

<img src="images/25/insulator.jpg">

# Python Packages

We will focus today on packages for scientific programming.

<img src="images/25/diagram.svg">

\* the ones to be discussed today

# NumPy

We have been creating matrices using lists of lists.

For example, we would write $$M=\begin{pmatrix}1&2&3\\4&5&6\end{pmatrix}$$

as

In [None]:
M = [
    [1, 2, 3],
    [4, 5, 6]
]
M

In [None]:
M = [
    [1, 2, 3],
    [4, 5, 6]
]

What do I do if I want to select the **first row**?

In [None]:
# demonstration

In [None]:
M = [
    [1, 2, 3],
    [4, 5, 6]
]

What do I do if I want to select the **third column**?

<img width="200" height="200" src="images/25/hard.png">

It isn't as easy, uh?

In [None]:
# demonstration

# NumPy to the rescue

In [None]:
M = [
    [1, 2, 3],
    [4, 5, 6]
]

# demonstration

# YES !!

<img src="images/25/yes.jpg">

Furthermore: we don't need to be concerned about aliasing. `M.copy()` copies everything.

# NumPy
## How to create arrays

| Type | Example |
|:-|:-|
| From lists | `np.array(M)` |
| Zeros  | `np.zeros((2, 3))` |
| Ones   | `np.ones((2, 3))` |
| Random uniform | `np.random.rand(2, 3)` |
| Random gaussian | `np.random.randn(2, 3)` |

(Why does numpy call them *arrays*?)

In [None]:
# demonstration
# (show .shape)

# NumPy
## Indexing & slicing

The difference in slicing between python list and numpy array is that we can access different axis at the same time:

In [None]:
a = np.zeros((5, 10, 2, 7))
a.shape

In [None]:
b = a[:, 3:10, -1, :]
b.shape

# Notice that indexing using negative values `a[-2]` is supported like in Python lists.

Let's create a matrix of zeros with a square of ones:

$$\begin{pmatrix}0&0&0&0&0\\0&1&1&1&0\\0&1&1&1&0\\0&1&1&1&0\\0&0&0&0&0\\\end{pmatrix}$$

In [None]:
# demonstration
M = ...

In [None]:
import matplotlib.pyplot as plt
plt.imshow(M)
plt.show()

Let's create a circle:

In [None]:
# demonstration
M = ...

In [None]:
import matplotlib.pyplot as plt
plt.imshow(M)
plt.show()

## What is an image?

An image is simply three matrices, each matrix represents a color: <b><span style="color:red">R</span><span style="color:green">G</span><span style="color:blue">B</span></b>.

```python
R = ...
G = ...
B = ...
```

Let's draw a red circle.

In [None]:
# demonstration (use np.stack)

# NumPy
## Aggregate operations

| Type | Example |
|:-|:-|
| Sum | `np.sum(array, axis)` |
| Average  | `np.mean(array, axis)` |
| Standard deviation   | `np.std(array, axis)` |

In [None]:
# demonstration

# NumPy
## Arithmetic operations

| Type | Example |
|:-|:-|
| element-wise addition | `A+B` |
| element-wise product | `A*B` |
| matrix multiplication | `np.dot(A, B)` |

In [None]:
# demonstration

## Arithmetic broadcasting

NumPy supports arithmetic broadcasting:

In [None]:
A = np.array([[1, 2, 3], [4, 5, 6]])
B = np.array([[1, 2, 3]])
print('A:', A.shape, 'B:', B.shape)

In [None]:
A*B

## Arithmetic broadcasting

As long as A and B have the same dimensions (axes), NumPy automatically (implicitly) repeats the arithmetic operation along the dimension of different size.

This is a time-saver but can also be a source of bugs. "Implicit is better than explicit." Not in NumPy. :P

*Other languages:* MATLAB 2016b added this feature which Octave already supported. Julia supports broadcasting but requires being explicit annotation.

# NumPy application

<img src="images/25/nowwhat.jpg">

1. Face recognition?
1. Background replacement? [[wiki](https://en.wikipedia.org/wiki/Foreground_detection)]

[change slide]

The main Python package for computer vision is **OpenCV**:
* It is quite old (2000)
* The primary languages are C and C++
* Python is the main secundary language
* OpenCV is a bit archaic but it has many features.

In [None]:
import cv2
import numpy as np

cap = cv2.VideoCapture(0)

while True:
    _, frame = cap.read()

    ...
    ...
    
    cv2.imshow('frame', frame)
    if cv2.waitKey(1) == 27:  # Esc
        break

cap.release()
cv2.destroyAllWindows()

# Pandas

<img src="images/25/pandas.jpg">

<img src="images/25/diagram.svg">

Working with matrices in NumPy is better than using lists, right?

But it could be even better. Pandas extends NumPy and Matplotlib:

1. You can easily import/export to CSV or Excel
1. You can access columns and rows using names
1. It has functions to merge matrices according to a certain column
1. It has functions to group and aggregate by categories
1. It has functions to work with time series

Matrices in Pandas are called **data frames** like in R and other statistical languages.

Why use Python when other languages exist for statistics? Python is a generic programming language, therefore it is easier to interoperate and deploy in real systems.

# Pandas
## Analyze students grades

First, let us get your grades: https://moodle.up.pt/grade/report/grader/index.php?id=2126

In [None]:
from pandas_ods_reader import read_ods
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [None]:
df = read_ods('FEUP-EIC0005-20192020-1S Pauta.ods', 1)

In [None]:
df.columns

In [None]:
# Let us get some things we want
df = df[['Endereço de e-mail', 'Total da unidade curricular (Real)']]

In [None]:
df

In [None]:
df.columns = ['email', 'nota']
df

In [None]:
df['nota']

In [None]:
np.mean(df['nota'])

In [None]:
df['nota'] = df['nota'].replace('-', np.nan)

In [None]:
np.mean(df['nota'])

In [None]:
df

In [None]:
# LE=10%
# RE=10%
# PE=50%
# TE=30%
df['nota'] = df['nota'] / 0.7

In [None]:
df

In [None]:
df['nota'].plot(kind='hist')

In [None]:
plt.hist(df['nota'])

In [None]:
df.boxplot('nota')

<img src="images/25/whatnow.jpg">

I suggest we either:

1. Break grades by class
1. Break grades by gender
1. Break grades by student year.

# Conclusion

We have learned about:
1. Numpy
1. Pandas
1. Matplotlib
1. OpenCV

### Questions?

<img src="images/25/diagram.svg">

We must speak about **SymPy**.

## SymPy

We have been working with numbers directly. NumPy is a **numerical** package.

On the other hand, SymPy is an **analytical** or **symbolical** package. It can manipulate mathematical abstractions.

* It can simplify mathematical expressions
* It can solve equations
* It can do differentiation and anti-differentiation
* It can plot mathematical expressions.

## Solve expressions

In [None]:
solve(x**2 - 1)

## Derivatives & primitives

In [None]:
diff(x**2)

In [None]:
integrate(x**2)

# What now?

1. Website for doing derivatives
1. GUI for doing derivatives

<img src="images/25/ui.png">

## How does the Internet work in one image

<img src="images/25/internet.svg">

## User interfaces

Major user interface packages for Python:
1. **Tkinter (tk):** a bit ugly, but comes bundled with Python!
1. **ttk:** prettier version of Tkinter, but does not come bundled with Python
1. **GTK:** the most used toolkit for Linux (also works in Windows and Mac)
1. **Qt:** cross-platform toolkit
1. **wxWidgets:** wrapper for native toolkit.

# Ticket to leave

## Moodle activity

[LE25: Python frameworks](https://moodle.up.pt/mod/quiz/view.php?id=49618)

$\Rightarrow$ 
[Go back to the Table of Contents](00-contents.ipynb)

$\Rightarrow$ 
[Read the Preface](00-preface.ipynb)