## Reading the 100 page ML book

1. Don't give up ...
2. Realize that it takes time. 
3. Therefore: take the time!
4. Ask questions when you don't understand
  * Suggestion: make a list of questions
5. don't give up: you will only read a few excerpts of chapters.

# Machine learning wheel

<img src = "images/MLWheel.png" style="width:50%">

In [None]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

## Machine learning vs. AI vs. statistics

![](https://www.explainxkcd.com/wiki/images/d/d3/machine_learning.png)

## Intelligence

Definition: **Adaptation in the face of change** (Steven Pinker)

* We want things that adapts to our problems, and that can **predict the future**

## Statistics

* A very handy discipline to predict the future
* **Models** the real world into mathematical objects (lines, curves, etc.)
* Deals in likelihoods (80% chance of living beyond 60 years of age, etc.) and large populations (requires lots of data)
* We will use this extensively

## Machine learning (ML)

Definition: **Systems that continually improve**

* Why not just statistics?
  * Because statistical models are ideal
  
* Machine learning models are **highly** parameterised
  * $10^{6}$ knobs

## Artificial intelligence (AI)

* ... It sounds cool

* Seriously, AI can be defined as **intelligent agents**
  * *Agents that adapts in the face of change*

* Disappoingly, AI is still very stupid
  * But we will work with models that adapt to data, gaining power of prediction and generalization (contrary to statistics)

Google's Deep Mind AI, example of learning: https://www.youtube.com/watch?v=gn4nRCC9TwQ

## Using math

* No AI without ML, no ML without statistics, no statistics without math

* Yep that's right. If you want to be good at AI, you **have** to understand the math!
  * Luckily, because AI is fun, likewise math can be fun

* Math is many things
  * Basic algebra
  * Linear algebra
  * Probabilities
  * Calculus

# Math foundations

* Understanding "mathy" notation
* Working with simple linear algebra
* Basic probability tools
* Reading the 100 page ML book!

## Understanding mathy notation

* Math is essentially code. You can convert everything in math into Python

Math: $w = 1$

Python: 
```python
w = 1
```

Math: f(x) = x

Python:
```python
def f(x):
    return x

f = lamba x: x
```

Math: $m = \begin{bmatrix} -2 \\ 0 \\ 1 \end{bmatrix}$

Python:
```python
m = np.array([-2, 0, 1])
```

Math: $v = \begin{bmatrix} -2 & 0 & 1 \\ 5 & -4 & 2 \\ 7 & -1 & 3\end{bmatrix}$

Python:
```python
v = np.array([[-2, 0, 1], [5, -4, 2], [7, -1, 3]])
```

So; to be consistent....

## Your turn

* Create a new Jupyter Notebook
* Make sure you have `numpy` installed by typing
  * `import numpy as np`
  
  
* Write $f(x) = x ^ 2$
* Write $g(x) = {1 \over 10^{-x}}$
* Write $h(x) = g(x) + 2$
* Write $v = \begin{bmatrix} -2 \\ 0 \\ 1 \end{bmatrix}$
* Write $y = h(v)$

## Creating lists of stuff with Numpy

Math: $v = [0  \dots 100]$

Python:
```python
v = np.arange(0, 100)
```

In [None]:
np.arange(0, 100)

## Plotting stuff with Matplotlib

In [None]:
plt.plot(np.arange(0, 100))

In [None]:
xs = np.arange(0, 100)
ys = []
for x in xs:
    ys.append(x ** 2)
plt.plot(xs, ys)

In [None]:
xs = np.arange(0, 100)
ys = [x ** 2 for x in xs]
plt.plot(xs, ys)

## Your turn!

* Make sure you have matplotlib installed by typing `import matplotlib.pyplot as plt`
* To view the graphs *inline* in your notebook add:
  * `%matplotlib inline` (yes, with a `%`)
  
  
* Plot $f(x) = x ** 2$
* Plot $g(y) = y * \pi$ 
* Plot $h(z) = g(z) * cos(z)$

## A dimension

* A vector, $\textbf{v}$, of length $n$ "lives" in a vector space of $n$ dimensions, you all know this
  * The length of $n$, `len(v)` is the number of elements in that vector
  * Think about a vector in a 2D vector space: $\textbf{v} = [x, y]$, and in 3D: $\textbf{v} = [x, y, z]$

## A feature vector

"A feature vector is a vector in which each dimension $j = 1, \dots , D$ contains a value that describes the example somehow."

* "A feature vector is a vector in which..." 
  * This means that we have an unnamed vector that we're trying to define
* "each dimension $j = 1, \dots , D$"
  * Now we know we have a variable $j$ and that $j$ contains a number of things
  * The thing starts with the 1 and ends with D -- Oh, a second variable!
  * What is D? ... Could be 100; let's say it's 100
  * So D is the number of dimensions, then D must be the length! And if D is 100 then the unnamed feature vector is 100 elements long
* "contains a value that describes the example"
  * Ok, so each thing inside the unnamed vector (the thing with number $1, \dots, D$) helps us describe one piece of example data
  * Ok, that must mean that element $1$ is describing something different that element $2$
  * ... I don't understand much more, let's move on!

"That value is called a *feature* and is denoted as $x^j$"

* Ok, so now we have a name for our list called $x$
* We also heard that each single element (the ones with numbers $1, \dots, D$) is called a **feature**
  * And each element is uniquely identifiable by $j$: $x^j$
  * Hey, I've seen that before! That's an index!
  * Ahhhh, easy, so this just means that I can look at an element in the vector by it's index!
  * ... Oh, and that element is then for some reason called a *feature*

In [None]:
# We have a vector x of 1, ..., D elements
D = 100
x = np.arange(1, D + 1)

# Apparently we can look up a certain element in the list given j:
j = 1
x[j]

In [None]:
# Be careful with indexing!
print(x[0])
print(x[1])

A **feature vector** is then simply just a list where we can look an element up by its index.

Math: $x^1$

Python:
```python
x[0]
    ```

Math: $x^{100}$

Python: 
```python
x[99]
```


## Feature vectors as descriptions of things

Let's say we have a Person class:

```python

class Person:
    
    def __init__(self, age, height):
        self.age = age
        self.height = height
```

A Person is described by two **features**: `age` and `height`

* So a **feature vector** for a person would have 2 dimensions
  * $x^1 = $ age
  * $x^2 = $ height

## Feature vectors in high dimensions

* Imagine an image with the resolution `1024 x 800`
  * How many pixels does that image have?

* That image has `1024 * 800 = 819200` features!
```python
x[0]      # First feature
x[819199] # Last feature
```

* BUT Wait a moment....! How many was that?? (monochrome, each pixel is a bit (0/1))
  * For an RGB image the number of features is, rather, `1024 * 800 * 3 = 2457600` features!
