In [1]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

# Machine learning & AI

Jacob Trier Frederiksen og Jens Egholm Pedersen

## Jumping right in!

1. Download and install Python (version 3.5+ !)
  1. [https://python.org/downloads](https://python.org/downloads)
2. Make sure you can access `pip` (Python package manager) from the command line
  2. Open a console
  3. Type `pip --version` or `pip3 --version`
3. Install `jupyter`, `keras`, `tensorflow`, `scikit-learn`, `scikit-image` and `matplotlib` by typing:
  1. `pip install jupyter keras tensorflow scikit-learn scikit-image matplotlib`
4. Open a new (!) console and
  1. Clone the GitHub repository at https://github.com/Jegp/keras_deepdream
  2. Enter the directory of the cloned repository
  3. Open a Jupyter notebook server by typing `jupyter notebook`
  4. This should open your browser on `localhost:8888`. If not, go to that URL

5. Execute the Jupyter Notebook cells one by one by pressing play (▶️) or pressing Control+Enter
  1. Some of the cells might take some time to process. Try to figure out what they're doing while you wait
6. Modifying the behaviour of the program
  1. Find a picture of yourself somewhere and place it as a `.jpg` in the folder
  2. Point the variable `original_image` to your new image
  3. Execute all the cells, starting from the one with the variable you just changed (you don't have to execute the ones above)

## Agenda

* A taste of power: dreaming with AI (already done)
* Introduction to the course
* How to approach this course
* Machine learning vs. AI vs. statistics
* Machine learning wheel
* Math primer

## Introduction to the course

The course provides the student with fundamental concepts in math, machine learning and deep learning. The student will acquire knowledge and skills in how to 
1. Identify appropriate business cases.
2. Prepare and extract features from data. 
3. Construct machine learning models over data. 
4. Train, validate and optimise the models.
5. Test prediction strength
6. Deploy models to production. 

## Knowledge 

The student will possess knowledge of: 

*      Python programming language constucts and usage of external Python APIs 

*      Basic linear algebra, regression, optimisation problems and techniques 

*      Theoretical concepts underlying supervised and unsupervised learning 

*      The general principles and common pitfalls of machine learning 

*      Representative business questions as well as machine learning tools and models to answer them

## Skills


The students will be able to: 

*      Write Python scripts and programs using common language constructs in the read-eval-print-loop (REPL), "Jupyter Notebooks", as well as stand-alone programs 

*      Apply the Python libraries Scikit-learn, Numpy and Pandas to programmatically download, preprocess and analyse data 

*      Construct, train and validate supervised and unsupervised machine learning / deep learning models, using the Scikit-learn and Keras libraries 

*   Create various types of plots programmatically to share insight into data

## Competencies 

The students will be able to: 

*      Recognise and describe possible applications of machine learning 

*      Compare, appraise and select machine learning models methods for specific tasks 

*   Present and visualise data and findings 

*    Solve real-world data mining and pattern recognition problems by using machine learning techniques

## Platforms

* Moodle for announcements
* Website for course overview and homework
  * https://datsoftlyngby.github.io/soft2019spring
* GitHub for code, presentations and other materials
* Peergrade for hand-ins
  * You will grade each other!
* Menti for class-based questions
  * menti.com

## About us

## Jacob Trier Frederiksen

* Ph.D., Computational Astrophysics (Stockholm, 2008)
* 5+ years of industry experience
* 5+ years of teaching experience

![](images/jacob_location.png)

## Jens Egholm Pedersen

* MSc. Computer Science (IT & cognition)
* ~5 years of industry experience
  * Worked at CERN
* ~3 years of teaching experience
  * Cphbusiness, ITU

![](images/jens_location.png)

## How to approach this course

* Being human
* Cognitive dissonance
* Dunning-Kruger
* Metacognition


# Being human

* Contrary to computers, humans are genius guessing machines

* ... Except when it comes to our own abilities

* **Cognition**: The process of thinking

* **Dissonance**: Lack of agreement

## Cognitive dissonance

**The mental discomfort experienced by a person who simultaneously holds two or more contradictory beliefs, ideas, or values**

* Examples: 
  * Students who are smart, but lazy
  * Teachers who are stuck with old technology, but think they are up-to-date
  

* Discomfort in confrontation
  * Tendency to ignore the dissonance
    * Short-term rewards: Play Fortnite

## Dunning-Kruger effect

* Unskilled and unaware of it: why people fail to recognise their incompetence


<img src="images/dunning-kruger.png"/>

* This course is technical, and it's ok
  * The best strategy is to acknowledge that you have something to learn

* If you are not failing, you are doing it wrong!
  * "The only one who never makes mistakes is the one who never does anything."

# Meta learning

* Learning how to learn
  * "Being aware of and taking control of one’s own learning" - John Biggs

You may have seen Bloom's taxonomy before.  
The important point is that lower levels are prequisite for higher levels but higher levels powerfully reinforce lower levels.  

<img src="images/animated.gif"/>

* Most importantly: learning is **not** free

# Metacognition

A [study on metacognition](https://phys.org/news/2017-10-metacognition-boosts-gen-chem-exam.html) shows that teaching students about metacognition gives better grades:

    "The students who are successful will ask themselves—what is this question asking me to do? How does that relate
    to what we're doing in class? Why are they giving me this question? If there's an equation, why does this equation
    work? That's the metacognitive part. 
    If they will kick that in, they will see their grades go straight through the roof." 
    - Charles Atwood

## Hacking your grades

* Be aware of how you think
  * How are you currently going about it? Is that the best way?
* **What** are you learning?
  * And how does that fit into the bigger picture? Why is it relevant?
* **How** are you learning?
  * Visual, auditory etc. (modalities)
  * Practical vs theoretical

# Summary

* We prepared a lot for you, and you will likely not find it easy

* You are human: fail fast

* This takes time. Seriously. Take. The. Time

* Think about how you think and learn
  * Is it smart to attend the lectures but never work on it by yourself? 
  * Take notes!
  * (Hint: we won't be there to help you at the exam)

## Machine learning vs. AI vs. statistics

![](https://www.explainxkcd.com/wiki/images/d/d3/machine_learning.png)

## Intelligence

Definition: **Adaptation in the face of change** (Steven Pinker)

* We want things that adapts to our problems, and that can **predict the future**

## Statistics

* A very handy discipline to predict the future
* **Models** the real world into mathematical objects (lines, curves, etc.)
* Deals in likelihoods (80% chance of dying etc.) and large populations (requires lots of data)
* We will use this extensively

## Machine learning (ML)

Definition: **Systems that continually improve**

* Why not just statistics?
  * Because statistical models are ideal
  
* Machine learning models are **highly** parameterised
  * $10^{6}$ knobs

## Artificial intelligence (AI)

* ... It sounds cool

* Seriously, AI can be defined as **intelligent agents**
  * *Agents that adapts in the face of change*

* Disappoingly, AI is still very stupid
  * But we will work with models that adapt to data, gaining power of prediction and generalization (contrary to statistics)

# Machine learning wheel

@cyberomin: "You don't need AI, you need SQL"

<img src = "images/MLWheel.png" style="width:50%">

## Using math

* No AI without ML, no ML without statistics, no statistics without math

* Yep that's right. If you want to be good at AI, you **have** to understand the math!
  * Luckily, because AI is fun, likewise math can be fun

* Math is many things
  * Basic algebra
  * Linear algebra
  * Probabilities
  * Calculus

# Math foundations

* Understanding "mathy" notation
* Working with simple linear algebra
* Basic probability tools
* Reading the 100 page ML book!

## Understanding mathy notation

* Math is essentially code. You can convert everything in math into Python

Math: $w = 1$

Python: 
```python
w = 1
```

Math: f(x) = x

Python:
```python
def f(x):
    return x

f = lamba x: x
```

Math: $m = \begin{bmatrix} -2 \\ 0 \\ 1 \end{bmatrix}$

Python:
```python
m = np.array([-2, 0, 1])
```

Math: $v = \begin{bmatrix} -2 & 0 & 1 \\ 5 & -4 & 2 \\ 7 & -1 & 3\end{bmatrix}$

Python:
```python
v = np.array([[-2, 0, 1], [5, -4, 2], [7, -1, 3]])
```

So; to be consistent....

![](images/column_row_numpy.png)

## Your turn

* Create a new Jupyter Notebook
* Make sure you have `numpy` installed by typing
  * `import numpy as np`
  
  
* Write $f(x) = x ^ 2$
* Write $g(x) = {1 \over 10^{-x}}$
* Write $h(x) = g(x) + 2$
* Write $v = \begin{bmatrix} -2 \\ 0 \\ 1 \end{bmatrix}$
* Write $y = h(v)$

## Creating lists of stuff with Numpy

Math: $v = [0  \dots 100]$

Python:
```python
v = np.arange(0, 100)
```

In [None]:
np.arange(0, 100)

## Plotting stuff with Matplotlib

In [None]:
plt.plot(np.arange(0, 100))

In [None]:
xs = np.arange(0, 100)
ys = []
for x in xs:
    ys.append(x ** 2)
plt.plot(xs, ys)

In [None]:
xs = np.arange(0, 100)
ys = [x ** 2 for x in xs]
plt.plot(xs, ys)

## Your turn!

* Make sure you have matplotlib installed by typing `import matplotlib.pyplot as plt`
* To view the graphs *inline* in your notebook add:
  * `%matplotlib inline` (yes, with a `%`)
  
  
* Plot $f(x) = x ** 2$
* Plot $g(y) = y * \pi$ 
* Plot $h(z) = g(z) * cos(z)$

## Reading the 100 page ML book

1. Don't give up
2. Realise that it takes a looooong time
3. Ask questions when you don't understand
  * Suggestion: make a list of questions

## A dimension

* A vector, $\textbf{v}$, of length $n$ "lives" in a vector space of $n$ dimensions, you all know this
  * The length of $n$, `len(v)` is the number of elements in that vector
  * Think about a vector in a 2D vector space: $\textbf{v} = [x, y]$, and in 3D: $\textbf{v} = [x, y, z]$

## A feature vector

"A feature vector is a vector in which each dimension $j = 1, \dots , D$ contains a value that describes the example somehow."

* "A feature vector is a vector in which..." 
  * This means that we have an unnamed vector that we're trying to define
* "each dimension $j = 1, \dots , D$"
  * Now we know we have a variable $j$ and that $j$ contains a number of things
  * The thing starts with the 1 and ends with D -- Oh, a second variable!
  * What is D? ... Could be 100; let's say it's 100
  * So D is the number of dimensions, then D must be the length! And if D is 100 then the unnamed feature vector is 100 elements long
* "contains a value that describes the example"
  * Ok, so each thing inside the unnamed vector (the thing with number $1, \dots, D$) helps us describe one piece of example data
  * Ok, that must mean that element $1$ is describing something different that element $2$
  * ... I don't understand much more, let's move on!

"That value is called a *feature* and is denoted as $x^j$"

* Ok, so now we have a name for our list called $x$
* We also heard that each single element (the ones with numbers $1, \dots, D$) is called a **feature**
  * And each element is uniquely identifiable by $j$: $x^j$
  * Hey, I've seen that before! That's an index!
  * Ahhhh, easy, so this just means that I can look at an element in the vector by it's index!
  * ... Oh, and that element is then for some reason called a *feature*

In [None]:
# We have a vector x of 1, ..., D elements
D = 100
x = np.arange(1, D + 1)

# Apparently we can look up a certain element in the list given j:
j = 1
x[j]

In [None]:
# Be careful with indexing!
print(x[0])
print(x[1])

A **feature vector** is then simply just a list where we can look an element up by its index.

Math: $x^1$

Python:
```python
x[0]
    ```

Math: $x^{100}$

Python: 
```python
x[99]
```


## Feature vectors as descriptions of things

Let's say we have a Person class:

```python

class Person:
    
    def __init__(self, age, height):
        self.age = age
        self.height = height
```

A Person is described by two **features**: `age` and `height`

* So a **feature vector** for a person would have 2 dimensions
  * $x^1 = $ age
  * $x^2 = $ height

## Feature vectors in high dimensions

* Imagine an image with the resolution `1024 x 800`
  * How many pixels does that image have?

* That image has `1024 * 800 = 819200` features!
```python
x[0]      # First feature
x[819199] # Last feature
```

* BUT Wait a moment....! How many was that?? (monochrome, each pixel is a bit (0/1))
  * For an RGB image the number of features is, rather, `1024 * 800 * 3 = 2457600` features!


* The image that you processed in the beginning of the class had `1602 * 896 = 4306176` features.