# Exercise: Deep Learning

Version: SoSe 2022

Estimated time needed: 90 minutes

Author: Clara Siepmann

______

# Objectives

After completing this exercise you will be able to:

 - have a basic understanding of deep learning algorithms
 - implement a basic deep learning algorithm

**Reminder: Please upload this exercise on [Google Colab](https://colab.research.google.com/) so you would be able to work on the tasks**

**Reading material:
Mandatory: Goodfellow et al. (2016): Chapter 1 (Introduction)**

# Task 1
One of the main challenges for artificial intelligence is the question of how to capture informal knowlegde.

True or false?

<details><summary>Click here for the solution</summary>

## Solution:

True

</details>

# Task 2
Deep learning builds more complicated concepts from simpler concepts.

True or false?

<details><summary>Click here for the solution</summary>

## Solution:    
    
True. For example corners & contours are build from edges and can be detected using the edges.

</details>

# Task 3
Deep learning is a new topic and has not be discussed before.

True or false?

<details><summary>Click here for the solution</summary>

## Solution:    
    
False, the first concepts were developed in the 1940s. 

</details>

# Task 4

Later in this exercise you need the image House.png. You can find it in the gitHub. Either upload it manually to this notebook or add it to your drive


In [None]:
img_source = "/House.png"

## Task 4.1

Why and what are neural networks used for?

<details><summary>Click here for the solution</summary>

## Solution:    

* Neural networks are inspired by the brain structure
* Processing images, Processing sequences, Generating data
* detecting patterns in data: automate the feature development, start with very basic low-level representations & identify relevant patterns in the data
</details>

## Task 4.2

What is the basic structure of a neural network?

 <details><summary>Click here for the solution</summary>

## Solution:    

A neuron consists of:
* Bias b
* Input x (determined by data)
* Parameters: w (learned by our model)
* Activation function: σ (can be varied)
* Output: σ(w • x)
A network consists of:
* Layers of neurons (Input, Hidden, Output)
* Connections between the layers
</details>

## Task 4.3

How is the output of this neuron calculated? We use ReLu (Rectified Linear Units, y = max(0, x)) as the activation function.

![neuron](https://raw.githubusercontent.com/MMesgar/Foundation_of_AI/master/lecture10/img/neuron.png)

<details><summary>Click here for the solution</summary>
    
## Solution:    

ŷ = σ(w_1*x_1 + w_2*x_2 + w_3*x_3 + b)

ŷ = σ(0.8*1 + 0.4*5 + 0.5*3 + 1*0.4) 

ŷ = σ(4.7) 

ŷ = max(0, (4.7)) 

ŷ = 4.7 

</details>

## Task 5

The automatic recognition of handwritten numbers is an important application, e.g. for the automatic sorting of letters by postal code. The best known dataset for this problem is the so-called MNIST dataset (http://yann.lecun.com/exdb/mnist/). 
It contains 60000 pictures with handwritten numbers. Each image is represented as a vector of pixels. The value is the color of the pixel (0 = white, 255 = black). The aim is to automatically identify the corresponding number on each image. There are ten classes for this task: All numbers from 0-9.

In [None]:
import numpy as np

from random import randint
from random import sample

import matplotlib
import matplotlib.pyplot as plt

import tensorflow as tf
import keras
from keras.models import Sequential, load_model
from keras.datasets import mnist
from keras.layers.core import Dense, Dropout, Activation
from keras.utils import np_utils

In [None]:
import keras
import tensorflow
print(keras.__version__)
print(tensorflow.__version__)
sess = tf.compat.v1.Session(config=tf.compat.v1.ConfigProto(log_device_placement=True))
print(sess)

In [None]:
#Load the dataset MNIST Handwritten Digits dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()

## Task 5.1
First, take a closer look at the data by displaying individual examples.

In [None]:
examples = sample(list(x_train), 16)

#Feel free to adjust this value to scale the output
plt.rcParams['figure.dpi'] = 100

for i in range(len(examples)):
    ax = plt.subplot(4, 4, i + 1)
    ax.axis('off')
    ax.imshow(x_train[i], cmap='Greys')

Why is it a good idea to use Machine Learning for this problem? What might be the difficulties you would encounter using other methods?

<details><summary>Click here for the solution</summary>
    
## Solution:    
* Every person has his own handwriting, so the numbers would look differently when written by a different person.
* We can't simply determine exactly which pixel should be black so that the character on the image represents a "1" (because it might look differently for each person).
* Machine Learning algorithms are able to learn, what exactly makes a "3", for example, a "3".
</details>

# Task 6

Neural networks, as discussed in Task 1, are very well able to solve this problem. As part of this task, we want to build a neural network that can recognize handwritten numbers.

## Task 6.1

Adjust the class vectors (y_train and y_test) so that we can use them for the neural network.

In [None]:
# Convert class vectors to binary class matrices
# Change num_classes to the number of classes we need
num_classes = 10
y_train_one_hot = keras.utils.np_utils.to_categorical(y_train, num_classes)
y_test_one_hot = keras.utils.np_utils.to_categorical(y_test, num_classes)

# print first 5 labels as categorial and one-hot
#Change the print out
for i in range(5):
    print(None, " -> ", None)

<details><summary>Click here for the solution</summary>
    
## Solution:    
```python
# Convert class vectors to binary class matrices
# Change num_classes to the number of classes we need
num_classes = 10
y_train_one_hot = keras.utils.np_utils.to_categorical(y_train, num_classes)
y_test_one_hot = keras.utils.np_utils.to_categorical(y_test, num_classes)

# print first 5 labels as categorial and one-hot
#Change the print out
for i in range(5):
    print(y_train[i], " -> ", y_train_one_hot[i])
```
</details>

## Task 6.2

Adjust the input data (x_train and x_test) so that we can use them for the neural network

In [None]:
# normalize into [0,1]
def normalize_data(x):
    x = x.astype('float32')
    x /= 255
    return x

img_rows, img_cols = 28, 28
image_size = img_rows * img_cols

# Flatten the images as we are not using CNN here
print("Original shape: ", x_train.shape)
x_train_reshaped = x_train.reshape(x_train.shape[0], image_size)
x_test_reshaped = x_test.reshape(x_test.shape[0], image_size)
print("After flattening: ", x_train_reshaped.shape)

# Normalize the pixel values
#define x_train_reshaped & x_test_reshaped
x_train_reshaped = normalize_data(None)
x_test_reshaped = normalize_data(None)

<details><summary>Click here for the solution</summary>
    
## Solution:    
```python
# normalize into [0,1]
def normalize_data(x):
    x = x.astype('float32')
    x /= 255
    return x

img_rows, img_cols = 28, 28
image_size = img_rows * img_cols

# Flatten the images as we are not using CNN here
print("Original shape: ", x_train.shape)
x_train_reshaped = x_train.reshape(x_train.shape[0], image_size)
x_test_reshaped = x_test.reshape(x_test.shape[0], image_size)
print("After flattening: ", x_train_reshaped.shape)

# Normalize the pixel values
#define x_train_reshaped & x_test_reshaped
x_train_reshaped = normalize_data(x_train_reshaped)
x_test_reshaped = normalize_data(x_test_reshaped)
```
</details>

## Task 7.1

Build a neural network capable of recognizing handwritten numbers. Experiment with the parameters as well. What do you notice? 
Use the documentation: https://keras.io/layers/core/

The following help functions are given:

In [None]:
def fit_model(model, xtrain, ytrain):
    history = model.fit(xtrain, ytrain,
                        batch_size=batch_size,
                        epochs=epochs,
                        verbose=True,
                        validation_split=.1)
    return history
    
def evaluate_model(model, history, xtest, ytest):
    score = model.evaluate(xtest, ytest, verbose=False)

    plt.plot(history.history['acc'])
    plt.plot(history.history['val_acc'])
    plt.title('model accuracy')
    plt.ylabel('accuracy')
    plt.xlabel('epoch')
    plt.legend(['training', 'validation'], loc='best')
    plt.show()
 
    print("Test loss: ", score[0])
    print("Test accuracy: ", score[1])

In [None]:
batch_size = 128
epochs = 5

#Your model:


#Here you can modify your model

model.compile(
    optimizer="sgd",
    loss='categorical_crossentropy',
    metrics=['acc'])

#Train the neural network
history = fit_model(model, x_train_reshaped, y_train_one_hot)

#Evaluate the neural network
evaluate_model(model, history, x_test_reshaped, y_test_one_hot)

<details><summary>Click here for the solution</summary>
    
## Solution:    
```python
model = Sequential()
model.add(Dense(128, activation='relu', input_shape=(image_size,)))
model.add(Dropout(0.2))
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(num_classes, activation='softmax'))
model.summary()
```
</details>

## Task 7.2

How many input and output nodes are there? Could it be a different number?

<details><summary>Click here for the solution</summary>
    
## Solution:    
* Input 28*28,
* Output: 10
* It could not be a different number because the input and output are given by the training data.
</details>

Look at our proposed solution. How many nodes are there in the hidden layer? Could it be a different number?

<details><summary>Click here for the solution</summary>
    
## Solution:    
* 2
* depends on the code, therefore it could be a different number
</details>

Which other hyperparameters are there?

<details><summary>Click here for the solution</summary>
    
## Solution:    
* Learning Rate
* Regularization
* Anzahl der Knoten und Hidden Layers / The number of nodes and hidden layers
* Activation Function
* Loss Function
</details>

Which activation functions are used in our solution and which other activation functions are there?

<details><summary>Click here for the solution</summary>
    
## Solution:    
* Used here: ReLu & Softmax
* Others: Sigmoid, Tanh

</details>

Why are the activation functions in the code used in this order?

<details><summary>Click here for the solution</summary>
    
## Solution:    
* ReLu in all the hidden layers: it could also have been another activation function.
* Softmax for the output layer: necessary for the probability distribution.

</details>

The code defines 'epochs' and 'batch_size'. What do they mean?

<details><summary>Click here for the solution</summary>
    
## Solution:    
* epochs: hyperparameter that defines the number of times that the learning algorithm will work through the entire training dataset
* batch size: total number of training examples present in a single batch, batch: dataset divided into number of batches for calculation

</details>

# Task 8
## Task 8.1

Convolutional Neural Networks contain layers in which activity is calculated using discrete convolution.

Given is filter b

$$b = \begin{bmatrix} -1 &  1 \\ 1 & -1 \end{bmatrix}$$

and as input the image f

$$f = \begin{bmatrix} 0 & 0.1 & 0.5 \\ 0 & 0.7 & 0.2 \\ 0.9 & 0.2 & 0 \end{bmatrix}$$

Calculate the result of the discrete convolution. To do this, use the activation function f with:

$$f(x) = max(x, 0)$$

<details><summary>Click here for the solution</summary>
    
## Solution:    

Calculation steps:
    
<ol>
    <li> Step: $$f(-1 * 0 + 1 * 0.1 + 1 * 0 -1 * 0.7) = f(-0.6) = 0 $$ </li>
    <li> Step: $$f(-1 * 0.1 + 1 * 0.5 + 1 * 0.7 - 1 * 0.2) = f(-0.1 + 0.5 + 0.7 - 0.2) = f(0.9) = 0.9$$ </li>
    <li> Step: $$f(-1 * 0 + 1 * 0.7 + 1 * 0.9 - 1 * 0.2) = f(0.7 + 0.9 - 0.2) = f(1.4) = 1.4$$ </li>
    <li> Step: $$f(-1 * 0.7 + 1 * 0.2 + 1 * 0.2 - 1 * 0) = f(-0.7 + 0.2 + 0.2) = f(-0.3) = 0$$ </li>
</ol>

Result:

$$\begin{bmatrix} 0 & 0.9 \\ 1.4 & 0 \end{bmatrix}$$
    
</div>

</details>

For which feature could the filter be responsible?

<details><summary>Click here for the solution</summary>
    
## Solution:    
    
The filter can recognize white, diagonal lines on a picture.

</details>

## Task 8.2
Develop a filter with a height and width of 3 which is capable of detecting vertical edges

In [None]:
import cv2
from PIL import Image


def convolve(img, kernel):
    img_w, img_h = img.shape
    ker_w, ker_h = kernel.shape
    
    # Add a padding to make sure that the original shape is maintained.
    pad = (ker_w - 1) // 2
    img_pad = cv2.copyMakeBorder(img, pad, pad, pad, pad, cv2.BORDER_REPLICATE)
    
    # Create the output matrix
    output = np.zeros((img_w, img_h), dtype="float32")
    
    # Compute the convolution.
    for i in range(img_w):
        for j in range(img_h):
            output[i][j] = np.sum(img_pad[i:i+ker_w, j:j+ker_h]*kernel)
    
    return output

img = cv2.imread(img_source, cv2.IMREAD_GRAYSCALE)

# Change the filter here.
kernel = np.asarray([[-1, -1, 1],
                     [-1, -1, 1],
                     [-1, -1, 1]])

# Compute the feature map.
convolved = convolve(img, kernel)

plt.imshow(img, cmap='gray', interpolation='nearest', vmin=0, vmax=255)
plt.show()

plt.imshow(convolved, cmap='gray', interpolation='nearest', vmin=0, vmax=255)
plt.show()
print(convolved)

<details><summary>Click here for the solution</summary>
    
## Solution:    
    
```python
# Vertical
kernel = np.asarray([[-1, 0, +1],
                     [-1, 0, +1],
                     [-1, 0, +1]])

# vertical.
kernel = np.asarray([[-1, -1, 1],
                     [-1, -1, 1],
                     [-1, -1, 1]])

# horizontal.
kernel = np.asarray([[+1, +1, +1],
                     [ 0,  0,  0],
                     [-1, -1, -1]])

```

</details>

In addition, Convolutional Neural Networks contain so-called pooling layers. Unnecessary information is discarded here.
Calculate the output of a 2x2 Max Pooling Layer for the following input:

$$\begin{bmatrix} 0.8 & 0.2 & 0.4 & 0.2\\ 0.9 & 0.5 & -0.4 & 0.2 \\ 0.1 & 0.1 & -0.6 & -0.4 \\ -0.2 & 0.9 & -0.3 & 0.6 \end{bmatrix}$$

<details><summary>Click here for the solution</summary>
    
## Solution:    
    


$$\begin{bmatrix} 0.9 & 0.4 \\ 0.9 & 0.6 \end{bmatrix}$$

</details>

The following shows how to implement a Convolutional Neural Network using Keras.

In [None]:
from keras import backend as K
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D

if K.image_data_format() == 'channels_first':
    x_train_reshaped = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
    x_test_reshaped = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
    input_shape = (1, img_rows, img_cols)
else:
    x_train_reshaped = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
    x_test_reshaped = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
    input_shape = (img_rows, img_cols, 1)

x_train_reshaped = normalize_data(x_train_reshaped)
x_test_reshaped = normalize_data(x_test_reshaped)

# train with less data (takes too long otherwise)
x_train_small = x_train_reshaped[:6000,:]
y_train_small = y_train_one_hot[:6000,:]
print(x_train_small.shape)

model = Sequential()
model.add(Conv2D(32, kernel_size=(5, 5),
                 activation='relu',
                 input_shape=input_shape))
model.add(Conv2D(64, (5, 5), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
model.summary()

model.compile(
    optimizer="adadelta",
    loss='categorical_crossentropy',
    metrics=['acc'])

history = fit_model(model, x_train_small, y_train_small)
evaluate_model(model, history, x_test_reshaped, y_test_one_hot)

Apply the model

In [None]:
input_form = """
<table>
<td style="border-style: none;">
<div style="border: solid 2px #666; width: 143px; height: 144px;">
<canvas width="140" height="140"></canvas>
</div></td>
<td style="border-style: none;">
<button onclick="clear_value()">Clear</button>
<button onclick="classify_digit()">Classify</button>
</td>
</table>
"""

javascript = '''
<script type="text/Javascript">
    var pixels = [];
    for (var i = 0; i < 28*28; i++) pixels[i] = 0;
    var click = 0;

    var canvas = document.querySelector("canvas");
    canvas.addEventListener("mousemove", function(e){
        if (e.buttons == 1) {
            click = 1;
            canvas.getContext("2d").fillStyle = "rgb(0,0,0)";
            canvas.getContext("2d").fillRect(e.offsetX, e.offsetY, 8, 8);
            x = Math.floor(e.offsetY * 0.2);
            y = Math.floor(e.offsetX * 0.2) + 1;
            for (var dy = 0; dy < 2; dy++){
                for (var dx = 0; dx < 2; dx++){
                    if ((x + dx < 28) && (y + dy < 28)){
                        pixels[(y+dy)+(x+dx)*28] = 1;
                    }
                }
            }
        } else {
            if (click == 1) set_value();
            click = 0;
        }
    });
    
    function set_value(){
        var result = "";
        for (var i = 0; i < 28*28; i++) result += pixels[i] + ",";
        var kernel = IPython.notebook.kernel;
        kernel.execute("image = [" + result + "]");
    }
    
    function clear_value(){
        canvas.getContext("2d").fillStyle = "rgb(255,255,255)";
        canvas.getContext("2d").fillRect(0, 0, 140, 140);
        for (var i = 0; i < 28*28; i++) pixels[i] = 0;
    }
    
    function classify_digit() {
        IPython.notebook.execute_cells([IPython.notebook.get_selected_index()+1])
    }
</script>
'''

from IPython.display import HTML
HTML(input_form + javascript)

The code below does currently not work in Colab. Try it in Jupyter. Or tell us your solution :). 


In [None]:
img_array = np.array(image)
img_array = img_array.reshape(1, 28, 28, 1)
predictions = model.predict(img_array)

%matplotlib inline 
fig = plt.figure(figsize=(4,2))
subplot = fig.add_subplot(1,1,1)
subplot.set_xticks(range(10))
subplot.set_xlim(-0.5,9.5)
subplot.set_ylim(0,1)
subplot.bar(range(10), predictions[0], align='center')
fig.show()

Note: The code with the Convolutional Neural Network must be executed first, so that this sign will work.