In [1]:
## RUN THIS CELL TO PROPERLY HIGHLIGHT THE EXERCISES
import requests
from IPython.core.display import HTML
styles = requests.get("https://raw.githubusercontent.com/Harvard-IACS/2018-CS109A/master/content/styles/cs109.css").text
HTML(styles)

# Introduction to Neural Networks


## Lab 2: ANNs /w Keras Part 2

**June 2020**<br>
**Instructor:** Pavlos Protopapas<br>
**Lab Instructors:** Chris Gumb<br>
**Contributors:** Eleni Kaxiras

---

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
import pandas as pd
%matplotlib inline

from PIL import Image, ImageOps

In [None]:
from __future__ import absolute_import, division, print_function, unicode_literals

# TensorFlow and tf.keras
import tensorflow as tf

tf.keras.backend.clear_session()  # For easy reset of notebook state.

print(tf.__version__)  # You should see a 2.0.0 here!

## Learning Goals
In this lab we will continue with the basics of feedforward neural networks using `tf.keras`, a deep learning library inside the broader framework called [Tensorflow](https://www.tensorflow.org). By the end of this lab, you should:

- Strengthen your understanding of how a simple neural network works and code some of its functionality using `tf.keras`.
- Think of vectors and arrays as tensors. Learn how to do basic image manipulations.
- Implement a simple real world example using a feed forward neural network on image data. 

## Part 1: Data Preparation

### Tensors

We can think of tensors as multidimensional arrays of real numerical values; their job is to generalize matrices to multiple dimensions. 

- **scalar** = just a number = rank 0 tensor  ($a$ ∈ $F$,)
<BR><BR>
    
- **vector** = 1D array = rank 1 tensor ( $x = (\;x_1,...,x_i\;)⊤$ ∈ $F^n$ )
<BR><BR>
    
- **matrix** = 2D array = rank 2 tensor ( $\textbf{X} = [a_{ij}] ∈ F^{m×n}$ )
<BR><BR>
    
- **3D array** = rank 3 tensor ( $\mathscr{X} =[t_{i,j,k}]∈F^{m×n×l}$ )
<BR><BR>
    
- **$\mathscr{N}$D array** = rank $\mathscr{N}$ tensor ( $\mathscr{T} =[t_{i1},...,t_{i\mathscr{N}}]∈F^{n_1×...×n_\mathscr{N}}$ ) <-- **Things start to get complicated here...**
    
#### Tensor indexing
We can create subarrays by fixing some of the given tensor’s indices. We can create a vector by fixing all but one index. A 2D matrix is created when fixing all but two indices. For example, for a third order tensor the vectors are
<br><BR>
$\mathscr{X}[:,j,k]$ = $\mathscr{X}[j,k]$ (column), <br>
$\mathscr{X}[i,:,k]$ = $\mathscr{X}[i,k]$ (row), and <BR>
$\mathscr{X}[i,j,:]$ = $\mathscr{X}[i,j]$ (tube) <BR>
 
#### Tensor multiplication
We can multiply one matrix with another as long as the sizes are compatible ((n × m) × (m × p) = n × p), and also multiply an entire matrix by a constant. Numpy `numpy.dot` performs a matrix multiplication which is straightforward when we have 2D or 1D arrays. But what about > 3D arrays? The function will choose according to the matching dimentions but if we want to choose we should use `tensordot`, but, again, we **do not need tensordot** for this class. 

### Pavlos as a Rank 3 Tensor

A common kind of data input to a neural network is images. Images are nice to look at, but remember, the computer only sees a series of numbers arranged in `tensors`. In this part we will look at how images are displayed and altered in Python. 

`matplotlib` supports only .png images but uses a library called `Pillow` to handle any image. If you do not have `Pillow` installed you can do this in anaconda:
```
conda install -c anaconda pillow 

OR 

pip install pillow
```

Images are 24-bit RGB images (height, width, channels) with 8 bits for each of R, G, B channel. Explore and print the array.

In [None]:
import matplotlib.image as mpimg

# load and show the image
FILE = '../data/pavlos.jpeg'
img = mpimg.imread(FILE);
imgplot = plt.imshow(img);

print(f'The image is a: {type(img)} of shape {img.shape}')
img[3:5, 3:5, :];

#### Slicing tensors: slice along each axis

In [None]:
# we want to show each color channel
fig, axes = plt.subplots(1, 3, figsize=(10,10))
for i, subplot in zip(range(3), axes):
    temp = np.zeros(img.shape, dtype='uint8')
    temp[:,:,i] = img[:,:,i]
    subplot.imshow(temp)
    subplot.set_axis_off()
plt.show()

#### Multiplying Images with a scalar

Just for fun, no real use for this lab!

In [None]:
temp = img
temp = temp * 2
plt.imshow(temp);

For more on image manipulation by `matplotlib` see: [matplotlib-images](https://matplotlib.org/3.1.1/tutorials/introductory/images.html)

## Part 2: Building an Artificial Neural Network

https://www.tensorflow.org/guide/keras

`tf.keras` is TensorFlow's high-level API for building and training deep learning models. It's used for fast prototyping, state-of-the-art research, and production. `Keras` is a library created by François Chollet. After Google released Tensorflow 2.0, the creators of `keras` recommend that "Keras users who use multi-backend Keras with the TensorFlow backend switch to `tf.keras` in TensorFlow 2.0. `tf.keras` is better maintained and has better integration with TensorFlow features".

NOTE:  In `Keras` everything starts with a Tensor of N samples as input and ends with a Tensor of N samples as output.

### First you build it ...

Parts of a NN:

* Part 1: the input layer (our dataset)

* Part 2: the internal architecture or hidden layers (the number of layers, the activation functions, the learnable parameters and other hyperparameters)
* Part 3: the output layer (what we want from the network - classification or regression)

### ... and then you train it!

1. Load and pre-process the data
2. Define the layers of the model.
3. Compile the model.
4. Fit the model to the train set (also using a validation set).
5. Evaluate the model on the test set.
6. We learn a lot by studying History! Plot metrics such as accuracy.
7. Now let's use the Network for what it was meant to do: Predict on the test set!
8. Evaluate predictions

In [None]:
# set the seed for reproducability of results
seed = 7
np.random.seed(seed)

### Fashion MNIST 

**Fashion-MNIST** is a dataset of clothing article images (created by [Zalando](https://github.com/zalandoresearch/fashion-mnist)), consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a **28 x 28** grayscale image, associated with a label from **10 classes**. The creators intend Fashion-MNIST to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms. It shares the same image size and structure of training and testing splits. Each pixel is 8 bits so its value ranges from 0 to 255.

Let's load and look at it!

<div class='exercise'> <b>Exercise 1: Load and pre-process the data</div>
    
1. After loading the data, normalize the images so each pixel takes on a value between 0 and 1.
2. Use the provided code to display a sample of the data. Then write your own code to display a single image of yoru choice.

In [None]:
# get the data from keras - how convenient!
fashion_mnist = tf.keras.datasets.fashion_mnist

# load the data splitted in train and test! how nice!
(x_train, y_train),(x_test, y_test) = fashion_mnist.load_data()


In [None]:
# hint: each pixel is 8 bits so its value ranges from 0 to 255
## your code here


In [None]:
# classes are named 0-9 so define names for plotting clarity
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
               'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

plt.figure(figsize=(10,10))
for i in range(25):
    plt.subplot(5,5,i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(x_train[i], cmap=plt.cm.binary)
    plt.xlabel(class_names[y_train[i]])
plt.show()

In [None]:
# choose one image to look at
# your code here


In [None]:
# take a look at the array shapes
x_train.shape, x_test.shape, y_train.shape

<div class='exercise'> <b>Exercise 2: Define the layers of the model</div>
The input images are 2D arrays. How can we feed these into a Keras dense layer? We will need to start our model with a special layer to first transform the input into a flat, 1D vector. Try using autocomplete (`Tab`) to search for layers in `tf.keras.layers` for a potential candidate layer. When you find one, use the `Shift+Tab` trick to see what arguments it takes.
    
After that you should be able to construct your model. You can use as many layers/nodes as you like for the hidden layers, just start conservatively. Also consider:
* What is the (initial) input shape?
* What is the desired output shape?
* What sort of activation functions to I need on each layer?

In [None]:
# your code here


<div class='exercise'> <b>Exercise 3: Compile the model</div>
Use autocomplete on `tf.keras.losses` to find a candidate loss function. 
    
You can then use the `Shift+Tab` trick to read more about it. If there is also an additional metric to track during training you should set the `metrics` parameter when you call `compile` 
    
(hint: this parameter is a list and the elements of the list can be strings if they are the names of metrics recognized by keras)

In [None]:
# your code here


In [None]:
# print a summary of your model
model.summary()

In [None]:
# use this cool `tf.keras` method to visualize the layers of your network
tf.keras.utils.plot_model(
    model,
    #to_file='model.png', # if you want to save the image
    show_shapes=True, # True for more details than you need
    show_layer_names=True,
    rankdir='TB',
    expand_nested=False,
    dpi=96
)

[Everything you wanted to know about a Keras Model and were afraid to ask](https://www.tensorflow.org/api_docs/python/tf/keras/Model)

<div class='exercise'> <b>Exercise 4: Fit the model to the train set (also using a validation set) </div>
This is the part that takes the longest in terms of time and where having GPUs helps. Save the return value of the call to `fit` in a variable `history` so we can inspect the training history later.
    
You may want to start by just doing a few epochs.

In [None]:
# the core of the network training
# your code here


<div class='exercise'> <b>Exercise 5: Save and restore the model </div>

You can save the model so you do not have `.fit` everytime you reset the kernel in the notebook. Network training is expensive!
    
Use the model's `save` method to store the model to disk. The commented code can be used to restore it later without having to fit it all again!

For more details on this see [https://www.tensorflow.org/guide/keras/save_and_serialize](https://www.tensorflow.org/guide/keras/save_and_serialize)

In [None]:
# save the model so you do not have to run the code everytime
# your code here


# Recreate the exact same model purely from the file
#model = tf.keras.models.load_model('../data/fashion_model.h5')

<div class='exercise'> <b>Exercise 6: Evaluate the model on the test set </div>
Use autocomplete to find the appropriate model method. Use `Shift+Tab` to see what it returns. Then store the test accuracy in `test_accuracy`.

In [None]:
# your code here


In [None]:
print(f'Test accuracy={test_accuracy:.4f}')
if test_accuracy>0.8: print(f'Not bad!')

<div class='exercise'> <b>Exercise 7: Inspect the history </div>
We learn a lot by studying History! Plot metrics such as accuracy. 

You can learn a lot about neural networks by observing how they perform while training. You can issue `callbacks` in `keras`. The networks's performance is stored in a `keras` callback aptly named `history` which can be plotted.
    
Plot accuracy and loss for the train & test sets. You may find it best to have multiple axes and multiple lines on each axis.

In [None]:
print(history.history.keys())

In [None]:
# plot accuracy and loss for the train & test sets
# your code here


We notice that the model starts to overfil after ~10 epochs.

<div class='exercise'> <b>Exercise 8: Make predictions</div>
Now let's use the Network for what it was meant to do: 

1. Predict on the test set and save in the variable `predictions` (hint: explore the model's methods with autocomplete)
2. Print the class probabilities for the first observation in the test data
3. Print the class the model predicts this observation to be (remember our `class_names` variable from earlier)

In [None]:
# your code here


Let's see if our network predicted right! Does this item really look like what was predicted?

In [None]:
plt.figure()
plt.imshow(x_test[0], cmap=plt.cm.binary)
plt.xlabel(class_names[y_test[0]])
plt.colorbar();

Now let's see how confident our model is by plotting the probability values:

In [None]:
# code source: https://www.tensorflow.org/tutorials/keras/classification
def plot_image(i, predictions_array, true_label, img):
    predictions_array, true_label, img = predictions_array, true_label[i], img[i]
    plt.grid(False)
    plt.xticks([])
    plt.yticks([])

    plt.imshow(img, cmap=plt.cm.binary)

    predicted_label = np.argmax(predictions_array)
    if predicted_label == true_label:
        color = 'blue'
    else:
        color = 'red'

    plt.xlabel("{} {:2.0f}% ({})".format(class_names[predicted_label],
                                100*np.max(predictions_array),
                                class_names[true_label]),
                                color=color)

def plot_value_array(i, predictions_array, true_label):
    predictions_array, true_label = predictions_array, true_label[i]
    plt.grid(False)
    plt.xticks(range(10))
    plt.yticks([])
    thisplot = plt.bar(range(10), predictions_array, color="#777777")
    plt.ylim([0, 1])
    predicted_label = np.argmax(predictions_array)

    thisplot[predicted_label].set_color('red')
    thisplot[true_label].set_color('blue')

In [None]:
def plot_pred(i):
    plt.figure(figsize=(6,3))
    plt.subplot(1,2,1)
    plot_image(i, predictions[i], y_test, x_test)
    plt.subplot(1,2,2)
    plot_value_array(i, predictions[i],  y_test)
    plt.xticks(np.arange(len(class_names)),class_names, rotation=75)
    plt.show()

In [None]:
plot_pred(0)

The model is very confident! It predicts and ankle boot with 100% probability. 
Try a few other observations.

In [None]:
# your code here
