# Exercise 1: Explore MLPs  
## Train the MLP models on Fashion-MNIST dataset.  

In this tutorial, we'll build and train a multiple perceptron neural network to classify images of clothing, like sneakers and shirts. 
This guide uses [tf.keras](https://www.tensorflow.org/beta/guide/keras/overview), a high-level API to build and train models in TensorFlow.


Before running any code, we do the following two steps:
1. Reset the runtime by going to **Runtime -> Reset all runtimes** in the menu above. 
2. Select **GPU** by going to **Runtime -> Change runtime type -> Hardware accelerator** in the menu above. 

**Fashion-MNIST** is a dataset of Zalando's article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes. We intend Fashion-MNIST to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms. It shares the same image size and structure of training and testing splits.


## [Install and import dependencies]

Since we need to use the latest tensorflow version, which is Tensorflow 2.0 beta. We need to install it. The default Tensorflow version in Colab is r1.14.0. 


First, we will import some packages which will be used in our codes.
- `numpy`: for matrix computation 
- `matplotlib.pyplot`: for graph plotting and display
- `tensorflow`: the main deep learning toolbox
-  `keras.datasets`: an API providing several datasets for common use. 

In [0]:
 ## Install tensorflow 2.0.0 beta version (GPU VERSION)
 !pip install tensorflow-gpu==2.0.0-beta1 

In [0]:
from __future__ import absolute_import, division, print_function, unicode_literals


# Import TensorFlow and Fashion-MNIST Datasets
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

# Helper libraries
import numpy as np
import matplotlib.pyplot as plt
print(tf.__version__)


In [0]:
# To ignore some warning message
import logging
logger = tf.get_logger()
logger.setLevel(logging.ERROR)

## [Data Loading]  

**Fashion MNIST** has the same data structure as the regular **MNIST** dataset, but it's a slightly more challenging. Both datasets are relatively small and are used to verify that an algorithm works as expected. They're good starting points to test and debug code. 

We will use 60,000 images to train the network and 10,000 images to evaluate how accurately the network learned to classify images. You can access the Fashion MNIST directly from TensorFlow, using the [Datasets](https://www.tensorflow.org/datasets) API:

In [0]:
# load dataset
(x_train, y_train), (x_test, y_test) = keras.datasets.fashion_mnist.load_data()

# count the number of unique train labels
unique, counts = np.unique(y_train, return_counts=True)
print("Train labels: ", dict(zip(unique, counts)))

#count the number of unique test labels
unique, counts = np.unique(y_test, return_counts=True)
print("Test labels: ", dict(zip(unique, counts)))

The images are 28 $\times$ 28 arrays, with pixel values in the range `[0, 255]`. The *labels* are an array of integers, in the range `[0, 9]`. These correspond to the *class* of clothing the image represents:  
<table>
  <tr>
    <th>Label</th>
    <th>Class</th> 
  </tr>
  <tr>
    <td>0</td>
    <td>T-shirt/top</td> 
  </tr>
  <tr>
    <td>1</td>
    <td>Trouser</td> 
  </tr>
    <tr>
    <td>2</td>
    <td>Pullover</td> 
  </tr>
    <tr>
    <td>3</td>
    <td>Dress</td> 
  </tr>
    <tr>
    <td>4</td>
    <td>Coat</td> 
  </tr>
    <tr>
    <td>5</td>
    <td>Sandal</td> 
  </tr>
    <tr>
    <td>6</td>
    <td>Shirt</td> 
  </tr>
    <tr>
    <td>7</td>
    <td>Sneaker</td> 
  </tr>
    <tr>
    <td>8</td>
    <td>Bag</td> 
  </tr>
    <tr>
    <td>9</td>
    <td>Ankle boot</td> 
  </tr>
</table>

Each image is mapped to a single label. Since the *class names* are not included with the dataset, store them here to use later when plotting the images:


In [0]:
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat', 
               'Sandal',      'Shirt',   'Sneaker',  'Bag',   'Ankle boot']

In [0]:
# sample 25 mnist digits from train dataset
indexes = np.random.randint(0, x_train.shape[0], size=25)
images = x_train[indexes]
labels = y_train[indexes]

# plot the 25 mnist digits
plt.figure(figsize=(10,10))
for i in range(len(indexes)):
  plt.subplot(5, 5, i + 1)
  image = images[i]
  label = labels[i]
  plt.imshow(image, cmap='gray')
  plt.xlabel(class_names[label])
  plt.xticks([])
  plt.yticks([])
  
plt.show()
plt.savefig("fashion_mnist-samples.png")
plt.close('all')

### Data Preparation  

The value of each pixel in the image data is an integer in the range `[0,255]`. For the model to work properly, these values need to be normalized to the range `[0,1]`. 

In [0]:
# image dimensions (assumed square)
image_size = x_train.shape[1]
input_size = image_size * image_size
# resize and normalize

x_train = np.reshape(x_train, [-1, input_size])
x_train = x_train.astype('float32')/255.0
x_test = np.reshape(x_test, [-1, input_size])
x_test = x_test.astype('float32')/255.0

##[Construct the Model]  
Building the neural network requires configuring the layers of the model, then compiling the model.  
We build a MLP model with 2 hidden layers. activation function -- 'relu'

In [0]:
# import tf.keras api for build the model architecture
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation, Dropout, Flatten
from tensorflow.keras.utils import to_categorical, plot_model

In [0]:
# compute the number of labels
num_labels = len(np.unique(y_train))
# Define network parameters
batch_size = 128
hidden_units = 256
dropout = 0.4

In [0]:
# model is a 3-layer MLP with ReLU and dropout after each layer
  model = Sequential()
  model.add(Dense(256, activation='relu',input_shape=(784,)))
  model.add(Dropout(0.4))
  model.add(Dense(256, activation='relu'))
  model.add(Dropout(0.4))
  model.add(Dense(10, activation='softmax'))


In [0]:
model.summary()

### Compile the model

Before the model is ready for training, it needs a few more settings. These are added during the model's *compile* step:


* *Loss function* — An algorithm for measuring how far the model's outputs are from the desired output. The goal of training is this measures loss.
* *Optimizer* —An algorithm for adjusting the inner parameters of the model in order to minimize loss.
* *Metrics* —Used to monitor the training and testing steps. The following example uses *accuracy*, the fraction of the images that are correctly classified.

In [0]:
model.compile(optimizer='adam', 
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

##[Train the Model]  


In [0]:
EPOCHS = 50
history = model.fit(x_train, y_train, validation_data=(x_test, y_test), epochs=EPOCHS, batch_size=batch_size)

In [0]:
loss, acc = model.evaluate(x_test, y_test, batch_size=batch_size)
print("\nTest accuracy: %.1f%%" % (100.0 * acc))

### history object for visualization  


The `fit` method returns a history object. We can use this object to plot how the loss of our model goes down after each training epoch. A high loss means that the Fahrenheit degrees the model predicts is far from the corresponding value in `fahrenheit_a`. 

We'll use [Matplotlib](https://matplotlib.org/) to visualize this (you could use another tool). As you can see, our model improves very quickly at first, and then has a steady, slow improvement towards the end.

In [0]:
plt.xlabel('Epoch Number')
plt.ylabel("Loss Magnitude")
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
# plt.savefig('./loss.png')