# MNIST Tutorial : classify digits using Tensorflow

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1D2-o8EbrqNbwvaKv1ZFkTIectqdnZV2L?pli=1#scrollTo=RSJWPffckbO2)

**Don't forget to turn on the GPU for the training** 

Runtime > Change runtime type > Hardware accelerator > GPU

In order to save your work you should add the colab notebook to your personnal gdrive (otherwise your changes will not be saved).

This tutorial is an introduction to tensorflow. It is adapted from the [official TensorFlow tutorial](https://www.tensorflow.org/tutorials/quickstart/beginner). 
At the end of the tutorial, given an image of a digit, you will be able to predict which digit it is. You will also understand the principal steps to train a neural network model.

# TensorFlow quickstart for beginners

This short introduction uses [Keras](https://www.tensorflow.org/guide/keras/overview) to:

1. Build a neural network that classifies images.
2. Train this neural network.
3. And, finally, evaluate the accuracy of the model.

This is a [Google Colaboratory](https://colab.research.google.com/notebooks/welcome.ipynb) notebook file. Python programs are run directly in the browser—a great way to learn and use TensorFlow. To follow this tutorial, run the notebook in Google Colab by clicking the button at the top of this page.

1. In Colab, connect to a Python runtime: At the top-right of the menu bar, select *CONNECT*.
2. Run all the notebook code cells: Select *Runtime* > *Run all*.

Import tensorflow 2.0 package:

In [0]:
# Install TensorFlow
try:
  # %tensorflow_version only exists in Colab.
  %tensorflow_version 2.x
except Exception:
  pass

import tensorflow as tf

In [0]:
# Load the TensorBoard notebook extension
%load_ext tensorboard

In [0]:
import matplotlib.pyplot as plt
import os
import datetime
import numpy as np

## MNIST DATASET

Load and prepare the [MNIST dataset](http://yann.lecun.com/exdb/mnist/). 

The dataset is already split into train set and test set.

In [0]:
mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()

Show some example images:

In [0]:
n_samples = 5
for i in range(n_samples):
  plt.subplot(1, n_samples, i+1)
  plt.imshow(x_train[i], cmap='gray')
  plt.axis('off')

Convert the samples from integers to floating-point numbers:

In [0]:
x_train, x_test = x_train / 255.0, x_test / 255.0

## Build model

Build the `tf.keras.Sequential` model by stacking layers and choosing an optimizer and loss function for training.

Keras is a library running on top of TensorFlow that allows to abstract a lot of Tensorflow's internal cooking. It has been intergrated to TensorFlow's core and is the preferred method to develop deep learning models quickly. 

In this example we will stack 2 dense layers, meaning layers which neurons are all connected to the previous layer.






1. First we flatten our input image which has shape 28 x 28 to be of shape 1 x 784. 
2. Then a dense layer with 128 output neurons is added, with a [ReLU activation](https://en.wikipedia.org/wiki/Rectifier_(neural_networks)) (in order to introduce some non-linearities). 
3. A dropout layer is added to randomly drop some connections (here 20%) and prevent overfitting.
4. The last layer is a dense layer which has 10 output neurons, which correspond to the number of classes we have and a softmax activation. The softmax will normalize it into a probability distribution, which means each output will have a value between 0 and 1 and the sum of all output values will add up to 1.

In [0]:
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

The loss function is the function the network will optimize, in this case it is the crossentropy
\begin{equation}
CE = \sum_i^C t_i \log s_i
\end{equation}
where $t_i$ is the groundtruth and $s_i$ are the outputs of the model (scores) and $C$ the number of classes (10 for MNIST).

The loss function will be minimized using the `adam` method , which will take care of updating the weights of your model in order to reduce the loss.

The metric we will use to measure the progress of the training is the accuracy which computes the number of correctly classified images with respect to the total number of images.

In [0]:
model.build(input_shape=(None, 28, 28))
model.summary()

## Train model

We create a Tensorbord callback in order to visualize the training and the model.

In [0]:
logdir = os.path.join("logs", datetime.datetime.now().strftime("%Y%m%d-%H%M%S"))
tensorboard_callback = tf.keras.callbacks.TensorBoard(logdir, histogram_freq=1)

Train the model for 5 epochs:

In [0]:
model.fit(x=x_train, 
          y=y_train, 
          epochs=5,
          # validation_data=(x_test, y_test), 
          callbacks=[tensorboard_callback])

Evaluate the model:

In [0]:
model.evaluate(x_test,  y_test, verbose=2)

## Tensorboard visualization

In [0]:
%tensorboard --logdir logs

In [0]:
# Control TensorBoard display. If no port is provided, 
# the most recently launched TensorBoard is used
# notebook.display(port=6006, height=1000) 

## Predict

In [0]:
img_to_predict = x_test[42]

plt.imshow(img_to_predict, cmap='gray')

The network outputs a probability distribution, so to know which is the most probable class, we need to look at the class with the highest probability (``argmax``)

In [0]:
np.argmax(model.predict(img_to_predict[None]))

# Exercice with convolutional model
Try to change the model to use convolutional layer `tf.keras.layers.Conv2D` combined with `tf.keras.layers.MaxPooling2D`.

An example of model could be :
- conv [n=8, k=3]
- conv [n=16, k=3]
- maxpool
- conv [n=32, k=3]
- maxpool
- dense [n=10]

In [0]:
model = tf.keras.models.Sequential([
  tf.keras.layers.Lambda(lambda x: tf.expand_dims(x, axis=-1)), # add a channel dimension
  # todo : convolutional layers
  tf.keras.layers.Flatten(input_shape=(5, 5, 32)), # flatten the input to pass it to the dense layers
  # todo : dense layers + softmax activation at the very end
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

Compare the number of parameters needed for the convolutional model and the dense model.

In [0]:
# TODO complete