<a href="https://colab.research.google.com/github/zhenya-mamenko/mini-ML-piscine/blob/master/multi_class_classification_of_handwritten_digits_with_tf2_and_keras_plus_tensorboard.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#### Copyright 2017 Google LLC., 2019 Zhenya Mamenko
This notebook based on [Classifying Handwritten Digits with Neural Networks](https://colab.research.google.com/notebooks/mlcc/multi-class_classification_of_handwritten_digits.ipynb?utm_source=zhenya-mamenko&utm_campaign=colab-external&utm_medium=referral&utm_content=multiclass-colab&hl=en) exercise from [Google Machine Learning Crash Course](https://developers.google.com/machine-learning/crash-course/).

In [0]:
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Classifying Handwritten Digits with Neural Networks

![img](https://www.tensorflow.org/versions/r0.11/images/MNIST.png)

**Learning Objectives:**
  * Train both a linear model and a neural network to classify handwritten digits from the classic [MNIST](http://yann.lecun.com/exdb/mnist/) data set
  * Compare the performance of the linear and neural network classification models
  * Visualize the weights of a neural-network hidden layer

Our goal is to map each input image to the correct numeric digit. We will create a NN with a few hidden layers and a Softmax layer at the top to select the winning class.

## Setup

First, let's download the data set, import TensorFlow and other utilities, and load the data into a *pandas* `DataFrame`. Note that this data is a sample of the original MNIST training data; we've taken 20000 rows at random.

In [0]:
import glob
import math
import os

from matplotlib import cm
from matplotlib import gridspec
from matplotlib import pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
from sklearn import metrics
import logging
from IPython.display import display
pd.options.display.max_rows = 10
pd.options.display.float_format = '{:.1f}'.format
logging.getLogger('tensorflow').disabled = True

!pip install -q tensorflow==2.0.0-beta1

import tensorflow as tf
    
%load_ext tensorboard

from datetime import datetime
import io
logging.getLogger('tensorboard').disabled = True

mnist_dataframe = pd.read_csv(
  "https://download.mlcc.google.com/mledu-datasets/mnist_train_small.csv",
  sep=",",
  header=None)

# Use just the first 10,000 records for training/validation.
mnist_dataframe = mnist_dataframe.head(10000)

mnist_dataframe = mnist_dataframe.reindex(np.random.permutation(mnist_dataframe.index))
mnist_dataframe.head()

Each row represents one labeled example. Column 0 represents the label that a human rater has assigned for one handwritten digit. For example, if Column 0 contains '6', then a human rater interpreted the handwritten character as the digit '6'.  The ten digits 0-9 are each represented, with a unique class label for each possible digit. Thus, this is a multi-class classification problem with 10 classes.

![img](https://www.tensorflow.org/versions/r0.11/images/MNIST-Matrix.png)

Columns 1 through 784 contain the feature values, one per pixel for the 28×28=784 pixel values. The pixel values are on a gray scale in which 0 represents white, 255 represents black, and values between 0 and 255 represent shades of gray. Most of the pixel values are 0; you may want to take a minute to confirm that they aren't all 0.  For example, adjust the following text block to print out the values in column 72.

In [0]:
mnist_dataframe.loc[:, 72:72]

Now, let's parse out the labels and features and look at a few examples. Note the use of `loc` which allows us to pull out columns based on original location, since we don't have a header row in this data set.

In [0]:
def parse_labels_and_features(dataset):
  """Extracts labels and features.
  
  This is a good place to scale or transform the features if needed.
  
  Args:
    dataset: A Pandas `Dataframe`, containing the label on the first column and
      monochrome pixel values on the remaining columns, in row major order.
  Returns:
    A `tuple` `(labels, features)`:
      labels: A Pandas `Series`.
      features: A Pandas `DataFrame`.
  """
  labels = dataset[0]

  # DataFrame.loc index ranges are inclusive at both ends.
  features = dataset.loc[:,1:784]
  # Scale the data to [0, 1] by dividing out the max value, 255.
  features = features / 255

  return labels, features

In [0]:
training_targets, training_examples = parse_labels_and_features(mnist_dataframe[:7500])
training_examples.describe()

In [0]:
validation_targets, validation_examples = parse_labels_and_features(mnist_dataframe[7500:10000])
validation_examples.describe()

Show a random example and its corresponding label.

In [0]:
def plot_to_image(figure):
  """Converts the matplotlib plot specified by 'figure' to a PNG image and
  returns it. The supplied figure is closed and inaccessible after this call."""
  # Save the plot to a PNG in memory.
  buf = io.BytesIO()
  plt.savefig(buf, format='png')
  # the notebook.
  plt.close(figure)
  buf.seek(0)
  # Convert PNG buffer to TF image
  image = tf.image.decode_png(buf.getvalue(), channels=4)
  # Add the batch dimension
  image = tf.expand_dims(image, 0)
  return image

In [0]:
!rm -rf logs/multi-class_classification_of_handwritten_digits

In [0]:
rand_example = np.random.choice(training_examples.index)
figure = plt.figure()
_, ax = plt.subplots()
ax.matshow(training_examples.loc[rand_example].values.reshape(28, 28))
ax.set_title("Label: %i" % training_targets.loc[rand_example])
ax.grid(False)
logdir="logs/multi-class_classification_of_handwritten_digits/plots" + datetime.now().strftime("%Y%m%d-%H%M%S")
file_writer = tf.summary.create_file_writer(logdir)
with file_writer.as_default():
  tf.summary.image("Example",
                   plot_to_image(figure),
                   step=0)
plt.close()

In [0]:
%tensorboard --logdir logs/multi-class_classification_of_handwritten_digits

## Task 1: Build a Linear Model for MNIST

First, let's create a baseline linear classifier model to compare against.

You'll notice that in addition to reporting accuracy, and plotting loss over time, we also display a [**confusion matrix**](https://en.wikipedia.org/wiki/Confusion_matrix).  The confusion matrix shows which classes were misclassified as other classes. Which digits get confused for each other?

Also note that we track the model's error using the [`sparse_categorical_crossentropy`](https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/losses/sparse_categorical_crossentropy) function.

In [0]:
def construct_feature_columns():
  """Construct the TensorFlow Feature Columns.

  Returns:
    A set of feature columns
  """ 
  
  # There are 784 pixels in each image.
  return set([tf.feature_column.numeric_column('pixels', shape=784)])

In [0]:
def fit_linear_classification_model(
    learning_rate,
    steps_per_epoch,
    batch_size,
    training_examples,
    training_targets,
    validation_examples,
    validation_targets):
  """Trains a linear classifier model.
  
  In addition to training, this function also prints training progress information,
  as well as a plot of the training and validation loss over time.
  
  Args:
    learning_rate: A `float`, the learning rate.
    steps_per_epoch: A non-zero `int`, the total number of training steps. A training step
      consists of a forward and backward pass using a single batch.
    batch_size: A non-zero `int`, the batch size.
    training_examples: A `DataFrame` containing one or more columns from
      `california_housing_dataframe` to use as input features for training.
    training_targets: A `DataFrame` containing exactly one column from
      `california_housing_dataframe` to use as target for training.
    validation_examples: A `DataFrame` containing one or more columns from
      `california_housing_dataframe` to use as input features for validation.
    validation_targets: A `DataFrame` containing exactly one column from
      `california_housing_dataframe` to use as target for validation.
  """
  epochs = 10
  
  model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(10, activation='softmax')
  ])
  model.compile(optimizer=tf.keras.optimizers.Adagrad(learning_rate=learning_rate, clipnorm=5.0),
              loss=tf.keras.losses.sparse_categorical_crossentropy,
              metrics=['accuracy'])

  def log(epoch, logs):
    log_loss = logs["loss"]
    print("  epoch %02d : %0.2f" % (epoch, log_loss))

  model_callback = tf.keras.callbacks.LambdaCallback(
      on_epoch_end=lambda epoch, logs: log(epoch, logs))
  logdir="logs/multi-class_classification_of_handwritten_digits/"
  dt = datetime.now().strftime("%Y%m%d-%H%M%S")
  tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=logdir + "scalars" + dt,
                                                        histogram_freq=1,
                                                        update_freq='epoch')
  
  print("Train model...")
  print("LogLoss (on training data):")
  history = model.fit(training_examples.values,
            training_targets.values,
            validation_data=(validation_examples.values, validation_targets.values),
            epochs=epochs,
            steps_per_epoch=steps_per_epoch,
            batch_size=batch_size,
            callbacks=[model_callback, tensorboard_callback],
            verbose=0)
  print("Model training finished.")
  accuracy = history.history["val_accuracy"][epochs - 1]
  print("Final accuracy (on validation data): %0.2f" % accuracy)
  final_predictions = model.predict_on_batch(validation_examples.values)
  final_predictions = [np.argmax(r) for r in final_predictions]

  # Output a plot of the confusion matrix.
  cm = metrics.confusion_matrix(validation_targets, final_predictions)
  # Normalize the confusion matrix by row (i.e by the number of samples
  # in each class).
  figure = plt.figure()
  cm_normalized = cm.astype("float") / cm.sum(axis=1)[:, np.newaxis]
  ax = sns.heatmap(cm_normalized, cmap="bone_r")
  ax.set_aspect(1)
  plt.title("Confusion matrix")
  plt.ylabel("True label")
  plt.xlabel("Predicted label")
  file_writer = tf.summary.create_file_writer(logdir+ "plots" + dt)
  with file_writer.as_default():
    tf.summary.image("Confusion matrix",
                     plot_to_image(figure),
                     step=0)
  plt.close()
  return model

**Spend 5 minutes seeing how well you can do on accuracy with a linear model of this form. For this exercise, limit yourself to experimenting with the hyperparameters for batch size, learning rate and steps.**

Stop if you get anything above about 0.9 accuracy.

In [0]:
classifier = fit_linear_classification_model(
             learning_rate=0.02,
             steps_per_epoch=10,
             batch_size=10,
             training_examples=training_examples,
             training_targets=training_targets,
             validation_examples=validation_examples,
             validation_targets=validation_targets)

In [0]:
%tensorboard --logdir logs/multi-class_classification_of_handwritten_digits

### Solution

Click below for one possible solution.

Here is a set of parameters that should attain roughly 0.9 accuracy.

In [0]:
classifier = fit_linear_classification_model(
             learning_rate=0.03,
             steps_per_epoch=100,
             batch_size=100,
             training_examples=training_examples,
             training_targets=training_targets,
             validation_examples=validation_examples,
             validation_targets=validation_targets)

## Task 2: Replace the Linear Classifier with a Neural Network

**Replace the LinearClassifier above with a Neural Network with multiple layers and find a parameter combination that gives 0.95 or better accuracy.**

You may wish to experiment with additional regularization methods, such as [`Dropout`](https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/layers/Dropout) layer.

In [0]:
#
# YOUR CODE HERE: Replace the linear classifier with a neural network.
#

Once you have a good model, double check that you didn't overfit the validation set by evaluating on the test data that we'll load below.


In [0]:
mnist_test_dataframe = pd.read_csv(
  "https://download.mlcc.google.com/mledu-datasets/mnist_test.csv",
  sep=",",
  header=None)

test_targets, test_examples = parse_labels_and_features(mnist_test_dataframe)
test_examples.describe()

In [0]:
#
# YOUR CODE HERE: Calculate accuracy on the test set.
#

In [0]:
%tensorboard --logdir logs/multi-class_classification_of_handwritten_digits

### Solution

Click below for a possible solution.

The code below is almost identical to the original `LinearClassifer` training code, with the exception of the NN-specific configuration, such as the hyperparameter for hidden units.

In [0]:
def fit_nn_classification_model(
    learning_rate,
    steps_per_epoch,
    batch_size,
    hidden_units,
    training_examples,
    training_targets,
    validation_examples,
    validation_targets):
  """Trains a neural network regression model.
  
  In addition to training, this function also prints training progress information,
  as well as a plot of the training and validation loss over time.
  
  Args:
    learning_rate: A `float`, the learning rate.
    steps_per_epoch: A non-zero `int`, the total number of training steps. A training step
      consists of a forward and backward pass using a single batch.
    batch_size: A non-zero `int`, the batch size.
    hidden_units: A `list` of int values, specifying the number of neurons in each layer.
    training_examples: A `DataFrame` containing one or more columns from
      `california_housing_dataframe` to use as input features for training.
    training_targets: A `DataFrame` containing exactly one column from
      `california_housing_dataframe` to use as target for training.
    validation_examples: A `DataFrame` containing one or more columns from
      `california_housing_dataframe` to use as input features for validation.
    validation_targets: A `DataFrame` containing exactly one column from
      `california_housing_dataframe` to use as target for validation.
      
  Returns:
    The trained model.
  """

  epochs = 10
  
  # Create a Sequential model.
  model = tf.keras.Sequential()
  for u in hidden_units:
    model.add(tf.keras.layers.Dense(u, activation='relu'))
  model.add(tf.keras.layers.Dense(10, activation='softmax'))           
  model.compile(optimizer=tf.keras.optimizers.Adagrad(learning_rate=learning_rate, clipnorm=5.0),
                loss=tf.keras.losses.sparse_categorical_crossentropy,
                metrics=['accuracy'])
  
  def log(epoch, logs):
    log_loss = logs["loss"]
    print("  epoch %02d : %0.2f" % (epoch, log_loss))
                         
  model_callback = tf.keras.callbacks.LambdaCallback(
      on_epoch_end=lambda epoch, logs: log(epoch, logs))
  logdir="logs/multi-class_classification_of_handwritten_digits/"
  dt = datetime.now().strftime("%Y%m%d-%H%M%S")
  tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=logdir + "scalars" + dt,
                                                        histogram_freq=1,
                                                        update_freq='epoch')
  print("Train model...")
  print("RMSE (on training data):")
  history = model.fit(training_examples.values,
            training_targets.values,
            validation_data=(validation_examples.values, validation_targets.values),
            epochs=epochs,
            steps_per_epoch=steps_per_epoch,
            batch_size=batch_size,
            callbacks=[model_callback, tensorboard_callback],
            verbose=0)
  accuracy = history.history["val_accuracy"][epochs - 1]
  print("Final accuracy (on validation data): %0.2f" % accuracy)
  final_predictions = model.predict_on_batch(validation_examples.values)
  final_predictions = [np.argmax(r) for r in final_predictions]
  
  # Output a plot of the confusion matrix.
  cm = metrics.confusion_matrix(validation_targets, final_predictions)
  # Normalize the confusion matrix by row (i.e by the number of samples
  # in each class).
  figure = plt.figure()
  cm_normalized = cm.astype("float") / cm.sum(axis=1)[:, np.newaxis]
  ax = sns.heatmap(cm_normalized, cmap="bone_r")
  ax.set_aspect(1)
  plt.title("Confusion matrix")
  plt.ylabel("True label")
  plt.xlabel("Predicted label")
  file_writer = tf.summary.create_file_writer(logdir+ "plots" + dt)
  with file_writer.as_default():
    tf.summary.image("Confusion matrix",
                     plot_to_image(figure),
                     step=0)
  plt.close()

  return model

In [0]:
classifier = fit_nn_classification_model(
    learning_rate=0.05,
    steps_per_epoch=100,
    batch_size=30,
    hidden_units=[100, 100],
    training_examples=training_examples,
    training_targets=training_targets,
    validation_examples=validation_examples,
    validation_targets=validation_targets)

Next, we verify the accuracy on the test set.

In [0]:
mnist_test_dataframe = pd.read_csv(
  "https://download.mlcc.google.com/mledu-datasets/mnist_test.csv",
  sep=",",
  header=None)

test_targets, test_examples = parse_labels_and_features(mnist_test_dataframe)
test_examples.describe()

In [0]:
metrics = classifier.evaluate(test_examples.values, test_targets.values, verbose=0)
  
accuracy = metrics[1]
print("Accuracy on test data: %0.2f" % accuracy)

## Task 3: Visualize the weights of the first hidden layer.

Let's take a few minutes to dig into our neural network and see what it has learned by accessing the `weights_` attribute of our model.

The input layer of our model has `784` weights corresponding to the `28×28` pixel input images. The first hidden layer will have `784×N` weights where `N` is the number of nodes in that layer. We can turn those weights back into `28×28` images by *reshaping* each of the `N` `1×784` arrays of weights into `N` arrays of size `28×28`.

Run the following cell to plot the weights. Note that this cell requires that a `DNNClassifier` called "classifier" has already been trained.

In [0]:
weights0 = classifier.weights[0].numpy()
print("weights0 shape: {}".format(weights0.shape))

num_nodes = weights0.shape[1]
num_rows = int(math.ceil(num_nodes / 10.0))
figure, axes = plt.subplots(num_rows, 10, figsize=(20, 2 * num_rows))
for coef, ax in zip(weights0.T, axes.ravel()):
    # Weights in coef is reshaped from 1x784 to 28x28.
    ax.matshow(coef.reshape(28, 28), cmap=plt.cm.pink)
    ax.set_xticks(())
    ax.set_yticks(())
logdir="logs/multi-class_classification_of_handwritten_digits/plots" + datetime.now().strftime("%Y%m%d-%H%M%S")
file_writer = tf.summary.create_file_writer(logdir)
with file_writer.as_default():
  tf.summary.image("Weights of the first hidden layer",
                   plot_to_image(figure),
                   step=0)
plt.close()

In [0]:
%tensorboard --logdir logs/multi-class_classification_of_handwritten_digits

The first hidden layer of the neural network should be modeling some pretty low level features, so visualizing the weights will probably just show some fuzzy blobs or possibly a few parts of digits.  You may also see some neurons that are essentially noise -- these are either unconverged or they are being ignored by higher layers.

It can be interesting to stop training at different numbers of iterations and see the effect.

**Train the classifier for 10, 100 and respectively 1000 steps. Then run this visualization again.**

What differences do you see visually for the different levels of convergence?