<a href="https://colab.research.google.com/github/anupa-aa/DeepLearning.AI-Courses/blob/master/MNIST_NN_vs_CNN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Week 2: Implementing Callbacks in TensorFlow using the MNIST Dataset

In the course you learned how to do classification using Fashion MNIST, a data set containing items of clothing. There's another, similar dataset called MNIST which has items of handwriting -- the digits 0 through 9.

Write an MNIST classifier that trains to 99% accuracy and stops once this threshold is achieved. In the lecture you saw how this was done for the loss but here you will be using accuracy instead.

Some notes:
1. Your network should succeed in less than 9 epochs.
2. When it reaches 99% or greater it should print out the string "Reached 99% accuracy so cancelling training!" and stop training.
3. If you add any additional variables, make sure you use the same names as the ones used in the class. This is important for the function signatures (the parameters and names) of the callbacks.

In [2]:
import tensorflow as tf
from tensorflow import keras
import os # We need this to load the data

## Loading and inspecting the data

In [12]:
# Gets the current working directory and appends the data location
# data/mnist.npz to create the full path
current_dir = os.getcwd()
data_path = os.path.join(current_dir, "mnist.npz")

# Get the training set and discard the test set
(x_train, y_train), _ = keras.datasets.mnist.load_data(path=data_path)

# Normalize the colour values from 0 to 1 instead of 0 to 255
x_train = x_train/255.0


In [13]:
# grader-required-cell

data_shape = x_train.shape

print(f"There are {data_shape[0]} examples with shape ({data_shape[1]}, {data_shape[2]})")

There are 60000 examples with shape (28, 28)


## Defining a callback

In [14]:
class myCallback(keras.callbacks.Callback):

  def on_epoch_end(self, epoch, logs={}):
    if logs.get("accuracy") is not None and logs.get("accuracy") > 0.99:
      print("Reached 99% accuracy so stopping training")
      self.model.stop_training = True


## Creating our DNN model

In [17]:
x_train.shape

(60000, 28, 28)

In [18]:
def train_mnist(x_train, y_train):

  # Instantiate call back class
  mycallback = myCallback()

  # Design our model
  model = keras.models.Sequential([
      keras.layers.Flatten(),
      keras.layers.Dense(128, activation="relu"),
      keras.layers.Dense(10, activation="softmax")
  ])

  # Compile our model
  model.compile(optimizer="adam",
                loss="sparse_categorical_crossentropy",
                metrics=["accuracy"])

  # Fit model for ten epochs adding the callbacks and save history
  history = model.fit(x_train, y_train, epochs=10, callbacks = [mycallback])

  return history



In [19]:
hist = train_mnist(x_train, y_train)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10


## Creating an alternative CNN model

In [28]:
def train_mnist_cnn(x_train, y_train):

  # initialise our callback class
  my_callback = myCallback()

  # *AS WE ARE USING CONV LAYERS, WE NEED TO CHANGE THE SHAPE OF X_TRAIN
  # TO MATCH THE INPUT SIZE OF THE CONV LAYERS
  x_train = x_train.reshape(-1,28,28,1)
  y_train = keras.utils.to_categorical(y_train, num_classes=10)

  # Design our model
  model = keras.models.Sequential([
      keras.layers.Conv2D(16, (3,3), activation="relu", input_shape=(28,28,1)),
      keras.layers.MaxPooling2D(2,2),
      keras.layers.Conv2D(32, (3,3), activation="relu"),
      keras.layers.MaxPooling2D(2,2),
      keras.layers.Conv2D(32, (3,3), activation="relu"),

      keras.layers.Flatten(),
      keras.layers.Dense(32, activation="relu"),
      keras.layers.Dense(10, activation="softmax")
  ])

  # Compile our model
  model.compile(
      optimizer="adam",
      loss="categorical_crossentropy",
      metrics=["accuracy"]
  )

  # Fit our model to the data for 10 epochs with our callbacks
  history = model.fit(x_train, y_train, epochs=10, callbacks=[my_callback])

  return history

In [29]:
hist_cnn = train_mnist_cnn(x_train, y_train)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
