### 28.04.25, © Dmytro Sokhin KI-21-1, 2025

Source: https://github.com/https-deeplearning-ai/tensorflow-1-public

# Workshop 2 Assignment 1: Improve MNIST with Convolutions

For this exercise see if you can improve MNIST to 99.5% accuracy or more by adding only a single convolutional layer and a single MaxPooling 2D layer to the model from the  assignment of the previous week. 

You should stop training once the accuracy goes above this amount. It should happen in less than 10 epochs, so it's ok to hard code the number of epochs for training, but your training must end once it hits the above metric. If it doesn't, then you'll need to redesign your callback.

When 99.5% accuracy has been hit, you should print out the string "Reached 99.5% accuracy so cancelling training!"


In [1]:
import os
import numpy as np
import tensorflow as tf
from tensorflow import keras

## Load the data

Begin by loading the data. A couple of things to notice:

- The file `mnist.npz` is already included in the current workspace under the `data` directory. By default the `load_data` from Keras accepts a path relative to `~/.keras/datasets` but in this case it is stored somewhere else, as a result of this, you need to specify the full path.

- `load_data` returns the train and test sets in the form of the tuples `(x_train, y_train), (x_test, y_test)` but in this exercise you will be needing only the train set so you can ignore the second tuple.

In [3]:
# Load the data

# Get current working directory
current_dir = os.getcwd() 

data_dir = os.path.join(current_dir, "data")
data_path = os.path.join(data_dir, "mnist.npz")
print(f"Attempting to load data from: {data_path}")

if os.path.exists(data_path):
    with np.load(data_path) as data:
        print("Keys available in the npz file:", data.files) 
        try:
            training_images = data['x_train']
            training_labels = data['y_train'] 
            print("Data loaded successfully using np.load().")
            print("Training images shape:", training_images.shape)
            print("Training labels shape:", training_labels.shape)
        except KeyError as e:
            print(f"Error loading data: {e}")
            print("Please check the available keys printed above and modify the code accordingly.")
            training_images, training_labels = None, None
else:
    print(f"Error: File not found at {data_path}")
    training_images, training_labels = None, None

Attempting to load data from: c:\Users\user\Downloads\Workshop\data\mnist.npz
Keys available in the npz file: ['x_test', 'x_train', 'y_train', 'y_test']
Data loaded successfully using np.load().
Training images shape: (60000, 28, 28)
Training labels shape: (60000,)


## Pre-processing the data

One important step when dealing with image data is to preprocess the data. During the preprocess step you can apply transformations to the dataset that will be fed into your convolutional neural network.

Here you will apply two transformations to the data:
- Reshape the data so that it has an extra dimension. The reason for this 
is that commonly you will use 3-dimensional arrays (without counting the batch dimension) to represent image data. The third dimension represents the color using RGB values. This data might be in black and white format so the third dimension doesn't really add any additional information for the classification process but it is a good practice regardless.


- Normalize the pixel values so that these are values between 0 and 1. You can achieve this by dividing every value in the array by the maximum.

Remember that these tensors are of type `numpy.ndarray` so you can use functions like [reshape](https://numpy.org/doc/stable/reference/generated/numpy.reshape.html) or [divide](https://numpy.org/doc/stable/reference/generated/numpy.divide.html) to complete the `reshape_and_normalize` function below:

In [4]:
def reshape_and_normalize(images):

    ### START CODE HERE

    # Reshape the images to add an extra dimension
    images = images.reshape(-1, 28, 28, 1)
    
    # Normalize pixel values
    images = images.astype('float32') / 255.0
    
    ### END CODE HERE

    return images

Test your function with the next cell:

In [5]:
current_dir = os.getcwd()
data_dir = os.path.join(current_dir, "data")
data_path = os.path.join(data_dir, "mnist.npz")
print(f"Using data path: {data_path}")

# Reload the images in case you run this cell multiple times
training_images = None
try:
    print("Attempting to load data using tf.keras.datasets.mnist.load_data()...")
    (training_images, _), _ = tf.keras.datasets.mnist.load_data()
    print("Loaded data using tf.keras.datasets.mnist.load_data()")

except Exception as e_keras:
    print(f"tf.keras.datasets.mnist.load_data() failed: {e_keras}")
    print(f"Attempting to load data using np.load from local path: {data_path}")

    if os.path.exists(data_path):
        try:
            with np.load(data_path) as data:
                print("Keys in local npz file:", data.files)
                if 'x_train' in data.files:
                   training_images = data['x_train']
                   print("Loaded 'x_train' data using np.load()")
                else:
                   print("ERROR: 'x_train' key not found in the local npz file.")

        except Exception as e_np:
            print(f"np.load failed: {e_np}")
    else:
        print(f"ERROR: Local data file not found at {data_path}")

if training_images is not None:
    print(f"\nOriginal shape of training set: {training_images.shape}")
    print(f"Original maximum pixel value: {np.max(training_images)}\n")

    # Apply your function
    training_images = reshape_and_normalize(training_images)

    print(f"Maximum pixel value after normalization: {np.max(training_images):.2f}\n") # Using .2f for cleaner output
    print(f"Shape of training set after reshaping: {training_images.shape}\n")
    print(f"Shape of one image after reshaping: {training_images[0].shape}")
else:
    print("Skipping normalization and shape check because data loading failed.")

Using data path: c:\Users\user\Downloads\Workshop\data\mnist.npz
Attempting to load data using tf.keras.datasets.mnist.load_data()...
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
[1m11490434/11490434[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 0us/step
Loaded data using tf.keras.datasets.mnist.load_data()

Original shape of training set: (60000, 28, 28)
Original maximum pixel value: 255

Maximum pixel value after normalization: 1.00

Shape of training set after reshaping: (60000, 28, 28, 1)

Shape of one image after reshaping: (28, 28, 1)


**Expected Output:**
```
Maximum pixel value after normalization: 1.0

Shape of training set after reshaping: (60000, 28, 28, 1)

Shape of one image after reshaping: (28, 28, 1)
```

## Defining your callback

Now complete the callback that will ensure that training will stop after an accuracy of 99.5% is reached:

In [6]:
# GRADED CLASS: myCallback
### START CODE HERE

# Remember to inherit from the correct class
class myCallback(tf.keras.callbacks.Callback):
    # Define the method that checks the accuracy at the end of each epoch
    def on_epoch_end(self, epoch, logs=None):
        if logs.get('accuracy') is not None and logs.get('accuracy') >= 0.995:
            print("\nReached 99.5% accuracy so cancelling training!")

            # Stop training by setting the model's stop_training attribute to True
            self.model.stop_training = True

### END CODE HERE


## Convolutional Model

Finally, complete the `convolutional_model` function below. This function should return your convolutional neural network.

**Your model should achieve an accuracy of 99.5% or more before 10 epochs to pass this assignment.**

**Hints:**
- You can try any architecture for the network but try to keep in mind you don't need a complex one. For instance, only one convolutional layer is needed. 

- In case you need extra help you can check out an architecture that works pretty well at the end of this notebook.

In [None]:
# GRADED FUNCTION: convolutional_model
def convolutional_model():
    model = tf.keras.models.Sequential([
        tf.keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(28, 28, 1)),
        tf.keras.layers.MaxPooling2D(2, 2),

        tf.keras.layers.Flatten(),

        tf.keras.layers.Dense(128, activation='relu'),
        tf.keras.layers.Dense(10, activation='softmax')
    ])

    model.compile(optimizer='adam',
                  loss='sparse_categorical_crossentropy',
                  metrics=['accuracy'])
    return model

In [10]:
# Save your untrained model
model = convolutional_model()
model.summary()

# Instantiate the callback class
callbacks = myCallback()

# Train your model (this can take up to 5 minutes)
history = model.fit(training_images, training_labels, epochs=10, callbacks=[callbacks])

Epoch 1/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m52s[0m 26ms/step - accuracy: 0.9104 - loss: 0.2940
Epoch 2/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m49s[0m 26ms/step - accuracy: 0.9850 - loss: 0.0503
Epoch 3/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m50s[0m 27ms/step - accuracy: 0.9912 - loss: 0.0297
Epoch 4/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m61s[0m 33ms/step - accuracy: 0.9947 - loss: 0.0189
Epoch 5/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m47s[0m 25ms/step - accuracy: 0.9953 - loss: 0.0136
Epoch 6/10
[1m1873/1875[0m [32m━━━━━━━━━━━━━━━━━━━[0m[37m━[0m [1m0s[0m 24ms/step - accuracy: 0.9966 - loss: 0.0098
Reached 99.5% accuracy so cancelling training!
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m80s[0m 24ms/step - accuracy: 0.9966 - loss: 0.0098


If you see the message that you defined in your callback printed out after less than 10 epochs it means your callback worked as expected. You can also double check by running the following cell:

In [11]:
print(f"Your model was trained for {len(history.epoch)} epochs")

Your model was trained for 6 epochs
