Homework Guidelines: Building a Custom Neural Network for MNIST Digit Classification

In this homework assignment, you are tasked with understanding and possibly modifying a Python script that trains a neural network to classify handwritten digits from the MNIST dataset. The provided code uses TensorFlow and Keras. Your goal is to comprehend the code and potentially make some modifications as indicated.

Instructions:

1 - Importing Libraries: Understand the initial part of the code where necessary libraries are imported. Ensure that you have TensorFlow and Keras installed in your environment.

2- Loading the MNIST Dataset: Observe how the MNIST dataset is loaded and divided into training and testing sets. The dataset consists of images and labels.

3- Data Preprocessing: Understand the normalization of pixel values to the range [0, 1]. This preprocessing step is essential for efficient model training.

4- One-Hot Encoding: Comprehend the one-hot encoding process applied to the labels. It converts class labels into a binary matrix representation. Ensure that you understand the purpose of this transformation.

5- Custom Dense Layer: Examine the custom dense layer definition. This layer is used within the neural network architecture and allows you to specify the number of units and activation function.

    Pay attention to how weights and biases are initialized in the build method.
    Understand how the layer performs matrix multiplication and applies the activation function in the call method.
6- Neural Network Architecture: Analyze the definition of the neural network model. It consists of a series of layers, including custom dense layers with specified units and activation functions.

    Identify the input shape for the first layer (28x28 for MNIST images).
    Note the choice of activation functions (e.g., tf.nn.relu and tf.nn.softmax) for different layers.
    
7- Custom Loss Function: Study the custom loss function custom_sparse_categorical_crossentropy. This function computes the loss based on the negative log probabilities.

    Understand how the negative log probabilities are calculated.
    Observe how the mean loss across the batch is computed.
8- Custom Accuracy Metric: Examine the custom_accuracy function, which calculates accuracy as the percentage of correct predictions.

    Pay attention to how it compares predicted and true labels to determine accuracy.
9- Model Compilation: Observe how the model is compiled using the Adam optimizer or other optimizer or you can write costum optimizer function, the custom loss function, and the custom accuracy metric.

    Understand the significance of choosing appropriate optimizer and metrics for the task.
10- Model Training: Check the training process using the model.fit method. The model is trained for a specified number of epochs and with a given batch size.

11- Model Evaluation: Understand how the trained model is evaluated on the test dataset using the model.evaluate method. The test accuracy is printed as the result.

Assignment Tasks (Optional):

Experiment with different neural network architectures (e.g., changing the number of units, adding more layers, or trying different activation functions) to see how they affect model performance.

Implement your custom loss function or metric and evaluate its impact on model training and performance.

If you have a larger dataset, consider adapting this code to work with it by adjusting the input shape and the number of output units.

Explore ways to visualize the model's predictions or intermediate layer activations to gain insights into its behavior.

In [6]:
import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Flatten
from tensorflow.keras.layers import Dense

# 1 - Importing Libraries
# TensorFlow and Keras are imported at the beginning.

(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

# 2 - Loading the MNIST Dataset
# The MNIST dataset is loaded into training and testing sets, consisting of images and labels.

# 3 - Data Preprocessing
# Normalize pixel values to be between 0 and 1
train_images, test_images = train_images / 255.0, test_images / 255.0

# 4 - One-Hot Encoding
# One-hot encode the labels
train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

# 5 - Custom Dense Layer
class CustomDense(tf.keras.layers.Layer):
    def __init__(self, units, activation=None):
        super(CustomDense, self).__init__()
        self.units = units
        self.activation = activation

    def build(self, input_shape):
        self.w = self.add_weight("weights", (input_shape[-1], self.units))
        self.b = self.add_weight("bias", (self.units,))

    def call(self, inputs):
        z = tf.matmul(inputs, self.w) + self.b
        if self.activation is not None:
            return self.activation(z)
        return z

# 6 - Neural Network Architecture
model = Sequential([
    Flatten(input_shape=(28, 28)),  # Input shape for MNIST images

    # Add custom dense layers with specified units and activation functions
    CustomDense(128, activation=tf.nn.relu),
    CustomDense(64, activation=tf.nn.relu),
    CustomDense(10, activation=tf.nn.softmax)
])

# 7 - Custom Loss Function (implemented)
def custom_sparse_categorical_crossentropy(y_true, y_pred):
    # Calculate loss based on negative log probabilities
    loss = -tf.reduce_sum(y_true * tf.math.log(y_pred))
    return loss

# 8 - Custom Accuracy Metric (alternative)
def custom_accuracy(y_true, y_pred):
    correct_predictions = tf.equal(tf.argmax(y_true, axis=-1), tf.argmax(y_pred, axis=-1))
    accuracy = tf.reduce_mean(tf.cast(correct_predictions, tf.float32))
    return accuracy

# 9 - Model Compilation
model.compile(optimizer='adam',  # You can choose other optimizers as well
              loss=custom_sparse_categorical_crossentropy,
              metrics=[custom_accuracy])

# 10 - Model Training
model.fit(train_images, train_labels, epochs=5, batch_size=32)  # Specify the number of epochs and batch size

# 11 - Model Evaluation
test_loss, test_accuracy = model.evaluate(test_images, test_labels)
print(f'Test accuracy: {test_accuracy}')


Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Test accuracy: 0.9772364497184753
