Homework Guidelines: Building a Custom Neural Network for MNIST Digit Classification

In this homework assignment, you are tasked with understanding and possibly modifying a Python script that trains a neural network to classify handwritten digits from the MNIST dataset. The provided code uses TensorFlow and Keras. Your goal is to comprehend the code and potentially make some modifications as indicated.

Instructions:

1 - Importing Libraries: Understand the initial part of the code where necessary libraries are imported. Ensure that you have TensorFlow and Keras installed in your environment.

2- Loading the MNIST Dataset: Observe how the MNIST dataset is loaded and divided into training and testing sets. The dataset consists of images and labels.

3- Data Preprocessing: Understand the normalization of pixel values to the range [0, 1]. This preprocessing step is essential for efficient model training.

4- One-Hot Encoding: Comprehend the one-hot encoding process applied to the labels. It converts class labels into a binary matrix representation. Ensure that you understand the purpose of this transformation.

5- Custom Dense Layer: Examine the custom dense layer definition. This layer is used within the neural network architecture and allows you to specify the number of units and activation function.

    Pay attention to how weights and biases are initialized in the build method.
    Understand how the layer performs matrix multiplication and applies the activation function in the call method.
6- Neural Network Architecture: Analyze the definition of the neural network model. It consists of a series of layers, including custom dense layers with specified units and activation functions.

    Identify the input shape for the first layer (28x28 for MNIST images).
    Note the choice of activation functions (e.g., tf.nn.relu and tf.nn.softmax) for different layers.
    
7- Custom Loss Function: Study the custom loss function custom_sparse_categorical_crossentropy. This function computes the loss based on the negative log probabilities.

    Understand how the negative log probabilities are calculated.
    Observe how the mean loss across the batch is computed.
8- Custom Accuracy Metric: Examine the custom_accuracy function, which calculates accuracy as the percentage of correct predictions.

    Pay attention to how it compares predicted and true labels to determine accuracy.
9- Model Compilation: Observe how the model is compiled using the Adam optimizer or other optimizer or you can write costum optimizer function, the custom loss function, and the custom accuracy metric.

    Understand the significance of choosing appropriate optimizer and metrics for the task.
10- Model Training: Check the training process using the model.fit method. The model is trained for a specified number of epochs and with a given batch size.

11- Model Evaluation: Understand how the trained model is evaluated on the test dataset using the model.evaluate method. The test accuracy is printed as the result.

Assignment Tasks (Optional):

Experiment with different neural network architectures (e.g., changing the number of units, adding more layers, or trying different activation functions) to see how they affect model performance.

Implement your custom loss function or metric and evaluate its impact on model training and performance.

If you have a larger dataset, consider adapting this code to work with it by adjusting the input shape and the number of output units.

Explore ways to visualize the model's predictions or intermediate layer activations to gain insights into its behavior.

In [None]:
import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Flatten

(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

# Normalize pixel values to be between 0 and 1
train_images, test_images =

# One-hot encode the labels
train_labels = 
test_labels = 

In [2]:
pip install tensorflow keras


Collecting tensorflow
  Obtaining dependency information for tensorflow from https://files.pythonhosted.org/packages/87/51/ad9ebf4ef29754b813a057d64a0634feb12aef27cabcbdb7433dc5cd4cb4/tensorflow-2.14.0-cp310-cp310-macosx_10_15_x86_64.whl.metadata
  Downloading tensorflow-2.14.0-cp310-cp310-macosx_10_15_x86_64.whl.metadata (3.9 kB)
Collecting keras
  Obtaining dependency information for keras from https://files.pythonhosted.org/packages/fe/58/34d4d8f1aa11120c2d36d7ad27d0526164b1a8ae45990a2fede31d0e59bf/keras-2.14.0-py3-none-any.whl.metadata
  Downloading keras-2.14.0-py3-none-any.whl.metadata (2.4 kB)
Collecting absl-py>=1.0.0 (from tensorflow)
  Obtaining dependency information for absl-py>=1.0.0 from https://files.pythonhosted.org/packages/01/e4/dc0a1dcc4e74e08d7abedab278c795eef54a224363bb18f5692f416d834f/absl_py-2.0.0-py3-none-any.whl.metadata
  Downloading absl_py-2.0.0-py3-none-any.whl.metadata (2.3 kB)
Collecting astunparse>=1.6.0 (from tensorflow)
  Downloading astunparse-1.6.3-p

Collecting oauthlib>=3.0.0 (from requests-oauthlib>=0.7.0->google-auth-oauthlib<1.1,>=0.5->tensorboard<2.15,>=2.14->tensorflow)
  Downloading oauthlib-3.2.2-py3-none-any.whl (151 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m151.7/151.7 kB[0m [31m5.1 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading tensorflow-2.14.0-cp310-cp310-macosx_10_15_x86_64.whl (229.6 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m229.6/229.6 MB[0m [31m4.5 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0mm
[?25hDownloading ml_dtypes-0.2.0-cp310-cp310-macosx_10_9_universal2.whl (1.2 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.2/1.2 MB[0m [31m4.8 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
[?25hDownloading keras-2.14.0-py3-none-any.whl (1.7 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.7/1.7 MB[0m [31m4.1 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
[?25hDownloading absl_py-2.0.0-py3-none-any.whl (130 kB)
[2K   [

In [2]:
import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Flatten, Dense

# Step 2: Loading the MNIST Dataset
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

# Step 3: Data Preprocessing
# Normalize pixel values to be between 0 and 1
train_images, test_images = train_images / 255.0, test_images / 255.0

# Step 4: One-Hot Encoding
# One-hot encode the labels
train_labels = to_categorical(train_labels, num_classes=10)
test_labels = to_categorical(test_labels, num_classes=10)

# Step 5: Custom Dense Layer
class CustomDenseLayer(tf.keras.layers.Layer):
    def __init__(self, units, activation=None):
        super(CustomDenseLayer, self).__init__()
        self.units = units
        self.activation = activation

    def build(self, input_shape):
        self.w = self.add_weight("weights", (input_shape[-1], self.units))
        self.b = self.add_weight("bias", (self.units,))
    
    def call(self, inputs):
        z = tf.matmul(inputs, self.w) + self.b
        if self.activation is not None:
            return self.activation(z)
        return z

# Step 6: Neural Network Architecture
model = Sequential([
    Flatten(input_shape=(28, 28)),  # Input shape for the first layer
    CustomDenseLayer(128, activation=tf.nn.relu),
    CustomDenseLayer(10, activation=tf.nn.softmax)  # Output layer with 10 units for 10 classes
])

# Step 7: Custom Loss Function
def custom_sparse_categorical_crossentropy(y_true, y_pred):
    loss = -tf.reduce_sum(y_true * tf.math.log(y_pred))
    return loss

# Step 8: Custom Accuracy Metric
def custom_accuracy(y_true, y_pred):
    correct_predictions = tf.equal(tf.argmax(y_true, axis=1), tf.argmax(y_pred, axis=1))
    accuracy = tf.reduce_mean(tf.cast(correct_predictions, tf.float32))
    return accuracy

# Step 9: Model Compilation
model.compile(optimizer='adam',
              loss=custom_sparse_categorical_crossentropy,
              metrics=[custom_accuracy])

# Step 10: Model Training
model.fit(train_images, train_labels, epochs=5, batch_size=64)

# Step 11: Model Evaluation
test_loss, test_acc = model.evaluate(test_images, test_labels)
print("Test accuracy number:", test_acc)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Test accuracy number: 0.9745407104492188
