## Custom Dataset Class for Loading MNIST Data

This code snippet defines a Python class, `Dataset`, responsible for loading the MNIST training and testing data from the filesystem. The MNIST data files are expected to be in the IDX file format.

1. **Import Dependencies**: The required modules, `numpy as np` and `struct`, are imported at the beginning of the code to handle array operations and binary data reading, respectively.

2. **Initialization**: The `__init__` method initializes the class and immediately calls methods to load the training and testing labels and images.

3. **Reading Labels**: The `read_idx_labels` method reads the labels from an IDX formatted file. It reads the magic number and the number of items from the file header and then loads the labels into a NumPy array.

4. **Reading Images**: Similar to `read_idx_labels`, the `read_idx_images` method reads image data from an IDX formatted file. It reads the magic number, the number of items, and the dimensions (rows and cols) of each image. The image data is loaded into a 3D NumPy array and normalized by dividing by 255.

5. **Get Train and Test Data**: The `get_train_test_data` method returns the loaded training and testing image and label datasets.

6. **Instantiation and Data Retrieval**: Finally, an instance of the `Dataset` class is created, and the training and testing data are retrieved using `get_train_test_data`.

Note: While making `Dataset` a class might seem like overkill for this simple example, this approach stems from a larger project where the class had additional features and functionalities.

In [35]:
import struct
import numpy as np

class Dataset(object):
    def __init__(self) -> None:
        self.train_labels = self.read_idx_labels("data/train-labels.idx1-ubyte")
        self.train_images = self.read_idx_images("data/train-images.idx3-ubyte")
        self.test_labels = self.read_idx_labels("data/t10k-labels.idx1-ubyte")
        self.test_images = self.read_idx_images("data/t10k-images.idx3-ubyte")
        
    def read_idx_labels(self, file_path : str) -> np.ndarray:
        with open(file_path, 'rb') as f:
            magic, num = struct.unpack(">II", f.read(8))
            labels = np.frombuffer(f.read(), dtype=np.uint8)
        return labels

    def read_idx_images(self, file_path : str) -> np.ndarray:
        with open(file_path, 'rb') as f:
            magic, num, rows, cols = struct.unpack(">IIII", f.read(16))
            images = np.frombuffer(f.read(), dtype=np.uint8).reshape(num, rows, cols)
        return images.astype('float32')/255
    
    def get_train_test_data(self):
        return self.train_images, self.train_labels, self.test_images, self.test_labels

data = Dataset()
X_train, y_train, X_test, y_test = data.get_train_test_data()

## Creating a Neural Network using Keras Functional API

In this code snippet, we are defining a neural network model to tackle the MNIST digit classification problem using TensorFlow's Keras Functional API.

1. **Import Dependencies**: We import necessary modules from Keras, such as `Model`, `Input`, `Dense`, and `Flatten`.

2. **Input Layer**: The `Input` layer is defined with shape `(28, 28)`, which corresponds to the dimensions of the MNIST images.

3. **Flatten Layer**: The `Flatten` layer is used to flatten the 2D `(28, 28)` input into a 1D array of 784 values. 

4. **Hidden Layers**: Two hidden layers are used in the network. The first hidden layer consists of 128 neurons and the second one has 64 neurons. Both layers use the ReLU (Rectified Linear Unit) activation function.

5. **Output Layer**: The `Dense` output layer has 10 neurons with a softmax activation function. Each neuron represents one of the 10 possible digit classes (0 to 9).

6. **Model Compilation**: Finally, the model is compiled using the Adam optimizer, and categorical cross-entropy loss function. We also track accuracy as a metric.

7. **Loss Function**: We use the categorical cross-entropy loss function, which is commonly used for multi-class classification problems. It measures the dissimilarity between the predicted probability distribution and the actual distribution, aiming to minimize the loss.

8. **Optimizer**: The Adam optimizer is used for training the model. Adam is an adaptive learning rate optimization algorithm that's designed to combine the advantages of other extensions of stochastic gradient descent. It adjusts the learning rate during training, which often leads to faster convergence and better performance.



In [36]:
from tensorflow.keras import Model, Input
from tensorflow.keras.layers import Dense, Flatten

input_layer = Input(shape=(28, 28))
x = Flatten()(input_layer)
x = Dense(128, activation='relu')(x)
x = Dense(64, activation='relu')(x)
output_layer = Dense(10, activation='softmax')(x)

model = Model(inputs=input_layer, outputs=output_layer)

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

## Preprocessing Labels and Training the Model

This section of the code deals with the label preprocessing and the model training process, along with timing the training duration.

1. **Label Preprocessing**: The `to_categorical` function from Keras is used to convert the integer labels into one-hot encoded vectors. This step is essential as the model expects labels in this format for a multi-class classification problem.

2. **Model Training**: The `fit` method is called on the model object to start the training process. The training data `X_train` and one-hot encoded labels `y_train` are passed as arguments. The model is trained for 10 epochs with a batch size of 128.

3. **Timing the Training**: We import the `time` module and record the start and end times surrounding the model training. The duration is calculated as the difference between the end and start times, giving us the time taken to fit the model in seconds.

In [37]:
from keras.utils import to_categorical
import time

y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

start_time = time.time()
model.fit(X_train, y_train, epochs=10, batch_size=128)
end_time = time.time()

time_taken = end_time - start_time
print(f"Time taken to fit the model: {time_taken:.2f} seconds")


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Time taken to fit the model: 9.72 seconds


## Evaluating the Model on Test Data

After training the model, it's crucial to evaluate its performance on unseen data to understand its generalization capability.

1. **Model Evaluation**: The `evaluate` method from Keras is used to compute the loss and accuracy metrics on the test dataset (`X_test` and `y_test`).

2. **Print Accuracy**: The accuracy is printed to the console, formatted to show up to two decimal places.

In [38]:
test_loss, test_acc = model.evaluate(X_test, y_test)
print(f'Test accuracy: {test_acc * 100:.2f}%')



Test accuracy: 97.82%


## Conclusion

In this notebook, we walked through the process of creating a neural network model for MNIST digit classification using the Keras Functional API. We also created a custom dataset class to load the MNIST data, preprocessed the labels, trained the model, and finally evaluated its performance on the test set.

The model achieved a test accuracy of approximately 97.6% in around 10s, which is quite impressive for such a simple neural network architecture. This result validates the model's effectiveness in digit classification tasks and its ability to generalize well to unseen data.

While the model performs well, there's always room for further improvement or experimentation. For instance, you could try different architectures, optimizers, or even apply data augmentation techniques.

Overall, this notebook serves as a practical example of using deep learning for image classification tasks, demonstrating the power and ease of using modern machine learning libraries like Keras.
