# Importing Necessary Libraries
In this block, we're importing all the necessary modules and libraries. This includes modules for data processing, model building, and visualization.

In [None]:
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential, load_model
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras.utils import to_categorical
import numpy as np
import matplotlib.pyplot as plt
import cv2
from google.colab import files

#Loading the MNIST Dataset
The MNIST dataset contains handwritten digits. We're dividing the data into training and test sets. The training set (train_images and train_labels) is used to train the model, and the test set (test_images and test_labels) is used to evaluate its performance.

In [None]:
# Load the dataset
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

#Data Preprocessing
For the neural network to work effectively, the image data is normalized to a range between 0 and 1. This is achieved by dividing each pixel value by 255 (maximum pixel value). Next, we reshape the images to include a single channel (grayscale) because CNNs expect a 3D matrix as input for each image.

In [None]:
# Normalize the images to be between 0 and 1 and reshape them
train_images = train_images.astype('float32') / 255.0
test_images = test_images.astype('float32') / 255.0
train_images = train_images.reshape(train_images.shape[0], 28, 28, 1)
test_images = test_images.reshape(test_images.shape[0], 28, 28, 1)

# Label Processing
Our labels are digits from 0 to 9. However, the neural network will output an array of probabilities (one for each class). By converting the labels to one-hot encoded format, we make sure the label format matches the output format of the network.

In [None]:
# Convert labels to one-hot encoded format
train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

#Building the CNN Model
Here, we define the architecture of our convolutional neural network:

Conv2D Layer: Convolutional layer with 32 filters of size 3x3 and ReLU activation. This layer will learn local patterns in the data.
MaxPooling2D Layer: Reduces the spatial dimensions by taking the maximum value in each window of size 2x2.
Another Conv2D Layer with 64 filters to learn more complex patterns.
Another MaxPooling2D for dimensionality reduction.
Flatten: This layer reshapes the 3D output of the previous layer to 1D.
Dense Layer: A fully connected layer with 128 neurons and ReLU activation.
Dropout: Randomly sets 50% of the input units to 0 at each update during training, which helps to prevent overfitting.
Dense Layer: The final layer with 10 neurons (one for each class) and a softmax activation to output class probabilities.

In [None]:
# Instantiate a Sequential model
model = Sequential()

# Add layers one by one
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))

#Compiling the Model
Before training, we need to define how the model should learn. We specify:

Optimizer: The algorithm that will be used to optimize the weights. 'Adam' is a popular choice.
Loss Function: Since it's a multi-class classification, we use categorical_crossentropy as the loss function.
Metrics: We're interested in the accuracy of classification, so we include 'accuracy' as a metric.

In [None]:
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

#Training the Model
Now, we're training the model using our training data. The model's weights are updated in a way to minimize the loss over 5 epochs. Each epoch is a complete forward and backward pass of all the training examples.


In [None]:
# Train the model
model.fit(train_images, train_labels, epochs=5, batch_size=32)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.src.callbacks.History at 0x7ad4790a06a0>

#Saving the Trained Model
After training, we save the model's architecture and its weights to a file. This allows us to load the trained model later without having to retrain it.

In [None]:
# Save the trained model
model.save('mnist_cnn_model.h5')

  saving_api.save_model(


<details>
<summary># Evaluating the Model on Test Data</summary>

**Explanation:**
After training the model, it's crucial to evaluate its performance on unseen data, which is our test dataset. The `model.evaluate()` method returns the model's loss and other metrics (in this case, accuracy) on the test dataset.


In [None]:
test_loss, test_accuracy = model.evaluate(test_images, test_labels)
print(f'Test accuracy: {test_accuracy}')
print(f'Test loss: {test_loss}')

Test accuracy: 0.9922000169754028
Test loss: 0.02388986013829708
