**Training a Convolutional Neural Network (CNN) on the MNIST dataset using TensorFlow and Keras**

This file loads the MNIST dataset, preprocesses the data by normalizing and reshaping it, defines a CNN model, compiles it, trains it on the data, and finally saves the trained model.

In [None]:
# run all

# Import Libraries

In [19]:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense

`tensorflow` is an open-source library for machine learning.

`Sequential` is a linear stack of layers.

`Conv2D` is a 2D convolutional layer.

`MaxPooling2D` is a max pooling layer.

`Flatten` flattens the input.

`Dense` is a fully connected layer.

# Load the MNIST Dataset

In [20]:
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()

The MNIST dataset is a collection of 60,000 training images and 10,000 testing images of handwritten digits (0-9).

`load_data()` downloads the dataset and splits it into training and testing sets.

`x_train, y_train` are the training data and labels.

`x_test, y_test` are the testing data and labels.

# Normalize the Data

In [21]:
x_train, x_test = x_train / 255.0, x_test / 255.0

The pixel values of the images range from 0 to 255. Normalizing them to a range of 0 to 1 helps in faster convergence during training.

Dividing by 255.0 scales the values.

# Reshape the Data

In [22]:
x_train = x_train.reshape(-1, 28, 28, 1)
x_test = x_test.reshape(-1, 28, 28, 1)

The MNIST images are 28x28 pixels.

* `reshape` changes the shape of the dataset to include a single color channel (grayscale), making the shape (28, 28, 1).

* `-1` allows the reshaping to automatically calculate the correct size for this dimension.

# Build the CNN Model

In [23]:
model = Sequential([
    Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
    MaxPooling2D((2, 2)),
    Conv2D(64, (3, 3), activation='relu'),
    MaxPooling2D((2, 2)),
    Flatten(),
    Dense(64, activation='relu'),
    Dense(10, activation='softmax')
])

`Sequential` creates a linear stack of layers.

The first layer is a `Conv2D` layer with 32 filters, a 3x3 kernel, and ReLU activation function. The input_shape is specified for the first layer.

`MaxPooling2D` with a 2x2 pool size reduces the spatial dimensions of the feature maps.

Another `Conv2D` layer with 64 filters and a 3x3 kernel.

Another `MaxPooling2D` layer.

`Flatten` layer converts the 2D feature maps to a 1D vector.

`Dense` layer with 64 units and ReLU activation.

`Dense` output layer with 10 units (for the 10 classes) and softmax activation.

# Compile the Model

In [24]:
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])


`compile` configures the model for training.

`optimizer='adam'` specifies the Adam optimizer.

`loss='sparse_categorical_crossentropy'` is used for multi-class classification where labels are integers.

`metrics=['accuracy']` tracks the accuracy of the model.

# Train the Model

In [25]:
model.fit(x_train, y_train, epochs=5, validation_data=(x_test, y_test))

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.src.callbacks.History at 0x1476f7bd0d0>

`fit` trains the model on the training data.

`epochs=5` specifies the number of times to iterate over the training data.

`validation_data=(x_test, y_test)` evaluates the model on the test data at the end of each epoch.

# Save the Model

In [26]:
model.save('mnist_cnn_model.h5')

`save` saves the trained model to a file named `mnist_cnn_model.h5`.

This allows the model to be loaded and used later without retraining.