Implementing a CNN

Build a Simple CNN on the MNIST Dataset
Step 1: Load and Preprocess the Data

MNIST Dataset: This dataset consists of 28x28 grayscale images of handwritten digits (0-9).

In [1]:
import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical

# Load the MNIST dataset
(X_train, y_train), (X_test, y_test) = mnist.load_data()

# Preprocess the data
X_train = X_train.reshape(-1, 28, 28, 1).astype('float32') / 255.0
X_test = X_test.reshape(-1, 28, 28, 1).astype('float32') / 255.0

# One-hot encode the labels
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)


Step 2: Build the CNN Architecture

A simple CNN architecture for MNIST can consist of several convolutional layers followed by pooling layers, and finally, fully connected layers.

In [2]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout

model = Sequential([
    Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
    MaxPooling2D(pool_size=(2, 2)),
    
    Conv2D(64, (3, 3), activation='relu'),
    MaxPooling2D(pool_size=(2, 2)),
    
    Flatten(),
    Dense(128, activation='relu'),
    Dropout(0.5),
    Dense(10, activation='softmax')
])

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.summary()


Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 26, 26, 32)        320       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 13, 13, 32)       0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           (None, 11, 11, 64)        18496     
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 5, 5, 64)         0         
 2D)                                                             
                                                                 
 flatten (Flatten)           (None, 1600)              0         
                                                                 
 dense (Dense)               (None, 128)               2

Step 3: Train the CNN

Use callbacks such as EarlyStopping and ModelCheckpoint to improve training.

In [10]:
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint

callbacks = [
    EarlyStopping(monitor='val_loss', patience=5),
    ModelCheckpoint('mnist_cnn', save_best_only=True, save_format='tf')
]


history = model.fit(X_train, y_train, validation_split=0.2, epochs=10, batch_size=128, callbacks=callbacks)


Epoch 1/10



INFO:tensorflow:Assets written to: mnist_cnn\assets


INFO:tensorflow:Assets written to: mnist_cnn\assets


Epoch 2/10



INFO:tensorflow:Assets written to: mnist_cnn\assets


INFO:tensorflow:Assets written to: mnist_cnn\assets


Epoch 3/10
Epoch 4/10
Epoch 5/10



INFO:tensorflow:Assets written to: mnist_cnn\assets


INFO:tensorflow:Assets written to: mnist_cnn\assets


Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


Evaluate the CNN

Evaluate the model on the test set and document the results.

In [11]:
test_loss, test_acc = model.evaluate(X_test, y_test)
print(f"Test accuracy: {test_acc:.4f}")


Test accuracy: 0.9921
