## Load and Preprocess MNIST Data

### Subtask:
Load the MNIST dataset, normalize pixel values to [0, 1], and reshape images to include a channel dimension for CNN compatibility.

**Reasoning**:
I will load the MNIST dataset using `tf.keras.datasets.mnist.load_data()`. The images will then be normalized by dividing pixel values by 255 to scale them between 0 and 1. Finally, the images will be reshaped from `(num_samples, 28, 28)` to `(num_samples, 28, 28, 1)` to add a channel dimension, which is required for Conv2D layers in Keras.

In [None]:
# Load the MNIST dataset
print("Loading MNIST dataset...")
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

# Normalize pixel values to [0, 1]
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0

# Reshape images to add a channel dimension (for grayscale images, channel is 1)
x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)

print(f"x_train shape: {x_train.shape}")
print(f"x_test shape: {x_test.shape}")
print(f"y_train shape: {y_train.shape}")
print(f"y_test shape: {y_test.shape}")
print("MNIST dataset loaded and preprocessed.")

Loading MNIST dataset...
x_train shape: (60000, 28, 28, 1)
x_test shape: (10000, 28, 28, 1)
y_train shape: (60000,)
y_test shape: (10000,)
MNIST dataset loaded and preprocessed.


## Build CNN Model

### Subtask:
Define the Convolutional Neural Network architecture using Keras Sequential API, including convolutional, pooling, flatten, and dense layers. Compile the model with an appropriate optimizer, loss function, and metrics.

**Reasoning**:
I will define a sequential CNN model using `tf.keras.models.Sequential`. The architecture will consist of two `Conv2D` layers with `relu` activation and `MaxPooling2D` layers for feature extraction, followed by a `Flatten` layer to convert 2D feature maps into a 1D vector. Finally, two `Dense` layers with `relu` and `softmax` activations will be used for classification. The model will be compiled using the `adam` optimizer, `sparse_categorical_crossentropy` as the loss function (suitable for integer labels), and `accuracy` as the metric.

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense

# Define the CNN model architecture
model = Sequential([
    # First Convolutional Layer: 32 filters, 3x3 kernel, ReLU activation.
    # input_shape is set for the 28x28 grayscale MNIST images (28, 28, 1).
    Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
    # First MaxPooling Layer: Downsamples the feature maps by 2x2.
    MaxPooling2D((2, 2)),
    # Second Convolutional Layer: 64 filters, 3x3 kernel, ReLU activation.
    Conv2D(64, (3, 3), activation='relu'),
    # Second MaxPooling Layer: Further downsamples the feature maps by 2x2.
    MaxPooling2D((2, 2)),
    # Flatten Layer: Flattens the 2D feature maps into a 1D vector to feed into dense layers.
    Flatten(),
    # First Dense (Fully Connected) Layer: 128 units, ReLU activation.
    Dense(128, activation='relu'),
    # Output Dense Layer: 10 units (for 10 MNIST classes), softmax activation for probability distribution.
    Dense(10, activation='softmax')
])

# Compile the model
# Optimizer: 'adam' is a popular choice for its efficiency.
# Loss function: 'sparse_categorical_crossentropy' is used for integer labels (0-9) with multi-class classification.
# Metrics: 'accuracy' is monitored during training.
model.compile(
    optimizer='adam',
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

# Print a summary of the model architecture
print("Model summary:")
model.summary()

Model summary:


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


## Train CNN Model

### Subtask:
Train the built CNN model using the preprocessed training data, monitoring validation performance over several epochs.

**Reasoning**:
I will train the compiled CNN model using `model.fit()`. The training will be performed on `x_train` and `y_train` for 10 epochs. A validation split of 10% will be used to monitor the model's performance on unseen data during training, which helps detect overfitting.

In [None]:
# Train the model
print("\nTraining the model...")
# x_train: training images, y_train: training labels.
# epochs: number of times to iterate over the entire training dataset.
# validation_split: 10% of the training data will be used for validation during training.
history = model.fit(
    x_train, y_train,
    epochs=10,
    validation_split=0.1
)
print("Model training complete.")


Training the model...
Epoch 1/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m56s[0m 33ms/step - accuracy: 0.9020 - loss: 0.3206 - val_accuracy: 0.9858 - val_loss: 0.0507
Epoch 2/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m82s[0m 33ms/step - accuracy: 0.9850 - loss: 0.0485 - val_accuracy: 0.9863 - val_loss: 0.0464
Epoch 3/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m81s[0m 32ms/step - accuracy: 0.9897 - loss: 0.0313 - val_accuracy: 0.9902 - val_loss: 0.0368
Epoch 4/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m55s[0m 32ms/step - accuracy: 0.9940 - loss: 0.0193 - val_accuracy: 0.9905 - val_loss: 0.0370
Epoch 5/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m58s[0m 35ms/step - accuracy: 0.9954 - loss: 0.0139 - val_accuracy: 0.9917 - val_loss: 0.0353
Epoch 6/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m82s[0m 35ms/step - accuracy: 0.9960 - loss: 0.0118 - val_accuracy: 0.9888 

## Evaluate CNN Model

### Subtask:
Evaluate the trained CNN model on the test dataset to determine its accuracy and loss on unseen data.

**Reasoning**:
I will use `model.evaluate()` with `x_test` and `y_test` to calculate the loss and accuracy of the trained model on the test dataset. This will provide an objective measure of how well the model generalizes to new, unseen data.

In [None]:
# Evaluate the model on the test data
print("\nEvaluating the model on the test data...")
loss, accuracy = model.evaluate(x_test, y_test)

print(f"Test Loss: {loss:.4f}")
print(f"Test Accuracy: {accuracy:.4f}")
print("Model evaluation complete.")

Loading MNIST dataset...
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
[1m11490434/11490434[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step
MNIST dataset loaded and preprocessed.

Training the model...


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m53s[0m 31ms/step - accuracy: 0.9021 - loss: 0.3117 - val_accuracy: 0.9840 - val_loss: 0.0534
Epoch 2/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m53s[0m 31ms/step - accuracy: 0.9859 - loss: 0.0455 - val_accuracy: 0.9855 - val_loss: 0.0526
Epoch 3/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m52s[0m 31ms/step - accuracy: 0.9893 - loss: 0.0311 - val_accuracy: 0.9898 - val_loss: 0.0312
Epoch 4/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m51s[0m 30ms/step - accuracy: 0.9938 - loss: 0.0187 - val_accuracy: 0.9902 - val_loss: 0.0422
Epoch 5/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m82s[0m 31ms/step - accuracy: 0.9953 - loss: 0.0143 - val_accuracy: 0.9888 - val_loss: 0.0406
Epoch 6/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m53s[0m 31ms/step - accuracy: 0.9965 - loss: 0.0107 - val_accuracy: 0.9903 - val_loss: 0.0403
Epoc