# CNN on MNIST Dataset

In this notebook, we will train a **Convolutional Neural Network (CNN)** on the **MNIST dataset**, which contains 70,000 grayscale images of handwritten digits (0‚Äì9).

We'll explore data loading, preprocessing, CNN architecture, training, and performance evaluation.

## üì¶ 1. Importing Libraries

In [None]:
import tensorflow as tf
from tensorflow.keras import datasets, layers, models
import matplotlib.pyplot as plt
import numpy as np

## üì• 2. Load and Preprocess MNIST Dataset

In [None]:
# Load dataset
(x_train, y_train), (x_test, y_test) = datasets.mnist.load_data()

# Normalize pixel values (0-1)
x_train, x_test = x_train / 255.0, x_test / 255.0

# Reshape for CNN (samples, height, width, channels)
x_train = x_train.reshape(-1, 28, 28, 1)
x_test = x_test.reshape(-1, 28, 28, 1)

print('Training set shape:', x_train.shape)
print('Testing set shape:', x_test.shape)

### üîç Visualize a few samples

In [None]:
plt.figure(figsize=(10,2))
for i in range(10):
    plt.subplot(1,10,i+1)
    plt.imshow(x_train[i].reshape(28,28), cmap='gray')
    plt.axis('off')
    plt.title(str(y_train[i]))
plt.show()

## üß© 3. Build CNN Architecture

In [None]:
model = models.Sequential([
    layers.Conv2D(32, (3,3), activation='relu', input_shape=(28,28,1)),
    layers.MaxPooling2D((2,2)),
    layers.Conv2D(64, (3,3), activation='relu'),
    layers.MaxPooling2D((2,2)),
    layers.Flatten(),
    layers.Dense(128, activation='relu'),
    layers.Dropout(0.3),
    layers.Dense(10, activation='softmax')
])

model.summary()

## ‚öôÔ∏è 4. Compile the Model

In [None]:
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

## üöÄ 5. Train the CNN

In [None]:
history = model.fit(x_train, y_train, epochs=5, batch_size=64, validation_data=(x_test, y_test))

## üìä 6. Plot Accuracy and Loss Curves

In [None]:
plt.figure(figsize=(12,4))
plt.subplot(1,2,1)
plt.plot(history.history['accuracy'], label='Train Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.legend(); plt.title('Model Accuracy')
plt.subplot(1,2,2)
plt.plot(history.history['loss'], label='Train Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.legend(); plt.title('Model Loss')
plt.show()

## üßÆ 7. Evaluate Model Performance

In [None]:
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=2)
print(f"Test Accuracy: {test_acc*100:.2f}%")

## üîç 8. Visualize Predictions

In [None]:
predictions = model.predict(x_test[:9])

plt.figure(figsize=(9,9))
for i in range(9):
    plt.subplot(3,3,i+1)
    plt.imshow(x_test[i].reshape(28,28), cmap='gray')
    plt.title(f"Pred: {np.argmax(predictions[i])}\nTrue: {y_test[i]}")
    plt.axis('off')
plt.show()

## üßæ Summary
- Built a CNN to classify handwritten digits.
- Achieved high accuracy (>98%) with just a few epochs.
- Demonstrated training curves and visualized predictions.

Next Notebook ‚Üí `07-CNN_on_CIFAR10.ipynb` üñºÔ∏è