# Task 2: Deep Learning with TensorFlow/PyTorch - MNIST Handwritten Digits

**Goal:**
1. Build a CNN model to classify handwritten digits.
2. Achieve >95% test accuracy.
3. Visualize the model’s predictions on 5 sample images.

## 1. Import Libraries (using TensorFlow for this example)

In [None]:
import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras.utils import to_categorical
import matplotlib.pyplot as plt
import numpy as np

## 2. Load and Preprocess MNIST Dataset

In [None]:
# Load data
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Preprocess images
# Reshape data to fit the model (add channel dimension)
x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)

# Normalize pixel values to be between 0 and 1
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0

# Preprocess labels (one-hot encode)
y_train_categorical = to_categorical(y_train, num_classes=10)
y_test_categorical = to_categorical(y_test, num_classes=10)

print(f"x_train shape: {x_train.shape}")
print(f"y_train_categorical shape: {y_train_categorical.shape}")
print(f"x_test shape: {x_test.shape}")
print(f"y_test_categorical shape: {y_test_categorical.shape}")

## 3. Build the CNN Model

In [None]:
model = Sequential([
    # Convolutional Layer 1
    Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(28, 28, 1)),
    MaxPooling2D(pool_size=(2, 2)),
    
    # Convolutional Layer 2
    Conv2D(64, kernel_size=(3, 3), activation='relu'),
    MaxPooling2D(pool_size=(2, 2)),
    
    # Flattening Layer
    Flatten(),
    
    # Fully Connected (Dense) Layer
    Dense(128, activation='relu'),
    Dropout(0.5), # Dropout for regularization
    
    # Output Layer
    Dense(10, activation='softmax') # 10 classes for digits 0-9
])

# Compile the model
model.compile(optimizer='adam', 
              loss='categorical_crossentropy', 
              metrics=['accuracy'])

# Print model summary
model.summary()

### Model Architecture Explanation:
- **Conv2D (Convolutional Layer):** Applies filters to the input image to create feature maps. `32` and `64` are the number of filters. `kernel_size=(3,3)` is the size of the filter. `activation='relu'` introduces non-linearity.
- **MaxPooling2D:** Downsamples the feature maps, reducing dimensionality and helping to make the model more robust to variations in the input.
- **Flatten:** Converts the 2D feature maps into a 1D vector to be fed into the Dense layers.
- **Dense (Fully Connected Layer):** A standard neural network layer where each neuron is connected to all neurons in the previous layer. `128` is the number of neurons.
- **Dropout:** A regularization technique where randomly selected neurons are ignored during training. This helps prevent overfitting. `0.5` means 50% of neurons are dropped.
- **Output Layer (Dense):** Has `10` neurons (one for each digit) and uses `softmax` activation to output probabilities for each class.

## 4. Train the Model

In [None]:
batch_size = 128
epochs = 10 # Can be increased if needed to reach >95% accuracy

history = model.fit(x_train, y_train_categorical, 
                    epochs=epochs, 
                    batch_size=batch_size, 
                    validation_split=0.1) # Use part of training data for validation

## 5. Evaluate the Model on Test Data

In [None]:
loss, accuracy = model.evaluate(x_test, y_test_categorical, verbose=0)
print(f"Test Loss: {loss:.4f}")
print(f"Test Accuracy: {accuracy:.4f}")

if accuracy > 0.95:
    print("\nModel achieved >95% test accuracy!")
else:
    print("\nModel did not achieve >95% test accuracy. Consider training for more epochs or adjusting the architecture.")

## 6. Visualize Predictions on Sample Images

In [None]:
# Get predictions for the test set
predictions = model.predict(x_test)

# Select 5 random sample images from the test set
num_samples = 5
sample_indices = np.random.choice(x_test.shape[0], num_samples, replace=False)

plt.figure(figsize=(15, 5))
for i, index in enumerate(sample_indices):
    plt.subplot(1, num_samples, i + 1)
    plt.imshow(x_test[index].reshape(28, 28), cmap='gray')
    predicted_label = np.argmax(predictions[index])
    true_label = np.argmax(y_test_categorical[index]) # Original y_test[index] would also work
    plt.title(f"Pred: {predicted_label}\nTrue: {true_label}")
    plt.axis('off')
plt.tight_layout()
plt.show()

### Training Loop Explanation:
- **`model.fit()`:** This function trains the model.
- **`x_train, y_train_categorical`:** The training data and corresponding one-hot encoded labels.
- **`epochs`:** The number of times the model will iterate over the entire training dataset.
- **`batch_size`:** The number of samples processed before the model's internal parameters are updated.
- **`validation_split=0.1`:** Reserves 10% of the training data to be used as validation data. The model's performance on this validation set is monitored during training, which can help detect overfitting.