# Module 9 â€“ Build a CNN Step by Step (Keras + MNIST)

A hands-on walkthrough inspired by the Medium guide. Run cells in order, read the notes, and tweak the model.

**You will practice:**
- Loading and inspecting MNIST digits
- Preparing images for a CNN (scaling, channel dimension, one-hot labels)
- Stacking `Conv2D` + `MaxPooling2D` layers in Keras
- Training, evaluating, and plotting results
- Running small experiments to improve the baseline

**Setup:** install `tensorflow` and `matplotlib` (e.g., `pip install tensorflow matplotlib`).

## 1. Imports and reproducibility
Keep dependencies minimal and set seeds so runs stay comparable. GPU runs may still vary slightly.

In [4]:
        import os
        import random
        import numpy as np
        import matplotlib.pyplot as plt
        import tensorflow as tf
        
        # Reproducibility (CPU is most deterministic)
        seed = 42
        random.seed(seed)
        np.random.seed(seed)
        tf.random.set_seed(seed)
        os.environ['PYTHONHASHSEED'] = str(seed)
        
        print('TensorFlow version:', tf.__version__)
        

ImportError: cannot import name 'broadcast_to' from 'numpy.lib.stride_tricks' (/opt/anaconda3/envs/smiledetection/lib/python3.10/site-packages/numpy/lib/stride_tricks.py)

## 2. Load MNIST digits
MNIST has 70k 28x28 grayscale images of handwritten digits. Keras fetches and caches it for us.

In [None]:
        # Keras gives train and test splits out of the box
        (x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
        
        print('Training set:', x_train.shape, 'Labels:', y_train.shape)
        print('Test set     :', x_test.shape, 'Labels:', y_test.shape)
        

## 3. Peek at a few images
Always look at the raw data first to ground the rest of the pipeline.

In [None]:
        def show_examples(images, labels, n=6):
            plt.figure(figsize=(10, 2))
            for i in range(n):
                plt.subplot(1, n, i + 1)
                plt.imshow(images[i], cmap='gray')
                plt.axis('off')
                plt.title(int(labels[i]))
            plt.show()
        
        show_examples(x_train, y_train)
        

## 4. Preprocess: scale, reshape, one-hot labels
- CNNs expect channels, so reshape from `(28, 28)` to `(28, 28, 1)`.
- Scale pixels to `[0, 1]` for stable gradients.
- Convert labels to one-hot vectors to match a `softmax` output.

In [None]:
        x_train = (x_train.astype('float32') / 255.0)[..., None]
        x_test = (x_test.astype('float32') / 255.0)[..., None]
        
        num_classes = 10
        y_train_cat = tf.keras.utils.to_categorical(y_train, num_classes)
        y_test_cat = tf.keras.utils.to_categorical(y_test, num_classes)
        
        print('New train shape:', x_train.shape)
        print('Example label vector:', y_train_cat[0])
        

## 5. Build the CNN
Start with a small, readable architecture:
- `Conv2D` layers learn local patterns; `ReLU` activates them.
- `MaxPooling2D` downsamples while keeping strongest signals.
- `Dropout` regularizes; `Dense` + `softmax` maps to 10 classes.

In [None]:
        model = tf.keras.Sequential([
            tf.keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
            tf.keras.layers.MaxPooling2D((2, 2)),
            tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
            tf.keras.layers.MaxPooling2D((2, 2)),
            tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
            tf.keras.layers.Flatten(),
            tf.keras.layers.Dropout(0.3),
            tf.keras.layers.Dense(64, activation='relu'),
            tf.keras.layers.Dense(num_classes, activation='softmax')
        ])
        
        model.summary()
        

## 6. Compile the model
- Optimizer: Adam (adaptive learning rate)
- Loss: `sparse_categorical_crossentropy` fits integer labels
- Metric: accuracy keeps feedback simple

In [None]:
        model.compile(
            optimizer=tf.keras.optimizers.Adam(learning_rate=1e-3),
            loss='sparse_categorical_crossentropy',
            metrics=['accuracy']
        )
        

## 7. Train
Five epochs are fast on CPU and good for a first pass. Validation uses the test set for quick feedback.

In [None]:
        history = model.fit(
            x_train, y_train,
            epochs=5,
            batch_size=128,
            validation_data=(x_test, y_test),
            verbose=2
        )
        

## 8. Evaluate and plot learning curves
Check test accuracy, then look at loss/accuracy over epochs to spot under- or overfitting.

In [None]:
        test_loss, test_acc = model.evaluate(x_test, y_test, verbose=0)
        print(f"Test accuracy: {test_acc:.4f} | Test loss: {test_loss:.4f}")
        
        fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))
        ax1.plot(history.history['loss'], label='train')
        ax1.plot(history.history['val_loss'], label='val')
        ax1.set_title('Loss')
        ax1.legend()
        
        ax2.plot(history.history['accuracy'], label='train')
        ax2.plot(history.history['val_accuracy'], label='val')
        ax2.set_title('Accuracy')
        ax2.legend()
        plt.show()
        

## 9. Predict on new images
Turn model outputs into class IDs with `argmax`, then visualize a few predictions next to the digits.

In [None]:
        num_samples = 8
        sample_images = x_test[:num_samples]
        sample_labels = y_test[:num_samples]
        preds = model.predict(sample_images)
        pred_classes = preds.argmax(axis=1)
        
        show_examples(sample_images.squeeze(-1), pred_classes, n=num_samples)
        print('True labels:', sample_labels.tolist())
        print('Predicted   :', pred_classes.tolist())
        

## 10. Try your own experiments
- Train longer (`epochs=10`+) or with a smaller `batch_size`.
- Add `BatchNormalization` after convolutions.
- Replace pooling with `Conv2D` stride 2 to keep learning features while downsampling.
- Add data augmentation: `RandomFlip`, `RandomRotation`, `RandomZoom`.
- Swap in `tf.keras.datasets.cifar10` (RGB, shape `(32, 32, 3)`) and adjust the model.