# Class 2: Convolutional Neural Networks Fundamentals

**Week 9: Convolutional Neural Networks (CNNs) and Image Processing**

## Objective
In this class, we'll dive into **Convolutional Neural Networks (CNNs)**, the backbone of deep learning for images. You'll learn how CNNs process image data and build a simple CNN to classify images from the CIFAR-10 dataset.

## Agenda
1. Why use CNNs for images?
2. Understanding CNN layers: convolutions, pooling, and fully connected layers.
3. Building a simple CNN with TensorFlow/Keras.
4. Exercise: Modify and experiment with the CNN architecture.

## Setup
Ensure you have the required libraries installed:
```bash
pip install tensorflow numpy matplotlib
```

Let's get started!

## Part 1: Why Use CNNs?

Images are high-dimensional (e.g., a 32x32x3 image has 3,072 values). Fully connected neural networks (like MLPs) struggle with images because:
- They have too many parameters, leading to overfitting and high computation.
- They ignore spatial structure (e.g., nearby pixels form patterns like edges).

**Convolutional Neural Networks (CNNs)** solve this by:
- Using **filters** to detect local patterns (e.g., edges, textures).
- Sharing weights to reduce parameters.
- Leveraging spatial hierarchies to learn complex features.

Let's explore the key components of a CNN.

## Part 2: CNN Layers

A CNN typically has three main types of layers:

1. **Convolutional Layers**:
   - Apply **filters** (e.g., 3x3 kernels) to input images to produce **feature maps**.
   - Each filter detects a specific pattern (e.g., horizontal edges).
   - Parameters: Number of filters, kernel size, stride, padding.
   - Example: A 32x32x3 image with 16 3x3 filters produces 16 feature maps.

2. **Pooling Layers**:
   - Reduce spatial dimensions (e.g., from 32x32 to 16x16) to lower computation and prevent overfitting.
   - Common type: **Max pooling** (takes the maximum value in a region, e.g., 2x2).
   - Example: 2x2 max pooling with stride 2 halves the width and height.

3. **Fully Connected Layers**:
   - Flatten feature maps and feed them into dense layers for classification.
   - Example: Output layer with 10 neurons for CIFAR-10's 10 classes.

4. **Activation Functions**:
   - **ReLU** (Rectified Linear Unit) is commonly used after convolutions to add non-linearity (`max(0, x)`).

**Question**: Why does pooling help prevent overfitting?

## Part 3: Building a Simple CNN

We'll use TensorFlow/Keras to build a CNN for CIFAR-10 classification. The architecture will include:
- 1 convolutional layer (16 filters, 3x3 kernel, ReLU).
- 1 max pooling layer (2x2).
- 1 fully connected layer for classification.

First, let's load and preprocess the CIFAR-10 dataset.

In [None]:
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt

# Load CIFAR-10 dataset
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()

# Normalize pixel values to [0, 1]
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0

# Define class names
class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer',
               'dog', 'frog', 'horse', 'ship', 'truck']

# Verify shapes
print('Training data shape:', x_train.shape)
print('Test data shape:', x_test.shape)

**Explanation**:
- We normalized pixel values (0-255 to 0-1) to help the model train faster.
- `x_train`: 50,000 images of shape (32, 32, 3).
- `y_train`: Labels as integers (0-9).

Now, let's define the CNN model.

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense

# Build the CNN
model = Sequential([
    # Convolutional layer: 16 filters, 3x3 kernel, ReLU activation
    Conv2D(16, (3, 3), activation='relu', input_shape=(32, 32, 3)),
    # Max pooling: 2x2 pool size
    MaxPooling2D((2, 2)),
    # Flatten feature maps for dense layer
    Flatten(),
    # Fully connected layer: 10 outputs for 10 classes
    Dense(10, activation='softmax')
])

# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Display model summary
model.summary()

**Explanation**:
- `Conv2D`: Applies 16 3x3 filters, outputs 16 feature maps.
- `MaxPooling2D`: Reduces feature maps from 30x30 to 15x15 (after convolution padding).
- `Flatten`: Converts feature maps to a 1D vector.
- `Dense`: Outputs probabilities for 10 classes (softmax activation).
- `sparse_categorical_crossentropy`: Loss function for integer labels.
- `model.summary()`: Shows layer shapes and parameter counts.

**Question**: How does the number of filters affect the model?

Let's train the model for a few epochs (this may take a minute).

In [None]:
# Train the model
history = model.fit(x_train, y_train, epochs=5, batch_size=32,
                    validation_data=(x_test, y_test))

# Plot training accuracy
plt.plot(history.history['accuracy'], label='Training Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.show()

**Explanation**:
- Trained for 5 epochs to keep it quick (accuracy may be low due to simplicity).
- Plotted accuracy to visualize learning progress.
- Validation accuracy (on test set) shows how well the model generalizes.

**Note**: This is a basic model. We'll improve it in later classes!

## Part 4: Visualizing Predictions

Let's see how the model performs on a few test images.

In [None]:
# Predict on first 9 test images
predictions = model.predict(x_test[:9])
predicted_labels = np.argmax(predictions, axis=1)

# Display images with predicted and true labels
plt.figure(figsize=(8, 8))
for i in range(9):
    plt.subplot(3, 3, i + 1)
    plt.imshow(x_test[i])
    plt.title(f'Pred: {class_names[predicted_labels[i]]}\nTrue: {class_names[y_test[i][0]]}')
    plt.axis('off')
plt.show()

**Explanation**:
- `model.predict`: Outputs probabilities for each class.
- `np.argmax`: Selects the class with the highest probability.
- Compare predicted vs. true labels to evaluate performance.

## Exercise: Experiment with the CNN

Now it's your turn! Complete the tasks below to explore CNNs further.

1. **Modify the Number of Filters**:
   - Change the number of filters in the `Conv2D` layer to 32 (instead of 16).
   - Rebuild, compile, and check the new model summary.
   - How does the number of parameters change?

2. **Add Another Convolutional Layer**:
   - Add a second `Conv2D` layer with 8 filters (3x3, ReLU) before the pooling layer.
   - Rebuild, compile, and train the model for 5 epochs.
   - Plot the training and validation accuracy.

3. **Challenge (Optional)**:
   - Change the kernel size of the first `Conv2D` layer to 5x5 (instead of 3x3).
   - Train the model and compare validation accuracy to the original model.
   - What might a larger kernel size do?

Write your code in the cells below.

In [None]:
# Task 1: Modify number of filters to 32
# Your code here




# Task 2: Add another convolutional layer
# Your code here




# Task 3 (Optional): Change kernel size to 5x5
# Your code here


## Wrap-Up

In this class, you:
- Learned why CNNs are suited for image data.
- Understood convolutional, pooling, and fully connected layers.
- Built and trained a simple CNN on CIFAR-10.
- Visualized predictions and explored model performance.

**Homework**:
- Experiment with changing the number of filters or kernel size in the CNN and observe effects on the model summary.
- Read about image preprocessing (resizing, augmentation) for the next class.
- Submit your completed notebook if required.

**Next Class**: We'll focus on image preprocessing and training CNNs effectively!

Questions? Feel free to ask!