# Unit 2 CNN Fundamentals

Welcome to the next step in your journey of mastering drawing recognition using Convolutional Neural Networks (CNNs). In this lesson, we will delve into the fundamentals of CNNs, exploring what they are and how they work. This will build on your understanding of the drawing recognition problem and prepare you to create and train your own CNN models.

### What You'll Learn

Convolutional Neural Networks are a type of deep learning model specifically designed for processing structured grid data, like images. In this lesson, you will learn about the basic components of a CNN, including convolutional layers, pooling layers, and fully connected layers. We will also guide you through building a simple CNN using the MNIST dataset with Keras and TensorFlow.

Here’s a breakdown of the main concepts you’ll encounter in the code:

  * **Convolutional Layers:** These layers use filters (small matrices) that slide over the input image to detect features such as edges or patterns. The process of applying these filters is called convolution. Each filter helps the network learn different features from the image.
  * **Activation Functions:** After each convolution, an activation function (like `relu`, which stands for Rectified Linear Unit) is applied. This introduces non-linearity, allowing the network to learn more complex patterns.
  * **Pooling Layers:** These layers reduce the spatial size of the feature maps, making the computation more efficient and helping the model focus on the most important features. `Max pooling` is a common method, which takes the maximum value from a region of the feature map.
  * **Fully Connected Layers:** After the convolutional and pooling layers, the data is flattened and passed through one or more fully connected (dense) layers. These layers combine the features learned by previous layers to make the final classification.
  * **Optimizers:** The optimizer (like `adam` in the code) is an algorithm that adjusts the model’s parameters (weights) to minimize the loss during training.
  * **Loss Function:** The loss function (like `categorical_crossentropy`) measures how well the model’s predictions match the actual labels. The optimizer tries to minimize this value.
  * **Accuracy:** This is a metric that tells you the percentage of correct predictions made by the model. It’s a common way to evaluate how well your model is performing.

Here’s a quick look at how you can build a simple CNN model:

```python
import tensorflow as tf
from tensorflow.keras import layers, models

def build_simple_cnn():
    model = models.Sequential([
        layers.Input(shape=(28, 28, 1)),
        layers.Conv2D(32, (3, 3), activation='relu'),  # Convolutional layer with ReLU activation
        layers.MaxPooling2D((2, 2)),                  # Pooling layer
        layers.Conv2D(64, (3, 3), activation='relu'), # Another convolutional layer
        layers.MaxPooling2D((2, 2)),                  # Another pooling layer
        layers.Flatten(),                             # Flatten to 1D for dense layers
        layers.Dense(64, activation='relu'),          # Fully connected (dense) layer
        layers.Dense(10, activation='softmax')        # Output layer for 10 classes
    ])
    return model

model = build_simple_cnn()
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.summary()
```

This code snippet demonstrates how to define a simple CNN model using Keras. The model consists of convolutional layers for feature extraction, pooling layers for down-sampling, and dense layers for classification. The optimizer, loss function, and accuracy metric are specified when compiling the model.

`model.summary()` displays a table summarizing the structure of your CNN, including the types of layers, their output shapes, and the number of parameters in each layer. This helps you quickly understand the architecture and complexity of your model.

### Training the Model

Once you have defined your CNN model, the next step is to train it using a dataset. Training involves feeding the model with input data and adjusting its parameters to minimize the difference between the predicted and actual outputs. In this lesson, we will use the MNIST dataset to train our simple CNN model.

Here's how you can train the model:

```python
# Train the model
model.fit(train_images, train_labels, epochs=1, batch_size=64, validation_split=0.1)
```

In this snippet, `train_images` and `train_labels` represent the training data and their corresponding labels. The model is trained for one epoch with a batch size of 64, and 10% of the training data is used for validation. Adjusting these parameters can help improve the model's performance and generalization.

### Making Predictions with the Model

After training your CNN model, you can use it to make predictions on new data. This involves passing input data through the model to obtain the predicted output. Here's how you can use the trained model to make predictions:

```python
# Make predictions
predictions = model.predict(test_images)

# Example: Get the predicted class for the first test image
predicted_class = predictions[0].argmax()
```

In this snippet, `test_images` represents the new data you want to classify. The `model.predict()` function returns an array of predictions, where each prediction is a probability distribution over the possible classes. You can use `argmax()` to find the class with the highest probability for each input, which is the model's predicted class. The `.argmax()` function returns the index of the highest value in the prediction array, which corresponds to the class with the highest predicted probability.

### Evaluating the Model

Once you've built and trained your CNN model, it's important to evaluate its performance to ensure it meets your expectations. Evaluating the model involves testing it on a separate dataset that it hasn't seen during training. This helps in assessing how well the model generalizes to new, unseen data.

Here's how you can evaluate your CNN model using Keras:

```python
# Evaluate the model
test_loss, test_acc = model.evaluate(test_images, test_labels)
print(f"Test accuracy: {test_acc}")
```

In this snippet, `model.evaluate()` is used to compute the loss and accuracy of the model on the test dataset. The `test_acc` provides a measure of how well the model performs on the test data, which is crucial for understanding its effectiveness in real-world applications.

### Why It Matters

Understanding CNN fundamentals is crucial because CNNs are the backbone of many modern computer vision applications. They are capable of automatically learning and extracting features from images, making them highly effective for tasks like drawing recognition. By mastering CNNs, you will be equipped with the skills to tackle a wide range of image processing challenges, from recognizing handwritten digits to more complex image classification tasks.

Excited to see CNNs in action? Let's move on to the practice section and start building your own CNN models.

## Building Your First CNN Model

Great job analyzing the digit distribution in the MNIST dataset! Now it's time to build a simple Convolutional Neural Network (CNN) to recognize these handwritten digits.

CNNs excel at image recognition tasks because they can automatically learn spatial hierarchies of features. For handwritten digit recognition, the CNN will first learn to identify edges and simple patterns, then combine these features in later layers to recognize entire digits.

In this practice, you'll complete the code to build a CNN model with convolutional layers for feature extraction, pooling layers for downsampling, and dense layers for classification. You'll also prepare the data correctly by reshaping and normalizing the images, which is a critical preprocessing step for CNNs.

This model will serve as the foundation for our digit recognition system and demonstrate the core architecture used in many computer vision applications.

```python
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical

# Load and preprocess a smaller sample of the MNIST dataset
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
TRAIN_SIZE = 3000
TEST_SIZE = 1000
train_images = train_images[:TRAIN_SIZE].reshape((________)).astype('float32') / ________
test_images = test_images[:TEST_SIZE].reshape((________)).astype('float32') / ________
train_labels = to_categorical(train_labels[:TRAIN_SIZE])
test_labels = to_categorical(test_labels[:TEST_SIZE])

# Simple CNN model using MNIST dataset
def build_simple_cnn():
    model = models.Sequential([
        layers.Input(shape=(28, 28, 1)),
        layers.Conv2D(32, (3, 3), activation='________'),
        layers.MaxPooling2D((2, 2)),
        layers.Conv2D(64, (3, 3), activation='________'),
        layers.MaxPooling2D((2, 2)),
        layers.Flatten(),
        layers.Dense(64, activation='________'),
        layers.Dense(10, activation='________')
    ])
    return model

model = build_simple_cnn()
model.compile(optimizer='________', loss='________', metrics=['________'])
model.summary()

# Train the model
model.fit(train_images, train_labels, epochs=1, batch_size=64, validation_split=0.1)

# Evaluate the model
test_loss, test_acc = model.evaluate(test_images, test_labels)
print(f"Test accuracy: {test_acc}")
```

```python
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical

# Load and preprocess a smaller sample of the MNIST dataset
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
TRAIN_SIZE = 3000
TEST_SIZE = 1000
# Reshape images to include a channel dimension and normalize pixel values to [0, 1]
train_images = train_images[:TRAIN_SIZE].reshape((TRAIN_SIZE, 28, 28, 1)).astype('float32') / 255
test_images = test_images[:TEST_SIZE].reshape((TEST_SIZE, 28, 28, 1)).astype('float32') / 255
# Convert integer labels to one-hot encoded vectors
train_labels = to_categorical(train_labels[:TRAIN_SIZE])
test_labels = to_categorical(test_labels[:TEST_SIZE])

# Simple CNN model using MNIST dataset
def build_simple_cnn():
    model = models.Sequential([
        # Input layer specifying the shape of the images (28x28 pixels, 1 channel for grayscale)
        layers.Input(shape=(28, 28, 1)),
        # First Convolutional Layer: 32 filters, 3x3 kernel, ReLU activation
        layers.Conv2D(32, (3, 3), activation='relu'),
        # First MaxPooling Layer: Reduces spatial dimensions by taking the maximum value over 2x2 windows
        layers.MaxPooling2D((2, 2)),
        # Second Convolutional Layer: 64 filters, 3x3 kernel, ReLU activation
        layers.Conv2D(64, (3, 3), activation='relu'),
        # Second MaxPooling Layer
        layers.MaxPooling2D((2, 2)),
        # Flatten layer: Converts the 2D feature maps into a 1D vector for the dense layers
        layers.Flatten(),
        # First Dense (Fully Connected) Layer: 64 units, ReLU activation
        layers.Dense(64, activation='relu'),
        # Output Dense Layer: 10 units (for 10 digits), Softmax activation for probability distribution
        layers.Dense(10, activation='softmax')
    ])
    return model

# Build the CNN model
model = build_simple_cnn()
# Compile the model:
# optimizer: 'adam' is an efficient stochastic optimization algorithm
# loss: 'categorical_crossentropy' is used for multi-class classification with one-hot encoded labels
# metrics: 'accuracy' to monitor the proportion of correctly classified images
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
# Display a summary of the model's architecture
model.summary()

# Train the model:
# train_images: The training data
# train_labels: The corresponding labels for the training data
# epochs: Number of times the model will iterate over the entire training dataset
# batch_size: Number of samples per gradient update
# validation_split: Fraction of the training data to be used as validation data
model.fit(train_images, train_labels, epochs=1, batch_size=64, validation_split=0.1)

# Evaluate the model on the test dataset:
# test_images: The test data
# test_labels: The corresponding labels for the test data
# Returns the loss value and metrics values for the model in test mode
test_loss, test_acc = model.evaluate(test_images, test_labels)
print(f"Test accuracy: {test_acc}")
```

## Training and Evaluating Your CNN Model

Great job building your first CNN model! Now let's take the next step and train it on the MNIST dataset to recognize handwritten digits.

Training a neural network involves feeding it batches of images and their labels, allowing the model to learn patterns through multiple iterations. After training, we need to evaluate how well the model performs on unseen data to measure its ability to generalize.

In this practice, you'll complete the code to train the model and evaluate its accuracy on the test set. You'll specify the training data, configure training parameters, and implement the evaluation code to measure your model's performance.

This step is crucial for understanding how well your CNN can recognize digits and will prepare you for more advanced techniques in future lessons.

```python
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical

# Load and preprocess a smaller sample of the MNIST dataset
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
TRAIN_SIZE = 3000
TEST_SIZE = 1000
train_images = train_images[:TRAIN_SIZE].reshape((TRAIN_SIZE, 28, 28, 1)).astype('float32') / 255
test_images = test_images[:TEST_SIZE].reshape((TEST_SIZE, 28, 28, 1)).astype('float32') / 255
train_labels = to_categorical(train_labels[:TRAIN_SIZE])
test_labels = to_categorical(test_labels[:TEST_SIZE])

# Simple CNN model using MNIST dataset
def build_simple_cnn():
    model = models.Sequential([
        layers.Input(shape=(28, 28, 1)),
        layers.Conv2D(32, (3, 3), activation='relu'),
        layers.MaxPooling2D((2, 2)),
        layers.Conv2D(64, (3, 3), activation='relu'),
        layers.MaxPooling2D((2, 2)),
        layers.Flatten(),
        layers.Dense(64, activation='relu'),
        layers.Dense(10, activation='softmax')
    ])
    return model

model = build_simple_cnn()
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.summary()

# TODO: Train the model with 1 epoch, batch size of 64, and use 10% of training data for validation
model.fit(_________, _________, epochs=_________, batch_size=_________, validation_split=_________)

# TODO: Evaluate the model on test data and print the accuracy
________, ________ = model.evaluate(_________, _________)
print(f"Test accuracy: {________}")

```

## Building Enhanced CNN with Sequential API