# Deep Learning with TensorFlow and Keras

This notebook introduces the key ideas behind deep learning using TensorFlow and Keras. We will train a convolutional neural network (CNN) from scratch and then demonstrate how to fine-tune a pre-trained model with transfer learning.

## Learning Objectives

* Understand the building blocks of deep neural networks.
* Learn how to prepare image data for training in TensorFlow.
* Train a convolutional neural network (CNN) from scratch on the CIFAR-10 dataset.
* Apply transfer learning with a pre-trained model and fine-tune it for improved performance.
* Evaluate model performance with accuracy metrics and visualizations.

## Notebook Outline

1. Introduction to Deep Learning Concepts
2. Environment Setup
3. Exploring the CIFAR-10 Dataset
4. Building and Training a Baseline CNN
5. Evaluating Model Performance
6. Transfer Learning and Fine-Tuning with MobileNetV2
7. Summary and Takeaways

## 1. Introduction to Deep Learning Concepts

Deep learning is a subset of machine learning that uses neural networks with many layers to learn complex patterns in data. Key ideas include:

* **Neurons and Layers**: Each neuron computes a weighted sum of its inputs followed by a non-linear activation function. Layers of neurons compose to form deep networks.
* **Feature Hierarchies**: Lower layers learn low-level features (e.g., edges in images), while higher layers learn more abstract concepts (e.g., object parts).
* **Training with Backpropagation**: Networks learn parameters by minimizing a loss function using gradient descent and backpropagation.
* **Regularization**: Techniques such as dropout, data augmentation, and weight decay prevent overfitting.
* **Transfer Learning**: Pre-trained models can be fine-tuned on new tasks, reducing the amount of data and time needed to reach high accuracy.

## 2. Environment Setup

We start by importing the Python packages required for the experiments. If you have access to a GPU, enabling it will accelerate training.

In [None]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, models

import numpy as np
import matplotlib.pyplot as plt

print(f'TensorFlow version: {tf.__version__}')

## 3. Exploring the CIFAR-10 Dataset

The [CIFAR-10 dataset](https://www.cs.toronto.edu/~kriz/cifar.html) contains 60,000 color images (32x32 pixels) in 10 categories, with 6,000 images per class. TensorFlow provides a built-in loader that returns NumPy arrays for the training and test sets.

In [None]:
(x_train, y_train), (x_test, y_test) = keras.datasets.cifar10.load_data()
class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']

print(f'Train images shape: {x_train.shape}')
print(f'Test images shape: {x_test.shape}')

Let's visualize a few images to understand what the model will learn.

In [None]:
plt.figure(figsize=(10, 4))
for i in range(10):
    plt.subplot(2, 5, i + 1)
    plt.imshow(x_train[i])
    plt.title(class_names[int(y_train[i])])
    plt.axis('off')
plt.tight_layout()
plt.show()

### Data Preprocessing

Neural networks converge faster when the input values are normalized. We will also convert the labels to integer vectors. For efficient training we use the `tf.data` API to batch and shuffle the data.

In [None]:
batch_size = 64
num_classes = 10

# Normalize pixel values to the [0, 1] range
x_train_norm = x_train.astype('float32') / 255.0
x_test_norm = x_test.astype('float32') / 255.0

train_ds = tf.data.Dataset.from_tensor_slices((x_train_norm, y_train))
train_ds = train_ds.shuffle(buffer_size=len(x_train_norm)).batch(batch_size).prefetch(tf.data.AUTOTUNE)

val_split = 5000
val_ds = tf.data.Dataset.from_tensor_slices((x_test_norm[:val_split], y_test[:val_split]))
val_ds = val_ds.batch(batch_size).prefetch(tf.data.AUTOTUNE)

test_ds = tf.data.Dataset.from_tensor_slices((x_test_norm[val_split:], y_test[val_split:]))
test_ds = test_ds.batch(batch_size).prefetch(tf.data.AUTOTUNE)

## 4. Building and Training a Baseline CNN

Convolutional neural networks are well-suited for image tasks. We will define a simple architecture with convolution, pooling, and dropout layers, followed by dense layers.

In [None]:
def build_baseline_cnn():
    model = models.Sequential([
        layers.Input(shape=(32, 32, 3)),
        layers.Conv2D(32, (3, 3), activation='relu', padding='same'),
        layers.BatchNormalization(),
        layers.Conv2D(32, (3, 3), activation='relu', padding='same'),
        layers.MaxPooling2D(),
        layers.Dropout(0.25),

        layers.Conv2D(64, (3, 3), activation='relu', padding='same'),
        layers.BatchNormalization(),
        layers.Conv2D(64, (3, 3), activation='relu', padding='same'),
        layers.MaxPooling2D(),
        layers.Dropout(0.3),

        layers.Conv2D(128, (3, 3), activation='relu', padding='same'),
        layers.BatchNormalization(),
        layers.MaxPooling2D(),
        layers.Dropout(0.4),

        layers.Flatten(),
        layers.Dense(256, activation='relu'),
        layers.Dropout(0.5),
        layers.Dense(num_classes, activation='softmax'),
    ])
    return model

baseline_model = build_baseline_cnn()
baseline_model.summary()

In [None]:
baseline_model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-3),
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

### Training the Baseline Model

We train for a handful of epochs to keep the runtime manageable. Increase the number of epochs for better accuracy if you have more compute time.

In [None]:
epochs = 10
baseline_history = baseline_model.fit(
    train_ds,
    epochs=epochs,
    validation_data=val_ds
)

### Evaluating the Baseline Model

We visualize the training and validation curves and measure the final test accuracy.

In [None]:
def plot_history(history, title):
    acc = history.history['accuracy']
    val_acc = history.history.get('val_accuracy')
    loss = history.history['loss']
    val_loss = history.history.get('val_loss')

    epochs_range = range(1, len(acc) + 1)
    plt.figure(figsize=(12, 4))
    plt.subplot(1, 2, 1)
    plt.plot(epochs_range, acc, label='Training Accuracy')
    if val_acc is not None:
        plt.plot(epochs_range, val_acc, label='Validation Accuracy')
    plt.title(f'{title} Accuracy')
    plt.xlabel('Epoch')
    plt.ylabel('Accuracy')
    plt.legend()

    plt.subplot(1, 2, 2)
    plt.plot(epochs_range, loss, label='Training Loss')
    if val_loss is not None:
        plt.plot(epochs_range, val_loss, label='Validation Loss')
    plt.title(f'{title} Loss')
    plt.xlabel('Epoch')
    plt.ylabel('Loss')
    plt.legend()
    plt.show()

plot_history(baseline_history, title='Baseline CNN')

baseline_test_loss, baseline_test_acc = baseline_model.evaluate(test_ds)
print(f'Baseline CNN test accuracy: {baseline_test_acc:.3f}')

### Inspecting Predictions

Let's take a quick look at how the model performs on individual samples.

In [None]:
def show_predictions(model, dataset, class_names, num_images=8):
    plt.figure(figsize=(12, 6))
    for images, labels in dataset.take(1):
        preds = model.predict(images)
        pred_labels = np.argmax(preds, axis=1)
        for i in range(num_images):
            plt.subplot(2, 4, i + 1)
            plt.imshow(images[i])
            true_label = class_names[int(labels[i])]
            predicted_label = class_names[int(pred_labels[i])]
            color = 'green' if true_label == predicted_label else 'red'
            plt.title(f'True: {true_label}
Pred: {predicted_label}', color=color)
            plt.axis('off')
    plt.tight_layout()
    plt.show()

show_predictions(baseline_model, test_ds, class_names)

## 5. Transfer Learning and Fine-Tuning

Training from scratch is powerful, but leveraging pre-trained models can yield better results faster. We will:

1. Load a pre-trained MobileNetV2 model trained on ImageNet.
2. Use it as a fixed feature extractor with frozen weights.
3. Add a custom classification head for CIFAR-10.
4. Fine-tune the top layers of the base model for additional gains.

### Preparing Data for Transfer Learning

MobileNetV2 expects 160x160 pixel inputs with values in `[-1, 1]`. We will create a preprocessing pipeline that resizes and rescales the images.

In [None]:
target_size = (160, 160)
preprocess_input = keras.applications.mobilenet_v2.preprocess_input

resize_and_rescale = keras.Sequential([
    layers.Resizing(target_size[0], target_size[1]),
    layers.Lambda(preprocess_input),
])

augmented_train_ds = train_ds.map(
    lambda x, y: (resize_and_rescale(x), y),
    num_parallel_calls=tf.data.AUTOTUNE
)
augmented_val_ds = val_ds.map(
    lambda x, y: (resize_and_rescale(x), y),
    num_parallel_calls=tf.data.AUTOTUNE
)
augmented_test_ds = test_ds.map(
    lambda x, y: (resize_and_rescale(x), y),
    num_parallel_calls=tf.data.AUTOTUNE
)

### Building the Transfer Learning Model

We combine the pre-trained MobileNetV2 base with a global pooling layer and a small dense head. Initially, the base model is frozen so only the new layers are trained.

In [None]:
base_model = keras.applications.MobileNetV2(
    input_shape=target_size + (3,),
    include_top=False,
    weights='imagenet'
)
base_model.trainable = False

transfer_inputs = layers.Input(shape=target_size + (3,))
x = base_model(transfer_inputs, training=False)
x = layers.GlobalAveragePooling2D()(x)
x = layers.Dropout(0.2)(x)
transfer_outputs = layers.Dense(num_classes, activation='softmax')(x)

transfer_model = keras.Model(inputs=transfer_inputs, outputs=transfer_outputs)
transfer_model.summary()

In [None]:
transfer_model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-3),
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

### Training the Feature Extractor

We train only the new classification head while keeping the base model frozen.

In [None]:
transfer_epochs = 5
feature_history = transfer_model.fit(
    augmented_train_ds,
    epochs=transfer_epochs,
    validation_data=augmented_val_ds
)

### Fine-Tuning the Top Layers

Fine-tuning allows the model to adapt high-level representations to the CIFAR-10 dataset. We unfreeze a subset of layers and continue training with a lower learning rate.

In [None]:
fine_tune_at = 120
for layer in base_model.layers[:fine_tune_at]:
    layer.trainable = False
for layer in base_model.layers[fine_tune_at:]:
    layer.trainable = True

transfer_model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-4),
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

fine_tune_epochs = 5
fine_tune_history = transfer_model.fit(
    augmented_train_ds,
    epochs=fine_tune_epochs,
    validation_data=augmented_val_ds
)

### Evaluating Transfer Learning Performance

We compare the learning curves and final accuracy against the baseline model.

In [None]:
plot_history(feature_history, title='Transfer Learning (Frozen Base)')
plot_history(fine_tune_history, title='Transfer Learning (Fine-Tuned)')

transfer_test_loss, transfer_test_acc = transfer_model.evaluate(augmented_test_ds)
print(f'Transfer learning model test accuracy: {transfer_test_acc:.3f}')

### Visualizing Transfer Learning Predictions

Let's inspect predictions from the fine-tuned model.

In [None]:
show_predictions(transfer_model, augmented_test_ds, class_names)

## 6. Summary and Teaching Tips

* **Baseline CNN**: Demonstrates the fundamentals of building and training a model from scratch. Encourage students to experiment with additional layers, different learning rates, or regularization techniques.
* **Transfer Learning**: Shows how leveraging pre-trained models can quickly boost performance. Discuss how feature extraction and fine-tuning complement each other.
* **Evaluation and Visualization**: Plots and sample predictions make the training process tangible for learners.

### Suggested Classroom Activities

1. **Hyperparameter Search**: Have students adjust the number of epochs, batch size, or optimizer and observe the effects on accuracy.
2. **Data Augmentation Challenge**: Introduce `layers.RandomFlip`, `layers.RandomRotation`, or `layers.RandomZoom` in the preprocessing pipeline and compare results.
3. **Model Comparison**: Swap MobileNetV2 for another architecture (e.g., EfficientNet, ResNet) and evaluate the trade-offs.
4. **Real-World Discussion**: Connect model deployment considerations (latency, memory, fairness) to the models built in class.