# Part 1 - Regularization Techniques

## Objective

This lab applies four major regularization techniques to the deep models built in Lab 3 to combat overfitting:

- **ℓ₁ and ℓ₂ Regularization:** Adding a penalty to the loss function based on the size of the weights.
- **Dropout:** Randomly "killing" neurons during training to ensure the network doesn't over-rely on specific paths.
- **Max-Norm Regularization:** Constraining the weights of each neuron so they don't grow too large.

## Conceptual Background

| Technique | Description | Effect |
|-----------|-------------|--------|
| **ℓ₁ Regularization** | Adds penalty proportional to absolute weight values | Produces sparse models (weights become exactly zero), effectively acting as feature selection |
| **ℓ₂ Regularization** | Adds penalty proportional to squared weight values | Keeps weights small but rarely zero, helps handle multicollinearity |
| **Dropout** | Randomly deactivates neurons during training | Forces the network to learn redundant representations, making it more robust |
| **Max-Norm** | Rescales weight vector w such that \|\|w\|\|₂ ≤ c | Prevents weights from growing too large, where c is the max-norm hyperparameter |

## Step 1: $\ell_1$ and $\ell_2$
## RegularizationWe will apply these to a Dense network similar to your MNIST/CIFAR-10 tasks.

In [1]:
import tensorflow as tf
from tensorflow import keras

# 1. Define l2 regularization factor
# This adds a penalty to the loss: Loss + 0.01 * sum(weights^2)
regularizer = keras.regularizers.l2(0.01)

model_l2 = keras.Sequential([
    keras.layers.Flatten(input_shape=[32, 32, 3]), # Using CIFAR-10 shape as example
    keras.layers.Dense(100, activation="elu", kernel_initializer="he_normal",
                       kernel_regularizer=regularizer),
    keras.layers.Dense(100, activation="elu", kernel_initializer="he_normal",
                       kernel_regularizer=regularizer),
    keras.layers.Dense(10, activation="softmax")
])

# If you want to use l1, simply use: keras.regularizers.l1(0.01)
# For both (Elastic Net style): keras.regularizers.l1_l2(0.01, 0.01)

model_l2.compile(loss="categorical_crossentropy", optimizer="nadam", metrics=["accuracy"])

2026-02-19 08:47:12.089219: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: SSE4.1 SSE4.2 AVX AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
  super().__init__(**kwargs)


## Step 2: Dropout
### Dropout is typically placed after each hidden layer.

In [2]:
model_dropout = keras.Sequential([
    keras.layers.Flatten(input_shape=[32, 32, 3]),
    keras.layers.Dense(100, activation="elu", kernel_initializer="he_normal"),
    # 20% of the neurons will be dropped during each training step
    keras.layers.Dropout(rate=0.2),
    keras.layers.Dense(100, activation="elu", kernel_initializer="he_normal"),
    keras.layers.Dropout(rate=0.2),
    keras.layers.Dense(10, activation="softmax")
])

model_dropout.compile(loss="categorical_crossentropy", optimizer="nadam", metrics=["accuracy"])

## Step 3: Max-Norm Regularization
### Max-norm doesn't add a penalty to the loss function; instead, it constrains the weights directly after each update.

In [3]:
# Define the constraint (e.g., max value of 1.0)
max_norm_reg = keras.constraints.max_norm(1.0)

model_max_norm = keras.Sequential([
    keras.layers.Flatten(input_shape=[32, 32, 3]),
    keras.layers.Dense(100, activation="elu", kernel_initializer="he_normal",
                       kernel_constraint=max_norm_reg),
    keras.layers.Dense(100, activation="elu", kernel_initializer="he_normal",
                       kernel_constraint=max_norm_reg),
    keras.layers.Dense(10, activation="softmax")
])

model_max_norm.compile(loss="categorical_crossentropy", optimizer="nadam", metrics=["accuracy"])

# Part 2 - Data Augmentation

## Objective

This lab demonstrates how to use TensorFlow Flowers to practice data augmentation. We will explore three ways to implement this:

- **Integrated Layers:** Adding augmentation directly into the Sequential model.
- **Dataset Mapping:** Applying transformations to the `tf.data` pipeline for better performance.
- **Custom Augmentation:** Using `tf.image` for fine-grained control over specific transformations.

## Why use Data Augmentation?

Data Augmentation is a powerful technique used to artificially expand the size of your training set by creating "new" images from existing ones using transformations like rotation, flipping, and zooming. This prevents the model from memorizing specific orientations and helps it generalize better to real-world photos.

| Benefit | Description |
|---------|-------------|
| **Reduces Overfitting** | By showing the model a slightly different version of the image every time, it can't rely on the exact position of pixels |
| **Invariance** | The model learns that a flower is still the same flower even if it is upside down, flipped, or slightly rotated |

### Step 1: Load and Pre-process the Flowers Dataset

In [None]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import tensorflow_datasets as tfds

# 1. Load the tf_flowers dataset
# as_supervised=True gives us (image, label) pairs instead of a dictionary
(train_ds, val_ds), ds_info = tfds.load(
    'tf_flowers',
    #Take everything from the very beginning up to the 80% mark. This becomes your Training Set. Take everything starting from the 80% mark until the very end. This becomes your Validation Set.
    split=['train[:80%]', 'train[80%:]'],
    as_supervised=True,
    with_info=True
)

# 2. Define image size and batch size
IMG_SIZE = 180
BATCH_SIZE = 32

# 3. Resize and Rescale function
# Images come in different sizes; we need them uniform for the neural network
resize_and_rescale = keras.Sequential([
  layers.Resizing(IMG_SIZE, IMG_SIZE),
  layers.Rescaling(1./255) # Scales pixels from [0, 255] to [0, 1]
])