# Unit 3 Implementing Data Augmentation

Welcome back\! You've successfully learned how to preprocess your dataset by cleaning, normalizing, and splitting it into training and testing sets. Now, it's time to take your data preparation skills to the next level with **data augmentation**. This lesson will guide you through the process of enhancing your dataset, making your model more robust and capable of recognizing drawings more accurately.

-----

## What You'll Learn

In this lesson, you'll discover how to implement data augmentation using the Keras `ImageDataGenerator`. **Data augmentation** is a technique that artificially expands the size of your training dataset by creating modified versions of images. This is crucial for improving the performance of your model, especially when working with limited data.

> **Note:** Data augmentation is typically applied **only to the training set**, not the validation or test sets. This ensures that your model is evaluated on unaltered data, providing a true measure of its performance.

Here's a sneak peek at the code you'll be working with:

```python
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Data augmentation
datagen = ImageDataGenerator(
    rotation_range=10,
    width_shift_range=0.1,
    height_shift_range=0.1,
    shear_range=0.1,
    zoom_range=0.1,
    horizontal_flip=True,
    fill_mode='nearest')

# Fit the generator to the training data
datagen.fit(x_train)

# Example of using the generator
for x_batch, y_batch in datagen.flow(x_train, y_train, batch_size=32):
    print("Batch shape:", x_batch.shape)
    break  # Just to show one batch
```

```
# Output:
Batch shape: (32, 28, 28, 1)
```

This output shows that the generator produces a batch of 32 augmented images, each with the same shape as your original training images (for example, 28x28 pixels with 1 color channel for grayscale). This code snippet demonstrates how to set up and use the `ImageDataGenerator` to augment your training data. You'll learn how to apply various transformations, such as rotation, shifting, and flipping, to create a more diverse dataset.

Here’s what each parameter in `ImageDataGenerator` does and how it helps your model generalize:

  * `rotation_range=10`: Randomly rotates images by up to 10 degrees. This helps the model recognize drawings even if they are slightly rotated.
  * `width_shift_range=0.1`: Shifts images horizontally by up to 10% of the width. This teaches the model to handle drawings that are not perfectly centered.
  * `height_shift_range=0.1`: Shifts images vertically by up to 10% of the height. This helps the model learn from drawings that are higher or lower in the frame.
  * `shear_range=0.1`: Applies shearing transformations (slanting the image). This exposes the model to skewed versions of drawings.
  * `zoom_range=0.1`: Randomly zooms in or out by up to 10%. This helps the model recognize objects at different scales.
  * `horizontal_flip=True`: Randomly flips images horizontally. This is useful if the orientation of the drawing doesn’t matter, making the model robust to left-right variations.
  * `fill_mode='nearest'`: Determines how to fill in new pixels that are created after a transformation. Using 'nearest' copies the nearest pixel value, which helps preserve the drawing’s structure after augmentation.

After that, we perform `fit` and `flow` operations:

  * `datagen.fit(x_train)`: Calculates any statistics required for certain augmentations (like feature-wise normalization or ZCA whitening) based on the training data. For basic augmentations, this step can be included for consistency, even if not strictly necessary.
  * `datagen.flow(x_train, y_train, batch_size=32)`: Creates an iterator that generates batches of augmented image and label pairs on the fly, applying random transformations to each batch during training.

-----

## Why It Matters

Data augmentation is a powerful tool in machine learning that helps improve model generalization. By introducing variations in your training data, you can make your model more resilient to changes and distortions in real-world data. This is especially important in drawing recognition, where the same object can be drawn in many different ways. By the end of this lesson, you'll be equipped with the skills to enhance your dataset and boost your model's performance.

Excited to get started? Let's dive into the practice section and see data augmentation in action\!

## Basic Image Augmentation for Drawings

Great job splitting your data into training and testing sets! Now, let's implement basic data augmentation to enhance your training dataset.

Data augmentation creates modified versions of your existing images, which helps your model learn to recognize drawings from different angles and variations. This is particularly useful when working with hand-drawn images, as people draw the same objects in many different ways.

In this task, you'll set up a simple ImageDataGenerator with rotation transformation, generate a batch of augmented images, and verify that the augmentation is working by printing the batch shape.

```python
import urllib.request
import numpy as np
import os
from sklearn.model_selection import train_test_split
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Ensure the data is downloaded
categories = ['cat', 'house', 'airplane', 'apple', 'bicycle']
base_url = 'https://storage.googleapis.com/quickdraw_dataset/full/numpy_bitmap/'

os.makedirs('quickdraw_data', exist_ok=True)

for category in categories:
    file_path = f'quickdraw_data/{category}.npy'
    if not os.path.exists(file_path):
        print(f"Downloading {category}...")
        urllib.request.urlretrieve(base_url + category + '.npy', file_path)
    else:
        print(f"{category}.npy already exists.")

# Load and prepare data
data = []
labels = []

for idx, cat in enumerate(categories):
    filepath = f'quickdraw_data/{cat}.npy'
    imgs = np.load(filepath)[:15000]  # Load up to 15000 images per category
    if imgs.dtype != np.uint8:
        imgs = imgs.astype(np.uint8)
    data.append(imgs)
    labels.append(np.full(imgs.shape[0], idx))

# Combine all data and labels
data = np.concatenate(data, axis=0)
labels = np.concatenate(labels, axis=0)

# Shuffle data
indices = np.arange(len(data))
np.random.shuffle(indices)
data, labels = data[indices], labels[indices]

# Reshape and normalize
data = data.reshape(-1, 28, 28, 1).astype('float32') / 255.0

# Split data into training and testing sets
x_train, x_test, y_train, y_test = train_test_split(data, labels, test_size=0.2, random_state=42)
print(f"Training data shape: {x_train.shape}, Testing data shape: {x_test.shape}")

# TODO: Create a basic data augmentation generator with rotation_range=15

# TODO: Fit the generator to the training data

# TODO: Generate a batch of augmented images and print the batch shape

```

Of course\! Here is the completed code for implementing basic data augmentation.

### Completed Code

```python
import urllib.request
import numpy as np
import os
from sklearn.model_selection import train_test_split
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Ensure the data is downloaded
categories = ['cat', 'house', 'airplane', 'apple', 'bicycle']
base_url = 'https://storage.googleapis.com/quickdraw_dataset/full/numpy_bitmap/'

os.makedirs('quickdraw_data', exist_ok=True)

for category in categories:
    file_path = f'quickdraw_data/{category}.npy'
    if not os.path.exists(file_path):
        print(f"Downloading {category}...")
        urllib.request.urlretrieve(base_url + category + '.npy', file_path)
    else:
        print(f"{category}.npy already exists.")

# Load and prepare data
data = []
labels = []

for idx, cat in enumerate(categories):
    filepath = f'quickdraw_data/{cat}.npy'
    imgs = np.load(filepath)[:15000]  # Load up to 15000 images per category
    if imgs.dtype != np.uint8:
        imgs = imgs.astype(np.uint8)
    data.append(imgs)
    labels.append(np.full(imgs.shape[0], idx))

# Combine all data and labels
data = np.concatenate(data, axis=0)
labels = np.concatenate(labels, axis=0)

# Shuffle data
indices = np.arange(len(data))
np.random.shuffle(indices)
data, labels = data[indices], labels[indices]

# Reshape and normalize
data = data.reshape(-1, 28, 28, 1).astype('float32') / 255.0

# Split data into training and testing sets
x_train, x_test, y_train, y_test = train_test_split(data, labels, test_size=0.2, random_state=42)
print(f"Training data shape: {x_train.shape}, Testing data shape: {x_test.shape}")

# Create a basic data augmentation generator with rotation_range=15
datagen = ImageDataGenerator(rotation_range=15)

# Fit the generator to the training data
datagen.fit(x_train)

# Generate a batch of augmented images and print the batch shape
for x_batch, y_batch in datagen.flow(x_train, y_train, batch_size=32):
    print("Augmented batch shape:", x_batch.shape)
    break

```

-----

### Explanation 🧑‍🏫

1.  **Create Generator**: `datagen = ImageDataGenerator(rotation_range=15)` initializes the augmentation tool. We've configured it to randomly rotate images by up to **15 degrees**.
2.  **Fit Generator**: `datagen.fit(x_train)` "fits" the generator to your training data. While not strictly necessary for simple transformations like rotation, it's a good practice. This step becomes essential for more complex augmentations that require statistical calculations based on the dataset (like ZCA whitening).
3.  **Generate Augmented Batch**: The `for` loop uses `datagen.flow(x_train, y_train, batch_size=32)`. This method creates an iterator that generates batches of transformed images and their corresponding labels on-the-fly.
      * `x_train, y_train`: The source data and labels.
      * `batch_size=32`: Specifies that 32 augmented images should be generated in each batch.
4.  **Print and Break**: `print("Augmented batch shape:", x_batch.shape)` confirms that the generator is producing a batch of 32 images, each with the correct shape (`28, 28, 1`). The `break` statement stops the loop after the first batch is generated, as we only need to verify its creation for this task.

## Enhanced Data Augmentation for Drawings

Great work implementing basic rotation augmentation! Now let's enhance our data augmentation pipeline with additional transformations to create even more variety in our training data.

By adding width and height shifts, zoom effects, and shear transformations, we can simulate drawings positioned differently, drawn at various sizes, and viewed from different angles. This creates a more diverse dataset that better represents real-world variations.

In this task, you'll expand the existing ImageDataGenerator with these additional transformations and visualize the results to confirm they're working correctly. This visual comparison between original and augmented images will help you understand how each transformation affects your drawing data.

```python
import urllib.request
import numpy as np
import os
from sklearn.model_selection import train_test_split
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import matplotlib.pyplot as plt

# Ensure the data is downloaded
categories = ['cat', 'house', 'airplane', 'apple', 'bicycle']
base_url = 'https://storage.googleapis.com/quickdraw_dataset/full/numpy_bitmap/'

os.makedirs('quickdraw_data', exist_ok=True)

for category in categories:
    file_path = f'quickdraw_data/{category}.npy'
    if not os.path.exists(file_path):
        print(f"Downloading {category}...")
        urllib.request.urlretrieve(base_url + category + '.npy', file_path)
    else:
        print(f"{category}.npy already exists.")

# Load and prepare data
data = []
labels = []

for idx, cat in enumerate(categories):
    filepath = f'quickdraw_data/{cat}.npy'
    imgs = np.load(filepath)[:15000]  # Load up to 15000 images per category
    if imgs.dtype != np.uint8:
        imgs = imgs.astype(np.uint8)
    data.append(imgs)
    labels.append(np.full(imgs.shape[0], idx))

# Combine all data and labels
data = np.concatenate(data, axis=0)
labels = np.concatenate(labels, axis=0)

# Shuffle data
indices = np.arange(len(data))
np.random.shuffle(indices)
data, labels = data[indices], labels[indices]

# Reshape and normalize
data = data.reshape(-1, 28, 28, 1).astype('float32') / 255.0

# Split data into training and testing sets
x_train, x_test, y_train, y_test = train_test_split(data, labels, test_size=0.2, random_state=42)
print(f"Training data shape: {x_train.shape}, Testing data shape: {x_test.shape}")

# Create an enhanced data augmentation generator with multiple transformations
datagen = ImageDataGenerator(
    rotation_range=15,
    width_shift_range=________,
    height_shift_range=________,
    zoom_range=________,
    shear_range=________,
    fill_mode='nearest'
)

# Fit the generator to the training data
datagen.fit(x_train)

# TODO: Generate a batch of augmented images using only the first 5 samples
# TODO: Display original and augmented images to verify transformations
for x_batch, y_batch in datagen.flow(_________, _________, batch_size=5):
    print("Augmented batch shape:", x_batch.shape)
    
    # Display original and augmented images to verify transformations
    plt.figure(figsize=(10, 4))
    for i in range(5):
        # Original image
        plt.subplot(2, 5, i+1)
        plt.imshow(_________.reshape(28, 28), cmap='gray')
        plt.title(f"Original {i+1}")
        plt.axis('off')
        
        # Augmented image
        plt.subplot(2, 5, i+6)
        plt.imshow(_________.reshape(28, 28), cmap='gray')
        plt.title(f"Augmented {i+1}")
        plt.axis('off')
    
    plt.tight_layout()
    plt.savefig('static/images/plot.png')
    break  # Just process one batch

```

Let's enhance the `ImageDataGenerator` with width and height shifts, zoom effects, and shear transformations.

Here's the completed code:

```python
import urllib.request
import numpy as np
import os
from sklearn.model_selection import train_test_split
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import matplotlib.pyplot as plt

# Ensure the data is downloaded
categories = ['cat', 'house', 'airplane', 'apple', 'bicycle']
base_url = 'https://storage.googleapis.com/quickdraw_dataset/full/numpy_bitmap/'

os.makedirs('quickdraw_data', exist_ok=True)

for category in categories:
    file_path = f'quickdraw_data/{category}.npy'
    if not os.path.exists(file_path):
        print(f"Downloading {category}...")
        urllib.request.urlretrieve(base_url + category + '.npy', file_path)
    else:
        print(f"{category}.npy already exists.")

# Load and prepare data
data = []
labels = []

for idx, cat in enumerate(categories):
    filepath = f'quickdraw_data/{cat}.npy'
    imgs = np.load(filepath)[:15000]  # Load up to 15000 images per category
    if imgs.dtype != np.uint8:
        imgs = imgs.astype(np.uint8)
    data.append(imgs)
    labels.append(np.full(imgs.shape[0], idx))

# Combine all data and labels
data = np.concatenate(data, axis=0)
labels = np.concatenate(labels, axis=0)

# Shuffle data
indices = np.arange(len(data))
np.random.shuffle(indices)
data, labels = data[indices], labels[indices]

# Reshape and normalize
data = data.reshape(-1, 28, 28, 1).astype('float32') / 255.0

# Split data into training and testing sets
x_train, x_test, y_train, y_test = train_test_split(data, labels, test_size=0.2, random_state=42)
print(f"Training data shape: {x_train.shape}, Testing data shape: {x_test.shape}")

# Create an enhanced data augmentation generator with multiple transformations
datagen = ImageDataGenerator(
    rotation_range=15,
    width_shift_range=0.1,  # Shift images horizontally by up to 10% of total width
    height_shift_range=0.1, # Shift images vertically by up to 10% of total height
    zoom_range=0.2,         # Zoom in/out by up to 20%
    shear_range=0.1,        # Apply shear transformation
    fill_mode='nearest'
)

# Fit the generator to the training data
datagen.fit(x_train)

# Generate a batch of augmented images using only the first 5 samples
# Display original and augmented images to verify transformations
for x_batch, y_batch in datagen.flow(x_train[:5], y_train[:5], batch_size=5):
    print("Augmented batch shape:", x_batch.shape)
    
    # Display original and augmented images to verify transformations
    plt.figure(figsize=(10, 4))
    for i in range(5):
        # Original image
        plt.subplot(2, 5, i+1)
        plt.imshow(x_train[i].reshape(28, 28), cmap='gray')
        plt.title(f"Original {i+1}")
        plt.axis('off')
        
        # Augmented image
        plt.subplot(2, 5, i+6)
        plt.imshow(x_batch[i].reshape(28, 28), cmap='gray')
        plt.title(f"Augmented {i+1}")
        plt.axis('off')
    
    plt.tight_layout()
    plt.savefig('static/images/plot.png')
    break  # Just process one batch
```

-----

### Explanation of the Enhancements

I've added the following parameters to the `ImageDataGenerator`:

  * `width_shift_range=0.1`: This allows for horizontal shifts of the image. A value of `0.1` means the image can be shifted horizontally by up to 10% of its total width.
  * `height_shift_range=0.1`: Similar to `width_shift_range`, this enables vertical shifts of the image by up to 10% of its total height.
  * `zoom_range=0.2`: This parameter randomly zooms in or out on the image. A value of `0.2` means the image can be zoomed by up to 20% (either 80-120% of original size).
  * `shear_range=0.1`: Shear transformation shifts one part of the image while keeping the other fixed, creating an effect as if the image is being viewed from an angle. A value of `0.1` applies a shear intensity.

Additionally, I completed the `datagen.flow` line to use the first 5 samples from `x_train` and `y_train` for visualization:

  * `datagen.flow(x_train[:5], y_train[:5], batch_size=5)`: This feeds the first 5 training images and their corresponding labels into the generator, creating augmented versions.
  * `plt.imshow(x_train[i].reshape(28, 28), cmap='gray')`: This displays the original image.
  * `plt.imshow(x_batch[i].reshape(28, 28), cmap='gray')`: This displays the augmented image generated by the `datagen.flow`.

By running this code, you'll see a `plot.png` file in the `static/images/` directory, showing both the original and augmented versions of the first five drawings, demonstrating the effects of these new transformations. This enhanced augmentation pipeline will help your model generalize better to a wider variety of real-world drawing variations.

## Visualizing Multiple Augmentations of One Drawing

Now that you've enhanced your data augmentation pipeline with multiple transformations, let's explore the randomness and variety these transformations create. In this task, you'll focus on a single drawing and generate multiple augmented versions to visualize how data augmentation works in practice.

By examining multiple variations of the same drawing, you'll gain insight into how these transformations help your model learn to recognize drawings regardless of their position, orientation, or scale. This is crucial for building robust drawing recognition systems.

You'll select a single image, run it through the augmentation pipeline multiple times, and compare the results to confirm that each iteration produces unique transformations. This visual inspection will help you understand the power of data augmentation for expanding your training dataset.

```python
import urllib.request
import numpy as np
import os
from sklearn.model_selection import train_test_split
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import matplotlib.pyplot as plt

# Ensure the data is downloaded
categories = ['cat', 'house', 'airplane', 'apple', 'bicycle']
base_url = 'https://storage.googleapis.com/quickdraw_dataset/full/numpy_bitmap/'

os.makedirs('quickdraw_data', exist_ok=True)

for category in categories:
    file_path = f'quickdraw_data/{category}.npy'
    if not os.path.exists(file_path):
        print(f"Downloading {category}...")
        urllib.request.urlretrieve(base_url + category + '.npy', file_path)
    else:
        print(f"{category}.npy already exists.")

# Load and prepare data
data = []
labels = []

for idx, cat in enumerate(categories):
    filepath = f'quickdraw_data/{cat}.npy'
    imgs = np.load(filepath)[:15000]  # Load up to 15000 images per category
    if imgs.dtype != np.uint8:
        imgs = imgs.astype(np.uint8)
    data.append(imgs)
    labels.append(np.full(imgs.shape[0], idx))

# Combine all data and labels
data = np.concatenate(data, axis=0)
labels = np.concatenate(labels, axis=0)

# Shuffle data
indices = np.arange(len(data))
np.random.shuffle(indices)
data, labels = data[indices], labels[indices]

# Reshape and normalize
data = data.reshape(-1, 28, 28, 1).astype('float32') / 255.0

# Split data into training and testing sets
x_train, x_test, y_train, y_test = train_test_split(data, labels, test_size=0.2, random_state=42)
print(f"Training data shape: {x_train.shape}, Testing data shape: {x_test.shape}")

# Create a data augmentation generator
datagen = ImageDataGenerator(
    rotation_range=15,
    width_shift_range=0.1,
    height_shift_range=0.1,
    zoom_range=0.1,
    shear_range=0.1,
    horizontal_flip=________,
    fill_mode='nearest'
)

# TODO: Select a single image from the training set (use index 42 or any other)
image_index = ________
single_image = ________  # Keep dimensions for the generator
category_name = categories[int(________)]

# Create a figure to display the original and augmented images
plt.figure(figsize=(12, 8))

# TODO: Display the original image
plt.subplot(3, 4, 1)
plt.imshow(________.reshape(28, 28), cmap='gray')
plt.title(f"Original {________}")
plt.axis('off')

# TODO: Generate and display multiple augmented versions
i = 2
for batch in datagen.flow(________, batch_size=________):
    plt.subplot(3, 4, i)
    plt.imshow(________.reshape(28, 28), cmap='gray')
    plt.title(f"Augmented #{i-1}")
    plt.axis('off')
    i += 1
    if i > ________:  # Display 11 augmented versions
        break

plt.tight_layout()
plt.savefig('static/images/plot.png')
print(f"Generated 11 different augmented versions of a {category_name} drawing")

```

Let's visualize multiple augmented versions of a single drawing to understand the power of data augmentation\!

Here's the completed code:

```python
import urllib.request
import numpy as np
import os
from sklearn.model_selection import train_test_split
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import matplotlib.pyplot as plt

# Ensure the data is downloaded
categories = ['cat', 'house', 'airplane', 'apple', 'bicycle']
base_url = 'https://storage.googleapis.com/quickdraw_dataset/full/numpy_bitmap/'

os.makedirs('quickdraw_data', exist_ok=True)

for category in categories:
    file_path = f'quickdraw_data/{category}.npy'
    if not os.path.exists(file_path):
        print(f"Downloading {category}...")
        urllib.request.urlretrieve(base_url + category + '.npy', file_path)
    else:
        print(f"{category}.npy already exists.")

# Load and prepare data
data = []
labels = []

for idx, cat in enumerate(categories):
    filepath = f'quickdraw_data/{cat}.npy'
    imgs = np.load(filepath)[:15000]  # Load up to 15000 images per category
    if imgs.dtype != np.uint8:
        imgs = imgs.astype(np.uint8)
    data.append(imgs)
    labels.append(np.full(imgs.shape[0], idx))

# Combine all data and labels
data = np.concatenate(data, axis=0)
labels = np.concatenate(labels, axis=0)

# Shuffle data
indices = np.arange(len(data))
np.random.shuffle(indices)
data, labels = data[indices], labels[indices]

# Reshape and normalize
data = data.reshape(-1, 28, 28, 1).astype('float32') / 255.0

# Split data into training and testing sets
x_train, x_test, y_train, y_test = train_test_split(data, labels, test_size=0.2, random_state=42)
print(f"Training data shape: {x_train.shape}, Testing data shape: {x_test.shape}")

# Create a data augmentation generator
datagen = ImageDataGenerator(
    rotation_range=15,
    width_shift_range=0.1,
    height_shift_range=0.1,
    zoom_range=0.1,
    shear_range=0.1,
    horizontal_flip=True, # Added horizontal flip for more variety
    fill_mode='nearest'
)

# Select a single image from the training set (use index 42 or any other)
image_index = 42
single_image = x_train[image_index:image_index+1]  # Keep dimensions for the generator
category_name = categories[int(y_train[image_index])]

# Create a figure to display the original and augmented images
plt.figure(figsize=(12, 8))

# Display the original image
plt.subplot(3, 4, 1)
plt.imshow(single_image.reshape(28, 28), cmap='gray')
plt.title(f"Original {category_name}")
plt.axis('off')

# Generate and display multiple augmented versions
i = 2
for batch in datagen.flow(single_image, batch_size=1):
    plt.subplot(3, 4, i)
    plt.imshow(batch[0].reshape(28, 28), cmap='gray')
    plt.title(f"Augmented #{i-1}")
    plt.axis('off')
    i += 1
    if i > 12:  # Display 11 augmented versions (1 original + 11 augmented = 12 total plots)
        break

plt.tight_layout()
plt.savefig('static/images/plot.png')
print(f"Generated 11 different augmented versions of a {category_name} drawing")

```

-----

### Understanding the Code and Visualizations

In this solution, we:

1.  **Selected a Single Image**: We picked an image from the training set using `image_index = 42` (you can change this index to see other drawings). Crucially, we sliced it as `x_train[image_index:image_index+1]` to maintain the 4-dimensional shape `(1, 28, 28, 1)` that the `ImageDataGenerator` expects.
2.  **Added Horizontal Flip**: I added `horizontal_flip=True` to the `ImageDataGenerator`. This introduces even more variety by randomly flipping images horizontally, which is often beneficial for drawings where left-right orientation doesn't change the meaning (e.g., a "house" is still a house when flipped).
3.  **Displayed the Original**: The first subplot shows the **original** selected image for comparison. We also dynamically retrieve its category name using `categories[int(y_train[image_index])]`.
4.  **Generated and Displayed Augmented Versions**: We then loop through `datagen.flow(single_image, batch_size=1)`. For each iteration, `datagen.flow` applies a **new, random set** of transformations to the `single_image` and yields a batch (in this case, a single augmented image). We then plot each of these augmented images.
5.  **Controlled the Number of Augmentations**: The loop breaks once 11 augmented images have been generated and displayed (`if i > 12`). Since we start `i` at 2 (the first slot is for the original), `i > 12` means we've plotted the original and 11 augmented versions, for a total of 12 plots.

When you run this code, it will generate a `plot.png` file in the `static/images/` directory. This plot will clearly show the original drawing alongside 11 unique augmented versions. You'll observe various rotations, shifts, zooms, shears, and potentially a horizontal flip, all applied randomly to the same base image.

This visualization highlights how even a single drawing can be expanded into many diverse examples, effectively increasing your training data and making your model more robust to real-world variations in how drawings are made.

## Comparing Fill Modes for Drawing Augmentation

Now that you've explored multiple augmentations of a single drawing, let's investigate how different fill_mode settings affect your augmented images. When applying transformations like rotation or shifting, empty spaces can appear at the edges of your drawings. The fill_mode parameter controls how these spaces are filled.

In this task, you'll compare two different fill modes side by side: the standard 'nearest' mode and the 'constant' mode with a specified value. This comparison will help you understand how different fill strategies impact the quality of your augmented drawings.

By examining these differences, you'll be able to choose the most appropriate fill_mode for your specific drawing recognition task, ensuring your model learns from high-quality augmented data.


```python
import urllib.request
import numpy as np
import os
from sklearn.model_selection import train_test_split
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import matplotlib.pyplot as plt

# Ensure the data is downloaded
categories = ['cat', 'house', 'airplane', 'apple', 'bicycle']
base_url = 'https://storage.googleapis.com/quickdraw_dataset/full/numpy_bitmap/'

os.makedirs('quickdraw_data', exist_ok=True)

for category in categories:
    file_path = f'quickdraw_data/{category}.npy'
    if not os.path.exists(file_path):
        print(f"Downloading {category}...")
        urllib.request.urlretrieve(base_url + category + '.npy', file_path)
    else:
        print(f"{category}.npy already exists.")

# Load and prepare data
data = []
labels = []

for idx, cat in enumerate(categories):
    filepath = f'quickdraw_data/{cat}.npy'
    imgs = np.load(filepath)[:15000]  # Load up to 15000 images per category
    if imgs.dtype != np.uint8:
        imgs = imgs.astype(np.uint8)
    data.append(imgs)
    labels.append(np.full(imgs.shape[0], idx))

# Combine all data and labels
data = np.concatenate(data, axis=0)
labels = np.concatenate(labels, axis=0)

# Shuffle data
indices = np.arange(len(data))
np.random.shuffle(indices)
data, labels = data[indices], labels[indices]

# Reshape and normalize
data = data.reshape(-1, 28, 28, 1).astype('float32') / 255.0

# Split data into training and testing sets
x_train, x_test, y_train, y_test = train_test_split(data, labels, test_size=0.2, random_state=42)
print(f"Training data shape: {x_train.shape}, Testing data shape: {x_test.shape}")

# Create two data augmentation generators with different fill modes
datagen_nearest = ImageDataGenerator(
    rotation_range=30,  # Increased rotation to make the fill mode differences more visible
    width_shift_range=0.2,
    height_shift_range=0.2,
    zoom_range=0.1,
    shear_range=0.1,
    horizontal_flip=True,
    fill_mode='nearest'  # Original fill mode
)

datagen_constant = ImageDataGenerator(
    rotation_range=30,  # Same parameters as above
    width_shift_range=0.2,
    height_shift_range=0.2,
    zoom_range=0.1,
    shear_range=0.1,
    horizontal_flip=True,
    fill_mode=________,  # TODO: Change to 'constant' fill mode
    cval=________  # TODO: Add a gray color value between 0-1 (e.g., 0.5)
)

# Select a single image from the training set
image_index = 42
single_image = x_train[image_index:image_index+1]
category_name = categories[int(y_train[image_index])]

# Create a figure to display the original and augmented images with different fill modes
plt.figure(figsize=(15, 10))

# Display the original image
plt.subplot(3, 4, 1)
plt.imshow(single_image[0].reshape(28, 28), cmap='gray')
plt.title(f"Original {category_name}")
plt.axis('off')

# Generate and display augmented versions with 'nearest' fill mode
i = 2
for batch in datagen_nearest.flow(single_image, batch_size=1, seed=42):
    plt.subplot(3, 4, i)
    plt.imshow(batch[0].reshape(28, 28), cmap='gray')
    plt.title(f"Nearest #{i-1}")
    plt.axis('off')
    i += 1
    if i > ________:  # TODO: Display 5 augmented versions
        break

# TODO: Generate and display augmented versions with 'constant' fill mode
i = 7
for batch in ________.flow(single_image, batch_size=1, seed=42):
    plt.subplot(3, 4, i)
    plt.imshow(batch[0].reshape(28, 28), cmap='gray')
    plt.title(f"Constant #{i-6}")
    plt.axis('off')
    i += 1
    if i > ________:  # Display 5 augmented versions
        break

plt.tight_layout()
plt.savefig('static/images/plot.png')
print(f"Generated augmented versions of a {category_name} drawing with different fill modes")

```

Let's compare the effects of different `fill_mode` settings on augmented drawings\!

Here's the completed code:

```python
import urllib.request
import numpy as np
import os
from sklearn.model_selection import train_test_split
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import matplotlib.pyplot as plt

# Ensure the data is downloaded
categories = ['cat', 'house', 'airplane', 'apple', 'bicycle']
base_url = 'https://storage.googleapis.com/quickdraw_dataset/full/numpy_bitmap/'

os.makedirs('quickdraw_data', exist_ok=True)

for category in categories:
    file_path = f'quickdraw_data/{category}.npy'
    if not os.path.exists(file_path):
        print(f"Downloading {category}...")
        urllib.request.urlretrieve(base_url + category + '.npy', file_path)
    else:
        print(f"{category}.npy already exists.")

# Load and prepare data
data = []
labels = []

for idx, cat in enumerate(categories):
    filepath = f'quickdraw_data/{cat}.npy'
    imgs = np.load(filepath)[:15000]  # Load up to 15000 images per category
    if imgs.dtype != np.uint8:
        imgs = imgs.astype(np.uint8)
    data.append(imgs)
    labels.append(np.full(imgs.shape[0], idx))

# Combine all data and labels
data = np.concatenate(data, axis=0)
labels = np.concatenate(labels, axis=0)

# Shuffle data
indices = np.arange(len(data))
np.random.shuffle(indices)
data, labels = data[indices], labels[indices]

# Reshape and normalize
data = data.reshape(-1, 28, 28, 1).astype('float32') / 255.0

# Split data into training and testing sets
x_train, x_test, y_train, y_test = train_test_split(data, labels, test_size=0.2, random_state=42)
print(f"Training data shape: {x_train.shape}, Testing data shape: {x_test.shape}")

# Create two data augmentation generators with different fill modes
datagen_nearest = ImageDataGenerator(
    rotation_range=30,  # Increased rotation to make the fill mode differences more visible
    width_shift_range=0.2,
    height_shift_range=0.2,
    zoom_range=0.1,
    shear_range=0.1,
    horizontal_flip=True,
    fill_mode='nearest'  # Original fill mode
)

datagen_constant = ImageDataGenerator(
    rotation_range=30,  # Same parameters as above
    width_shift_range=0.2,
    height_shift_range=0.2,
    zoom_range=0.1,
    shear_range=0.1,
    horizontal_flip=True,
    fill_mode='constant',  # Changed to 'constant' fill mode
    cval=0.5  # Added a gray color value (0.5 for mid-gray)
)

# Select a single image from the training set
image_index = 42
single_image = x_train[image_index:image_index+1]
category_name = categories[int(y_train[image_index])]

# Create a figure to display the original and augmented images with different fill modes
plt.figure(figsize=(15, 10))

# Display the original image
plt.subplot(3, 4, 1)
plt.imshow(single_image[0].reshape(28, 28), cmap='gray')
plt.title(f"Original {category_name}")
plt.axis('off')

# Generate and display augmented versions with 'nearest' fill mode
i = 2
for batch in datagen_nearest.flow(single_image, batch_size=1, seed=42):
    plt.subplot(3, 4, i)
    plt.imshow(batch[0].reshape(28, 28), cmap='gray')
    plt.title(f"Nearest #{i-1}")
    plt.axis('off')
    i += 1
    if i > 6:  # Display 5 augmented versions (1 original + 5 nearest = 6 total plots)
        break

# Generate and display augmented versions with 'constant' fill mode
i = 7
for batch in datagen_constant.flow(single_image, batch_size=1, seed=42):
    plt.subplot(3, 4, i)
    plt.imshow(batch[0].reshape(28, 28), cmap='gray')
    plt.title(f"Constant #{i-6}")
    plt.axis('off')
    i += 1
    if i > 11:  # Display 5 augmented versions (5 constant = 11 total plots including previous)
        break

plt.tight_layout()
plt.savefig('static/images/plot.png')
print(f"Generated augmented versions of a {category_name} drawing with different fill modes")
```

-----

### Explanation of Changes and `fill_mode`

Here's what was changed and why:

1.  **`datagen_constant` Configuration:**

      * `fill_mode='constant'`: This tells the `ImageDataGenerator` to fill any newly created pixels (due to transformations like rotation or shifting) with a constant value.
      * `cval=0.5`: This parameter is used *only* when `fill_mode` is set to `'constant'`. It specifies the value to fill the new pixels with. Since our images are normalized to `0.0-1.0` (black to white), `0.5` will result in a mid-gray color. You could use `0.0` for black or `1.0` for white, depending on what best suits your data's background.

2.  **Loop for `datagen_constant`:**

      * `i = 7`: We start the counter for the `constant` fill mode images at 7 because the first 6 subplots are occupied by the original and the 5 'nearest' filled images (3 rows, 4 columns: 1 original + 5 nearest = 6 plots. The next plot starts at `(2, 4, 7)`).
      * `for batch in datagen_constant.flow(single_image, batch_size=1, seed=42):`: We use the `datagen_constant` generator and pass the `single_image` to it. We use the same `seed=42` for both generators to ensure they apply the *same* random transformations, allowing for a direct comparison of the `fill_mode` effect on identical transformations.
      * `plt.title(f"Constant #{i-6}")`: Adjusted the title to correctly number the 'constant' images from 1 to 5.
      * `if i > 11`: This condition ensures that 5 augmented images with the 'constant' fill mode are displayed (plots 7 through 11).

When you run this code, it will generate a `plot.png` file displaying:

  * The original drawing.
  * Five augmented versions using `fill_mode='nearest'`. You'll notice that the empty areas are filled by repeating the nearest pixel value, which can sometimes lead to stretched or blurred artifacts at the edges.
  * Five augmented versions using `fill_mode='constant'` with `cval=0.5`. Here, the empty areas will be filled with a uniform gray color.

This visual comparison clearly demonstrates how the `fill_mode` impacts the appearance of your augmented data, helping you decide which method is more appropriate for your specific task and dataset characteristics. For quickdraw, a `cval` of `1.0` (white) might be more natural since the drawings are on a white background.

After exploring different fill modes, let's take a practical step forward by saving your augmented drawings to disk. This is an essential technique that allows you to inspect the augmentations visually and build a larger dataset for training.

When you save augmented images, you can verify that your transformations are working as expected and create a permanent record of your expanded dataset. This approach is particularly useful when you want to carefully review the quality of augmentations before using them for model training.

In this task, you'll configure the flow() method to save augmented drawings to a directory, generate multiple variations for different categories, and examine how the augmentations appear as actual image files on disk.

```python
import urllib.request
import numpy as np
import os
from sklearn.model_selection import train_test_split
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import matplotlib.pyplot as plt

# Ensure the data is downloaded
categories = ['cat', 'house', 'airplane', 'apple', 'bicycle']
base_url = 'https://storage.googleapis.com/quickdraw_dataset/full/numpy_bitmap/'

os.makedirs('quickdraw_data', exist_ok=True)

for category in categories:
    file_path = f'quickdraw_data/{category}.npy'
    if not os.path.exists(file_path):
        print(f"Downloading {category}...")
        urllib.request.urlretrieve(base_url + category + '.npy', file_path)
    else:
        print(f"{category}.npy already exists.")

# Load and prepare data
data = []
labels = []

for idx, cat in enumerate(categories):
    filepath = f'quickdraw_data/{cat}.npy'
    imgs = np.load(filepath)[:15000]  # Load up to 15000 images per category
    if imgs.dtype != np.uint8:
        imgs = imgs.astype(np.uint8)
    data.append(imgs)
    labels.append(np.full(imgs.shape[0], idx))

# Combine all data and labels
data = np.concatenate(data, axis=0)
labels = np.concatenate(labels, axis=0)

# Shuffle data
indices = np.arange(len(data))
np.random.shuffle(indices)
data, labels = data[indices], labels[indices]

# Reshape and normalize
data = data.reshape(-1, 28, 28, 1).astype('float32') / 255.0

# Split data into training and testing sets
x_train, x_test, y_train, y_test = train_test_split(data, labels, test_size=0.2, random_state=42)
print(f"Training data shape: {x_train.shape}, Testing data shape: {x_test.shape}")

# TODO: Create a directory to save augmented images
os.makedirs(________, exist_ok=True)

# Create a data augmentation generator
datagen = ImageDataGenerator(
    rotation_range=15,
    width_shift_range=0.1,
    height_shift_range=0.1,
    zoom_range=0.1,
    shear_range=0.1,
    horizontal_flip=True,
    fill_mode='nearest'
)

# Fit the generator to the training data
datagen.fit(x_train)

# Select a few images from each category to augment
samples_per_category = 2
augmentations_per_sample = 3
total_generated = 0

for category_idx in range(len(categories)):
    # Find indices of images from this category
    category_indices = np.where(y_train == category_idx)[0][:samples_per_category]
    
    for idx in category_indices:
        image = x_train[idx:idx+1]  # Keep batch dimension
        category_name = categories[int(y_train[idx])]
        
        # TODO: Generate and save augmented versions using flow() with save_to_dir parameter
        i = 0
        for batch in datagen.flow(
            image, 
            batch_size=1,
            save_to_dir=________,
            save_prefix=________,
            save_format=________
        ):
            i += 1
            total_generated += 1
            if i >= augmentations_per_sample:
                break

print(f"Generated {total_generated} augmented drawings and saved them to 'augmented_drawings' directory")

# Display a few original images for reference
plt.figure(figsize=(12, 4))
for i in range(5):
    plt.subplot(1, 5, i+1)
    plt.imshow(x_train[i].reshape(28, 28), cmap='gray')
    plt.title(f"{categories[int(y_train[i])]}")
    plt.axis('off')

plt.tight_layout()
plt.savefig('static/images/plot.png')

```

Let's get those augmented drawings saved to disk\!

Here's the completed code:

```python
import urllib.request
import numpy as np
import os
from sklearn.model_selection import train_test_split
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import matplotlib.pyplot as plt

# Ensure the data is downloaded
categories = ['cat', 'house', 'airplane', 'apple', 'bicycle']
base_url = 'https://storage.googleapis.com/quickdraw_dataset/full/numpy_bitmap/'

os.makedirs('quickdraw_data', exist_ok=True)

for category in categories:
    file_path = f'quickdraw_data/{category}.npy'
    if not os.path.exists(file_path):
        print(f"Downloading {category}...")
        urllib.request.urlretrieve(base_url + category + '.npy', file_path)
    else:
        print(f"{category}.npy already exists.")

# Load and prepare data
data = []
labels = []

for idx, cat in enumerate(categories):
    filepath = f'quickdraw_data/{cat}.npy'
    imgs = np.load(filepath)[:15000]  # Load up to 15000 images per category
    if imgs.dtype != np.uint8:
        imgs = imgs.astype(np.uint8)
    data.append(imgs)
    labels.append(np.full(imgs.shape[0], idx))

# Combine all data and labels
data = np.concatenate(data, axis=0)
labels = np.concatenate(labels, axis=0)

# Shuffle data
indices = np.arange(len(data))
np.random.shuffle(indices)
data, labels = data[indices], labels[indices]

# Reshape and normalize
data = data.reshape(-1, 28, 28, 1).astype('float32') / 255.0

# Split data into training and testing sets
x_train, x_test, y_train, y_test = train_test_split(data, labels, test_size=0.2, random_state=42)
print(f"Training data shape: {x_train.shape}, Testing data shape: {x_test.shape}")

# Create a directory to save augmented images
output_dir = 'augmented_drawings'
os.makedirs(output_dir, exist_ok=True)
print(f"Created directory: {output_dir}")

# Create a data augmentation generator
datagen = ImageDataGenerator(
    rotation_range=15,
    width_shift_range=0.1,
    height_shift_range=0.1,
    zoom_range=0.1,
    shear_range=0.1,
    horizontal_flip=True,
    fill_mode='nearest'
)

# Fit the generator to the training data
datagen.fit(x_train)

# Select a few images from each category to augment
samples_per_category = 2
augmentations_per_sample = 3
total_generated = 0

for category_idx in range(len(categories)):
    # Find indices of images from this category
    category_indices = np.where(y_train == category_idx)[0][:samples_per_category]
    
    for idx in category_indices:
        image = x_train[idx:idx+1]  # Keep batch dimension
        category_name = categories[int(y_train[idx])]
        
        # Generate and save augmented versions using flow() with save_to_dir parameter
        i = 0
        for batch in datagen.flow(
            image, 
            batch_size=1,
            save_to_dir=output_dir,
            save_prefix=f'aug_{category_name}', # Prefix with category name
            save_format='png' # Save as PNG images
        ):
            i += 1
            total_generated += 1
            if i >= augmentations_per_sample:
                break

print(f"Generated {total_generated} augmented drawings and saved them to '{output_dir}' directory")

# Display a few original images for reference
plt.figure(figsize=(12, 4))
for i in range(5):
    plt.subplot(1, 5, i+1)
    plt.imshow(x_train[i].reshape(28, 28), cmap='gray')
    plt.title(f"{categories[int(y_train[i])]}")
    plt.axis('off')

plt.tight_layout()
plt.savefig('static/images/plot.png')
```

-----

### Key Changes and How to Verify

1.  **`os.makedirs('augmented_drawings', exist_ok=True)`**: This line creates a new directory named `augmented_drawings` if it doesn't already exist. This is where all the augmented images will be saved.
2.  **`save_to_dir=output_dir`**: Inside the `datagen.flow()` method, this parameter is set to the `output_dir` (which is `'augmented_drawings'`). This tells the generator to save the augmented images to this specified directory instead of yielding them to the loop.
3.  **`save_prefix=f'aug_{category_name}'`**: This adds a prefix to the filenames of the saved images. I've used an f-string to include the category name (e.g., `aug_cat_`, `aug_house_`) which makes it easier to identify the original category of the augmented image when you browse the directory.
4.  **`save_format='png'`**: This specifies the image format to save the files in. PNG is a good choice for these 28x28 grayscale images as it's lossless. Other options include `'jpeg'`, but PNG is generally better for line art.

### How to Verify the Output

After running this code, you will see a new directory named `augmented_drawings` created in the same location as your script. Inside this directory, you will find:

  * A total of `samples_per_category * len(categories) * augmentations_per_sample` images. In this case, it will be `2 * 5 * 3 = 30` augmented images.
  * The filenames will start with prefixes like `aug_cat_`, `aug_house_`, `aug_airplane_`, `aug_apple_`, and `aug_bicycle_`, followed by a timestamp and a random number (generated by Keras).
  * You can open these `png` files using any image viewer to visually inspect the transformations and confirm they look as expected. Each image will be a 28x28 grayscale drawing, reflecting the various rotations, shifts, zooms, and shears applied.

This process of saving augmented images is invaluable for debugging your augmentation pipeline and for creating large, diverse datasets for training deep learning models.