This is a companion notebook for the book [Deep Learning with Python, Second Edition](https://www.manning.com/books/deep-learning-with-python-second-edition?a_aid=keras&a_bid=76564dff). For readability, it only contains runnable code blocks and section titles, and omits everything else in the book: text paragraphs, figures, and pseudocode.

**If you want to be able to follow what's going on, I recommend reading the notebook side by side with your copy of the book.**

This notebook was generated for TensorFlow 2.6.

# Introduction to deep learning for computer vision

## Introduction to convnets

**Instantiating a small convnet**

In [None]:
from tensorflow import keras # importing keras from tensorflow
from tensorflow.keras import layers # importing layers from tensorflow.keras
inputs = keras.Input(shape=(28, 28, 1)) # defining the input shape
x = layers.Conv2D(filters=32, kernel_size=3, activation="relu")(inputs) # defining the convolutional layer
x = layers.MaxPooling2D(pool_size=2)(x) # defining the max pooling layer
x = layers.Conv2D(filters=64, kernel_size=3, activation="relu")(x) # defining the convolutional layer
x = layers.MaxPooling2D(pool_size=2)(x) # defining the max pooling layer
x = layers.Conv2D(filters=128, kernel_size=3, activation="relu")(x) # defining the convolutional layer
x = layers.Flatten()(x) # flattening the output
outputs = layers.Dense(10, activation="softmax")(x) # defining the output layer
model = keras.Model(inputs=inputs, outputs=outputs) # creating the model

**Displaying the model's summary**

In [None]:
model.summary() # printing the model summary

**Training the convnet on MNIST images**

In [None]:
from tensorflow.keras.datasets import mnist # importing mnist dataset

(train_images, train_labels), (test_images, test_labels) = mnist.load_data() # loading the mnist dataset
train_images = train_images.reshape((60000, 28, 28, 1)) # reshaping the train images
train_images = train_images.astype("float32") / 255 # normalizing the train images
test_images = test_images.reshape((10000, 28, 28, 1)) # reshaping the test images
test_images = test_images.astype("float32") / 255 # normalizing the test images
model.compile( # compiling the model
    optimizer="rmsprop", # adding the optimizer as rmsprop which is a variant of gradient descent
    loss="sparse_categorical_crossentropy", # adding the loss function as sparse_categorical_crossentropy which is used for classification problems
    metrics=["accuracy"]) # adding the metrics as accuracy which is used to evaluate the model performance 
model.fit(train_images, train_labels, epochs=5, batch_size=64) # fitting the model on the train images and labels with 5 epochs and batch size of 64 

**Evaluating the convnet**

In [None]:
test_loss, test_acc = model.evaluate(test_images, test_labels) # evaluating the model on the test images and labels
print(f"Test accuracy: {test_acc:.3f}") # printing the test accuracy

### The convolution operation

#### Understanding border effects and padding

#### Understanding convolution strides

### The max-pooling operation

**An incorrectly structured convnet missing its max-pooling layers**

In [None]:
inputs = keras.Input(shape=(28, 28, 1)) # defining the input shape
x = layers.Conv2D(filters=32, kernel_size=3, activation="relu")(inputs) # defining the convolutional layer
x = layers.Conv2D(filters=64, kernel_size=3, activation="relu")(x) # defining the convolutional layer based on the previous layer 
x = layers.Conv2D(filters=128, kernel_size=3, activation="relu")(x) # defining the convolutional layer based on the previous layer
x = layers.Flatten()(x) # flattening the output
outputs = layers.Dense(10, activation="softmax")(x) # defining the output layer 
model_no_max_pool = keras.Model(inputs=inputs, outputs=outputs) # creating the model

In [None]:
model_no_max_pool.summary() # printing the model summary

## Training a convnet from scratch on a small dataset

### The relevance of deep learning for small-data problems

### Downloading the data

In [None]:
from google.colab import files # importing files from google.colab 
files.upload() # uploading the file

In [None]:
!mkdir ~/.kaggle # creating a directory named .kaggle
!cp kaggle.json ~/.kaggle/ # copying the kaggle.json file to the .kaggle directory
!chmod 600 ~/.kaggle/kaggle.json # changing the permissions of the kaggle.json file

In [None]:
!kaggle competitions download -c dogs-vs-cats # downloading the dogs-vs-cats dataset

In [None]:
!unzip -qq dogs-vs-cats.zip # unzipping the dogs-vs-cats.zip file

In [None]:
!unzip -qq train.zip # unzipping the train.zip file

**Copying images to training, validation, and test directories**

In [None]:
import os, shutil, pathlib # importing os, shutil and pathlib

original_dir = pathlib.Path("train") # defining the original directory
new_base_dir = pathlib.Path("cats_vs_dogs_small") # defining the new base directory

def make_subset(subset_name, start_index, end_index): # defining a function to make a subset
    for category in ("cat", "dog"): # iterating over the categories
        dir = new_base_dir / subset_name / category # defining the directory
        os.makedirs(dir) # creating the directory
        fnames = [f"{category}.{i}.jpg" for i in range(start_index, end_index)] # defining the file names
        for fname in fnames: # iterating over the file names 
            shutil.copyfile(src=original_dir / fname, # copying the file from the original directory to the new directory 
                            dst=dir / fname) # copying the file from the original directory to the new directory

make_subset("train", start_index=0, end_index=1000) # making a subset of the train data
make_subset("validation", start_index=1000, end_index=1500) # making a subset of the validation data
make_subset("test", start_index=1500, end_index=2500) # making a subset of the test data

### Building the model

**Instantiating a small convnet for dogs vs. cats classification**

In [None]:
from tensorflow import keras # importing keras from tensorflow
from tensorflow.keras import layers # importing layers from tensorflow.keras

inputs = keras.Input(shape=(180, 180, 3)) # defining the input shape as 180x180x3 because the images are 180x180 pixels with 3 channels
x = layers.Rescaling(1./255)(inputs) # rescaling the input values to be between 0 and 1 since the pixel values are between 0 and 255
x = layers.Conv2D(filters=32, kernel_size=3, activation="relu")(x) # defining the convolutional layer
x = layers.MaxPooling2D(pool_size=2)(x) # defining the max pooling layer
x = layers.Conv2D(filters=64, kernel_size=3, activation="relu")(x) # defining the convolutional layer based on the previous layer
x = layers.MaxPooling2D(pool_size=2)(x) # defining the max pooling layer based on the previous layer 
x = layers.Conv2D(filters=128, kernel_size=3, activation="relu")(x) # defining the convolutional layer based on the previous layer
x = layers.MaxPooling2D(pool_size=2)(x) # defining the max pooling layer based on the previous layer
x = layers.Conv2D(filters=256, kernel_size=3, activation="relu")(x) # defining the convolutional layer based on the previous layer
x = layers.MaxPooling2D(pool_size=2)(x) # defining the max pooling layer based on the previous layer
x = layers.Conv2D(filters=256, kernel_size=3, activation="relu")(x) # defining the convolutional layer based on the previous layer
x = layers.Flatten()(x) # flattening the output
outputs = layers.Dense(1, activation="sigmoid")(x) # defining the output layer 
model = keras.Model(inputs=inputs, outputs=outputs) # creating the model

In [None]:
model.summary() # printing the model summary

**Configuring the model for training**

In [None]:
model.compile(loss="binary_crossentropy", 
              optimizer="rmsprop",
              metrics=["accuracy"]) # compiling the model with the loss function as binary_crossentropy, optimizer as rmsprop and metrics as accuracy

### Data preprocessing

**Using `image_dataset_from_directory` to read images**

In [None]:
from tensorflow.keras.utils import image_dataset_from_directory # importing image_dataset_from_directory from tensorflow.keras.utils

train_dataset = image_dataset_from_directory( # creating the train dataset
    new_base_dir / "train", # defining the directory as the train directory
    image_size=(180, 180), # defining the image size as 180x180 pixels
    batch_size=32) # defining the batch size as 32
validation_dataset = image_dataset_from_directory( # creating the validation dataset
    new_base_dir / "validation", # defining the directory as the validation directory
    image_size=(180, 180), # defining the image size as 180x180 pixels
    batch_size=32) # defining the batch size as 32
test_dataset = image_dataset_from_directory( # creating the test dataset
    new_base_dir / "test", # defining the directory as the test directory
    image_size=(180, 180), # defining the image size as 180x180 pixels
    batch_size=32) # defining the batch size as 32

In [None]:
import numpy as np # importing numpy as np
import tensorflow as tf # importing tensorflow as tf
random_numbers = np.random.normal(size=(1000, 16)) # generating random numbers
dataset = tf.data.Dataset.from_tensor_slices(random_numbers) # creating a dataset from the random numbers

In [None]:
for i, element in enumerate(dataset): # iterating over the dataset
    print(element.shape) # printing the shape of the element
    if i >= 2: # if the index is greater than or equal to 2
        break # break the loop

In [None]:
batched_dataset = dataset.batch(32) # batching the dataset with a batch size of 32
for i, element in enumerate(batched_dataset): # iterating over the batched dataset
    print(element.shape) # printing the shape of the element
    if i >= 2: # if the index is greater than or equal to 2
        break # break the loop

In [None]:
reshaped_dataset = dataset.map(lambda x: tf.reshape(x, (4, 4))) # reshaping the dataset to have a shape of 4x4 because 16 = 4x4 and the original shape is 16 
for i, element in enumerate(reshaped_dataset): # iterating over the reshaped dataset
    print(element.shape) # printing the shape of the element
    if i >= 2: # if the index is greater than or equal to 2
        break # break the loop

**Displaying the shapes of the data and labels yielded by the `Dataset`**

In [None]:
for data_batch, labels_batch in train_dataset: # iterating over the train dataset
    print("data batch shape:", data_batch.shape) # printing the shape of the data batch
    print("labels batch shape:", labels_batch.shape) # printing the shape of the labels batch
    break # breaking the loop

**Fitting the model using a `Dataset`**

In [None]:
callbacks = [ # defining the callbacks
    keras.callbacks.ModelCheckpoint( # defining the model checkpoint callback
        filepath="convnet_from_scratch.keras", # defining the file path 
        save_best_only=True, # saving the best model only
        monitor="val_loss") # monitoring the validation loss
] 
history = model.fit( # fitting the model
    train_dataset, # using the train dataset
    epochs=30, # using 30 epochs
    validation_data=validation_dataset, # using the validation dataset
    callbacks=callbacks) # using the callbacks

**Displaying curves of loss and accuracy during training**

In [None]:
import matplotlib.pyplot as plt # importing matplotlib.pyplot as plt
accuracy = history.history["accuracy"] # getting the accuracy
val_accuracy = history.history["val_accuracy"] # getting the validation accuracy
loss = history.history["loss"] # getting the loss
val_loss = history.history["val_loss"] # getting the validation loss
epochs = range(1, len(accuracy) + 1) # getting the epochs
plt.plot(epochs, accuracy, "bo", label="Training accuracy") # plotting the training accuracy
plt.plot(epochs, val_accuracy, "b", label="Validation accuracy") # plotting the validation accuracy
plt.title("Training and validation accuracy") # setting the title
plt.legend() # adding the legend
plt.figure() # creating a new figure
plt.plot(epochs, loss, "bo", label="Training loss") # plotting the training loss
plt.plot(epochs, val_loss, "b", label="Validation loss") # plotting the validation loss
plt.title("Training and validation loss") # setting the title
plt.legend() # adding the legend
plt.show() # showing the plot

**Evaluating the model on the test set**

In [None]:
test_model = keras.models.load_model("convnet_from_scratch.keras") # loading the model
test_loss, test_acc = test_model.evaluate(test_dataset) # evaluating the model on the test dataset
print(f"Test accuracy: {test_acc:.3f}") # printing the test accuracy

### Using data augmentation

**Define a data augmentation stage to add to an image model**

In [None]:
data_augmentation = keras.Sequential( # defining the data augmentation
    [
        layers.RandomFlip("horizontal"), # adding random horizontal flip
        layers.RandomRotation(0.1), # adding random rotation
        layers.RandomZoom(0.2), # adding random zoom 
    ]
)

**Displaying some randomly augmented training images**

In [None]:
plt.figure(figsize=(10, 10)) # creating a new figure
for images, _ in train_dataset.take(1): # iterating over the train dataset
    for i in range(9): # iterating over the first 9 images
        augmented_images = data_augmentation(images) # augmenting the images
        ax = plt.subplot(3, 3, i + 1) # creating a subplot
        plt.imshow(augmented_images[0].numpy().astype("uint8")) # showing the augmented images
        plt.axis("off") # turning off the axis

**Defining a new convnet that includes image augmentation and dropout**

In [None]:
inputs = keras.Input(shape=(180, 180, 3)) # defining the input shape
x = data_augmentation(inputs) # augmenting the input
x = layers.Rescaling(1./255)(x) # rescaling the input values to be between 0 and 1
x = layers.Conv2D(filters=32, kernel_size=3, activation="relu")(x) # defining the convolutional layer
x = layers.MaxPooling2D(pool_size=2)(x) # defining the max pooling layer
x = layers.Conv2D(filters=64, kernel_size=3, activation="relu")(x) # defining the convolutional layer based on the previous layer
x = layers.MaxPooling2D(pool_size=2)(x) # defining the max pooling layer based on the previous layer
x = layers.Conv2D(filters=128, kernel_size=3, activation="relu")(x) # defining the convolutional layer based on the previous layer
x = layers.MaxPooling2D(pool_size=2)(x) # defining the max pooling layer based on the previous layer
x = layers.Conv2D(filters=256, kernel_size=3, activation="relu")(x) # defining the convolutional layer based on the previous layer
x = layers.MaxPooling2D(pool_size=2)(x) # defining the max pooling layer based on the previous layer
x = layers.Conv2D(filters=256, kernel_size=3, activation="relu")(x) # defining the convolutional layer based on the previous layer
x = layers.Flatten()(x) # flattening the output
x = layers.Dropout(0.5)(x) # adding dropout
outputs = layers.Dense(1, activation="sigmoid")(x) # defining the output layer
model = keras.Model(inputs=inputs, outputs=outputs) # creating the model

model.compile(loss="binary_crossentropy",
              optimizer="rmsprop",
              metrics=["accuracy"]) # compiling the model with the loss function as binary_crossentropy, optimizer as rmsprop and metrics as accuracy

**Training the regularized convnet**

In [None]:
callbacks = [ # defining the callbacks
    keras.callbacks.ModelCheckpoint( # defining the model checkpoint callback
        filepath="convnet_from_scratch_with_augmentation.keras", # defining the file path 
        save_best_only=True, # saving the best model only
        monitor="val_loss") # monitoring the validation loss
]
history = model.fit( # fitting the model
    train_dataset, # using the train dataset
    epochs=100, # using 100 epochs
    validation_data=validation_dataset, # using the validation dataset
    callbacks=callbacks) # using the callbacks

**Evaluating the model on the test set**

In [None]:
test_model = keras.models.load_model( # loading the model
    "convnet_from_scratch_with_augmentation.keras") # loading the model
test_loss, test_acc = test_model.evaluate(test_dataset) # evaluating the model on the test dataset
print(f"Test accuracy: {test_acc:.3f}") # printing the test accuracy

## Leveraging a pretrained model

### Feature extraction with a pretrained model

**Instantiating the VGG16 convolutional base**

In [None]:
conv_base = keras.applications.vgg16.VGG16( # loading the VGG16 model
    weights="imagenet", # using the imagenet weights
    include_top=False, # excluding the top layer
    input_shape=(180, 180, 3)) # defining the input shape

In [None]:
conv_base.summary() # printing the model summary

#### Fast feature extraction without data augmentation

**Extracting the VGG16 features and corresponding labels**

In [None]:
import numpy as np # importing numpy as np

def get_features_and_labels(dataset): # defining a function to get the features and labels
    all_features = [] # initializing the list of all features
    all_labels = [] # initializing the list of all labels
    for images, labels in dataset: # iterating over the dataset
        preprocessed_images = keras.applications.vgg16.preprocess_input(images) # preprocessing the images
        features = conv_base.predict(preprocessed_images) # getting the features
        all_features.append(features) # appending the features
        all_labels.append(labels) # appending the labels
    return np.concatenate(all_features), np.concatenate(all_labels) # returning the features and labels

train_features, train_labels =  get_features_and_labels(train_dataset) # getting the features and labels of the train dataset
val_features, val_labels =  get_features_and_labels(validation_dataset) # getting the features and labels of the validation dataset
test_features, test_labels =  get_features_and_labels(test_dataset) # getting the features and labels of the test dataset

In [None]:
train_features.shape # printing the shape of the train features

**Defining and training the densely connected classifier**

In [None]:
inputs = keras.Input(shape=(5, 5, 512)) # defining the input shape
x = layers.Flatten()(inputs) # flattening the input
x = layers.Dense(256)(x) # defining a dense layer
x = layers.Dropout(0.5)(x) # adding dropout
outputs = layers.Dense(1, activation="sigmoid")(x) # defining the output layer
model = keras.Model(inputs, outputs) # creating the model
model.compile(loss="binary_crossentropy", 
              optimizer="rmsprop",
              metrics=["accuracy"]) # compiling the model with the loss function as binary_crossentropy, optimizer as rmsprop and metrics as accuracy

callbacks = [ # defining the callbacks
    keras.callbacks.ModelCheckpoint( # defining the model checkpoint callback
      filepath="feature_extraction.keras", # defining the file path
      save_best_only=True, # saving the best model only
      monitor="val_loss") # monitoring the validation loss
]
history = model.fit( # fitting the model
    train_features, train_labels, # using the train features and labels
    epochs=20, # using 20 epochs
    validation_data=(val_features, val_labels), # using the validation features and labels
    callbacks=callbacks) # using the callbacks

**Plotting the results**

In [None]:
import matplotlib.pyplot as plt # importing matplotlib.pyplot as plt
acc = history.history["accuracy"] # getting the accuracy
val_acc = history.history["val_accuracy"] # getting the validation accuracy
loss = history.history["loss"] # getting the loss
val_loss = history.history["val_loss"] # getting the validation loss
epochs = range(1, len(acc) + 1) # getting the epochs
plt.plot(epochs, acc, "bo", label="Training accuracy") # plotting the training accuracy
plt.plot(epochs, val_acc, "b", label="Validation accuracy") # plotting the validation accuracy
plt.title("Training and validation accuracy") # setting the title
plt.legend() # adding the legend
plt.figure() # creating a new figure
plt.plot(epochs, loss, "bo", label="Training loss") # plotting the training loss
plt.plot(epochs, val_loss, "b", label="Validation loss") # plotting the validation loss
plt.title("Training and validation loss") # setting the title
plt.legend() # adding the legend
plt.show() # showing the plot

#### Feature extraction together with data augmentation

**Instantiating and freezing the VGG16 convolutional base**

In [None]:
conv_base  = keras.applications.vgg16.VGG16( # loading the VGG16 model
    weights="imagenet", # using the imagenet weights
    include_top=False) # excluding the top layer
conv_base.trainable = False # freezing the convolutional base

**Printing the list of trainable weights before and after freezing**

In [None]:
conv_base.trainable = True # unfreezing the convolutional base
print("This is the number of trainable weights " 
      "before freezing the conv base:", len(conv_base.trainable_weights)) # printing the number of trainable weights

In [None]:
conv_base.trainable = False # freezing the convolutional base
print("This is the number of trainable weights "
      "after freezing the conv base:", len(conv_base.trainable_weights)) # printing the number of trainable weights

**Adding a data augmentation stage and a classifier to the convolutional base**

In [None]:
data_augmentation = keras.Sequential( # defining the data augmentation
    [
        layers.RandomFlip("horizontal"), # adding random horizontal flip
        layers.RandomRotation(0.1), # adding random rotation
        layers.RandomZoom(0.2), # adding random zoom
    ]
)

inputs = keras.Input(shape=(180, 180, 3)) # defining the input shape
x = data_augmentation(inputs) # augmenting the input
x = keras.applications.vgg16.preprocess_input(x) # preprocessing the input
x = conv_base(x) # adding the convolutional base
x = layers.Flatten()(x) # flattening the output
x = layers.Dense(256)(x) # defining a dense layer
x = layers.Dropout(0.5)(x) # adding dropout
outputs = layers.Dense(1, activation="sigmoid")(x) # defining the output layer
model = keras.Model(inputs, outputs) # creating the model
model.compile(loss="binary_crossentropy", 
              optimizer="rmsprop",
              metrics=["accuracy"]) # compiling the model with the loss function as binary_crossentropy, optimizer as rmsprop and metrics as accuracy

In [None]:
callbacks = [ # defining the callbacks
    keras.callbacks.ModelCheckpoint( # defining the model checkpoint callback
        filepath="feature_extraction_with_data_augmentation.keras", # defining the file path
        save_best_only=True, # saving the best model only
        monitor="val_loss") # monitoring the validation loss
]
history = model.fit( # fitting the model
    train_dataset, # using the train dataset
    epochs=50, # using 50 epochs
    validation_data=validation_dataset, # using the validation dataset
    callbacks=callbacks) # using the callbacks

**Evaluating the model on the test set**

In [None]:
test_model = keras.models.load_model( # loading the model
    "feature_extraction_with_data_augmentation.keras") # loading the model
test_loss, test_acc = test_model.evaluate(test_dataset) # evaluating the model on the test dataset
print(f"Test accuracy: {test_acc:.3f}") # printing the test accuracy

### Fine-tuning a pretrained model

In [None]:
conv_base.summary() # printing the model summary

**Freezing all layers until the fourth from the last**

In [None]:
conv_base.trainable = True # unfreezing the convolutional base
for layer in conv_base.layers[:-4]: # iterating over the layers of the convolutional base except the last 4 layers
    layer.trainable = False # freezing the layers

**Fine-tuning the model**

In [None]:
model.compile(loss="binary_crossentropy",
              optimizer=keras.optimizers.RMSprop(learning_rate=1e-5),
              metrics=["accuracy"]) # compiling the model with the loss function as binary_crossentropy, optimizer as RMSprop with a learning rate of 1e-5 and metrics as accuracy

callbacks = [ # defining the callbacks
    keras.callbacks.ModelCheckpoint( # defining the model checkpoint callback
        filepath="fine_tuning.keras", # defining the file path
        save_best_only=True, # saving the best model only
        monitor="val_loss") # monitoring the validation loss
]
history = model.fit( # fitting the model
    train_dataset, # using the train dataset
    epochs=30, # using 30 epochs
    validation_data=validation_dataset, # using the validation dataset
    callbacks=callbacks) # using the callbacks

In [None]:
model = keras.models.load_model("fine_tuning.keras") # loading the model
test_loss, test_acc = model.evaluate(test_dataset) # evaluating the model on the test dataset
print(f"Test accuracy: {test_acc:.3f}") # printing the test accuracy

## Summary