[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/AMLA-UBC/100-Exploring-the-World-of-Modern-Machine-Learning/blob/main/Deep_Dive_into_Learning_Rate_Tutorial.ipynb)

# Install and Import the Required Modules

In [None]:
!pip install -q tensorflow
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Load Tensorflow's Animals Dataset

The dataset contains images of 10 different animals: cats, dogs, cows, horses, sheep, goats, ducks, chickens, rabbits, and pigs.

In [None]:
# Load the animal dataset
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.animals.load_data()

# Reshape the data
x_train = x_train.reshape(x_train.shape[0], 150, 150, 3)
x_test = x_test.reshape(x_test.shape[0], 150, 150, 3)

# Normalize the data
x_train = x_train / 255.0
x_test = x_test / 255.0

# Change the Learning Rate

Train the CNN with different learning rates. Play around with it.

In [None]:
learning_rate = 0.001

# Build a Convolutional Neural Netowrk (CNN)

In [None]:
# Create the model
model = tf.keras.Sequential([
    # Add a convolutional layer
    tf.keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(150, 150, 3)),
    # Add a pooling layer
    tf.keras.layers.MaxPooling2D(2, 2),
    # Add a flatten layer
    tf.keras.layers.Flatten(),
    # Add a dense layer
    tf.keras.layers.Dense(128, activation='relu'),
    # Add a final output layer
    tf.keras.layers.Dense(10, activation='softmax')
])

# Compile the model
model.compile(loss='sparse_categorical_crossentropy',
              optimizer=tf.keras.optimizers.Adam(learning_rate=learning_rate),
              metrics=['accuracy'])

# Train the model
model.fit(x_train, y_train, epochs=10)

# Evaluate the model
test_loss, test_acc = model.evaluate(x_test, y_test)
print('Test accuracy:', test_acc)

# Save the model
model.save("10animals")

# How to Find the Best Learning Rate?

Learning rate is a critical hyperparameter in deep learning, which determines the step size that the optimizer takes when updating the model parameters. Selecting an appropriate learning rate can make a big difference in the training performance of your model. Here're a few ways to find the best learning rate.

One of the most straightforward methods to find an optimal learning rate is to **plot the learning rate vs. the loss**. You can start with a low learning rate, say 1e-7, and gradually increase it, for example, by multiplying it by 10 after every iteration. Plot the loss values for each learning rate, and look for the "sweet spot" where the loss starts to decrease and then suddenly increases. That is the learning rate you should use.

In [None]:
import matplotlib.pyplot as plt

learning_rates = [1e-7, 1e-6, 1e-5, 1e-4, 1e-3, 1e-2]
loss_values = []

for learning_rate in learning_rates:
    model.compile(loss='sparse_categorical_crossentropy', optimizer=tf.keras.optimizers.Adam(learning_rate=learning_rate), metrics=['accuracy'])

    history = model.fit(x_train, y_train, epochs=10)
    loss_values.append(history.history['loss'][-1])

plt.plot(learning_rates, loss_values)
plt.xscale('log')
plt.xlabel('Learning Rate')
plt.ylabel('Loss')
plt.show()

Another method is to use a **learning rate schedule**, which starts with a high learning rate and gradually decreases it over time. A common schedule is the step decay schedule, where the learning rate is reduced by a factor of 0.1 after a certain number of epochs. You can also use other schedules like cyclical learning rate or cosine annealing.

In [None]:
# Compile the model
initial_learning_rate = 0.1
lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(initial_learning_rate,
                                                             decay_steps=100000,
                                                             decay_rate=0.96,
                                                             staircase=True)
optimizer = tf.keras.optimizers.Adam(learning_rate=lr_schedule)
model.compile(loss='sparse_categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])

# Train the model
model.fit(x_train, y_train, epochs=10)

There are also **pre-written algorithms** that can help you find the best learning rate. For example, the learning rate finder and learning rate range test. They both plot the learning rate versus the loss, but with some added functionality, such as automatically increasing the learning rate after each iteration.

In [None]:
import matplotlib.pyplot as plt

# Define the learning rate range
lr_start = 1e-7
lr_end = 1
num_steps = 200

# Create a learning rate schedule
learning_rates = tf.keras.optimizers.schedules.ExponentialDecay(
    lr_start,
    decay_steps=num_steps,
    decay_rate=(lr_end/lr_start)**(1/num_steps),
    staircase=True)

# Store the loss values
losses = []

# Loop over the learning rates
for i, learning_rate in enumerate(learning_rates):
    # Compile the model
    optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate)
    model.compile(loss='sparse_categorical_crossentropy',
                  optimizer=optimizer,
                  metrics=['accuracy'])

    # Train the model for a single step
    model.fit(x_train[:100], y_train[:100], epochs=1, batch_size=32, verbose=0)

    # Evaluate the model
    loss, _ = model.evaluate(x_test, y_test, verbose=0)
    losses.append(loss)

# Plot the learning rate versus the loss
plt.plot(learning_rates, losses)
plt.xscale('log')
plt.xlabel('Learning rate')
plt.ylabel('Loss')
plt.show()

# Choose the learning rate with the lowest loss
best_learning_rate = learning_rates[np.argmin(losses)]

Once you've found a good learning rate, **validate your model** with a test or validation set. It's important to keep in mind that the optimal learning rate may change based on the architecture and dataset you're using. So, experiment and validate with different learning rates to ensure that you've found the best one.

In [None]:
# Define a list of learning rates to experiment with
learning_rates = [1e-4, 5e-4, 1e-3, 5e-3, 1e-2, 5e-2, 1e-1, 5e-1]

# Initialize a list to store the test accuracy for each learning rate
accuracies = []

# Loop through each learning rate
for learning_rate in learning_rates:
    # Compile the model with the current learning rate
    model.compile(loss='sparse_categorical_crossentropy',
                  optimizer=tf.keras.optimizers.Adam(learning_rate=learning_rate),
                  metrics=['accuracy'])

    # Train the model
    model.fit(x_train, y_train, epochs=10)

    # Evaluate the model
    test_loss, test_acc = model.evaluate(x_test, y_test)
    print('Test accuracy for learning rate', learning_rate, ':', test_acc)
    accuracies.append(test_acc)

# Find the learning rate with the highest test accuracy
best_learning_rate = learning_rates[accuracies.index(max(accuracies))]
print('Best learning rate:', best_learning_rate)

# Compile the model with the best learning rate
model.compile(loss='sparse_categorical_crossentropy',
              optimizer=tf.keras.optimizers.Adam(learning_rate=best_learning_rate),
              metrics=['accuracy'])

# Train the model with the best learning rate
model.fit(x_train, y_train, epochs=10)

# Evaluate the model with the best learning rate
test_loss, test_acc = model.evaluate(x_test, y_test)
print('Test accuracy with best learning rate:', test_acc)

# Save the model with the best learning rate
model.save("10animals_best_lr")

In conclusion, finding the best learning rate may require a combination of trial and error, experimentation, and validation. The above methods should give you a good starting point, and be prepared to tweak and fine-tune the learning rate based on your specific use case.

# Popular Image Classification Datasets on Kaggle

The Kaggle image classification datasets are a treasure trove of visual data for all to explore. From the complex patterns of Deepfake images to the simple strokes of the QuickDraw dataset, these collections offer a fascinating glimpse into the ever-evolving world of AI. For those looking to build image classification models for practice, here are some of the most popular image classification datasets available on Kaggle:

- Doodles Dataset: `A collection of cartoon drawings from the Google QuickDraw project, offering an array of categories from which to classify images.`

- Chest X-Ray Images: `A dataset of over 100,000 chest X-ray images from 15 different classes, with labels that identify the area of abnormality.`

- Fruits 360 Dataset: `A dataset of images of over 120 different types of fruits, classified by size, shape, and color.`

- CelebFaces Attributes Dataset (CelebA): `A dataset of over 200,000 celebrity faces, categorized by age, gender, and ethnicity.`

- Deepfake Detection Challenge: `A dataset of millions of images and videos of deepfakes, generated using artificial intelligence algorithms.`

- Plant Seedlings Dataset: `A dataset of images of over 1,500 different species of plant seedlings, classified by species.`

- Traffic Sign Recognition: `A dataset of over 40,000 images of traffic signs, categorized by type of sign.`