# Transfer Learning Concepts

You may recall *Practicum AI*"s heroine Amelia, the AI-savvy nutritionist. At the end of our *[Deep Learning Foundations course](https://practicumai.org/courses/deep_learning/)*, Amelia was helping with a computer vision project. Her colleague, an entomologist named Kevin, had a dataset of images of bees and wasps and wanted to classify them.

![Image of bees and wasps from the dataset cover image](https://github.com/PracticumAI/deep_learning/blob/main/images/bees_wasps_dataset-cover.png?raw=true)


## AI Pathway review for Bees vs Wasps

If you have taken our [Getting Started with AI course](https://practicumai.org/courses/getting_started/), you may remember this figure of the AI Application Development Pathway. Let"s take a quick review of how we applied this pathway in the case of the Bees vs Wasps example.

![AI Application Development Pathway image showing the 7 steps in developing an AI application](https://practicumai.org/getting_started/images/application_dev_pathway.png)

1. **Choose a problem to solve:** In this example, we need to classify images as bees, wasps, other insects, or a non-insect. 
2. **Gather data:** The data for the example come from [Kaggle](https://www.kaggle.com/), a great repository of datasets, code, and models.
3. **Clean and prepare the data:** In the *Deep Learning Foundations* course, we assumed that this was done for us. One issue that we ran into was that of class imbalance. There are many more images in some classes than others, leading to a poor performing model.
4. **Choose a model:** In the *Deep Learning Foundations* course, we presented the model with little detail. Now that we know more about Convolutional Neural Networks and some other tools at our disposal, we will explore the model in more detail.
   * As part of the iterative process among this and the next steps, one thing we noticed is that most of our models were overfitting — performing better on the training data than they did on the testing data. Essentially, the models memorized the training data but did not generalize well to new data that had not been seen. 
      * In this notebook, we will explore **dropout** as one mechanism to mitigate overfitting.
5. **Train the model:** In training the model we may have had a few issues. With so many hyperparameters to tune, it"s easy to lose track of what combinations have been tried and how changes impacted model performance. 
   * In this notebook, we introduce you to [TensorBoard](https://www.tensorflow.org/tensorboard), one popular tool in a class of tools known as **experiment tracking** or **MLOps (Machine learning operations) tools**. These tools help track changes to hyperparameters, the training process, and the data. They allow comparison among runs and can even automate multiple runs for you. Learning to use MLOps tools will help you as you continue to learn more about AI workflows.
6. **Evaluate the model:** We will continue to assess how the model performs on the test set and adjust the model and hyperparameters to attempt to produce a better model. However, as noted above in step 3, one issue is the class imbalance.    
   * This is a common issue with real data, and in notebook [02.1_data_imbalance.ipynb](02.1_data_imbalance.ipynb), we will explore some methods to handle this.
7. **Deploy the model:** We won"t get to this stage in this exercise, but hopefully, we will end up with a model that could be deployed and achieve relatively good accuracy at solving the problem.


## A refresher

If you need a refresher, or haven"t taken the *Deep Learning Foundations* course, the final notebook is part of this repository: [DLF_03_bees_vs_wasps.ipynb](DLF_03_bees_vs_wasps.ipynb). No need to worry though, we will cover what we did before and the new changes as we work through this notebook. Some of the code has been moved into the [helpers_01.py](helpers_01.py) file and is imported below to keep things cleaner.

### What, Why, and How: Transfer Learning in Agriculture

**What:**
Transfer learning involves using a model pre-trained on one task and adapting it to a new, related task. In this notebook, we apply transfer learning to agricultural applications, such as plant disease detection.

**Why:**
Training a model from scratch requires extensive data and computational resources, which are often limited in specialized fields like agriculture. Transfer learning allows us to leverage pre-trained models to achieve high performance with fewer resources.

**How:**
We'll demonstrate three approaches:
- Training a baseline model from scratch.
- Fine-tuning a model pre-trained on ImageNet.
- Fine-tuning a model pre-trained on AgriNet, a domain-specific dataset.

### What, Why, and How: Data Preparation

**What:**
We'll prepare a subset of the AgriNet dataset, focusing on plant disease detection. This includes loading images, preprocessing them, and splitting the data into training, validation, and test sets.

**Why:**
High-quality data preparation ensures that our models are trained on consistent, representative datasets, leading to better generalization and performance.

**How:**
Using TensorFlow's `ImageDataGenerator`, we'll preprocess images by resizing them to 224x224 pixels, normalizing pixel values, and applying a train-validation split. Augmentation techniques like rotation and flipping will also be applied to increase data variety.

# Transfer Learning in Agricultural Applications: A Case Study

This notebook demonstrates the power of transfer learning in agricultural applications using the AgriNet dataset and pre-trained models. We'll compare the performance of:
- A model trained from scratch on a subset of the data.
- A model fine-tuned using ImageNet pre-trained weights.
- A model fine-tuned using AgriNet pre-trained weights.

By the end, you'll gain insights into how domain-specific pre-trained models can enhance performance in agricultural tasks.

### What, Why, and How: Baseline Model

**What:**
We'll train a simple convolutional neural network (CNN) from scratch as a baseline for comparison.

**Why:**
The baseline provides a performance benchmark, allowing us to assess the impact of transfer learning.

**How:**
We'll define a CNN with basic layers, such as convolutional, pooling, and fully connected layers. The model will be compiled with the Adam optimizer and categorical cross-entropy loss, then trained on the dataset.

## Data Preparation

We will load a subset of the AgriNet dataset, specifically focused on plant disease detection. The dataset will be split into training, validation, and test sets for model evaluation.

### What, Why, and How: Transfer Learning with ImageNet

**What:**
We'll use the VGG19 model pre-trained on ImageNet and fine-tune it for plant disease detection.

**Why:**
ImageNet pre-trained models have learned general features (e.g., edges, textures) that can be adapted to our specific task. This significantly reduces the training time and data requirements.

**How:**
The base layers of VGG19 will be frozen to retain their pre-trained features. We'll add custom layers for classification and fine-tune the model on our dataset.

In [None]:
import os
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import matplotlib.pyplot as plt

# Paths to dataset
data_dir = "/path/to/agri_subset"  # Replace with actual path

# Define data generators
datagen = ImageDataGenerator(rescale=1.0 / 255, validation_split=0.2)

train_gen = datagen.flow_from_directory(
    data_dir, target_size=(224, 224), batch_size=32, subset="training"
)

val_gen = datagen.flow_from_directory(
    data_dir, target_size=(224, 224), batch_size=32, subset="validation"
)

### What, Why, and How: Transfer Learning with AgriNet

**What:**
We'll use the VGG19 model pre-trained on the AgriNet dataset, which is domain-specific to agriculture.

**Why:**
Domain-specific pre-training captures features relevant to agricultural tasks, such as plant patterns and disease characteristics, which can further improve model performance compared to generic pre-trained models.

**How:**
Similar to the ImageNet approach, we'll freeze the base layers of the AgriNet model, add custom classification layers, and fine-tune the model on our dataset.

## Baseline Model

We'll train a simple convolutional neural network from scratch and use it as our baseline for performance comparison.

### What, Why, and How: Performance Comparison

**What:**
We'll compare the performance of the three models (baseline, ImageNet pre-trained, and AgriNet pre-trained) using metrics like accuracy and F1-score.

**Why:**
This step helps quantify the benefits of transfer learning and highlights the impact of using domain-specific pre-trained models.

**How:**
We'll evaluate each model on the test set and visualize the results using performance metrics and charts.

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout

# Define baseline model
baseline_model = Sequential(
    [
        Conv2D(32, (3, 3), activation="relu", input_shape=(224, 224, 3)),
        MaxPooling2D(2, 2),
        Flatten(),
        Dense(128, activation="relu"),
        Dropout(0.5),
        Dense(len(train_gen.class_indices), activation="softmax"),
    ]
)

baseline_model.compile(
    optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"]
)

# Train the baseline model
history_baseline = baseline_model.fit(train_gen, validation_data=val_gen, epochs=10)

### Conclusion: Key Insights

- Transfer learning significantly improves performance compared to training from scratch, especially with limited data.
- Domain-specific pre-training (e.g., AgriNet) can further enhance accuracy and generalization for specialized tasks.
- These findings demonstrate the importance of transfer learning in tackling real-world challenges in agriculture.

## Transfer Learning with ImageNet

We'll use a pre-trained VGG19 model with ImageNet weights and fine-tune it on our dataset.

In [None]:
from tensorflow.keras.applications import VGG19
from tensorflow.keras.models import Model
from tensorflow.keras.layers import GlobalAveragePooling2D

# Load pre-trained VGG19 model
imagenet_model = VGG19(weights="imagenet", include_top=False, input_shape=(224, 224, 3))

# Freeze base layers
for layer in imagenet_model.layers:
    layer.trainable = False

# Add custom top layers
x = GlobalAveragePooling2D()(imagenet_model.output)
x = Dense(128, activation="relu")(x)
x = Dropout(0.5)(x)
output = Dense(len(train_gen.class_indices), activation="softmax")(x)

imagenet_model = Model(inputs=imagenet_model.input, outputs=output)

imagenet_model.compile(
    optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"]
)

# Train the model
history_imagenet = imagenet_model.fit(train_gen, validation_data=val_gen, epochs=10)

## Transfer Learning with AgriNet

We'll use the AgriNet pre-trained VGG19 model to fine-tune it on the same dataset.

In [None]:
# Assuming AgriNet weights are available locally
agri_weights_path = "/path/to/agri_vgg19_weights.h5"  # Replace with actual path

# Load the VGG19 model
agri_model = VGG19(weights=None, include_top=False, input_shape=(224, 224, 3))

# Load AgriNet weights
agri_model.load_weights(agri_weights_path)

# Freeze base layers
for layer in agri_model.layers:
    layer.trainable = False

# Add custom top layers
x = GlobalAveragePooling2D()(agri_model.output)
x = Dense(128, activation="relu")(x)
x = Dropout(0.5)(x)
output = Dense(len(train_gen.class_indices), activation="softmax")(x)

agri_model = Model(inputs=agri_model.input, outputs=output)

agri_model.compile(
    optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"]
)

# Train the model
history_agri = agri_model.fit(train_gen, validation_data=val_gen, epochs=10)

## Performance Comparison

Let's compare the performance metrics (accuracy, F1-score) of the three models:
- Baseline model (trained from scratch)
- Transfer learning with ImageNet pre-trained weights
- Transfer learning with AgriNet pre-trained weights

## Conclusion

In this notebook, we demonstrated the benefits of transfer learning in agricultural tasks. The AgriNet pre-trained model outperformed the ImageNet model and the baseline, showing the importance of domain-specific pre-training for specialized applications.