# Transfer Learning Evaluation and Optimization

In this notebook, we will evaluate the models trained in the previous module and explore various evaluation metrics to better understand their performance. Additionally, we will critique the suitability of different models for a feature extraction task.

### 1. Import Required Libraries

We will start by importing the necessary libraries for evaluation and visualization.

In [1]:
import os
import torch
import matplotlib.pyplot as plt
from sklearn.metrics import classification_report, confusion_matrix, ConfusionMatrixDisplay



### 2. Load Pre-trained Models

We will load the three pre-trained models provided for this module. These include two models that are a poor match for the transfer task and one that is more suitable.

In [None]:
# Paths to the models
model_paths = {
    'model_1': 'models/model_1.pt',
    'model_2': 'models/model_2.pt',
    'model_3': 'models/model_3.pt'
}

### 3. Evaluate Models

We will evaluate each model using metrics such as accuracy, precision, recall, and F1-score. Additionally, confusion matrices will be generated to visualize the performance.

In [None]:
def evaluate_model(model, dataloader, device):
    model.eval()
    predictions, true_labels = [], []
    with torch.no_grad():
        for inputs, labels in dataloader:
            inputs, labels = inputs.to(device), labels.to(device)
            outputs = model(inputs)
            _, preds = torch.max(outputs, 1)
            predictions.extend(preds.cpu().numpy())
            true_labels.extend(labels.cpu().numpy())
    return predictions, true_labels

### 4. Critique Model Suitability

Based on the evaluation results, we will critique the suitability of each model for the feature extraction task.

### 5. Conclusion

Summarize the findings and discuss potential improvements for the models that performed poorly.