# Assignment 4. Deep Learning

*Foundations of Data Science*  
*Dr. Khalaj (Fall 2023)*  

*For questions 2-4 refer to @alregamo on Telegram.*

### Description  
This homework consists of four questions, each aimed at one category in the world of Deep Learning.   
1. Getting familiarized with sentiment analysis (A subject also covered in the course project).
   
2. Multi-layer perceptron (MLP). 
   
3. Convolutional Neural Networks (CNN).
   
4. Variational Autoencoders (VAE).

### Information  
Complete the information box below.

### Full Name : Parishad Mokhber
### Student Number : 98100537
__

## 3 Convolutional Neural Networks (CNN)

In this problem, you are going to compare the results of a simple CNN with a pre-trained deep learning model such as VGG16 for a classification task.

For this purpose, we are going to use a publicly available dataset, named CIFAR10. The CIFAR-10 dataset is a popular benchmark in the field of machine learning for image recognition tasks. Here are the key points about this dataset:

1. **Content**: The CIFAR-10 dataset consists of 60,000 32x32 color images. These images are divided into 10 different classes, representing different objects. The classes are airplanes, cars, birds, cats, deer, dogs, frogs, horses, ships, and trucks.

2. **Structure**: The dataset is split into two parts: 50,000 images for training and 10,000 images for testing. Each class in the dataset is represented equally, with 6,000 images per class.

3. **Purpose**: CIFAR-10 is widely used for training and evaluating machine learning and image processing systems. It's a benchmark dataset for developing and testing machine learning algorithms, especially in the field of computer vision.

4. **Challenge**: The relatively low resolution of the images (32x32 pixels) makes it a challenging dataset for image classification tasks. The small size of the images means that the details that distinguish between the classes can be quite subtle.

### Data Loading

Load the dataset with <code>torchvision.datasets</code> or <code>tensorflow.keras.datasets</code> and split the data into training and test sets.

In [1]:
import torch
from torchvision import datasets, transforms

# Define a transform to normalize the data
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

# Load the training data
trainset = datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=16,
                                          shuffle=True, num_workers=4)

# Load the test data
testset = datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=16,
                                         shuffle=False, num_workers=2)

# Classes in CIFAR10
classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')


Files already downloaded and verified
Files already downloaded and verified


## CNN Model

Build a simple 3-layer CNN model, which takes CIFAR10 images as input and classify their labels. Feel free to use <code>BatchNorm</code> or <code>Pooling</code> layers between your <code>Conv</code> layers. Use 2 layers of fully connected <code>Linear</code> or <code>Dense</code> layers for classificaton.

After building your model, make a summary of your architecture using <code>model.summary()</code> in Keras or <code> torchsummary</code> library for pytorch models.

In [4]:
import torch.nn as nn
import torch.nn.functional as F

class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        
        # Convolutional layer (sees 32x32x3 image tensor)
        self.conv1 = nn.Conv2d(in_channels=3, out_channels=16, kernel_size=3, padding=1)
        # Batch normalization
        self.conv1_bn = nn.BatchNorm2d(16)
        # Max pooling layer
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
        
        # Convolutional layer (sees 16x16x16 tensor)
        self.conv2 = nn.Conv2d(16, 32, 3, padding=1)
        # Batch normalization
        self.conv2_bn = nn.BatchNorm2d(32)

        # Convolutional layer (sees 8x8x32 tensor)
        self.conv3 = nn.Conv2d(32, 64, 3, padding=1)
        # Batch normalization
        self.conv3_bn = nn.BatchNorm2d(64)

        # Fully connected layer (sees 4x4x64 tensor)
        self.fc1 = nn.Linear(64 * 4 * 4, 512)
        # Fully connected layer
        self.fc2 = nn.Linear(512, 10) # Assuming 10 classes

    def forward(self, x):
        # Add sequence of convolutional and max pooling layers
        x = self.pool(F.relu(self.conv1_bn(self.conv1(x))))
        x = self.pool(F.relu(self.conv2_bn(self.conv2(x))))
        x = self.pool(F.relu(self.conv3_bn(self.conv3(x))))

        # Flatten image input
        x = x.view(-1, 64 * 4 * 4)
        
        # Add fully connected layers
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        
        return x

# Create the CNN model
model = CNN().cuda()
print(model)


CNN(
  (conv1): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (conv1_bn): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (conv2): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (conv2_bn): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (conv3): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (conv3_bn): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (fc1): Linear(in_features=1024, out_features=512, bias=True)
  (fc2): Linear(in_features=512, out_features=10, bias=True)
)


In [6]:
def print_model_parameters(model):
    total_params = sum(p.numel() for p in model.parameters())
    trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
    
    print(f"Total parameters: {total_params}")
    print(f"Trainable parameters: {trainable_params}")

# Example usage with a model
print_model_parameters(model)  # Replace 'vgg19' with your model's name


Total parameters: 553738
Trainable parameters: 553738


### Train your Model

Train your model for 20 epochs by using Adam optimizer for the training. Plot the accuracy curves for your training and test data during the training phase. Also plot the loss curves as well. 

You can use interactive tools such as <code>tensorboard</code> for these visualizations.

In [5]:
import torch
import torch.optim as optim
import torch.nn as nn
import tqdm

# Check if GPU is available and set the device accordingly
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

# Transfer the model to GPU
model = CNN().to(device)

# Define the loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Function for calculating accuracy
def get_accuracy(logit, target, batch_size):
    ''' Obtain accuracy for training round '''
    corrects = (torch.max(logit, 1)[1].view(target.size()).data == target.data).sum()
    accuracy = 100.0 * corrects / batch_size
    return accuracy.item()

# Training the model
for epoch in tqdm.tqdm(range(20)):  # loop over the dataset multiple times
    running_loss = 0.0
    running_accuracy = 0.0
    model.train()  # Set model to training mode

    for i, data in enumerate(trainloader, 0):
        # get the inputs; data is a list of [inputs, labels]
        inputs, labels = data
        inputs, labels = inputs.to(device), labels.to(device)

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        # print statistics
        running_loss += loss.item()
        running_accuracy += get_accuracy(outputs, labels, inputs.size(0))

    model.eval()  # Set model to evaluation mode for validation
    val_loss = 0.0
    val_accuracy = 0.0

    with torch.no_grad():
        for data in testloader:
            inputs, labels = data
            inputs, labels = inputs.to(device), labels.to(device)
            outputs = model(inputs)
            loss = criterion(outputs, labels)
            val_loss += loss.item()
            val_accuracy += get_accuracy(outputs, labels, inputs.size(0))

    print(f"Epoch {epoch+1}, Train loss: {running_loss/len(trainloader)}, Train Accuracy: {running_accuracy/len(trainloader)}")
    print(f"Epoch {epoch+1}, Validation loss: {val_loss/len(testloader)}, Validation Accuracy: {val_accuracy/len(testloader)}")

print('Finished Training')

Using device: cuda:0


  0%|          | 0/20 [00:00<?, ?it/s]

  5%|▌         | 1/20 [00:27<08:47, 27.77s/it]

Epoch 1, Train loss: 1.2100104375362397, Train Accuracy: 56.48
Epoch 1, Validation loss: 0.9460107670783997, Validation Accuracy: 66.91


 10%|█         | 2/20 [00:56<08:30, 28.37s/it]

Epoch 2, Train loss: 0.8688630001354217, Train Accuracy: 69.266
Epoch 2, Validation loss: 0.8330921003103257, Validation Accuracy: 71.19


 15%|█▌        | 3/20 [01:25<08:09, 28.82s/it]

Epoch 3, Train loss: 0.7349911655282975, Train Accuracy: 74.022
Epoch 3, Validation loss: 0.8031288730859757, Validation Accuracy: 72.6


 20%|██        | 4/20 [01:55<07:47, 29.19s/it]

Epoch 4, Train loss: 0.6387602717852593, Train Accuracy: 77.676
Epoch 4, Validation loss: 0.7839220308661461, Validation Accuracy: 74.01


 25%|██▌       | 5/20 [02:26<07:25, 29.67s/it]

Epoch 5, Train loss: 0.5582013236474991, Train Accuracy: 80.5
Epoch 5, Validation loss: 0.7199432440757751, Validation Accuracy: 75.8


 30%|███       | 6/20 [02:55<06:53, 29.57s/it]

Epoch 6, Train loss: 0.48502898522734644, Train Accuracy: 82.86
Epoch 6, Validation loss: 0.7304063143372536, Validation Accuracy: 76.08


 35%|███▌      | 7/20 [03:26<06:30, 30.06s/it]

Epoch 7, Train loss: 0.42728957171797755, Train Accuracy: 84.992
Epoch 7, Validation loss: 0.7476140356123447, Validation Accuracy: 76.45


 40%|████      | 8/20 [03:54<05:52, 29.38s/it]

Epoch 8, Train loss: 0.3733110846364498, Train Accuracy: 86.682
Epoch 8, Validation loss: 0.7701577781915665, Validation Accuracy: 75.75


 45%|████▌     | 9/20 [04:21<05:15, 28.70s/it]

Epoch 9, Train loss: 0.3260544341531396, Train Accuracy: 88.498
Epoch 9, Validation loss: 0.8184497742891311, Validation Accuracy: 75.53


 50%|█████     | 10/20 [04:50<04:46, 28.69s/it]

Epoch 10, Train loss: 0.2844632201051712, Train Accuracy: 89.896
Epoch 10, Validation loss: 0.8463505880072713, Validation Accuracy: 75.85


 55%|█████▌    | 11/20 [05:20<04:22, 29.12s/it]

Epoch 11, Train loss: 0.2462750483454764, Train Accuracy: 91.282
Epoch 11, Validation loss: 0.88678726978302, Validation Accuracy: 75.88


 60%|██████    | 12/20 [05:48<03:51, 28.89s/it]

Epoch 12, Train loss: 0.22102047663345933, Train Accuracy: 92.046
Epoch 12, Validation loss: 0.9341981368228793, Validation Accuracy: 75.54


 65%|██████▌   | 13/20 [06:20<03:28, 29.72s/it]

Epoch 13, Train loss: 0.19833665776602924, Train Accuracy: 92.976
Epoch 13, Validation loss: 1.0016332280457019, Validation Accuracy: 75.56


 70%|███████   | 14/20 [06:49<02:56, 29.41s/it]

Epoch 14, Train loss: 0.17213695020347833, Train Accuracy: 93.904
Epoch 14, Validation loss: 1.0498228014439344, Validation Accuracy: 75.51


 75%|███████▌  | 15/20 [07:17<02:25, 29.01s/it]

Epoch 15, Train loss: 0.1611183960681036, Train Accuracy: 94.308
Epoch 15, Validation loss: 1.1041414007849992, Validation Accuracy: 75.37


 80%|████████  | 16/20 [07:47<01:57, 29.48s/it]

Epoch 16, Train loss: 0.14480305770784616, Train Accuracy: 94.776
Epoch 16, Validation loss: 1.1224643905371428, Validation Accuracy: 75.29


 85%|████████▌ | 17/20 [08:18<01:29, 29.70s/it]

Epoch 17, Train loss: 0.13652133431114255, Train Accuracy: 95.126
Epoch 17, Validation loss: 1.123552916648984, Validation Accuracy: 75.94


 90%|█████████ | 18/20 [08:46<00:58, 29.28s/it]

Epoch 18, Train loss: 0.12118477410771651, Train Accuracy: 95.638
Epoch 18, Validation loss: 1.2518049395352602, Validation Accuracy: 75.48


 95%|█████████▌| 19/20 [09:15<00:29, 29.31s/it]

Epoch 19, Train loss: 0.11337628873200621, Train Accuracy: 95.994
Epoch 19, Validation loss: 1.2481394139915705, Validation Accuracy: 75.28


100%|██████████| 20/20 [09:43<00:00, 29.17s/it]

Epoch 20, Train loss: 0.10935036741292803, Train Accuracy: 96.248
Epoch 20, Validation loss: 1.3161755866914988, Validation Accuracy: 75.01
Finished Training





### Evaluate your Model

Now that you have trained your model, do the followings:

* plot your model's confusion matrix on the test set.
* report its final accuracy on your test set.
* show some images from the test set with their corresponding true label and your predictions.

## VGG16 Model and Transfer Learning

VGG16 is a popular convolutional neural network (CNN) architecture that was introduced by Karen Simonyan and Andrew Zisserman from the University of Oxford in a 2014 paper titled "Very Deep Convolutional Networks for Large-Scale Image Recognition." Here are the key points about the VGG16 model:

1. **Architecture**: VGG16 is named for its 16 layers that have weights. The architecture is characterized by its simplicity, using only 3x3 convolutional layers stacked on top of each other in increasing depth. Reducing volume size is handled by max pooling. The final architecture includes several fully connected layers.

2. **Uniform Design**: One of the defining aspects of VGG16 is its uniformity. All hidden layers use the same 3x3 convolutional filters with a stride of 1 and the same max pooling filters of 2x2 with a stride of 2. This consistency makes the architecture easy to scale and adapt.

3. **Depth**: The depth of the network (16 layers) was a significant feature at the time of its introduction. The increased depth helps the network to learn more complex patterns in the data.

4. **Performance**: In the ImageNet competition, which is a benchmark in image classification, VGG16 significantly improved upon the architectures that had been used previously, demonstrating the power of deeper neural networks.

5. **Applications**: VGG16, and its larger counterpart VGG19, are widely used in image processing. They are used both as standalone models for image classification tasks and as feature extraction parts of larger models in more complex tasks.

6. **Transfer Learning**: Due to its simplicity and high performance on benchmark datasets, VGG16 is often used as a pre-trained model for transfer learning, especially in tasks where training data might be limited. In this context, VGG16 trained on a large dataset like ImageNet is adapted to a new task with a relatively small amount of new data.

7. **Resource Intensity**: One downside of VGG16 is that it is resource-intensive, both in terms of the number of parameters and computation. This can make it less practical for deployment in resource-constrained environments.

VGG16 represents a key milestone in the development of deep learning architectures for image recognition, and it remains a popular choice for both academic and practical applications in the field of computer vision.

Here we want to use a VGG16 pre-trained model (trained on the ImageNet dataset) and use a transfer learning approach to fine-tune the model for our dataset. 

Certainly! Fine-tuning a pre-trained VGG16 model on the CIFAR-10 dataset is a common practice in deep learning, especially to demonstrate the power of transfer learning. Here are the steps and explanations you can provide to your students:

#### Understanding Transfer Learning and Fine-Tuning
- **Transfer Learning**: It's a technique where a model developed for one task is reused as the starting point for a model on a second task. It's especially popular in deep learning where large models take a lot of resources to train.
- **Fine-Tuning**: Involves tweaking the pre-trained model slightly to adapt it to a new, but similar task. In this case, fine-tuning a VGG16 model pre-trained on ImageNet to work on CIFAR-10.


### Building your Model

Importing the VGG16 Model from TensorFlow or PyTorch models and load the model with pre-trained weights.

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import models

# Load a pretrained VGG19 model
vgg19 = models.vgg19(pretrained=True)

CIFAR-10 images are 32x32 pixels, much smaller than the ImageNet images VGG16 was trained on (224x224 pixels). Decide on a strategy to handle this (e.g., resize CIFAR-10 images or modify the VGG16 input layer). Also, CIFAR-10 images need to be preprocessed to be compatible with VGG16. This includes normalizing pixel values in the same way as was done for the ImageNet images.

For this preprocessing steps, you can use <code>torchvision.transforms</code> in PyTorch or <code>tensorflow.keras.preprocessing.image.ImageDataGenerator</code> in TensorFlow.

Besides, you need to replace the output layer (or fully connected layers) of VGG16 to match the number of classes in CIFAR-10 (10 classes). This is because the original VGG16 model output is designed for 1,000 classes (ImageNet).

In [27]:
# Freeze the convolutional layers
for param in vgg19.features.parameters():
    param.requires_grad = False

# Replace the classifier with a new one
vgg19.avgpool = nn.AdaptiveAvgPool2d((2, 2))

# Modify the classifier
vgg19.classifier = nn.Sequential(
    nn.Linear(512 * 2 * 2, 1024),  # Adjust the input features to match the 7x7x512 size
    nn.ReLU(),
    nn.Dropout(),
    nn.Linear(1024, 1024),
    nn.ReLU(),
    nn.Dropout(),
    nn.Linear(1024, 10)  # CIFAR10 has 10 classes
)

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
vgg19 = vgg19.to(device)
# Define the loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(vgg19.classifier.parameters(), lr=0.001)  # Only optimize the classifier parameters

In [28]:
def print_model_parameters(model):
    total_params = sum(p.numel() for p in model.parameters())
    trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
    
    print(f"Total parameters: {total_params}")
    print(f"Trainable parameters: {trainable_params}")

# Example usage with a model
print_model_parameters(vgg19)  # Replace 'vgg19' with your model's name

Total parameters: 23182410
Trainable parameters: 3158026


In [25]:
def get_accuracy(logit, target, batch_size):
    ''' Obtain accuracy for training round '''
    corrects = (torch.max(logit, 1)[1].view(target.size()).data == target.data).sum()
    accuracy = 100.0 * corrects / batch_size
    return accuracy.item()

In [30]:
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])

# Load the training data
trainset = datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=16,
                                          shuffle=True, num_workers=4)

# Load the test data
testset = datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=16,
                                         shuffle=False, num_workers=2)

Files already downloaded and verified
Files already downloaded and verified


### Training your Model
Train the model on the CIFAR-10 training data for 20 epochs by using Adam optimizer. Remember you only need to update the weights of the unfrozen layers to adapt the model to the CIFAR-10 dataset.

Plot the accuracy curves for your training and test data during the training phase. Also plot the loss curves as well. 

You can use interactive tools such as <code>tensorboard</code> for these visualizations.

In [31]:
import tqdm

losses_train = []
losses_valid = []
acc_train = []
acc_valid = []

for epoch in tqdm.tqdm(range(20)):  # loop over the dataset multiple times
    running_loss = 0.0
    running_accuracy = 0.0
    vgg19.train()  # Set model to training mode

    for i, data in enumerate(trainloader, 0):
        # get the inputs; data is a list of [inputs, labels]
        inputs, labels = data
        inputs, labels = inputs.to(device), labels.to(device)

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = vgg19(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        # print statistics
        running_loss += loss.item()
        running_accuracy += get_accuracy(outputs, labels, inputs.size(0))

    vgg19.eval()  # Set model to evaluation mode for validation
    val_loss = 0.0
    val_accuracy = 0.0
    
    with torch.no_grad():
        for data in testloader:
            inputs, labels = data
            inputs, labels = inputs.to(device), labels.to(device)
            outputs = vgg19(inputs)
            loss = criterion(outputs, labels)
            val_loss += loss.item()
            val_accuracy += get_accuracy(outputs, labels, inputs.size(0))

    print(f"Epoch {epoch+1}, Train loss: {running_loss/len(trainloader)}, Train Accuracy: {running_accuracy/len(trainloader)}")
    print(f"Epoch {epoch+1}, Validation loss: {val_loss/len(testloader)}, Validation Accuracy: {val_accuracy/len(testloader)}")

    losses_train.append(running_loss/len(trainloader))
    losses_valid.append(val_loss/len(testloader))
    acc_train.append(running_accuracy/len(trainloader))
    acc_valid.append(val_accuracy/len(testloader))
print('Finished Training')

  5%|▌         | 1/20 [00:28<09:06, 28.74s/it]

Epoch 1, Train loss: 1.3690724831676484, Train Accuracy: 55.476
Epoch 1, Validation loss: 1.094175492477417, Validation Accuracy: 63.92


 10%|█         | 2/20 [00:57<08:36, 28.68s/it]

Epoch 2, Train loss: 1.264910141801834, Train Accuracy: 58.802
Epoch 2, Validation loss: 1.0607292023420334, Validation Accuracy: 65.06


 15%|█▌        | 3/20 [01:25<08:04, 28.52s/it]

Epoch 3, Train loss: 1.231211493883133, Train Accuracy: 60.048
Epoch 3, Validation loss: 1.079652437210083, Validation Accuracy: 64.44


 20%|██        | 4/20 [01:53<07:33, 28.34s/it]

Epoch 4, Train loss: 1.208364975156784, Train Accuracy: 60.6
Epoch 4, Validation loss: 1.05852837100029, Validation Accuracy: 65.04


 25%|██▌       | 5/20 [02:21<07:02, 28.19s/it]

Epoch 5, Train loss: 1.1965010613822937, Train Accuracy: 60.972
Epoch 5, Validation loss: 1.0420117020845414, Validation Accuracy: 66.78


 30%|███       | 6/20 [02:49<06:35, 28.23s/it]

Epoch 6, Train loss: 1.1748923247146605, Train Accuracy: 61.538
Epoch 6, Validation loss: 1.0783070336341858, Validation Accuracy: 64.99


 35%|███▌      | 7/20 [03:18<06:09, 28.45s/it]

Epoch 7, Train loss: 1.1682719102954864, Train Accuracy: 61.988
Epoch 7, Validation loss: 1.0463652221679687, Validation Accuracy: 66.01


 40%|████      | 8/20 [03:46<05:39, 28.29s/it]

Epoch 8, Train loss: 1.1447690533781052, Train Accuracy: 62.812
Epoch 8, Validation loss: 1.0400897037982941, Validation Accuracy: 66.29


 45%|████▌     | 9/20 [04:14<05:10, 28.19s/it]

Epoch 9, Train loss: 1.1313574957561492, Train Accuracy: 63.0
Epoch 9, Validation loss: 1.0516019164800643, Validation Accuracy: 66.12


 50%|█████     | 10/20 [04:43<04:44, 28.46s/it]

Epoch 10, Train loss: 1.124359544057846, Train Accuracy: 63.504
Epoch 10, Validation loss: 1.0572678737401962, Validation Accuracy: 65.8


 50%|█████     | 10/20 [05:01<05:01, 30.11s/it]


KeyboardInterrupt: 

## Discussing and Comparison
- Discuss the advantages of transfer learning in terms of training time and accuracy scores.
- Also, cover potential drawbacks, like overfitting if the new dataset is too small or too different from the original dataset the model was trained on.

As we can observe, we can use lightweight MLP heads on top of a large pretrained models which helps us train the model in less time. However, as the VGG model is trained on the 224*224 ImageNet pictures, the accuracy did not significantly improve in comparison with the CNN model from scratch. Besides, in cases that the pretrained model is trained on a much different domain, such as trained on ImageNet, but want to be used on a different domain such as Medical pictures, this can cause problems for representation of features and maybe training more layers could be useful.