### Objective:
To classify images in the Caltech-256 dataset, which is an improvement over Caltech-101 dataset using a Convolutional Neural Network.

### Problem Statement
To build and implement a Convolutional Neural Network model to classify images in the Caltech-256 dataset.

At the end of this competition, you will be able to:

* Load and extract features of images available in the Caltech-256 dataset using ImageDataGenerator

* Build convolutional neural networks using either Keras or PyTorch deep learning libraries

* Use the pre-trained models using either Keras or PyTorch deep learning libraries

### Description:
Caltech-256 is an object recognition dataset containing approximately 30,000 real-world images, of different sizes, spanning 256 classes (256 object classes and an additional clutter class). Each class is represented by at least 80 images. The dataset is a superset of the Caltech-101 dataset.

Here is a handy link to Kaggle's competition documentation (https://www.kaggle.com/docs/competitions), which includes, among other things, instructions on submitting predictions (https://www.kaggle.com/docs/competitions#making-a-submission).

### Instructions for downloading train and test data are as follows:

### 1. Create an API key in Kaggle.

To do this, go to the competition site on Kaggle at https://www.kaggle.com/t/185418aa7ed24db3b98ed851a4db2b41 and click on user then click on your profile as shown below. Click Account.

![alt text](https://cdn.iisc.talentsprint.com/DLFA/Experiment_related_data/Capture-NLP.PNG)



### 2. Next, scroll down to the API access section and click on **Create New Token** to download an API key (kaggle.json).

![alt text](https://cdn.iisc.talentsprint.com/DLFA/Experiment_related_data/Capture-NLP_1.PNG)

### 3. Upload your kaggle.json file using the following snippet in a code cell:



In [1]:
# from google.colab import files
# files.upload()

In [2]:
#If successfully uploaded in the above step, the 'ls' command here should display the kaggle.json file.
# %ls

### 4. Install the Kaggle API using the following command


In [3]:
# !pip install -U -q kaggle==1.5.8

### 5. Move the kaggle.json file into ~/.kaggle, which is where the API client expects your token to be located:



In [4]:
# !mkdir -p ~/.kaggle
# !cp kaggle.json ~/.kaggle/

In [5]:
#Execute the following command to verify whether the kaggle.json is stored in the appropriate location: ~/.kaggle/kaggle.json
# !ls ~/.kaggle

In [6]:
# !chmod 600 /root/.kaggle/kaggle.json #run this command to ensure your Kaggle API token is secure on colab

### 6. Now download the Test Data from Kaggle

**NOTE: If you get a '403 - Not Found' error after running the cell below, it is most likely that the user (whose kaggle.json is uploaded above) has not 'accepted' the rules of the competition and therefore has 'not joined' the competition.**

If you encounter **401-unauthorised** download latest **kaggle.json** by repeating steps 1 & 2

In [7]:
#If you get a forbidden link, you have most likely not joined the competition.
# !kaggle competitions download -c classification-of-caltech-256-images

In [8]:
# !mkdir /content/Kaggle2Test
# !unzip classification-of-caltech-256-images -d /content/Kaggle2Test/

In [9]:
# !mkdir Kaggle2Test/test
# !mv test/* Kaggle2Test/test/

### 7. Download the Train Data

In [10]:
%%capture
# !wget https://cdn.iisc.talentsprint.com/DLFA/Experiment_related_data/Caltech_256_Train.zip

# !unzip "Caltech_256_Train.zip"

## Grading = 10 Marks

## YOUR CODING STARTS FROM HERE

### Import Required packages

In [11]:
# import the libraries
import numpy as np
import pandas as pd
import os,shutil,glob,PIL
import pathlib
from PIL import Image
import torch
import torch.nn as nn
import torch.nn.init as init
from torch.utils.data import DataLoader
from torchvision import transforms, datasets, utils
import matplotlib.pyplot as plt
import torchvision.models as models
import torch.optim as optim
from tqdm import tqdm
from collections import defaultdict
from collections import Counter

  warn(


### **Stage 1:** Data Loading and preprocessing of Images (3 points)

#### Analyze the shape of images and distribution of classes

**The below two cells were run with transform without normalization to get appropriate mean and standard deviation.**

In [12]:
# mean = 0.0
# for img, _ in train_data:
#     mean += img.mean([1,2])
# mean = mean/len(train_data)
# print(mean)

In [13]:
# sumel = 0.0
# countel = 0
# for img, _ in train_data:
#     img = (img - mean.unsqueeze(1).unsqueeze(1))**2
#     sumel += img.sum([1, 2])
#     countel += torch.numel(img[0])
# std = torch.sqrt(sumel/countel)
# print(std)

In [14]:
# Normalize with mean and std
# train_transform = transforms.Compose([transforms.Resize((224, 224)), transforms.ToTensor(), transforms.Normalize((0.4839, 0.4528, 0.3962), (0.2702, 0.2655, 0.2745))])
# train_transform = transforms.Compose([transforms.Resize((256, 256)), transforms.ToTensor()])
# transform = transforms.Compose([transforms.Resize((256, 256)), transforms.ToTensor(), transforms.Normalize((0.5503, 0.5315, 0.5028), (0.3162, 0.3125, 0.3263))])

In [15]:
# Define train and test transforms for data preprocessing and image augmentation
train_transform = transforms.Compose([
    transforms.Resize((224,224)),
    transforms.RandomHorizontalFlip(),
    transforms.RandomVerticalFlip(),
    transforms.ToTensor(),
    transforms.Normalize((0.4839, 0.4528, 0.3962), (0.2702, 0.2655, 0.2745))])

test_transform = transforms.Compose([
    transforms.Resize((224,224)),
    transforms.ToTensor(),
    transforms.Normalize((0.4839, 0.4528, 0.3962), (0.2702, 0.2655, 0.2745))])

In [16]:
train_transform = transforms.Compose([
    transforms.RandomResizedCrop(224),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

test_transform = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])


**We will be adjusting these values 224 224 after calculating mean and std dev. later on.**

In [17]:
# Loading the train set file
train_data_folder = "./classification-of-caltech-256-images/train" # Train directory for loading images
total_dataset = datasets.ImageFolder(root=train_data_folder, transform=train_transform)
# train_data = datasets.ImageFolder(root=train_data_folder)

In [18]:
# splitting between train and Validation set
train_size = int(0.9 * len(total_dataset))  # 90% for training
val_size = len(total_dataset) - train_size  # Remaining 10% for validation
train_data, val_data = torch.utils.data.random_split(total_dataset, [train_size, val_size])

In [19]:
#defiining train_batch_size
train_batch_size = 32

In [20]:
# Create data loaders for training and validation
train_loader = torch.utils.data.DataLoader(train_data, batch_size=train_batch_size,shuffle=True)
val_loader = torch.utils.data.DataLoader(val_data, batch_size=train_batch_size,shuffle=False)

In [21]:
# Loading the test set file
test_data_folder = "./classification-of-caltech-256-images/test_images" # Test directory for loading images
test_data = datasets.ImageFolder(root=test_data_folder, transform=test_transform)

In [22]:
test_data.classes
total_dataset.classes

['001.ak47',
 '002.american-flag',
 '003.backpack',
 '004.baseball-bat',
 '005.baseball-glove',
 '006.basketball-hoop',
 '007.bat',
 '008.bathtub',
 '009.bear',
 '010.beer-mug',
 '011.billiards',
 '012.binoculars',
 '013.birdbath',
 '014.blimp',
 '015.bonsai-101',
 '016.boom-box',
 '017.bowling-ball',
 '018.bowling-pin',
 '019.boxing-glove',
 '020.brain-101',
 '021.breadmaker',
 '022.buddha-101',
 '023.bulldozer',
 '024.butterfly',
 '025.cactus',
 '026.cake',
 '027.calculator',
 '028.camel',
 '029.cannon',
 '030.canoe',
 '031.car-tire',
 '032.cartman',
 '033.cd',
 '034.centipede',
 '035.cereal-box',
 '036.chandelier-101',
 '037.chess-board',
 '038.chimp',
 '039.chopsticks',
 '040.cockroach',
 '041.coffee-mug',
 '042.coffin',
 '043.coin',
 '044.comet',
 '045.computer-keyboard',
 '046.computer-monitor',
 '047.computer-mouse',
 '048.conch',
 '049.cormorant',
 '050.covered-wagon',
 '051.cowboy-hat',
 '052.crab-101',
 '053.desk-globe',
 '054.diamond-ring',
 '055.dice',
 '056.dog',
 '057.dol

In [23]:
# # Initializing batch size
# batch_size = 64

# # Loading the train dataset
# train_loader = torch.utils.data.DataLoader(train_data, batch_size=batch_size, shuffle=True)

In [24]:
# Generate a batches of images and labels
train_images, train_labels = next(iter(train_loader))
train_images.shape, train_labels.shape

(torch.Size([32, 3, 224, 224]), torch.Size([32]))

In [25]:
# labels Translator
label_names = {v: k for k, v in total_dataset.class_to_idx.items()}
# label_names

In [26]:
# Create a grid of images along with their corresponding labels
# L = 3
# W = 3

# fig, axes = plt.subplots(L, W, figsize = (10, 10))
# axes = axes.reshape(-1)

# for i in np.arange(0, L*W):
#     axes[i].imshow(train_images[i].permute(1, 2, 0))
#     axes[i].set_title(label_names[train_labels[i].item()])
#     axes[i].axis('on')

# plt.tight_layout()

In [27]:
# mean = 0.0
# for img, _ in train_data:
#     mean += img.mean([1,2])
# mean = mean/len(train_data)
# print(mean)

In [28]:
# sumel = 0.0
# countel = 0
# for img, _ in train_data:
#     img = (img - mean.unsqueeze(1).unsqueeze(1))**2
#     sumel += img.sum([1, 2])
#     countel += torch.numel(img[0])
# std = torch.sqrt(sumel/countel)
# print(std)

In [29]:
# sum = 0
# for label in train_data.classes:
#     num_img = len(train_data.targets[train_data.targets == train_data.class_to_idx[label]])
#     print (num_img)
# print (sum)

num_classes = len(total_dataset.classes)
dataset_size = len(total_dataset)
classes = total_dataset.classes
img_dict = {}
for i in range(num_classes):
    img_dict[classes[i]] = 0

for i in range(dataset_size):
    img, label = total_dataset[i]
    img_dict[classes[label]] += 1

# img_dict

**Our observation is all those minority classes we need to go for augmenting new samples.**

In [30]:
# No of Categories
print(len(total_dataset.classes))

257


In [31]:
# total_dataset.targets

In [32]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
device

device(type='cuda')

In [33]:
# Calculate class weights
class_weights = torch.tensor([1.0] * num_classes)  # Initialize with equal weights

# Calculate class frequencies
class_counts = torch.bincount(torch.tensor(total_dataset.targets))
total_samples = len(total_dataset)
class_frequencies = class_counts.float() / total_samples

# Update class weights based on class frequencies
class_weights = 1.0 / class_frequencies

# Normalize class weights
class_weights /= torch.max(class_weights)
class_weights.to(device)

tensor([0.8163, 0.8247, 0.5298, 0.6299, 0.5405, 0.8889, 0.7547, 0.3448, 0.7843,
        0.8511, 0.2878, 0.3704, 0.8163, 0.9302, 0.6557, 0.8791, 0.7692, 0.7921,
        0.6452, 0.9639, 0.5634, 0.8247, 0.7273, 0.7143, 0.7018, 0.7547, 0.8000,
        0.7273, 0.7767, 0.7692, 0.8889, 0.7921, 0.7843, 0.8000, 0.9195, 0.7547,
        0.6667, 0.7273, 0.9412, 0.6452, 0.9195, 0.9195, 0.6452, 0.6612, 0.9412,
        0.6015, 0.8511, 0.7767, 0.7547, 0.8247, 0.7018, 0.9412, 0.9756, 0.6780,
        0.8163, 0.7843, 0.7547, 0.8602, 0.9639, 0.9195, 0.7843, 0.9639, 0.6557,
        0.6107, 0.7921, 0.9639, 0.9639, 0.7273, 0.8081, 0.9524, 0.8081, 0.6780,
        0.8000, 0.6957, 0.9639, 0.9524, 0.8696, 0.8889, 0.8081, 0.6897, 0.8421,
        0.9877, 0.8421, 0.9524, 0.7143, 1.0000, 0.8602, 0.8163, 0.7273, 0.3774,
        0.8421, 0.3980, 0.7143, 0.7692, 0.9302, 0.2807, 0.8989, 0.8000, 1.0000,
        0.8602, 0.5797, 0.9091, 0.7207, 0.8247, 0.2963, 0.9195, 0.8989, 0.9412,
        0.5128, 0.9412, 0.9524, 0.9524, 

In [34]:
print(device)

cuda


In [35]:
# Number of training samples
print(len(total_dataset))

30607


In [36]:
# Size of one training image
print(train_data[0][0].size(), val_data[0][0].size())

torch.Size([3, 224, 224]) torch.Size([3, 224, 224])


In [37]:
max(img_dict,key=img_dict.get)

'257.clutter'

In [38]:
# print(img_dict.values())
# total = sum(img_dict.values())

In [39]:
# To test whether GPU instance is present in the system of not.
use_cuda = torch.cuda.is_available()
print('Using PyTorch version:', torch.__version__, 'CUDA:', use_cuda)

Using PyTorch version: 1.12.1+cu116 CUDA: True


In [40]:
torch.cuda.empty_cache()

In [41]:
model_ft = models.googlenet(pretrained=True)
num_ftrs = model_ft.fc.in_features
model_ft.fc = nn.Linear(num_ftrs,num_classes)
#model.softmax = nn.softmax()
model_ft = model_ft.to(device)
print(model_ft)



GoogLeNet(
  (conv1): BasicConv2d(
    (conv): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    (bn): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
  )
  (maxpool1): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=True)
  (conv2): BasicConv2d(
    (conv): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
    (bn): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
  )
  (conv3): BasicConv2d(
    (conv): Conv2d(64, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    (bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
  )
  (maxpool2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=True)
  (inception3a): Inception(
    (branch1): BasicConv2d(
      (conv): Conv2d(192, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track

In [42]:
PATH='CalTech-256-GoogleNet_v302_2107.pth'
# PATH='CalTech-256-GoogleNet_v302_2307.pth'
model_ft.load_state_dict(torch.load(PATH))

<All keys matched successfully>

In [43]:
criterion = nn.CrossEntropyLoss(weight=class_weights).to(device)
optimizer = optim.Adam(model_ft.parameters(), lr=0.001)

In [44]:
epsilon = 0.005  # Perturbation strength for FGSM

In [45]:
train_accu = []     # Empty list for saving train accuracy
train_losses = []   # Empty list for saving train losses
val_accu = []

In [46]:
def train(epoch):
  print('\nEpoch : %d'%epoch)

  model_ft.train()    # Initiate the model in training mode

  running_loss = 0
  correct = 0
  total = 0

  for data in tqdm(train_loader):

    inputs,labels=data[0].to(device),data[1].to(device)   # Loading the input tensors into CUDA GPU
    # print(inputs,labels)
    # Generate adversarial examples using FGSM
    # inputs.requires_grad = True

    optimizer.zero_grad()
    outputs = model_ft(inputs)
    loss = criterion(outputs,labels)  # Calculating the loss
    loss.backward()                   # Back Propagation for calculaing gradients and adjusting weights

    # Create perturbations based on the sign of gradients
    # inputs_grad = inputs.grad.data
    # perturbed_inputs = torch.clamp(inputs + epsilon * torch.sign(inputs_grad), 0, 1)
    # perturbed_inputs = perturbed_inputs.detach()

    # # Perform forward pass with perturbed inputs
    # outputs = model_ft(perturbed_inputs)
    # loss = criterion(outputs, labels)
    # loss.backward()

    optimizer.step()

    running_loss += loss.item()

    _, predicted = outputs.max(1)
    total += labels.size(0)
    correct += predicted.eq(labels).sum().item()




  train_loss = running_loss/len(train_loader)     # Calculating the mean of training loss
  accu = 100.*correct/total                       # Calculating the accuracy

  train_accu.append(accu)
  train_losses.append(train_loss)
  print('Train Loss: %.3f | Accuracy: %.3f'%(train_loss,accu))

In [47]:
  # Validation
def eval_val(epoch):
    model_ft.eval()
    wrong_predictions = []
    val_loss = 0
    correct = 0
    total = 0

    with torch.no_grad():
      for i, data in enumerate(val_loader, 0):
        inputs, labels = data[0].to(device), data[1].to(device)
        outputs = model_ft(inputs)
        loss = criterion(outputs, labels)
        # Get predicted labels
        _, predicted = torch.max(outputs.data, 1)

        # Find wrong predictions
        for j in range(len(predicted)):
          if predicted[j] != labels[j]:
              wrong_predictions.append({
                  'input': inputs[j],
                  'predicted_label': predicted[j].item(),
                  'true_label': labels[j].item()
              })
        # print(val_loss)
        val_loss += loss.item()
        total += labels.size(0)
        correct += predicted.eq(labels).sum().item()


    val_batch_loss = val_loss/len(val_loader)
    val_batch_accu = 100.*correct/total                       # Calculating the accuracy
    val_accu.append(val_batch_accu)
    print('Val Loss: %.3f| Accuracy: %.3f' %(val_batch_loss,val_batch_accu))

    print('Top 5 Wrong Predictions in Validation Set:')
    counter = Counter([prediction['true_label'] for prediction in wrong_predictions])
    top_wrong_predictions = counter.most_common(5)

    for true_label, count in top_wrong_predictions:
        print(f'True Label: {true_label}, Count: {count}')
        filtered_predictions = [prediction for prediction in wrong_predictions
                                if prediction['true_label'] == true_label][:5]
        for prediction in filtered_predictions:
            input_image = prediction['input']
            predicted_label = prediction['predicted_label']
            true_label = prediction['true_label']
            print(f'Predicted Label: {predicted_label}, True Label: {true_label}')


    # # Print details of wrong predictions
    # print('Wrong Predictions in Validation Set:')
    # for prediction in wrong_predictions:
    #     input_image = prediction['input']
    #     predicted_label = prediction['predicted_label']
    #     true_label = prediction['true_label']
    #     print(f'Predicted Label: {predicted_label}, True Label: {true_label}')
    #     # Additional code to visualize or process the input image if needed


In [48]:
epochs = 15 # 20 run modify it later on
for epoch in range(1, epochs+1):
  train(epoch)
  eval_val(epoch)


Epoch : 1


100%|████████████████████████████████████████████████████████████████████████████████| 861/861 [11:33<00:00,  1.24it/s]


Train Loss: 1.040 | Accuracy: 74.160
Val Loss: 0.906| Accuracy: 77.654
Top 5 Wrong Predictions in Validation Set:
True Label: 256, Count: 45
Predicted Label: 230, True Label: 256
Predicted Label: 215, True Label: 256
Predicted Label: 49, True Label: 256
Predicted Label: 249, True Label: 256
Predicted Label: 97, True Label: 256
True Label: 158, Count: 18
Predicted Label: 25, True Label: 158
Predicted Label: 136, True Label: 158
Predicted Label: 7, True Label: 158
Predicted Label: 164, True Label: 158
Predicted Label: 228, True Label: 158
True Label: 104, Count: 17
Predicted Label: 232, True Label: 104
Predicted Label: 253, True Label: 104
Predicted Label: 172, True Label: 104
Predicted Label: 133, True Label: 104
Predicted Label: 184, True Label: 104
True Label: 146, Count: 17
Predicted Label: 189, True Label: 146
Predicted Label: 114, True Label: 146
Predicted Label: 83, True Label: 146
Predicted Label: 147, True Label: 146
Predicted Label: 25, True Label: 146
True Label: 231, Count: 1

100%|████████████████████████████████████████████████████████████████████████████████| 861/861 [09:23<00:00,  1.53it/s]


Train Loss: 1.010 | Accuracy: 74.584
Val Loss: 0.905| Accuracy: 76.674
Top 5 Wrong Predictions in Validation Set:
True Label: 256, Count: 36
Predicted Label: 28, True Label: 256
Predicted Label: 97, True Label: 256
Predicted Label: 186, True Label: 256
Predicted Label: 24, True Label: 256
Predicted Label: 77, True Label: 256
True Label: 104, Count: 23
Predicted Label: 27, True Label: 104
Predicted Label: 84, True Label: 104
Predicted Label: 55, True Label: 104
Predicted Label: 55, True Label: 104
Predicted Label: 150, True Label: 104
True Label: 231, Count: 18
Predicted Label: 34, True Label: 231
Predicted Label: 74, True Label: 231
Predicted Label: 158, True Label: 231
Predicted Label: 74, True Label: 231
Predicted Label: 221, True Label: 231
True Label: 158, Count: 17
Predicted Label: 256, True Label: 158
Predicted Label: 28, True Label: 158
Predicted Label: 89, True Label: 158
Predicted Label: 107, True Label: 158
Predicted Label: 97, True Label: 158
True Label: 89, Count: 11
Predic

100%|████████████████████████████████████████████████████████████████████████████████| 861/861 [10:49<00:00,  1.33it/s]


Train Loss: 0.946 | Accuracy: 75.496
Val Loss: 0.892| Accuracy: 77.524
Top 5 Wrong Predictions in Validation Set:
True Label: 256, Count: 46
Predicted Label: 70, True Label: 256
Predicted Label: 49, True Label: 256
Predicted Label: 97, True Label: 256
Predicted Label: 125, True Label: 256
Predicted Label: 67, True Label: 256
True Label: 3, Count: 16
Predicted Label: 38, True Label: 3
Predicted Label: 17, True Label: 3
Predicted Label: 207, True Label: 3
Predicted Label: 38, True Label: 3
Predicted Label: 38, True Label: 3
True Label: 231, Count: 15
Predicted Label: 110, True Label: 231
Predicted Label: 158, True Label: 231
Predicted Label: 148, True Label: 231
Predicted Label: 74, True Label: 231
Predicted Label: 78, True Label: 231
True Label: 104, Count: 13
Predicted Label: 84, True Label: 104
Predicted Label: 84, True Label: 104
Predicted Label: 63, True Label: 104
Predicted Label: 84, True Label: 104
Predicted Label: 63, True Label: 104
True Label: 95, Count: 12
Predicted Label: 15

 16%|████████████▊                                                                   | 138/861 [01:38<08:35,  1.40it/s]


KeyboardInterrupt: 

In [None]:
# epochs = 30 # 20 run modify it later on
# for epoch in range(1, epochs+1):
#   train(epoch)
#   eval_val(epoch)

In [None]:
# epochs = 5 # 20 run moidfy it later on
# for epoch in range(1, epochs+1):
#   train(epoch)
#   eval_val(epoch)

In [None]:
epochs = 3 # 20 run modify it later on
for epoch in range(1, epochs+1):
  train(epoch)
  eval_val(epoch)

In [None]:
# epochs = 28 # 20 run modify it later on
# for epoch in range(1, epochs+1):
#   train(epoch)
#   eval_val(epoch)

Validatation data sets as well - todo

Hyper Parameters = epcoh, learning rate,

In [49]:
PATH = 'CalTech-256-GoogleNet_v302_2307.pth'
torch.save(model_ft.state_dict(), PATH)

In [None]:
plt.plot(train_accu,'-o')
plt.xlabel('epoch')
plt.ylabel('accuracy')
plt.legend(['Train'])
plt.show()

In [None]:
plt.plot(train_losses,'-o')
plt.xlabel('epoch')
plt.ylabel('Loss')
plt.legend(['Train'])
plt.title('Train Loss')
plt.show()

In [50]:
PATH='CalTech-256-GoogleNet_v302_2307.pth'
model_ft.load_state_dict(torch.load(PATH))

<All keys matched successfully>

In [51]:
model_ft.eval()

GoogLeNet(
  (conv1): BasicConv2d(
    (conv): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    (bn): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
  )
  (maxpool1): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=True)
  (conv2): BasicConv2d(
    (conv): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
    (bn): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
  )
  (conv3): BasicConv2d(
    (conv): Conv2d(64, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    (bn): BatchNorm2d(192, eps=0.001, momentum=0.1, affine=True, track_running_stats=True)
  )
  (maxpool2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=True)
  (inception3a): Inception(
    (branch1): BasicConv2d(
      (conv): Conv2d(192, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn): BatchNorm2d(64, eps=0.001, momentum=0.1, affine=True, track

In [52]:
# Loading the train dataset
test_batch_size = 1
test_loader = torch.utils.data.DataLoader(test_data, batch_size=test_batch_size,shuffle=False)

In [53]:
predictions = []
for i,data in enumerate(test_loader):
  input,_ = data
  output = model_ft(input.to(device))
  # print(output)
  pred = torch.max(output,dim=1)
  # print(pred)
  predictions.append(total_dataset.classes[pred.indices])

In [70]:
# print([(x[0].split("/",3)[2]).split('\\',1)[1].replace('\\','/') for x in list(test_loader.dataset.imgs)])
# test_loader.dataset.imgs

In [65]:
results = pd.DataFrame()
results['img_path'] = pd.Series([(x[0].split("/",3)[2]).split('\\',1)[1].replace('\\','/') for x in list(test_loader.dataset.imgs)])
results['label'] = pd.Series(predictions)

In [68]:
results['label'] = results['label'].apply(lambda x: x.split('.')[1].title())
results['label']

0           Watch-101
1           Watch-101
2       American-Flag
3                Toad
4                Toad
            ...      
9172             Toad
9173             Toad
9174             Toad
9175             Toad
9176             Toad
Name: label, Length: 9177, dtype: object

In [69]:
results.to_csv("Group11_Kaggle2_Submission_v302_2307.csv")

In [67]:
results.head(15)

Unnamed: 0,img_path,label
0,test/1.jpg,240.watch-101
1,test/10.jpg,240.watch-101
2,test/100.jpg,002.american-flag
3,test/1000.jpg,256.toad
4,test/1001.jpg,256.toad
5,test/1002.jpg,256.toad
6,test/1003.jpg,256.toad
7,test/1004.jpg,256.toad
8,test/1005.jpg,256.toad
9,test/1006.jpg,256.toad


In [None]:
# results.head(5)

In [None]:
from google.colab import files
# files.download('Group4_Pred_Submission_v75_lC_withques.csv')
files.download('Group11_Kaggle2_Submission_v302_2107.csv')
files.download('/content/CalTech-256-GoogleNet_v302_2107.pth')

In [None]:
# submit the file to kaggle
# !kaggle competitions submit classification-of-caltech-256-images -f Group11_Kaggle2_Submission_v4_adv.csv -m "Model"

#### Visualize the sample images of each class


### **Stage 2:** Build and train the CNN model using Keras/Pytorch (5 points)

You can train the CNN model and Pre-trained model and then compare the model performance on the kaggle testset


### Transfer learning

Transfer learning consists of taking features learned on one problem, and leveraging them on a new, similar problem.

A pre-trained model is a saved network that was previously trained on a large dataset, typically on a large-scale image-classification task.

The intuition behind transfer learning for image classification is that if a model is trained on a large and general enough dataset, this model will effectively serve as a generic model of the visual world. You can then take advantage of these learned feature maps without having to start from scratch by training a large model on a large dataset.



#### Use the pre-trained models

* Load the pre-trained model
* Train and evaluate the images

In [None]:
# YOUR CODE HERE

###   **Stage 3**: Evaluate the Model and get model predictions on the Kaggle testset (2 Points)









In [None]:
# YOUR CODE HERE

### Report Analysis

- Compare the accuracies for the Pre-trained vs CNN models
- What process was followed to tune the hyperparameters?
- Plot the confusion matrix in terms of the misclassifications