**Caltech 101 Dataset Image Classification with Pytorch Pretrained Model Resnet34**
---

* Introduction of Caltech Dataset 101
* Approach to Train a Model
    * Neural Network(Resnet34)
    * Tools and Libraries
    * Directory Structure

**Dataset Download : [Caltech101](http://www.vision.caltech.edu/Image_Datasets/Caltech101/#Download)** 
---
![](https://debuggercafe.com/wp-content/uploads/2020/03/caltech101exp-min.png)

**Total Images : 8677 Images**
---
* If we ignore the BACKGROUND_Google tag and its images, then we have 8677 images in total. Now if you have been learning in depth for a while, you know that these are not enough images for very high precision. Still, we will do our best.
* Second issue with datset is distribution, which is not upto the mark as illustrated in Image.
![Bar plot showing the distribution of number of images in each category in the Caltech101 dataset.](https://debuggercafe.com/wp-content/uploads/2020/03/dist.png)  
Bar plot showing the distribution of number of images in each category in the Caltech101 dataset.


**1. Load Modules**
---

In [4]:
!pip install pretrainedmodels
!pip install imutils



In [5]:
import matplotlib.pyplot as plt
%matplotlib inline
import matplotlib
import joblib
import cv2
import os
import time
import random
import pretrainedmodels
import numpy as np

from imutils import paths
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from tqdm import tqdm

# Load torch...!!!
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader

# Load torchvision ...!!!
from torchvision import transforms

'''SEED Everything'''
def seed_everything(SEED=42):
    random.seed(SEED)
    np.random.seed(SEED)
    torch.manual_seed(SEED)
    torch.cuda.manual_seed(SEED)
    torch.cuda.manual_seed_all(SEED)
    torch.backends.cudnn.benchmark = True # keep True if all the input have same size.
SEED=42
seed_everything(SEED=SEED)
'''SEED Everything'''

'SEED Everything'

Going over some of the important imports in the above code block.

* `torch.nn` will help us access the neural network layers in the PyTorch library.
* `torch.optim` to access all the optimizer functions in PyTorch.
* `pretrainedmodels` to access all the pre-trained models like ResNet34 and many more. We installed this library in one of the previous sections.
* Using `torchvision.transforms` we can apply transforms to our image like normalization and resizing.
* `DataLoader` and `Dataset` from the torchvision.transforms will help us to create our own custom image dataset module and iterable data loaders.
* `cv2` to read images in the dataset.

**2. Define the device, Epochs, and BatchSize**
---

In [6]:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") # GPU
epochs = 5 # Number of epochs
BS = 16 # Batch size

**3. Preparing the Labels and Images**
---

In [7]:
image_paths = list(paths.list_images('./101_ObjectCategories'))

data = []
labels = []
for img_path in tqdm(image_paths):
    label = img_path.split(os.path.sep)[-2]
    if label == "BACKGROUND_Google":
        continue
    img = cv2.imread(img_path)
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    
    data.append(img)
    labels.append(label)
    
data = np.array(data)
labels = np.array(labels)
    

100%|██████████| 9144/9144 [01:10<00:00, 130.34it/s]
  data = np.array(data)


**4. Label Encoder**
---

In [9]:
lb = LabelEncoder()
labels = lb.fit_transform(labels)
print(f"Total Number of Classes: {len(lb.classes_)}")

Total Number of Classes: 101


**5.Define the Transforms**
---

In [10]:
train_transforms = transforms.Compose([
    transforms.ToPILImage(),
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean = [0.485,0.456,0.406], std=[0.229,0.224,0.225]),
])

val_transform = transforms.Compose([
    transforms.ToPILImage(),
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean = [0.485,0.456,0.406], std=[0.229,0.224,0.225]),
])    

**6. Divide the Train,Validation and Test split**
---

In [11]:
# divide the data into train, validation, and test set
(X, x_val , Y, y_val) = train_test_split(data, labels, test_size=0.2,  stratify=labels,random_state=42)
(x_train, x_test, y_train, y_test) = train_test_split(X, Y, test_size=0.25, random_state=42)
print(f"x_train examples: {x_train.shape}\nx_test examples: {x_test.shape}\nx_val examples: {x_val.shape}")

x_train examples: (5205,)
x_test examples: (1736,)
x_val examples: (1736,)


**7. Creating Custom Dataset**
---

In [12]:
# custom dataset class
class CustomDataset(Dataset):
    def __init__(self, images, labels= None, transforms = None):
        self.labels = labels
        self.images = images
        self.transforms = transforms
        
    def __len__(self):
        return len(self.images)
    
    def __getitem__(self, index):
        data = self.images[index][:]
        
        if self.transforms:
            data = self.transforms(data)
            
        if self.y is not None:
            return (data, self.labels[index])
        else:
            return data

train_data = CustomDataset(x_train, y_train, train_transforms)
val_data = CustomDataset(x_val, y_val, val_transform)
test_data = CustomDataset(x_test, y_test, val_transform)       

trainLoader = DataLoader(train_data, batch_size=BS, shuffle=True, num_workers=4)
valLoader = DataLoader(val_data, batch_size=BS, shuffle=True, num_workers=4)
testLoader = DataLoader(test_data, batch_size=BS, shuffle=True, num_workers=4) 


**8.ResNet34 Model**
---

In [13]:
# the resnet34 model
class ResNet34(nn.Module):
    def __init__(self, pretrained):
        super(ResNet34, self).__init__()
        if pretrained is True:
            self.model = pretrainedmodels.__dict__['resnet34'](pretrained='imagenet')
        else:
            self.model = pretrainedmodels.__dict__['resnet34'](pretrained = None)
        # change the classification layer
        self.l0= nn.Linear(512, len(lb.classes_))
        self.dropout = nn.Dropout2d(0.4)
        
    def forward(self, x):
        # get the batch size only, ignore(c, h, w)
        batch, _, _. _ = x.shape
        x = self.model.features(x)
        x = F.adaptive_avg_pool2d(x, 1).reshape(batch, -1)
        x = self.dropout(x)
        l0 = self.l0(x)
        return l0

model = ResNet34(pretrained=True).to(device)
print(model)

ResNet34(
  (model): ResNet(
    (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu): ReLU(inplace=True)
    (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    (layer1): Sequential(
      (0): BasicBlock(
        (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (1): BasicBlock(
        (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_runn

In [14]:
from torchsummary import summary
print(summary(model, input_size=(3, 224, 224)))

Layer (type:depth-idx)                   Param #
├─ResNet: 1-1                            --
|    └─Conv2d: 2-1                       9,408
|    └─BatchNorm2d: 2-2                  128
|    └─ReLU: 2-3                         --
|    └─MaxPool2d: 2-4                    --
|    └─Sequential: 2-5                   --
|    |    └─BasicBlock: 3-1              73,984
|    |    └─BasicBlock: 3-2              73,984
|    |    └─BasicBlock: 3-3              73,984
|    └─Sequential: 2-6                   --
|    |    └─BasicBlock: 3-4              230,144
|    |    └─BasicBlock: 3-5              295,424
|    |    └─BasicBlock: 3-6              295,424
|    |    └─BasicBlock: 3-7              295,424
|    └─Sequential: 2-7                   --
|    |    └─BasicBlock: 3-8              919,040
|    |    └─BasicBlock: 3-9              1,180,672
|    |    └─BasicBlock: 3-10             1,180,672
|    |    └─BasicBlock: 3-11             1,180,672
|    |    └─BasicBlock: 3-12             1,180,672
| 

**9.Calculate the Loss and Optimization Method**
---


In [15]:
# loss function
criterion = nn.CrossEntropyLoss()

# optimizer
optimizer = optim.Adam(model.parameters(), lr = 1e-4)

**10.Model Training**
---

In [16]:
# training function
train_loss , train_accuracy = [], []
for epoch in range(epochs):
    print('Training')
    model.train()
    running_loss = 0.0
    running_correct = 0
    for data in tqdm(trainLoader):
        tqdm.set_description(f"Epoch {epoch+1}/{epochs}")
        data, target = data[0].to(device), data[1].to(device)
        optimizer.zero_grad()
        outputs = model(data)
        loss = criterion(outputs, torch.max(target, 1)[1])
        running_loss += loss.item()
        _, preds = torch.max(outputs.data, 1)
        running_correct += (preds == torch.max(target, 1)[1]).sum().item()
        loss.backward()
        optimizer.step()
        tqdm.set_description(f"Loss: {loss.item()}")
        
    loss = running_loss/len(trainLoader.dataset)
    accuracy = 100. * running_correct/len(trainLoader.dataset)
    
    print(f"Train Loss: {loss:.4f}, Train Acc: {accuracy:.2f}")

Training


  0%|          | 0/326 [00:00<?, ?it/s]

In [None]:
#validation function
def validate(model, dataloader):
    print('Validating')
    model.eval()
    running_loss = 0.0
    running_correct = 0
    with torch.no_grad():
        for i, data in tqdm(enumerate(dataloader), total=int(len(val_data)/dataloader.batch_size)):
            data, target = data[0].to(device), data[1].to(device)
            outputs = model(data)
            loss = criterion(outputs, torch.max(target, 1)[1])
            
            running_loss += loss.item()
            _, preds = torch.max(outputs.data, 1)
            running_correct += (preds == torch.max(target, 1)[1]).sum().item()
        
        loss = running_loss/len(dataloader.dataset)
        accuracy = 100. * running_correct/len(dataloader.dataset)
        print(f'Val Loss: {loss:.4f}, Val Acc: {accuracy:.2f}')
        
        return loss, accuracy

In [None]:
def test(model, dataloader):
    correct = 0
    total = 0
    with torch.no_grad():
        for data in testLoader:
            inputs, target = data[0].to(device), data[1].to(device)
            outputs = model(inputs)
            _, predicted = torch.max(outputs.data, 1)
            total += target.size(0)
            correct += (predicted == torch.max(target, 1)[1]).sum().item()
    return correct, total

**11.Model Training**
---

In [None]:
train_loss , train_accuracy = [], []
val_loss , val_accuracy = [], []
print(f"Training on {len(train_data)} examples, validating on {len(val_data)} examples...")
start = time.time()
for epoch in range(epochs):
    print(f"Epoch {epoch+1} of {epochs}")
    train_epoch_loss, train_epoch_accuracy = fit(model, trainLoader)
    val_epoch_loss, val_epoch_accuracy = validate(model, valLoader)
    train_loss.append(train_epoch_loss)
    train_accuracy.append(train_epoch_accuracy)
    val_loss.append(val_epoch_loss)
    val_accuracy.append(val_epoch_accuracy)
end = time.time()
print((end-start)/60, 'minutes')

In [None]:
# accuracy plots
plt.figure(figsize=(10, 7))
plt.subplot(121)
plt.plot(train_accuracy, color='green', label='train accuracy')
plt.plot(val_accuracy, color='blue', label='validataion accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()
# plt.savefig('../outputs/plots/accuracy.png')
# loss plots
plt.subplot(122)
plt.plot(train_loss, color='orange', label='train loss')
plt.plot(val_loss, color='red', label='validataion loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
# plt.savefig('../outputs/plots/loss.png')
plt.show()