**Data Preprocessing** 
- Mounting Google drive
- Kaggle environment setting
- Download data from kaggle
- Unpack and setup data in a folder in drive 

In [None]:
#Mounting Gdrive
from google.colab import drive
drive.mount('/content/drive')

In [1]:
#Setting kaggle environment
import os 
os.environ['KAGGLE_CONFIG_DIR'] = '/content/drive/MyDrive/kaggle'

In [2]:
#Downloading data
!kaggle datasets download -d gauravduttakiit/ants-bees

Downloading ants-bees.zip to /content
 98% 44.0M/45.1M [00:00<00:00, 61.7MB/s]
100% 45.1M/45.1M [00:00<00:00, 51.9MB/s]


In [3]:
#Unpack data
!mkdir ants-bees 
!mv ants-bees.zip ants-bees
%cd ants-bees/
!unzip ants-bees.zip

/content/ants-bees
Archive:  ants-bees.zip
  inflating: hymenoptera_data/train/ants/0013035.jpg  
  inflating: hymenoptera_data/train/ants/1030023514_aad5c608f9.jpg  
  inflating: hymenoptera_data/train/ants/1095476100_3906d8afde.jpg  
  inflating: hymenoptera_data/train/ants/1099452230_d1949d3250.jpg  
  inflating: hymenoptera_data/train/ants/116570827_e9c126745d.jpg  
  inflating: hymenoptera_data/train/ants/1225872729_6f0856588f.jpg  
  inflating: hymenoptera_data/train/ants/1262877379_64fcada201.jpg  
  inflating: hymenoptera_data/train/ants/1269756697_0bce92cdab.jpg  
  inflating: hymenoptera_data/train/ants/1286984635_5119e80de1.jpg  
  inflating: hymenoptera_data/train/ants/132478121_2a430adea2.jpg  
  inflating: hymenoptera_data/train/ants/1360291657_dc248c5eea.jpg  
  inflating: hymenoptera_data/train/ants/1368913450_e146e2fb6d.jpg  
  inflating: hymenoptera_data/train/ants/1473187633_63ccaacea6.jpg  
  inflating: hymenoptera_data/train/ants/148715752_302c84f5a4.jpg  
  inflat

**Transfer Learning For Classfication** </br>
Transfer Learning technique make use of learning from any well trained model on simillar datasets. In simple language, in our task we have to classify between ants and bees from given dataset. But rather than making our own custom CNN model, we make use of a well trained model like ResNet18. ResNet in general is award winning model architecture proposed by guys from microsoft trained on famous ImageNet data. ResNet18 has 18 layers. For more details go through [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385).

Transfer learning can be very helpful when data is very less. We can take learning from other dataset and make use it in our model.Having said that there is two ways to do it.


1.   **Finetuning the network**
2.   **Use Network as a feature extractor**



In [4]:
#Importing dependenies
import time
import copy
import torch
import torchvision
import torch.nn as nn
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
from torchvision.datasets import  ImageFolder
import matplotlib.pyplot as plt

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

Augmenting and  Normalizing data will give our model better generalized overview especially while our dat ais limited.

In [5]:
#Transforms
data_transforms = {
    "train": transforms.Compose([
                                 transforms.RandomResizedCrop(224),
                                 transforms.RandomHorizontalFlip(),
                                 transforms.ToTensor(),
                                 transforms.Normalize([0.5,0.5,0.5],[0.2,0.2,0.2])
    ]),
    "val": transforms.Compose([
                              transforms.Resize(256),
                              transforms.CenterCrop(224),
                              transforms.ToTensor(),
                              transforms.Normalize([0.5,0.5,0.5],[0.2,0.2,0.2])
    ])
}

This function will run model for specified number of iterations. In each iteration model will train of train set and get evaluated on validation set. Best accuracy scored model will get saved.

In [6]:
#Data directory path
data_dir = '/content/ants-bees/hymenoptera_data'

#Data reading
datasets = {d: ImageFolder(os.path.join(data_dir,d), 
                           transform = data_transforms[d]) for d in ['train','val']} 

#Data loading
dataloaders = {d: DataLoader(datasets[d],
                             batch_size = 4, 
                             shuffle = True, 
                             num_workers = 2) for d in ['train','val']}
#Dataset lengths
dataset_sizes = {d: len(datasets[d]) for d in ['train','val']}

#Classes
classes = datasets['train'].classes


In [8]:
def train_model(model, criterion, optimizer, num_epochs=25):
    since = time.time()

    best_model_wts = copy.deepcopy(model.state_dict())
    best_acc = 0.0

    for epoch in range(num_epochs):
        print('Epoch {}/{}'.format(epoch, num_epochs - 1))
        print('-' * 10)

        # Each epoch has a training and validation phase
        for phase in ['train', 'val']:
            if phase == 'train':
                model.train()  # Set model to training mode
            else:
                model.eval()   # Set model to evaluate mode

            running_loss = 0.0
            running_corrects = 0

            # Iterate over data.
            for inputs, labels in dataloaders[phase]:
                inputs = inputs.to(device)
                labels = labels.to(device)

                # zero the parameter gradients
                optimizer.zero_grad()

                # forward
                # track history if only in train
                with torch.set_grad_enabled(phase == 'train'):
                    outputs = model(inputs)
                    _, preds = torch.max(outputs, 1)
                    loss = criterion(outputs, labels)

                    # backward + optimize only if in training phase
                    if phase == 'train':
                        loss.backward()
                        optimizer.step()

                # statistics
                running_loss += loss.item() * inputs.size(0)
                running_corrects += torch.sum(preds == labels.data)
            #if phase == 'train':
             #   scheduler.step()

            epoch_loss = running_loss / dataset_sizes[phase]
            epoch_acc = running_corrects.double() / dataset_sizes[phase]

            print('{} Loss: {:.4f} Acc: {:.4f}'.format(
                phase, epoch_loss, epoch_acc))

            # deep copy the model
            if phase == 'val' and epoch_acc > best_acc:
                best_acc = epoch_acc
                best_model_wts = copy.deepcopy(model.state_dict())

        print()

    time_elapsed = time.time() - since
    print('Training complete in {:.0f}m {:.0f}s'.format(
        time_elapsed // 60, time_elapsed % 60))
    print('Best val Acc: {:4f}'.format(best_acc))

    # load best model weights
    model.load_state_dict(best_model_wts)
    return model

**1. Finetune the network**
- Make use of the structure of famous architecture will give use less worries to build our own CNN network entirely.
- But using pretrained weights, we can finetune the weights to our usecase. It avoids trainning from random weights. Thus less resourse intensive.

Now well trained model may not have output simillar to our usecase. So we just mildly change the last layer structure to our case.
Here, We have added two Linear layers.

In [7]:
#Model building
model = torchvision.models.resnet18(pretrained=True)
input_ftrs = model.fc.in_features
model.fc = nn.Sequential(
    nn.Linear(input_ftrs,128),
    nn.Linear(128,2)
)
model = model.to(device)

#Loss function
criterion = nn.CrossEntropyLoss()

#Optimizer 
optimizer = torch.optim.SGD(model.parameters(), lr = 0.001, momentum = 0.9)

Downloading: "https://download.pytorch.org/models/resnet18-f37072fd.pth" to /root/.cache/torch/hub/checkpoints/resnet18-f37072fd.pth


  0%|          | 0.00/44.7M [00:00<?, ?B/s]

In [9]:
#Model Running
model = train_model(model,criterion,optimizer)

Epoch 0/24
----------
train Loss: 0.5574 Acc: 0.6967
val Loss: 0.3145 Acc: 0.8562

Epoch 1/24
----------
train Loss: 0.3768 Acc: 0.8361
val Loss: 0.1950 Acc: 0.9412

Epoch 2/24
----------
train Loss: 0.4535 Acc: 0.7951
val Loss: 0.2704 Acc: 0.8954

Epoch 3/24
----------
train Loss: 0.3670 Acc: 0.8648
val Loss: 0.2674 Acc: 0.8889

Epoch 4/24
----------
train Loss: 0.3567 Acc: 0.8402
val Loss: 0.2051 Acc: 0.9216

Epoch 5/24
----------
train Loss: 0.2949 Acc: 0.8770
val Loss: 0.2279 Acc: 0.9216

Epoch 6/24
----------
train Loss: 0.3765 Acc: 0.8279
val Loss: 0.2285 Acc: 0.9085

Epoch 7/24
----------
train Loss: 0.3906 Acc: 0.8115
val Loss: 0.2290 Acc: 0.9346

Epoch 8/24
----------
train Loss: 0.4058 Acc: 0.8074
val Loss: 0.2106 Acc: 0.9412

Epoch 9/24
----------
train Loss: 0.2702 Acc: 0.8770
val Loss: 0.2236 Acc: 0.9281

Epoch 10/24
----------
train Loss: 0.2892 Acc: 0.8852
val Loss: 0.1810 Acc: 0.9216

Epoch 11/24
----------
train Loss: 0.2529 Acc: 0.9016
val Loss: 0.2066 Acc: 0.9216

Ep

**2. Transfer Learning as feature extractor**</br>
Rather training whole network, only train the last layer to make use of pretrained weights fully. Hence we trust trained weights in intial layers as best as estimatators. This way we reduced resourse usage more. This technique specifically useful when our dataset has simillarity with trained datsets. 

In [10]:
model_fex = torchvision.models.resnet18(pretrained= True)

#Freezing the all the parameters
for param in model_fex.parameters():
  param.requires_grad = False

#Adding new layers in fully conncected layers to train
input_ftr = model_fex.fc.in_features
model_fex.fc = nn.Sequential(
    nn.Linear(input_ftrs,128),
    nn.Linear(128,2)
)
model_fex = model_fex.to(device)

#Loss function
criterion_fex = nn.CrossEntropyLoss()

#Optimizer 
optimizer_fex = torch.optim.SGD(model_fex.parameters(), lr = 0.001, momentum = 0.9)

In [11]:
#Model Running
model_fex = train_model(model_fex,criterion_fex,optimizer_fex)

Epoch 0/24
----------
train Loss: 0.6369 Acc: 0.6393
val Loss: 0.3039 Acc: 0.9216

Epoch 1/24
----------
train Loss: 0.4687 Acc: 0.7705
val Loss: 0.2480 Acc: 0.9216

Epoch 2/24
----------
train Loss: 0.3918 Acc: 0.8238
val Loss: 0.1930 Acc: 0.9346

Epoch 3/24
----------
train Loss: 0.4143 Acc: 0.8074
val Loss: 0.1882 Acc: 0.9281

Epoch 4/24
----------
train Loss: 0.4679 Acc: 0.7951
val Loss: 0.2297 Acc: 0.9150

Epoch 5/24
----------
train Loss: 0.3600 Acc: 0.8443
val Loss: 0.1977 Acc: 0.9346

Epoch 6/24
----------
train Loss: 0.3288 Acc: 0.8361
val Loss: 0.1662 Acc: 0.9542

Epoch 7/24
----------
train Loss: 0.3284 Acc: 0.8525
val Loss: 0.1577 Acc: 0.9542

Epoch 8/24
----------
train Loss: 0.3680 Acc: 0.8279
val Loss: 0.1518 Acc: 0.9608

Epoch 9/24
----------
train Loss: 0.4315 Acc: 0.8156
val Loss: 0.1661 Acc: 0.9477

Epoch 10/24
----------
train Loss: 0.3320 Acc: 0.8689
val Loss: 0.2304 Acc: 0.9216

Epoch 11/24
----------
train Loss: 0.3981 Acc: 0.8402
val Loss: 0.1779 Acc: 0.9477

Ep