<a href="https://colab.research.google.com/github/aryangoyal7/Deep-Learning-Hello-Foss/blob/main/ResNett9.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Doing the neccessary imports

In [None]:
import numpy as np
import torch
import torch.nn as nn
from torchvision import datasets
from torchvision import transforms
from torch.utils.data.sampler import SubsetRandomSampler
from torchvision.datasets import ImageFolder


# Device configuration
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

Let's import the data, there are multiple ways of doing it, we prefer this one

In [None]:
!pip install opendatasets --upgrade --quiet
import opendatasets as od

In [None]:
dataset_url = 'https://www.kaggle.com/datasets/aneesh10/cricket-shot-dataset'

#1e843ab47f3b61961799fbfc3be0f1b7


In [None]:
od.download(dataset_url)


Please provide your Kaggle credentials to download this dataset. Learn more: http://bit.ly/kaggle-creds
Your Kaggle username: aryanx07
Your Kaggle Key: ··········
Downloading cricket-shot-dataset.zip to ./cricket-shot-dataset


100%|██████████| 645M/645M [00:17<00:00, 38.9MB/s]





In [None]:
data_dir = './cricket-shot-dataset/data'
import os

In [None]:
os.listdir(data_dir)

['legglance-flick', 'drive', 'pullshot', 'sweep']

So we have 4 classes of shots, we have to classify each image as one of these shots

In [None]:
dataset = ImageFolder(data_dir)

In [None]:
len(dataset)

4724

Let's perform the necessary transformations over the dataset images before loading them into our model
</br>
-> Normalization is a process that changes the range of pixel intensity values

In [None]:
import torchvision.transforms as tt
#Normalization
# calculate mean and standard deviation from each channel and correct it here for better results
normalize = tt.Normalize(
        mean=[0.5, 0.5, 0.5],
        std=[0.2, 0.2, 0.2],
)
#Resizing,cropping and converting into tensor
dataset = ImageFolder(data_dir, tt.Compose([tt.Resize(64), 
                                            tt.RandomCrop(64), 
                                            tt.ToTensor()]))

We'll split the dataset into 2 parts, training and validation

In [None]:
val_pct = 0.1
val_size = int(val_pct * len(dataset))
train_size = len(dataset) - val_size

train_size, val_size

(4252, 472)

In [None]:
from torch.utils.data import random_split

train_ds, valid_ds = random_split(dataset, [train_size, val_size])
len(train_ds), len(valid_ds)

(4252, 472)

We'll load the datasets in batches, but what is __batch size__?
</br>
The batch size defines the number of samples that will be propagated through the network.
</br> 
For example you have a dataset of 100 images , you set the batch size to 10, it will pass the first 10 images through the network then train the model, the process is repeated for all the batches.

In [None]:
from torch.utils.data import DataLoader

batch_size = 128

train_dl = DataLoader(train_ds, 
                      batch_size, 
                      shuffle=True, 
                      num_workers=4, 
                      pin_memory=True)

valid_dl = DataLoader(valid_ds, 
                    batch_size, 
                    num_workers=4, 
                    pin_memory=True)

  cpuset_checked))


Here is our model, we are using a standard ResNet9 model, read more about it here [link] - 'resnet9' link

In [None]:
def conv_block(in_channels, out_channels, pool=False):
    layers = [nn.Conv2d(in_channels, out_channels, kernel_size=3, padding=1), 
              nn.BatchNorm2d(out_channels), 
              nn.ReLU(inplace=True)]
    if pool: layers.append(nn.MaxPool2d(2))
    return nn.Sequential(*layers)

class ResNet9(nn.Module):
    def __init__(self, in_channels, num_classes):
        super().__init__()
        
        self.conv1 = conv_block(in_channels, 64) 
        self.conv2 = conv_block(64, 128, pool=True) 
        self.res1 = nn.Sequential(conv_block(128, 128), 
                                  conv_block(128, 128))
        
        self.conv3 = conv_block(128, 256, pool=True) 
        self.conv4 = conv_block(256, 512, pool=True) 
        self.res2 = nn.Sequential(conv_block(512, 512),  
                                  conv_block(512, 512)) 
        
        self.classifier = nn.Sequential(nn.AdaptiveMaxPool2d(1), 
                                        nn.Flatten(), 
                                        nn.Dropout(0.2),
                                        nn.Linear(512, num_classes))
        
    def forward(self, xb):
        out = self.conv1(xb)
        out = self.conv2(out)
        out = self.res1(out) + out
        out = self.conv3(out)
        out = self.conv4(out)
        out = self.res2(out) + out
        out = self.classifier(out)
        return out

Here are some of the hyperparameters of our model, they have a good role to play in determining the accuracy of our model, try changing them for better accuracy

In [None]:
num_classes = 4
num_epochs = 30
batch_size = 16
learning_rate = 0.005
in_channels = 3
model = ResNet9(3,4).to(device)


# Loss and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate, weight_decay = 0.005, momentum = 0.9)  


# Train the model
total_step = len(train_dl)

In [None]:
total_step = len(train_dl)

for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_dl):  
        # Move tensors to the configured device
        images = images.to(device)
        labels = labels.to(device)
        
        # Forward pass
        outputs = model(images)
        loss = criterion(outputs, labels)
        
        # Backward and optimize
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

    print ('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}' 
                   .format(epoch+1, num_epochs, i+1, total_step, loss.item()))
            
    # Validation
    with torch.no_grad():
        correct = 0
        total = 0
        for images, labels in valid_dl:
            images = images.to(device)
            labels = labels.to(device)
            outputs = model(images)
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()
            del images, labels, outputs
    
        print('Accuracy of the network on the {} validation images: {} %'.format(4724, 100 * correct / total)) 

  cpuset_checked))


Epoch [1/30], Step [34/34], Loss: 1.6213
Accuracy of the network on the 4724 validation images: 26.483050847457626 %
Epoch [2/30], Step [34/34], Loss: 2.3673
Accuracy of the network on the 4724 validation images: 27.33050847457627 %
Epoch [3/30], Step [34/34], Loss: 1.1373
Accuracy of the network on the 4724 validation images: 30.93220338983051 %
Epoch [4/30], Step [34/34], Loss: 1.1467
Accuracy of the network on the 4724 validation images: 47.03389830508475 %
Epoch [5/30], Step [34/34], Loss: 1.0356
Accuracy of the network on the 4724 validation images: 42.79661016949152 %
Epoch [6/30], Step [34/34], Loss: 1.5511
Accuracy of the network on the 4724 validation images: 37.5 %
Epoch [7/30], Step [34/34], Loss: 1.0742
Accuracy of the network on the 4724 validation images: 57.41525423728814 %
Epoch [8/30], Step [34/34], Loss: 0.8544
Accuracy of the network on the 4724 validation images: 54.23728813559322 %
Epoch [9/30], Step [34/34], Loss: 0.8991
Accuracy of the network on the 4724 validat