#### Today we solve a problem related to computer vision

# Computer Vision:

Computer vision is a field of artificial intelligence that trains computers to interpret and understand the visual world. Using digital images from cameras and videos and deep learning models, machines can accurately identify and classify objects — and then react to what they “see.”


#### Libraries for computer vision

1) torch

2) torchvision

In [3]:
import torch
from torch import nn
from torchvision import datasets
from torchvision.transforms import ToTensor

import matplotlib.pyplot as plt
import numpy as np

# 1) Load Dataset (FashionMNIST)

To download it, we provide the following parameters:

root: str - which folder do you want to download the data to?
    
train: Bool - do you want the training or test split?

download: Bool - should the data be downloaded?

transform: torchvision.transforms - what transformations would you like to do on the data?

target_transform - you can transform the targets (labels) if you like too.

In [4]:
train = datasets.FashionMNIST(
    root="fashionmnist",
    train=True, 
    download=True, 
    transform=ToTensor(), 
    target_transform=None 
)


test = datasets.FashionMNIST(
    root="fashionmnist",
    train=False, 
    download=True,
    transform=ToTensor()
)

In [5]:
image, label = train[0]
image.shape, label

(torch.Size([1, 28, 28]), 9)

In [6]:
# Size of data

len(train), len(test)

(60000, 10000)

### Visualize data

In [None]:
plt.imshow(image.numpy().reshape(28,28))
plt.title(label);

# 2) Prepare Data Loader

In [7]:
from torch.utils.data import DataLoader


BATCH_SIZE = 32


train_dataloader = DataLoader(train,
    batch_size=BATCH_SIZE, 
    shuffle=True 
)

test_dataloader = DataLoader(test,
    batch_size=BATCH_SIZE,
    shuffle=False 
)


print(f"Dataloaders: {train_dataloader, test_dataloader}") 
print(f"Length of train dataloader: {len(train_dataloader)} batches of {BATCH_SIZE}")
print(f"Length of test dataloader: {len(test_dataloader)} batches of {BATCH_SIZE}")

Dataloaders: (<torch.utils.data.dataloader.DataLoader object at 0x00000214EBB69700>, <torch.utils.data.dataloader.DataLoader object at 0x00000214F41A8E80>)
Length of train dataloader: 1875 batches of 32
Length of test dataloader: 313 batches of 32


In [8]:
train_features_batch, train_labels_batch = next(iter(train_dataloader))
train_features_batch.shape, train_labels_batch.shape

(torch.Size([32, 1, 28, 28]), torch.Size([32]))

# 3) Train Model

In [9]:
class FashionMNISTModel0(nn.Module):
    def __init__(self,input_shape,hidden_units,output_shape):
        super().__init__()
        self.layer_stack = nn.Sequential(
            nn.Flatten(),
            nn.Linear(in_features=input_shape, out_features=hidden_units), # in_features = number of features in a data sample (784 pixels)
            nn.Linear(in_features=hidden_units, out_features=hidden_units),
            nn.Linear(in_features=hidden_units, out_features=output_shape)
        )
        
    def forward(self,data):
        return self.layer_stack(data)
    
    

In [10]:
class_names = train.classes

model_0 = FashionMNISTModel0(input_shape=784, # one for every pixel (28x28)
    hidden_units=10,
    output_shape=len(class_names) 
)

In [11]:
# Setup optimizer and loss function

loss_fn = nn.CrossEntropyLoss() 
optimizer = torch.optim.SGD(params=model_0.parameters(), lr=0.1)

In [12]:
# Calculate accuracy (a classification metric)
def accuracy_fn(y_true, y_pred):
    correct = torch.eq(y_true, y_pred).sum().item() 
    acc = (correct / len(y_pred)) * 100 
    return acc

In [13]:
# Now its time for training loop


epochs = 10
total_train_loss = []
total_test_loss = []
train_acc = []
test_acc = []


for epoch in range(epochs):
    
    train_batch_loss = 0
    train_batch_acc = 0
    
    for batch, (X,y) in enumerate(train_dataloader):
        model_0.train()
        
        #forward pass
        y_pred = model_0(X)
        
        #Calculate loss
        loss = loss_fn(y_pred,y)
        train_batch_loss += loss
        
        # Calculate accuracy
        train_batch_acc += accuracy_fn(y,y_pred.argmax(dim=1))
        
        
        # Optimizer zero grad
        
        optimizer.zero_grad()
        
        # Loss Backward
        
        loss.backward()
        
        # Update weights
        
        optimizer.step()
        
        if batch % 300==0:
            print(f"Train batch Loss: {train_batch_loss} and Accuracy: {train_batch_acc/len(train_dataloader)} ")
        
        
    total_train_loss.append(train_batch_loss/len(train_dataloader)) # Average the loss on batces
    train_acc.append(train_batch_acc/len(train_dataloader))
    
    model_0.eval()
    with torch.inference_mode():
        test_batch_loss = 0
        test_batch_acc = 0
        for X, y in test_dataloader:
            test_pred = model_0(X)
            
            test_batch_loss += loss_fn(test_pred,y)
            test_batch_acc += accuracy_fn(y,test_pred.argmax(dim=1))
        
        total_test_loss.append(test_batch_loss/len(test_dataloader))
        test_acc.append(test_batch_acc/len(test_dataloader))
        
    print('----------------------------------------------------------------------\n')
    print(f'Epoch {epoch} Loss: {train_batch_loss/len(train_dataloader)}, Accuracy: {train_batch_acc/len(train_dataloader)}, Val_loss: {test_batch_loss/len(test_dataloader)}, Val_Accuracy: {test_batch_acc/len(test_dataloader)}')
    print('----------------------------------------------------------------------\n')
        
        
            
        
            
            
            
            
    
    
    
    

Train batch Loss: 2.392350912094116 and Accuracy: 0.0033333333333333335 
Train batch Loss: 284.5870056152344 and Accuracy: 10.433333333333334 
Train batch Loss: 474.440673828125 and Accuracy: 22.768333333333334 
Train batch Loss: 645.0167236328125 and Accuracy: 35.45333333333333 
Train batch Loss: 809.6401977539062 and Accuracy: 48.31166666666667 
Train batch Loss: 971.3385620117188 and Accuracy: 61.24166666666667 
Train batch Loss: 1134.1739501953125 and Accuracy: 74.11 
----------------------------------------------------------------------

Epoch 0 Loss: 0.6263759732246399, Accuracy: 77.325, Val_loss: 0.5245091915130615, Val_Accuracy: 81.47963258785943
----------------------------------------------------------------------

Train batch Loss: 0.24139608442783356 and Accuracy: 0.04666666666666667 
Train batch Loss: 152.02468872070312 and Accuracy: 13.188333333333333 
Train batch Loss: 305.626708984375 and Accuracy: 26.226666666666667 
Train batch Loss: 454.1046142578125 and Accuracy: 39

## Now lets create a CNN based Model for classifying images

In [19]:
class FashionMNIST_CNN(nn.Module):
    def __init__(self,input_channel,hidden_units,output_shape):
        super().__init__()
        
        self.conv1 = nn.Conv2d(in_channels=input_channel,out_channels=hidden_units
                              ,kernel_size=3,stride=1,padding=1)
        
        self.relu1 = nn.ReLU()
        self.pool1 = nn.MaxPool2d(kernel_size=2)
        
        self.conv2 = nn.Conv2d(in_channels=hidden_units, 
                      out_channels=hidden_units,
                      kernel_size=3,
                      stride=1,
                      padding=1)
        
        self.relu2 = nn.ReLU()
        self.pool2 = nn.MaxPool2d(kernel_size=2)
        
        
        self.flatten = nn.Flatten()
        self.linear = nn.Linear(in_features=hidden_units*7*7,out_features=output_shape)
        
    def forward(self,X):
        
        X = self.conv1(X)
        X = self.relu1(X)
        X = self.pool1(X)
        X = self.conv2(X)
        X = self.relu2(X)
        X = self.pool2(X)
        X = self.flatten(X)
        X = self.linear(X)
        
        return X
    
    
        
        
        
        
        
        
        
        
        

In [20]:
# Setup optimizer and loss function

loss_fn = nn.CrossEntropyLoss() 
optimizer = torch.optim.SGD(params=model_0.parameters(), lr=0.1)



In [21]:
# here hidden_units also refereing to channels of output feature maps from conv2d
model_1 = FashionMNIST_CNN(input_channel=1,hidden_units=64,\
                          output_shape=len(class_names))



In [22]:
# Now its time for training loop


epochs = 10
total_train_loss = []
total_test_loss = []
train_acc = []
test_acc = []


for epoch in range(epochs):
    
    train_batch_loss = 0
    train_batch_acc = 0
    
    for batch, (X,y) in enumerate(train_dataloader):
        model_1.train()
        
        #forward pass
        y_pred = model_1(X)
        
        #Calculate loss
        loss = loss_fn(y_pred,y)
        train_batch_loss += loss
        
        # Calculate accuracy
        train_batch_acc += accuracy_fn(y,y_pred.argmax(dim=1))
        
        
        # Optimizer zero grad
        
        optimizer.zero_grad()
        
        # Loss Backward
        
        loss.backward()
        
        # Update weights
        
        optimizer.step()
        
        if batch % 300==0:
            print(f"Train batch Loss: {train_batch_loss} and Accuracy: {train_batch_acc/len(train_dataloader)} ")
        
        
    total_train_loss.append(train_batch_loss/len(train_dataloader)) # Average the loss on batces
    train_acc.append(train_batch_acc/len(train_dataloader))
    
    model_1.eval()
    with torch.inference_mode():
        test_batch_loss = 0
        test_batch_acc = 0
        for X, y in test_dataloader:
            test_pred = model_1(X)
            
            test_batch_loss += loss_fn(test_pred,y)
            test_batch_acc += accuracy_fn(y,test_pred.argmax(dim=1))
        
        total_test_loss.append(test_batch_loss/len(test_dataloader))
        test_acc.append(test_batch_acc/len(test_dataloader))
        
    print('----------------------------------------------------------------------\n')
    print(f'Epoch {epoch} Loss: {train_batch_loss/len(train_dataloader)}, Accuracy: {train_batch_acc/len(train_dataloader)}, Val_loss: {test_batch_loss/len(test_dataloader)}, Val_Accuracy: {test_batch_acc/len(test_dataloader)}')
    print('----------------------------------------------------------------------\n')
        
        
            
        
            
            
            
            
    
    
    
    

Train batch Loss: 2.327446460723877 and Accuracy: 0.0 
Train batch Loss: 695.8174438476562 and Accuracy: 1.4766666666666666 
Train batch Loss: 1389.3365478515625 and Accuracy: 3.13 
Train batch Loss: 2082.8076171875 and Accuracy: 4.748333333333333 
Train batch Loss: 2776.32373046875 and Accuracy: 6.283333333333333 
Train batch Loss: 3469.78955078125 and Accuracy: 7.85 
Train batch Loss: 4163.10546875 and Accuracy: 9.438333333333333 
----------------------------------------------------------------------

Epoch 0 Loss: 2.311635732650757, Accuracy: 9.788333333333334, Val_loss: 2.31148099899292, Val_Accuracy: 9.994009584664537
----------------------------------------------------------------------

Train batch Loss: 2.309089422225952 and Accuracy: 0.008333333333333333 
Train batch Loss: 695.858154296875 and Accuracy: 1.5366666666666666 
Train batch Loss: 1389.4171142578125 and Accuracy: 3.0366666666666666 
Train batch Loss: 2082.84375 and Accuracy: 4.628333333333333 
Train batch Loss: 2776.