## Scalable Data Mining (Autumn 2021)
## Assignment 2 
Name : <b>Varun Gupta</b> <br>
Roll Number : <b>18MA20050</b> <br>

## Question 1:
<b>Task: </b> The aim of this assignment is to train and test Convolutional Nueral Networks for image classification on CIFAR10 dataset using PyTorch Module and to acquaint yourself with wandb which is a tool for monitoring large experiments.

## Importing the required libraries
Here, I have just used the already given code for importing the libraries, and also added the code for the extra libraries that I have used for developing the model, or for other required functions. 

In [1]:
## all the torch libraries for Deep Learning Computation, and required functions, like loss and optimizers
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
import torch.backends.cudnn as cudnn

## torchvision provides with the saved dataset and models
import torchvision
import torchvision.transforms as transforms

## required for other computation tasks for the project
import os
import argparse
import numpy as np

## importing the required wandb library for logging all the required information
import wandb

## this just prevents wandb from printing its information, to decrease the clutter in the output 
%env WANDB_SILENT=true

## checking if cuda is available, for faster computation
device = 'cuda' if torch.cuda.is_available() else 'cpu'

## global var saves the best loss for each one of the models
best_loss = float('inf')

## logging into wandb account
wandb.login()

ModuleNotFoundError: No module named 'torch'

## Data:
- train_image: Consist of 50000 images of 32 x 32 RGB images
- train_labels: Consist of 50000 labels from 10 classes for the images in train_images. The labels are described below.
- test_images: Consist of 10000 images of 32 x 32 RGB images.
- test_labels: Consist of 10000 labels from 10 classes for the images in test_images.
<br>
<b>Labels: </b>Each training and test image is classified into <b>ANY ONE</b> of the following labels:<br>


| Labels | Labels
| :- | :-|
| 0 - Airplane | 1 - Automobile
| 2 - Bird | 3 - Cat
| 4 - Deer | 5 - Dog
| 6 - Frog | 7 - Horse
| 8 - Ship | 9 - Truck

### 1. Data Load:
Use the file <b>main.py</b> given in the [link](https://github.com/SoumiDas/CS60021_A2021) to load the data and carry on further experiments.The code here just downloads the data from the torchvision i.e. both the test and train data, and hence transforms the data as required so that it could be fed into the resnet18 model.<br>
The Classes tuple contains the name of the classes according to their index values.

In [2]:
# Data Load: 
# Just used the main.py code given to download and transform the CIFAR10 dataset as required
# for training the ResNet18 Model
print('Data transformation')

## tranformation for training set
transform_train = transforms.Compose([
    transforms.RandomCrop(32, padding=4),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),
])

## tranformation for test set
transform_test = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),
])

## downloading the training set
trainset = torchvision.datasets.CIFAR10(
    root='./data', train=True, download=False, transform=transform_train)

## using the dataloader so that we could iterated through minibatches of data while training 
trainloader = torch.utils.data.DataLoader(
    trainset, batch_size=128, shuffle=True, num_workers=2)

## downloading the test set
testset = torchvision.datasets.CIFAR10(
    root='./data', train=False, download=False, transform=transform_test)

## similarly defining the testloader
testloader = torch.utils.data.DataLoader(
    testset, batch_size=100, shuffle=False, num_workers=2)

## class names
classes = ('Airplane', 'Automobile', 'Bird', 'Cat', 'Deer',
           'Dog', 'Frog', 'Horse', 'Ship', 'Truck')

Data transformation


### 2)Model:
Use the pretrained Resnet18 model to train your Convolutional Neural Network in the following ways.<br>
a) <b>Train all the layers.</b><br>
b) <b>Freeze the other layers and finetune only the last layer.</b><br><br>
Here, I just initialize the four models required for devolepment according to the question, and for two of the models we just freeze the parameters in all their layers, and for the all the four models, we replace the final FC layer such that it outputs 10 classes and not 1000, also replacing in this manner, the last layer's parameters would automatically required to be trained.Finally just transfer all the models to the defined device so that if CUDA is available, faster computation would take place.<br><br> 
<b>Question: </b>You may get differences in accuracy in the above two methods. Describe what is the reason for the same.<br>
<b>Solution: </b><br>

In [3]:
# Model
print('Model creation')

# defining all the required models 
net1_SGD = torchvision.models.resnet18(pretrained = True)
net1_Adam = torchvision.models.resnet18(pretrained = True)
net2_SGD = torchvision.models.resnet18(pretrained = True)
net2_Adam = torchvision.models.resnet18(pretrained = True)

# now, for the two of the models, freezing all the layers
for param in net2_SGD.parameters():
    param.requires_grad = False
    
for param in net2_Adam.parameters():
    param.requires_grad = False

## number of input features in the last FC layer
num_ftrs = net1_SGD.fc.in_features

## for all models replacing the last layers number of output classes
net1_SGD.fc = nn.Linear(num_ftrs,10)
net1_Adam.fc = nn.Linear(num_ftrs,10)
net2_SGD.fc = nn.Linear(num_ftrs,10)
net2_Adam.fc = nn.Linear(num_ftrs,10)

# using the models in the respective device, gives advantage if CUDA is available
net1_SGD = net1_SGD.to(device)
net1_Adam = net1_Adam.to(device)
net2_SGD = net2_SGD.to(device)
net2_Adam = net2_Adam.to(device)

Model creation


### 3)Training Module:
Implement a mini-batch SGD using <b>main.py</b> to train the CNN in both the ways described above. Use the following configurations while training:<br>
- Use SGD optimizer with learning rate = 0.001, momentum = 0.9 and cross-entropy as the loss function.
- Use Adam optimizer with learning rate = 0.01 and cross-entropy as the loss function.
<br>
You can use early stopping too if loss converges beforehand.<br><br>
For each of the two configurations above under each way (described under Model), retain/save the best model (yielding best test set accuracy while testing at each epoch).<br>
<span style="color:blue">Use the best saved models to report the final test set accuracies for all the four combinations.</span><br><br>
<b>Working:</b><br>Initially, defined all the required optimizers for all the optimizers, and for the two freezed models, I explicitly define the optimizer such that it would only work on the last FC layer's parameters. Then, I just define the train and test function which would hence perform the model operations on each batch of the dataset.

In [4]:
## defining all the required characteristics of the model, 1 is for the model, _optim for the respective optimizers
optimizer1_SGD = optim.SGD(net1_SGD.parameters(),lr = 0.001,momentum=0.9)
optimizer1_Adam = optim.Adam(net1_Adam.parameters(),lr = 0.01)
optimizer2_SGD = optim.SGD(net2_SGD.fc.parameters(),lr = 0.001,momentum=0.9)
optimizer2_Adam = optim.Adam(net2_Adam.fc.parameters(),lr = 0.01)

## defining the required cross entropy loss function
criterion = nn.CrossEntropyLoss()

# required function for Training the model
## inputs -> model and optimizer for that model 
## outputs -> loss, and accuracy for each epoch for that model

def train(model,optimizer):
    print('\nEpoch {}/{}'.format(epoch+1, 200))
    print('------------------------------------')
    model.train()
    batch_loss = 0.0             ## adds loss for each mini-batch for that epoch
    correct = 0                  ## stores all the correct predictions for that epoch
    
    ## moving through batches
    for batch_idx, (inputs, targets) in enumerate(trainloader): 
        inputs, targets = inputs.to(device), targets.to(device)
        
        ## zeroing the gradients for that mini batch
        optimizer.zero_grad()
        
        ## outputs from the model
        outputs = model(inputs)
        
        ## evaluate the loss
        loss = criterion(outputs,targets)
        
        ## backpropagate the loss
        loss.backward()
        
        ## take the gradient step
        optimizer.step()
        
        ## adding the loss for that mini batch
        batch_loss += loss.item()
        
        ## evaluate the prediction classes for that mini batch
        _,pred = torch.max(outputs,dim = 1)
        
        ## summing the correct predictions 
        correct += torch.sum(pred == targets).item()
        
    ## printing the training loss and accuracy for each epoch
    print(f'Training Loss: {batch_loss/len(trainloader.dataset):.4f}, Training Accuracy: {(100*correct/len(trainloader.dataset)):.4f}')
    loss = batch_loss/len(trainloader.dataset)
    accuracy = 100*correct/len(trainloader.dataset)
    
    ## returning the epoch loss and accuracy
    return (loss,accuracy)
    
## inputs -> 
## model and optimizer for that model

## early stop is just an mutable list, which stores number of epochs till which no better model has occured
## and boolean for early-stopping condition,
## no_improve is just a param, which indicates till which epoch we would check for no better model

## model_name and optim_name are just strings for the required models, which saves the best model in the
## current directory until the testing runs

## outputs -> loss, accuracy, and some prediction_samples for each epoch for that model

# required function for Testing the model
def test(model,early_stop,model_name,optim_name,no_improve = 10):
    
    ## taking the global best_acc to save the best test model 
    global best_loss
    
    ## to use the model for testing, such that weight parameters would not get updated
    model.eval()
    
    ## required variables
    batch_loss = 0             ## stores the total loss for that epoch
    correct_t = 0              ## number of correct predictions for that epoch
    prediction_samples = []    ## saves some prediction samples for each mini-batch
    predictions = []           ## predictions list for that epoch, to be used for confusion matrix for best model
    targetlist = []            ## target list for that epoch, to be used for confusion matrix for best model
    
    ## use this, so that weights don't get updated
    with torch.no_grad():
        
        ## iterating through batches
        for batch_idx, (inputs, targets) in enumerate(testloader):
            inputs, targets = inputs.to(device), targets.to(device)
            
            ## taking outputs
            outputs = model(inputs)
            
            ## adding the loss to total loss
            batch_loss += criterion(outputs,targets).item()
            
            ## evaluating the predictions 
            _,pred_t = torch.max(outputs,dim = 1)
            
            ## summing the correct predictions 
            correct_t += torch.sum(pred_t == targets).item()
            
            ## adding the predictions and targets in their list
            predictions.extend(pred_t)
            targetlist.extend(targets)
            
            ## list for sample of predictions samples
            prediction_samples.append(wandb.Image(inputs[0],caption = "Prediction: {} Truth: {}".format(pred_t[0].item(),targets[0])))
        ## batch loss and accuracy
        loss = batch_loss/len(testloader.dataset)
        curr_acc = 100*correct_t/len(testloader.dataset)
        print(f'Testing Loss: {loss:.4f}, Testing Accuracy: {curr_acc:.4f}')
        
        # Save checkpoint for the model which yields best accuracy
        ## checking if current accuracy if larger or not 
        if loss < best_loss:
            
            ## since the elements would be tensors, hence taking their values only
            predictions = [x.item() for x in predictions]
            targetlist = [x.item() for x in targetlist]
            
            early_stop[1] = 0      ## since got a new model, hence better model count becomes 0
            best_loss = loss    ## changing the best accuracy respectively
            
            ## saving this model with all the required information
            torch.save({
            'model_state_dict': model.state_dict(),
            'Testing Loss': loss,
            'Testing Accuracy': curr_acc,
            'Predictions': predictions,
            'Target List': targetlist
            },'resnet18_'+ model_name + '_' + optim_name + '.pth')
            
        else:
            early_stop[1] += 1      ## adding the no. of unimproves models
            
        ## if early stop condition becomes true, then changing the respective boolean true
        if early_stop[1] > no_improve:
            early_stop[0] = True
            
        ## returning the loss, accuracy, and predictions
        return (loss,curr_acc,prediction_samples)

### Model1 : Unfreezed ResNet-18 Model with SDM Optimizer
This is ResNet-18 model where all the layers are kept unfreezed so that all that parameters are available for training, and the respective optimizer used is the SDM Optimizer. I have respectively initialized the wandb run, with the respective run name and project name. Also, saved the respective model's configuration parameters.<br>
Here, using wandb, I log all the respective values that need to be stored for the run name and project, for each epoch. Also, after saving the model using torch previously, I save the best model as an artifact for that run  using wandb. Finally, I finish that run so that nothing extra would get logged for that run.

In [5]:
## starting the respective run
run = wandb.init(name = "Unfreezed_SGD",project = "18MA20050_Assignment_2",resume = True)

## for watching all the respective information for that model
wandb.watch(net1_SGD, log = "all")

## configuration parameters
config = wandb.config
config.batch_size = 128
config.test_batch_size = 100
config.lr = 0.001
config.momentum = 0.9

## required for early stopping
early_stop = [False,0]

## using the respective best loss variable 
global best_loss
best_loss = float('inf')

## running for large number of epochs, as the model is would be more likely for early stopping
for epoch in range(0,200):
    
    ## runnning the respective functions, and hence getting the outputs
    train_loss,train_acc = train(net1_SGD,optimizer1_SGD)
    test_loss,test_acc,samples = test(net1_SGD,early_stop,"Unfreezed","SGD")
    
    ## logging each of those results
    wandb.log({
                "Training Loss":train_loss,
                "Training Accuracy":train_acc,
                "Testing Loss":test_loss,
                "Testing Accuracy": test_acc,
                "Prediction Samples":samples
                })
    
    ## if want to stop early
    if early_stop[0] :
        break

## saving the respective model as an artifact in that run
resnet18_Unfreezed_SGD = wandb.Artifact('resnet18_Unfreezed_SGD', type='model')
resnet18_Unfreezed_SGD.add_file('resnet18_Unfreezed_SGD.pth')
run.log_artifact(resnet18_Unfreezed_SGD)

## loading the best model saved, to access the best predictions and accruacy  
best_resnet18_Unfreezed_SGD = torch.load('resnet18_Unfreezed_SGD.pth')

## logging the confusion matrix for the best model
wandb.log({
    "resnet18_Unfreezed_SGD_Confusion_Matrix": wandb.plot.confusion_matrix(
    preds = best_resnet18_Unfreezed_SGD['Predictions'],
    y_true = best_resnet18_Unfreezed_SGD['Target List'],
    class_names = classes) 
})

## logging the best test accuracy 
config.best_test_accuracy = best_resnet18_Unfreezed_SGD['Testing Accuracy']

## hence finishing the run
wandb.finish()
print('##################################')
print('Run Finished')
print('Hence,The best model Accuracy for Resnet-18 Unfreezed with SDM Optimization: ',best_resnet18_Unfreezed_SGD['Testing Accuracy'])
print('##################################')


Epoch 1/200
------------------------------------
Training Loss: 0.0102, Training Accuracy: 54.0980
Testing Loss: 0.0094, Testing Accuracy: 67.1700

Epoch 2/200
------------------------------------
Training Loss: 0.0070, Training Accuracy: 68.6060
Testing Loss: 0.0080, Testing Accuracy: 72.4300

Epoch 3/200
------------------------------------
Training Loss: 0.0060, Training Accuracy: 72.9060
Testing Loss: 0.0070, Testing Accuracy: 76.0900

Epoch 4/200
------------------------------------
Training Loss: 0.0054, Training Accuracy: 75.5920
Testing Loss: 0.0065, Testing Accuracy: 77.4500

Epoch 5/200
------------------------------------
Training Loss: 0.0050, Training Accuracy: 77.5980
Testing Loss: 0.0063, Testing Accuracy: 77.9200

Epoch 6/200
------------------------------------
Training Loss: 0.0047, Training Accuracy: 78.8060
Testing Loss: 0.0060, Testing Accuracy: 79.0800

Epoch 7/200
------------------------------------
Training Loss: 0.0044, Training Accuracy: 79.9700
Testing Loss

### Model2 : Unfreezed ResNet-18 Model with Adam Optimizer
This is ResNet-18 model where all the layers are kept unfreezed so that all that parameters are available for training, and the respective optimizer used is the Adam Optimizer. I have respectively initialized the wandb run, with the respective run name and project name. Also, saved the respective model's configuration parameters.<br>
Here, using wandb, I log all the respective values that need to be stored for the run name and project, for each epoch. Also, after saving the model using torch previously, I save the best model as an artifact for that run  using wandb. Finally, I finish that run so that nothing extra would get logged for that run.

In [6]:
## starting the respective run
run = wandb.init(name = "Unfreezed_Adam",project = "18MA20050_Assignment_2")

## for watching all the respective information for that model
wandb.watch(net1_Adam, log = "all")

## configuration parameters
config = wandb.config
config.batch_size = 128
config.test_batch_size = 100
config.lr = 0.01

## required for early stopping
early_stop = [False,0]

## using the respective best loss variable 
global best_loss
best_loss = float('inf')

## running for large number of epochs, as the model is would be more likely for early stopping
for epoch in range(0,200):
    
    ## runnning the respective functions, and hence getting the outputs
    train_loss,train_acc = train(net1_Adam,optimizer1_Adam)
    test_loss,test_acc,samples = test(net1_Adam,early_stop,"Unfreezed","Adam")
    
    ## logging each of those results
    wandb.log({
                "Training Loss":train_loss,
                "Training Accuracy":train_acc,
                "Testing Loss":test_loss,
                "Testing Accuracy": test_acc,
                "Prediction Samples":samples
                })
    ## for ealy stopping
    if early_stop[0] :
        break
    
## saving the respective model as an artifact in that run
resnet18_Unfreezed_Adam = wandb.Artifact('resnet18_Unfreezed_Adam', type='model')
resnet18_Unfreezed_Adam.add_file('resnet18_Unfreezed_Adam.pth')
run.log_artifact(resnet18_Unfreezed_Adam)

## loading the best model saved, to access the best predictions and accruacy  
F# logging the confusion matrix for the best model
wandb.log({
    "resnet18_Unfreezed_Adam_Confusion_Matrix": wandb.plot.confusion_matrix(
    preds = best_resnet18_Unfreezed_Adam['Predictions'],
    y_true = best_resnet18_Unfreezed_Adam['Target List'],
    class_names = classes) 
})

## logging the best test accuracy 
config.best_test_accuracy = best_resnet18_Unfreezed_Adam['Testing Accuracy']

## hence finishing the run
wandb.finish()
print('##################################')
print('Run Finished')
print('Hence,The best model Accuracy for Resnet-18 Unfreezed with Adam Optimization: ',best_resnet18_Unfreezed_Adam['Testing Accuracy'])
print('##################################')


Epoch 1/200
------------------------------------
Training Loss: 0.0164, Training Accuracy: 25.4280
Testing Loss: 0.2203, Testing Accuracy: 34.0100

Epoch 2/200
------------------------------------
Training Loss: 0.0130, Training Accuracy: 38.6140
Testing Loss: 0.0149, Testing Accuracy: 47.4200

Epoch 3/200
------------------------------------
Training Loss: 0.0112, Training Accuracy: 46.9600
Testing Loss: 0.0133, Testing Accuracy: 51.9400

Epoch 4/200
------------------------------------
Training Loss: 0.0098, Training Accuracy: 54.7420
Testing Loss: 0.0146, Testing Accuracy: 50.9500

Epoch 5/200
------------------------------------
Training Loss: 0.0087, Training Accuracy: 59.9920
Testing Loss: 0.0101, Testing Accuracy: 64.5500

Epoch 6/200
------------------------------------
Training Loss: 0.0079, Training Accuracy: 64.2440
Testing Loss: 0.0109, Testing Accuracy: 63.1200

Epoch 7/200
------------------------------------
Training Loss: 0.0082, Training Accuracy: 62.8400
Testing Loss

Training Loss: 0.0025, Training Accuracy: 89.0520
Testing Loss: 0.0057, Testing Accuracy: 82.0300

Epoch 57/200
------------------------------------
Training Loss: 0.0024, Training Accuracy: 89.3920
Testing Loss: 0.0052, Testing Accuracy: 84.2700

Epoch 58/200
------------------------------------
Training Loss: 0.0024, Training Accuracy: 89.1460
Testing Loss: 0.0055, Testing Accuracy: 83.7000

Epoch 59/200
------------------------------------
Training Loss: 0.0023, Training Accuracy: 89.4620
Testing Loss: 0.0053, Testing Accuracy: 84.2100

Epoch 60/200
------------------------------------
Training Loss: 0.0023, Training Accuracy: 89.7500
Testing Loss: 0.0054, Testing Accuracy: 83.8200

Epoch 61/200
------------------------------------
Training Loss: 0.0023, Training Accuracy: 89.7820
Testing Loss: 0.0053, Testing Accuracy: 84.1100

Epoch 62/200
------------------------------------
Training Loss: 0.0022, Training Accuracy: 90.0360
Testing Loss: 0.0056, Testing Accuracy: 83.5900
########

### Model3 : Freezed ResNet-18 Model with SDM Optimizer
This is ResNet-18 model where all the layers except the last layer(FC layer) are kept Freezed, so that only the parameters for the last layer is available for training, and the respective optimizer used is the SDM Optimizer. I have respectively initialized the wandb run, with the respective run name and project name. Also, saved the respective model's configuration parameters.<br>
Here, using wandb, I log all the respective values that need to be stored for the run name and project, for each epoch. Also, after saving the model using torch previously, I save the best model as an artifact for that run  using wandb. Finally, I finish that run so that nothing extra would get logged for that run.

In [7]:
## starting the respective run
run = wandb.init(name = "Freezed_SGD",project = "18MA20050_Assignment_2")

## for watching all the respective information for that model
wandb.watch(net2_SGD, log = "all")

## configuration parameters
config = wandb.config
config.batch_size = 128
config.test_batch_size = 100
config.lr = 0.001
config.momentum = 0.9

## required for early stopping
early_stop = [False,0]

## using the respective best loss variable 
global best_loss
best_loss = float('inf')

## running for large number of epochs, as the model is would be more likely for early stopping
for epoch in range(0,200):
    
    ## runnning the respective functions, and hence getting the outputs
    train_loss,train_acc = train(net2_SGD,optimizer2_SGD)
    test_loss,test_acc,samples = test(net2_SGD,early_stop,"Freezed","SGD")
    
    ## logging each of those results
    wandb.log({
                "Training Loss":train_loss,
                "Training Accuracy":train_acc,
                "Testing Loss":test_loss,
                "Testing Accuracy": test_acc,
                "Prediction Samples":samples
                })
    
    ## for ealy stopping
    if early_stop[0] :
        break
        
## saving the respective model as an artifact in that run
resnet18_Freezed_SGD = wandb.Artifact('resnet18_Freezed_SGD', type='model')
resnet18_Freezed_SGD.add_file('resnet18_Freezed_SGD.pth')
run.log_artifact(resnet18_Freezed_SGD)

## loading the best model saved, to access the best predictions and accruacy
best_resnet18_Freezed_SGD = torch.load('resnet18_Freezed_SGD.pth')

## logging the confusion matrix for the best model
wandb.log({
    "resnet18_Freezed_SGD_Confusion_Matrix": wandb.plot.confusion_matrix(
    preds = best_resnet18_Freezed_SGD['Predictions'],
    y_true = best_resnet18_Freezed_SGD['Target List'],
    class_names = classes) 
})

## logging the best test accuracy
config.best_test_accuracy = best_resnet18_Freezed_SGD['Testing Accuracy']

## hence finishing the run
wandb.finish()
print('##################################')
print('Run Finished')
print('Hence,The best model Accuracy for Resnet-18 Freezed with SDM Optimization: ',best_resnet18_Freezed_SGD['Testing Accuracy'])
print('##################################')


Epoch 1/200
------------------------------------
Training Loss: 0.0153, Training Accuracy: 31.1780
Testing Loss: 0.0180, Testing Accuracy: 37.3200

Epoch 2/200
------------------------------------
Training Loss: 0.0137, Training Accuracy: 38.7100
Testing Loss: 0.0176, Testing Accuracy: 38.9400

Epoch 3/200
------------------------------------
Training Loss: 0.0134, Training Accuracy: 40.3580
Testing Loss: 0.0173, Testing Accuracy: 40.1400

Epoch 4/200
------------------------------------
Training Loss: 0.0132, Training Accuracy: 40.9620
Testing Loss: 0.0171, Testing Accuracy: 41.0600

Epoch 5/200
------------------------------------
Training Loss: 0.0131, Training Accuracy: 41.5400
Testing Loss: 0.0169, Testing Accuracy: 41.6900

Epoch 6/200
------------------------------------
Training Loss: 0.0131, Training Accuracy: 41.3500
Testing Loss: 0.0170, Testing Accuracy: 40.8600

Epoch 7/200
------------------------------------
Training Loss: 0.0130, Training Accuracy: 41.7880
Testing Loss

### Model4 : Freezed ResNet-18 Model with Adam Optimizer
This is ResNet-18 model where all the layers except the last layer(FC layer) are kept Freezed, so that only the parameters for the last layer is available for training, and the respective optimizer used is the Adam Optimizer. I have respectively initialized the wandb run, with the respective run name and project name. Also, saved the respective model's configuration parameters.<br>
Here, using wandb, I log all the respective values that need to be stored for the run name and project, for each epoch. Also, after saving the model using torch previously, I save the best model as an artifact for that run  using wandb. Finally, I finish that run so that nothing extra would get logged for that run.

In [8]:
## starting the respective run
run = wandb.init(name = "Freezed_Adam",project = "18MA20050_Assignment_2")

## for watching all the respective information for that model
wandb.watch(net2_Adam, log = "all")

## configuration parameters
config = wandb.config
config.batch_size = 128
config.test_batch_size = 100
config.lr = 0.01

## required for early stopping
early_stop = [False,0]

## using the respective best loss variable 
global best_loss
best_loss = float('inf')

## running for large number of epochs, as the model is would be more likely for early stopping
for epoch in range(0,200):
    
    ## runnning the respective functions, and hence getting the outputs
    train_loss,train_acc = train(net2_Adam,optimizer2_Adam)
    test_loss,test_acc,samples = test(net2_Adam,early_stop,"Freezed","Adam")
    
    ## logging each of those results
    wandb.log({
                "Training Loss":train_loss,
                "Training Accuracy":train_acc,
                "Testing Loss":test_loss,
                "Testing Accuracy": test_acc,
                "Prediction Samples":samples
                })
    
    ## for ealy stopping
    if early_stop[0] :
        break
        
## saving the respective model as an artifact in that run
resnet18_Freezed_Adam = wandb.Artifact('resnet18_Freezed_Adam', type='model')
resnet18_Freezed_Adam.add_file('resnet18_Freezed_Adam.pth')
run.log_artifact(resnet18_Freezed_Adam)

## loading the best model saved, to access the best predictions and accruacy
best_resnet18_Freezed_Adam = torch.load('resnet18_Freezed_Adam.pth')

## logging the confusion matrix for the best model
wandb.log({
    "resnet18_Freezed_SGD_Confusion_Matrix": wandb.plot.confusion_matrix(
    preds = best_resnet18_Freezed_Adam['Predictions'],
    y_true = best_resnet18_Freezed_Adam['Target List'],
    class_names = classes) 
})

## logging the best test accuracy
config.best_test_accuracy = best_resnet18_Freezed_Adam['Testing Accuracy']

## hence finishing the run
wandb.finish()
print('##################################')
print('Run Finished')
print('Hence,The best model Accuracy for Resnet-18 Freezed with Adam Optimization: ',best_resnet18_Freezed_Adam['Testing Accuracy'])
print('##################################')


Epoch 1/200
------------------------------------
Training Loss: 0.0162, Training Accuracy: 33.7540
Testing Loss: 0.0212, Testing Accuracy: 34.0200

Epoch 2/200
------------------------------------
Training Loss: 0.0160, Training Accuracy: 35.0900
Testing Loss: 0.0206, Testing Accuracy: 36.6600

Epoch 3/200
------------------------------------
Training Loss: 0.0158, Training Accuracy: 35.3820
Testing Loss: 0.0220, Testing Accuracy: 34.0100

Epoch 4/200
------------------------------------
Training Loss: 0.0161, Training Accuracy: 35.2860
Testing Loss: 0.0234, Testing Accuracy: 33.3400

Epoch 5/200
------------------------------------
Training Loss: 0.0160, Training Accuracy: 35.1920
Testing Loss: 0.0209, Testing Accuracy: 36.6100

Epoch 6/200
------------------------------------
Training Loss: 0.0161, Training Accuracy: 35.1420
Testing Loss: 0.0221, Testing Accuracy: 34.1100

Epoch 7/200
------------------------------------
Training Loss: 0.0161, Training Accuracy: 35.2960
Testing Loss