# Hyperparameter Tuning(Deep Learning)
Varishu Pant  
D19033

#### **Note-To get reproducible results,set seed as 101 and restart the kernel before training each model.**
**All observations have been documented in the excel file 'Hyperparameter_Tracking_D19033.xlsx"**

## Importing Required Libraries

In [1]:
import torch
from torch import nn
import torch.nn.functional as F
from torchvision import datasets,transforms
from collections import OrderedDict
from torch import optim
import pandas as pd
import numpy as np
from tqdm import tqdm
import random
seed=101
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
torch.cuda.manual_seed_all(seed)  # if you are using multi-GPU.
np.random.seed(seed)  # Numpy module.
random.seed(seed)  # Python random module.
torch.manual_seed(seed)
torch.backends.cudnn.benchmark = False
torch.backends.cudnn.deterministic = True
def _init_fn(worker_id):
    np.random.seed(int(seed))
def count_parameters(model):
    return sum(p.numel() for p in model.parameters() if p.requires_grad)

## Default Neural Network for optimizing accuracy

In [2]:
class Network(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784, 256)
        self.fc2 = nn.Linear(256, 128)
        self.fc3 = nn.Linear(128, 64)
        self.fc4 = nn.Linear(64, 10)

        # Dropout module with 0.2 drop probability
        self.dropout = nn.Dropout(p=0.2)

    def forward(self, x):
        # make sure input tensor is flattened
        x = x.view(x.shape[0], -1)

        # Now with dropout
        x = self.dropout(F.relu(self.fc1(x)))
        x = self.dropout(F.relu(self.fc2(x)))
        x = self.dropout(F.relu(self.fc3(x)))

        # output so no dropout here
        x = F.log_softmax(self.fc4(x), dim=1)

        return x
model=Network()       
criterion=nn.NLLLoss()

## Function to train model
Takes in number of epochs and details of optimizer as inputs.Outputs validation accuracies for all epochs.

In [2]:
def train_nn(epochs,optimizer):
    test_accuracy=[]
    for e in range(epochs):
        running_loss=0
        for images,labels in trainloader:
            optimizer.zero_grad()
            log_ps=model(images)
            loss=criterion(log_ps,labels) 
            loss.backward()
            optimizer.step()
            running_loss += loss.item() * images.shape[0]

        else:
            test_loss=0
            accuracy=0

            with torch.no_grad():
                
                model.eval()
                for images,labels in testloader:
                    log_ps=model(images)
                    test_loss+=criterion(log_ps,labels) *images.shape[0]
                    ps=torch.exp(log_ps)
                    top_p,top_class=ps.topk(1,dim=1)
                    equals=top_class==labels.view(*top_class.shape)
                    accuracy+=torch.sum(equals).item()
            model.train()
#             train_losses.append(running_loss/len(trainloader.dataset))
#             test_losses.append(test_loss.item()/len(testloader.dataset))
            test_accuracy.append(accuracy/len(testloader.dataset))
#             print("Epoch: {}/{}.. ".format(e+1, epochs),
#                   "Training Loss: {:.3f}.. ".format(running_loss/len(trainloader.dataset)),
#                   "Test Loss: {:.3f}.. ".format(test_loss/len(testloader.dataset)),
#                   "Test Accuracy: {:.3f}".format(accuracy/len(testloader.dataset)))
    return(test_accuracy)

# Optimizer Tuning using non-normalized data

### Trainloader for non-normalized data

In [3]:
torch.manual_seed(101)
transform=transforms.Compose([transforms.ToTensor()])
trainset=datasets.MNIST('~/.pytorch/MNIST_data/',train=True,transform=transform,download=True)
testset=datasets.MNIST('~/.pytorch/MNIST_data/',train=False,transform=transform,download=True)

trainloader=torch.utils.data.DataLoader(trainset,batch_size=64,shuffle=True,num_workers=0,worker_init_fn=_init_fn)
testloader=torch.utils.data.DataLoader(testset,batch_size=64,shuffle=True,num_workers=0,worker_init_fn=_init_fn)

In [4]:
epoch_list=['Epoch:'+str(i) for i in range(1,11)]

## Fitting models with different values of Hyperparameters

### 1.Optimizer - Stochastic Gradient Descent

#### 1.Tuning Learning Rate

**Gradually increasing learning rate while keeping momentum constant at 0.9 and observing the change in validation accuracy**

Fitting model with SGD without Nesterov,learning rate as 0.002 and momentum as 0.9

In [11]:
results=train_nn(epochs=10,optimizer=optim.SGD(model.parameters(),lr=0.002,momentum=0.9))
test_acc=[]
test_acc.append(results)
torch.save(model.state_dict(), 'NN_SGD_10E_lr0.002_mm0.9.pth')
print('Model saved as NN_SGD_10E_lr0.002_mm0.9.pth')
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
results_1=pd.DataFrame(test_acc,columns=epoch_list)
results_1

Model saved as NN_SGD_10E_lr0.002_mm0.9.pth
Number of Parameters: 242762


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.789,0.894,0.9191,0.9315,0.9418,0.9521,0.958,0.9628,0.965,0.967


Fitting model with SGD without Nesterov,learning rate as 0.01 and momentum as 0.9

In [8]:
results=train_nn(epochs=10,optimizer=optim.SGD(model.parameters(),lr=0.01,momentum=0.9))
test_acc=[]
test_acc.append(results)
torch.save(model.state_dict(), 'NN_SGD_10E_lr0.01_mm0.9.pth')
print('Model saved as NN_SGD_10E_lr0.01_mm0.9.pth')
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
results_2=pd.DataFrame(test_acc,columns=epoch_list)
results_2

Model saved as NN_SGD_10E_lr0.01_mm0.9.pth
Number of Parameters: 242762


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9296,0.9552,0.9687,0.9716,0.9714,0.9745,0.9772,0.9772,0.9766,0.9806


Fitting model with SGD without Nesterov,learning rate as 0.0015 and momentum as 0.9

In [8]:
results=train_nn(epochs=10,optimizer=optim.SGD(model.parameters(),lr=0.015,momentum=0.9))
test_acc=[]
test_acc.append(results)
torch.save(model.state_dict(), 'NN_SGD_10E_lr0.015_mm0.9.pth')
print('Model saved as NN_SGD_10E_lr0.015_mm0.9.pth')
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
results_3=pd.DataFrame(test_acc,columns=epoch_list)
results_3

Model saved as NN_SGD_10E_lr0.015_mm0.9.pth
Number of Parameters: 242762


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9473,0.9591,0.9697,0.9734,0.9755,0.9764,0.9786,0.9793,0.9792,0.9785


In [7]:
# results=train_nn(epochs=8,optimizer=optim.SGD(model.parameters(),lr=0.015,momentum=0.9))
# test_acc=[]
# test_acc.append(results)
# torch.save(model.state_dict(), 'NN_SGD_10E_lr0.015_mm0.9_e8.pth')
# print('Model saved as NN_SGD_10E_lr0.015_mm0.9_e8.pth')
epoch_list=['Epoch:'+str(i) for i in range(1,9)]
print('Number of Parameters:',count_parameters(model))
results_3_1=pd.DataFrame(test_acc,columns=epoch_list)
results_3_1

Number of Parameters: 242762


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8
0,0.9473,0.9591,0.9697,0.9734,0.9755,0.9764,0.9786,0.9793


Fitting model with SGD without Nesterov,learning rate as 0.02 and momentum as 0.9

In [8]:
results=train_nn(epochs=10,optimizer=optim.SGD(model.parameters(),lr=0.02,momentum=0.9))
test_acc=[]
test_acc.append(results)
torch.save(model.state_dict(), 'NN_SGD_10E_lr0.02_mm0.9.pth')
print('Model saved as NN_SGD_10E_lr0.02_mm0.9.pth')
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
results_4=pd.DataFrame(test_acc,columns=epoch_list)
results_4

Model saved as NN_SGD_10E_lr0.02_mm0.9.pth
Number of Parameters: 242762


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9525,0.9639,0.9716,0.9754,0.976,0.9778,0.9779,0.9793,0.9781,0.9812


Fitting model with SGD without Nesterov,learning rate as 0.03 and momentum as 0.9

In [8]:
results=train_nn(epochs=10,optimizer=optim.SGD(model.parameters(),lr=0.03,momentum=0.9))
test_acc=[]
test_acc.append(results)
torch.save(model.state_dict(), 'NN_SGD_10E_lr0.03_mm0.9.pth')
print('Model saved as NN_SGD_10E_lr0.03_mm0.9.pth')
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
results_5=pd.DataFrame(test_acc,columns=epoch_list)
results_5

Model saved as NN_SGD_10E_lr0.03_mm0.9.pth
Number of Parameters: 242762


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9565,0.9654,0.9723,0.9761,0.9756,0.9759,0.9789,0.9812,0.9791,0.9777


Fitting model with SGD without Nesterov,learning rate as 0.04 and momentum as 0.9

In [8]:
results=train_nn(epochs=10,optimizer=optim.SGD(model.parameters(),lr=0.04,momentum=0.9))
test_acc=[]
test_acc.append(results)
torch.save(model.state_dict(), 'NN_SGD_10E_lr0.04_mm0.9.pth')
print('Model saved as NN_SGD_10E_lr0.04_mm0.9.pth')
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
results_6=pd.DataFrame(test_acc,columns=epoch_list)
results_6

Model saved as NN_SGD_10E_lr0.04_mm0.9.pth
Number of Parameters: 242762


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9589,0.9658,0.9752,0.9722,0.9736,0.9766,0.9802,0.9786,0.978,0.9829


Fitting model with SGD without Nesterov,learning rate as 0.1 and momentum as 0.9

In [8]:
results=train_nn(epochs=10,optimizer=optim.SGD(model.parameters(),lr=0.1,momentum=0.9))
test_acc=[]
test_acc.append(results)
torch.save(model.state_dict(), 'NN_SGD_10E_lr0.1_mm0.9.pth')
print('Model saved as NN_SGD_10E_lr0.1_mm0.9.pth')
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
results_7=pd.DataFrame(test_acc,columns=epoch_list)
results_7

Model saved as NN_SGD_10E_lr0.1_mm0.9.pth
Number of Parameters: 242762


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9511,0.9585,0.96,0.9688,0.9659,0.9655,0.9727,0.9714,0.973,0.9696


**As of now,learning rate 0.04 give us the best average accuracy but there's a drop in accuracy from 0.04 to 0.1.So now we check for learning rates between 0.04 and 0.1**

Fitting model with SGD without Nesterov,learning rate as 0.09 and momentum as 0.9

In [8]:
results=train_nn(epochs=10,optimizer=optim.SGD(model.parameters(),lr=0.09,momentum=0.9))
test_acc=[]
test_acc.append(results)
torch.save(model.state_dict(), 'NN_SGD_10E_lr0.09_mm0.9.pth')
print('Model saved as NN_SGD_10E_lr0.09_mm0.9.pth')
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
results_8=pd.DataFrame(test_acc,columns=epoch_list)
results_8

Model saved as NN_SGD_10E_lr0.09_mm0.9.pth
Number of Parameters: 242762


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9558,0.9592,0.9704,0.973,0.9699,0.9696,0.9762,0.9747,0.9759,0.973


Fitting model with SGD without Nesterov,learning rate as 0.08 and momentum as 0.9

In [8]:
results=train_nn(epochs=10,optimizer=optim.SGD(model.parameters(),lr=0.08,momentum=0.9))
test_acc=[]
test_acc.append(results)
torch.save(model.state_dict(), 'NN_SGD_10E_lr0.08_mm0.9.pth')
print('Model saved as NN_SGD_10E_lr0.08_mm0.9.pth')
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
results_9=pd.DataFrame(test_acc,columns=epoch_list)
results_9

Model saved as NN_SGD_10E_lr0.08_mm0.9.pth
Number of Parameters: 242762


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9485,0.9584,0.968,0.9679,0.9711,0.9713,0.9739,0.974,0.9751,0.9768


Fitting model with SGD without Nesterov,learning rate as 0.07 and momentum as 0.9

In [8]:
results=train_nn(epochs=10,optimizer=optim.SGD(model.parameters(),lr=0.07,momentum=0.9))
test_acc=[]
test_acc.append(results)
torch.save(model.state_dict(), 'NN_SGD_10E_lr0.07_mm0.9.pth')
print('Model saved as NN_SGD_10E_lr0.07_mm0.9.pth')
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
results_10=pd.DataFrame(test_acc,columns=epoch_list)
results_10

Model saved as NN_SGD_10E_lr0.07_mm0.9.pth
Number of Parameters: 242762


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.953,0.9509,0.9668,0.9714,0.9734,0.9772,0.9777,0.9751,0.9761,0.9785


Fitting model with SGD without Nesterov,learning rate as 0.06 and momentum as 0.9

In [8]:
results=train_nn(epochs=10,optimizer=optim.SGD(model.parameters(),lr=0.06,momentum=0.9))
test_acc=[]
test_acc.append(results)
torch.save(model.state_dict(), 'NN_SGD_10E_lr0.06_mm0.9.pth')
print('Model saved as NN_SGD_10E_lr0.06_mm0.9.pth')
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
results_11=pd.DataFrame(test_acc,columns=epoch_list)
results_11

Model saved as NN_SGD_10E_lr0.06_mm0.9.pth
Number of Parameters: 242762


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9552,0.9679,0.9697,0.9746,0.9746,0.9767,0.9808,0.9789,0.9802,0.9811


Fitting model with SGD without Nesterov,learning rate as 0.05 and momentum as 0.9

In [8]:
results=train_nn(epochs=10,optimizer=optim.SGD(model.parameters(),lr=0.05,momentum=0.9))
test_acc=[]
test_acc.append(results)
torch.save(model.state_dict(), 'NN_SGD_10E_lr0.05_mm0.9.pth')
print('Model saved as NN_SGD_10E_lr0.05_mm0.9.pth')
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
results_12=pd.DataFrame(test_acc,columns=epoch_list)
results_12

Model saved as NN_SGD_10E_lr0.05_mm0.9.pth
Number of Parameters: 242762


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9597,0.9687,0.9736,0.9742,0.9745,0.9763,0.9796,0.9774,0.9792,0.9747


**We see that 0.04 still gives best average validation accuracy.Now keeping learning rate constant at 0.04,we check for different values of momentum.**

#### 2.Tuning Momentum (keeping lr=0.04)

Fitting model with SGD without Nesterov,learning rate as 0.04 and momentum as 0.7

In [9]:
results=train_nn(epochs=10,optimizer=optim.SGD(model.parameters(),lr=0.04,momentum=0.7))
test_acc=[]
test_acc.append(results)
torch.save(model.state_dict(), 'NN_SGD_10E_lr0.04_mm0.7.pth')
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Model saved as NN_SGD_10E_lr0.04_mm0.7.pth')
print('Number of Parameters:',count_parameters(model))
results_13=pd.DataFrame(test_acc,columns=epoch_list)
results_13

Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9469,0.9626,0.9707,0.9736,0.9729,0.9752,0.9771,0.9792,0.9805,0.9798


Fitting model with SGD without Nesterov,learning rate as 0.04 and momentum as 0.8

In [9]:
results=train_nn(epochs=10,optimizer=optim.SGD(model.parameters(),lr=0.04,momentum=0.8))
test_acc=[]
test_acc.append(results)
torch.save(model.state_dict(), 'NN_SGD_10E_lr0.04_mm0.8.pth')
print('Model saved as NN_SGD_10E_lr0.04_mm0.8.pth')
print('Number of Parameters:',count_parameters(model))
results_14=pd.DataFrame(test_acc,columns=epoch_list)
results_14

Model saved as NN_SGD_10E_lr0.04_mm0.8.pth
Number of Parameters: 242762


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9535,0.9648,0.9709,0.9746,0.9746,0.9754,0.978,0.9791,0.9797,0.9822


Fitting model with SGD without Nesterov,learning rate as 0.04 and momentum as 0.95

In [9]:
results=train_nn(epochs=10,optimizer=optim.SGD(model.parameters(),lr=0.04,momentum=0.95))
test_acc=[]
test_acc.append(results)
torch.save(model.state_dict(), 'NN_SGD_10E_lr0.04_mm0.95.pth')
print('Model saved as NN_SGD_10E_lr0.04_mm0.95.pth')
print('Number of Parameters:',count_parameters(model))
results_16=pd.DataFrame(test_acc,columns=epoch_list)
results_16

Model saved as NN_SGD_10E_lr0.04_mm0.95.pth
Number of Parameters: 242762


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9496,0.9579,0.9673,0.9714,0.9705,0.9691,0.9736,0.9758,0.9758,0.9765


**Out of all models yet fitted,combination of learning rate as 0.04 and momentum 0.9 gives best average accuracy.We now try that model for SGD with Nesterov**

In [9]:
results=train_nn(epochs=10,optimizer=optim.SGD(model.parameters(),lr=0.04,momentum=0.9,nesterov=True))
test_acc=[]
test_acc.append(results)
torch.save(model.state_dict(), 'NN_SGD_10E_lr0.04_mm0.9_nest.pth')
print('Model saved as NN_SGD_10E_lr0.04_mm0.9_nest.pth')
print('Number of Parameters:',count_parameters(model))
results_17=pd.DataFrame(test_acc,columns=epoch_list)
results_17

Model saved as NN_SGD_10E_lr0.04_mm0.9_nest.pth
Number of Parameters: 242762


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9581,0.9664,0.9725,0.9756,0.977,0.9759,0.98,0.9795,0.9798,0.9783


**Best Average accuracy yet.**

### 2. Optimizer-Adam

#### Tuning Learning Rate

Fitting model with Adam where learning rate is 0.1.

In [9]:
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.1))
test_acc=[]
test_acc.append(results)
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.1.pth')
print('Model saved as NN_Adam_10E_lr0.1.pth')
print('Number of Parameters:',count_parameters(model))
results_18=pd.DataFrame(test_acc,columns=epoch_list)
results_18

Model saved as NN_Adam_10E_lr0.1.pth
Number of Parameters: 242762


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.1009,0.1032,0.1028,0.1028,0.0974,0.1135,0.0982,0.1135,0.098,0.1009


Fitting model with Adam where learning rate is 0.

In [9]:
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0))
test_acc=[]
test_acc.append(results)
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.pth')
print('Model saved as NN_Adam_10E_lr0.pth')
print('Number of Parameters:',count_parameters(model))
results_19=pd.DataFrame(test_acc,columns=epoch_list)
results_19

Model saved as NN_SGD_10E_lr0.pth
Number of Parameters: 242762


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.0744,0.0744,0.0744,0.0744,0.0744,0.0744,0.0744,0.0744,0.0744,0.0744


Fitting model with Adam where learning rate is 0.02.

In [9]:
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.02))
test_acc=[]
test_acc.append(results)
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.02.pth')
print('Model saved as NN_Adam_10E_lr0.02.pth')
print('Number of Parameters:',count_parameters(model))
results_20=pd.DataFrame(test_acc,columns=epoch_list)
results_20

Model saved as NN_Adam_10E_lr0.02.pth
Number of Parameters: 242762


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9138,0.9044,0.9153,0.929,0.9336,0.934,0.9272,0.9276,0.9247,0.9264


Fitting model with Adam where learning rate is 0.04.

In [9]:
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.04))
test_acc=[]
test_acc.append(results)
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.04.pth')
print('Model saved as NN_Adam_10E_lr0.04.pth')
print('Number of Parameters:',count_parameters(model))
results_21=pd.DataFrame(test_acc,columns=epoch_list)
results_21

Model saved as NN_Adam_10E_lr0.04.pth
Number of Parameters: 242762


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.6084,0.5722,0.5559,0.4812,0.4523,0.4845,0.3754,0.3753,0.3464,0.3333


Fitting model with Adam where learning rate is 0.01.

In [7]:
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.01))
test_acc=[]
test_acc.append(results)
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.01.pth')
print('Model saved as NN_Adam_10E_lr0.01.pth')
print('Number of Parameters:',count_parameters(model))
results_22=pd.DataFrame(test_acc,columns=epoch_list)
results_22

Model saved as NN_Adam_10E_lr0.01.pth
Number of Parameters: 242762


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.948,0.9485,0.9602,0.9602,0.9525,0.9635,0.9596,0.9611,0.9577,0.9647


Fitting model with Adam where learning rate is 0.005.

In [5]:
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.005))
test_acc=[]
test_acc.append(results)
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.005.pth')
print('Model saved as NN_Adam_10E_lr0.005.pth')
print('Number of Parameters:',count_parameters(model))
results_23=pd.DataFrame(test_acc,columns=epoch_list)
results_23

Model saved as NN_Adam_10E_lr0.005.pth
Number of Parameters: 242762


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9571,0.9572,0.9657,0.9675,0.9682,0.9706,0.9739,0.974,0.9709,0.9744


Fitting model with Adam where learning rate is 0.003.

In [5]:
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.003))
test_acc=[]
test_acc.append(results)
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.003.pth')
print('Model saved as NN_Adam_10E_lr0.003.pth')
print('Number of Parameters:',count_parameters(model))
results_24=pd.DataFrame(test_acc,columns=epoch_list)
results_24

Model saved as NN_Adam_10E_lr0.003.pth
Number of Parameters: 242762


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9618,0.9662,0.9723,0.9733,0.9744,0.9766,0.9769,0.9795,0.976,0.9769


Fitting model with Adam where learning rate is 0.007.

In [5]:
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.007))
test_acc=[]
test_acc.append(results)
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.007.pth')
print('Model saved as NN_Adam_10E_lr0.007.pth')
print('Number of Parameters:',count_parameters(model))
results_25=pd.DataFrame(test_acc,columns=epoch_list)
results_25

Model saved as NN_Adam_10E_lr0.007.pth
Number of Parameters: 242762


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9525,0.9594,0.9619,0.9625,0.9672,0.968,0.9689,0.9671,0.971,0.9686


Fitting model with Adam where learning rate is 0.001.

In [5]:
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.001))
test_acc=[]
test_acc.append(results)
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.001.pth')
print('Model saved as NN_Adam_10E_lr0.001.pth')
print('Number of Parameters:',count_parameters(model))
results_26=pd.DataFrame(test_acc,columns=epoch_list)
results_26

Model saved as NN_Adam_10E_lr0.001.pth
Number of Parameters: 242762


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9561,0.9707,0.974,0.9771,0.9792,0.9768,0.9792,0.981,0.9787,0.9819


Fitting model with Adam where learning rate is 0.0005.

In [5]:
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.0005))
test_acc=[]
test_acc.append(results)
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.0005.pth')
print('Model saved as NN_Adam_10E_lr0.0005.pth')
print('Number of Parameters:',count_parameters(model))
results_27=pd.DataFrame(test_acc,columns=epoch_list)
results_27

Model saved as NN_Adam_10E_lr0.0005.pth
Number of Parameters: 242762


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9451,0.9633,0.9701,0.9757,0.9767,0.9784,0.978,0.9794,0.9785,0.9813


**Best accuracy given by Adam with learning rate as 0.001**

## Fitting good models with normalized data

#### Trainloader for normalized data

In [4]:
transform = transforms.Compose([transforms.ToTensor(),
                              transforms.Normalize((0.5,), (0.5,)),
                              ])
trainset=datasets.MNIST('~/.pytorch/MNIST_data/',train=True,transform=transform,download=True)
testset=datasets.MNIST('~/.pytorch/MNIST_data/',train=False,transform=transform,download=True)

trainloader=torch.utils.data.DataLoader(trainset,batch_size=64,shuffle=True,num_workers=0,worker_init_fn=_init_fn)
testloader=torch.utils.data.DataLoader(testset,batch_size=64,shuffle=True,num_workers=0,worker_init_fn=_init_fn)

Fitting model with Adam where learning rate is 0.01.

In [7]:
test_acc=[]
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.001))
test_acc.append(results)
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.001_norm.pth')
print('Model saved as NN_Adam_10E_lr0.001_norm.pth')
results_28=pd.DataFrame(test_acc,columns=epoch_list)
results_28

Number of Parameters: 242762
Model saved as NN_Adam_10E_lr0.001_norm.pth


Fitting model with SGD with Nesterov,learning rate as 0.04 and momentum as 0.9.

In [8]:
test_acc=[]
results=train_nn(epochs=10,optimizer=optim.SGD(model.parameters(),lr=0.04,momentum=0.9,nesterov=True))
test_acc.append(results)
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
torch.save(model.state_dict(), 'NN_SGD_10E_lr0.04_mm0.9_nest_norm.pth')
print('Model saved as NN_Adam_10E_lr0.04_mm0.9_nest_norm.pth')
print('Number of Parameters:',count_parameters(model))
results_29=pd.DataFrame(test_acc,columns=epoch_list)
results_29

Model saved as NN_Adam_10E_lr0.04_mm0.9_nest_norm.pth
Number of Parameters: 242762


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9263,0.9373,0.9445,0.9524,0.9597,0.961,0.9583,0.961,0.9619,0.9688


**No improvement in average accuracy**

## Fitting good models with scaled data

#### Trainloader for scaled non normalized data

In [4]:
train=pd.read_csv(r'C:\Users\Varishu Pant\Desktop\Praxis docs\ml\train.csv')
test=pd.read_csv(r'C:\Users\Varishu Pant\Desktop\Praxis docs\ml\test.csv')
train.shape,test.shape
x=train.drop("label",axis=1)
y=np.array(train['label'])
torch_X_train = torch.from_numpy(x.values).type(torch.FloatTensor)/255
torch_y_train = torch.from_numpy(y).type(torch.LongTensor)
myDataset = torch.utils.data.TensorDataset(torch_X_train,torch_y_train)
valid_no  = int(0.2 * len(myDataset))
# so divide the data into trainset and testset
trainSet,testSet = torch.utils.data.random_split(myDataset,(len(myDataset)-valid_no,valid_no))
print(f"len of trainSet {len(trainSet)} , len of testSet {len(testSet)}")
trainloader  = torch.utils.data.DataLoader(trainSet , batch_size=64 ,shuffle=True,num_workers=0,worker_init_fn=_init_fn) 
testloader  = torch.utils.data.DataLoader(testSet , batch_size=64 ,shuffle=True,num_workers=0,worker_init_fn=_init_fn)

len of trainSet 33600 , len of testSet 8400


Fitting model with Adam where learning rate is 0.01.

In [5]:
test_acc=[]
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.001))
test_acc.append(results)
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.001_255.pth')
print('Model saved as NN_Adam_10E_lr0.001_255.pth')
results_30=pd.DataFrame(test_acc,columns=epoch_list)
results_30

Number of Parameters: 242762
Model saved as NN_Adam_10E_lr0.001_255.pth


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.939881,0.955357,0.961548,0.968095,0.969643,0.970357,0.972024,0.97,0.972619,0.9725


Fitting model with SGD with Nesterov,learning rate as 0.04 and momentum as 0.9.

In [5]:
test_acc=[]
results=train_nn(epochs=10,optimizer=optim.SGD(model.parameters(),lr=0.04,momentum=0.9,nesterov=True))
test_acc.append(results)
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
torch.save(model.state_dict(), 'NN_SGD_10E_lr0.04_mm0.9_nest_255.pth')
print('Model saved as NN_Adam_10E_lr0.04_mm0.9_nest_255.pth')
print('Number of Parameters:',count_parameters(model))
results_31=pd.DataFrame(test_acc,columns=epoch_list)
results_31

Model saved as NN_Adam_10E_lr0.04_mm0.9_nest_255.pth
Number of Parameters: 242762


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.939286,0.95619,0.959405,0.952738,0.967976,0.96631,0.971429,0.96869,0.97,0.972738


**No improvement in average accuracy**

### As of now,maximum average accuracy was given by Adam with learning rate 0.001 for non normalized ,non scaled data.So,we use that as default model going ahead.

## Tuning batch size 

Fitting model with batch size 32

In [4]:
torch.manual_seed(101)
transform=transforms.Compose([transforms.ToTensor()])
trainset=datasets.MNIST('~/.pytorch/MNIST_data/',train=True,transform=transform,download=True)
testset=datasets.MNIST('~/.pytorch/MNIST_data/',train=False,transform=transform,download=True)

trainloader=torch.utils.data.DataLoader(trainset,batch_size=32,shuffle=True,num_workers=0,worker_init_fn=_init_fn)
testloader=torch.utils.data.DataLoader(testset,batch_size=32,shuffle=True,num_workers=0,worker_init_fn=_init_fn)
test_acc=[]
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.001))
test_acc.append(results)
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.001_32b.pth')
print('Model saved as NN_Adam_10E_lr0.001_32b.pth')
results_32=pd.DataFrame(test_acc,columns=epoch_list)
results_32

Number of Parameters: 242762
Model saved as NN_Adam_10E_lr0.001_32b.pth


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9614,0.9638,0.9716,0.9763,0.9773,0.9788,0.9773,0.9772,0.9786,0.9819


Fitting model with batch size 128

In [4]:
torch.manual_seed(101)
transform=transforms.Compose([transforms.ToTensor()])
trainset=datasets.MNIST('~/.pytorch/MNIST_data/',train=True,transform=transform,download=True)
testset=datasets.MNIST('~/.pytorch/MNIST_data/',train=False,transform=transform,download=True)

trainloader=torch.utils.data.DataLoader(trainset,batch_size=128,shuffle=True,num_workers=0,worker_init_fn=_init_fn)
testloader=torch.utils.data.DataLoader(testset,batch_size=128,shuffle=True,num_workers=0,worker_init_fn=_init_fn)
test_acc=[]
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.001))
test_acc.append(results)
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.001_128b.pth')
print('Model saved as NN_Adam_10E_lr0.001_128b.pth')
results_33=pd.DataFrame(test_acc,columns=epoch_list)
results_33

Number of Parameters: 242762
Model saved as NN_Adam_10E_lr0.001_128b.pth


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9489,0.9667,0.9724,0.9755,0.9759,0.9778,0.9781,0.9798,0.9798,0.9798


Fitting model with batch size 16

In [4]:
torch.manual_seed(101)
transform=transforms.Compose([transforms.ToTensor()])
trainset=datasets.MNIST('~/.pytorch/MNIST_data/',train=True,transform=transform,download=True)
testset=datasets.MNIST('~/.pytorch/MNIST_data/',train=False,transform=transform,download=True)

trainloader=torch.utils.data.DataLoader(trainset,batch_size=16,shuffle=True,num_workers=0,worker_init_fn=_init_fn)
testloader=torch.utils.data.DataLoader(testset,batch_size=16,shuffle=True,num_workers=0,worker_init_fn=_init_fn)
test_acc=[]
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.001))
test_acc.append(results)
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.001_16b.pth')
print('Model saved as NN_Adam_10E_lr0.001_16b.pth')
results_34=pd.DataFrame(test_acc,columns=epoch_list)
results_34

Number of Parameters: 242762
Model saved as NN_Adam_10E_lr0.001_16b.pth


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9568,0.9715,0.9738,0.9769,0.975,0.9789,0.9789,0.9783,0.9784,0.9794


Fitting model with batch size 8

In [4]:
torch.manual_seed(101)
transform=transforms.Compose([transforms.ToTensor()])
trainset=datasets.MNIST('~/.pytorch/MNIST_data/',train=True,transform=transform,download=True)
testset=datasets.MNIST('~/.pytorch/MNIST_data/',train=False,transform=transform,download=True)

trainloader=torch.utils.data.DataLoader(trainset,batch_size=8,shuffle=True,num_workers=0,worker_init_fn=_init_fn)
testloader=torch.utils.data.DataLoader(testset,batch_size=8,shuffle=True,num_workers=0,worker_init_fn=_init_fn)
test_acc=[]
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.001))
test_acc.append(results)
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.001_8b.pth')
print('Model saved as NN_Adam_10E_lr0.001_8b.pth')
results_35=pd.DataFrame(test_acc,columns=epoch_list)
results_35

Number of Parameters: 242762
Model saved as NN_Adam_10E_lr0.001_8b.pth


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9587,0.9663,0.9756,0.9757,0.9773,0.9786,0.9788,0.9776,0.9774,0.9791


**No improvement in average accuracy**

## Tuning Parameters

**Keeping all else constant,we change model architecture to minimize parameters while maintaining a good accuracy.**

In [4]:
class Network(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784, 128)
        self.fc2 = nn.Linear(128, 64)
        self.fc3 = nn.Linear(64, 32)
        self.fc4 = nn.Linear(32, 10)

        # Dropout module with 0.2 drop probability
        self.dropout = nn.Dropout(p=0.2)

    def forward(self, x):
        # make sure input tensor is flattened
        x = x.view(x.shape[0], -1)

        # Now with dropout
        x = self.dropout(F.relu(self.fc1(x)))
        x = self.dropout(F.relu(self.fc2(x)))
        x = self.dropout(F.relu(self.fc3(x)))

        # output so no dropout here
        x = F.log_softmax(self.fc4(x), dim=1)

        return x
model=Network()       
criterion=nn.NLLLoss()
test_acc=[]
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.001))
test_acc.append(results)
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.001_128.64.32.pth')
print('Model saved as NN_Adam_10E_lr0.001_128.64.32.pth')
results_37=pd.DataFrame(test_acc,columns=epoch_list)
results_37

Number of Parameters: 111146
Model saved as NN_Adam_10E_lr0.001.pth


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9474,0.9616,0.9683,0.9704,0.9757,0.9748,0.9772,0.9776,0.9782,0.9767


In [5]:
class Network(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784, 128)
        self.fc2 = nn.Linear(128, 32)
        self.fc3 = nn.Linear(32, 32)
        self.fc4 = nn.Linear(32, 10)

        # Dropout module with 0.2 drop probability
        self.dropout = nn.Dropout(p=0.2)

    def forward(self, x):
        # make sure input tensor is flattened
        x = x.view(x.shape[0], -1)

        # Now with dropout
        x = self.dropout(F.relu(self.fc1(x)))
        x = self.dropout(F.relu(self.fc2(x)))
        x = self.dropout(F.relu(self.fc3(x)))

        # output so no dropout here
        x = F.log_softmax(self.fc4(x), dim=1)

        return x
model=Network()       
criterion=nn.NLLLoss()
test_acc=[]
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.001))
test_acc.append(results)
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.001_128.32.32.pth')
print('Model saved as NN_Adam_10E_lr0.001_128.32.32.pth')
results_38=pd.DataFrame(test_acc,columns=epoch_list)
results_38

Number of Parameters: 105994
Model saved as NN_Adam_10E_lr0.001_128.32.32.pth


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9444,0.9594,0.9666,0.9702,0.973,0.9732,0.9735,0.9758,0.9794,0.9781


In [4]:
class Network(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784, 256)
        self.fc2 = nn.Linear(256, 32)
        self.fc3 = nn.Linear(32, 16)
        self.fc4 = nn.Linear(16, 10)

        # Dropout module with 0.2 drop probability
        self.dropout = nn.Dropout(p=0.2)

    def forward(self, x):
        # make sure input tensor is flattened
        x = x.view(x.shape[0], -1)

        # Now with dropout
        x = self.dropout(F.relu(self.fc1(x)))
        x = self.dropout(F.relu(self.fc2(x)))
        x = self.dropout(F.relu(self.fc3(x)))

        # output so no dropout here
        x = F.log_softmax(self.fc4(x), dim=1)

        return x
model=Network()       
criterion=nn.NLLLoss()
test_acc=[]
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.001))
test_acc.append(results)
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.001_256.32.16.pth')
print('Model saved as NN_Adam_10E_lr0.001_256.32.16.pth')
results_39=pd.DataFrame(test_acc,columns=epoch_list)
results_39

Number of Parameters: 209882
Model saved as NN_Adam_10E_lr0.001_256.32.16.pth


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9389,0.9581,0.9653,0.9715,0.9731,0.9749,0.9742,0.9767,0.9792,0.9793


In [5]:
class Network(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784, 128)
        self.fc2 = nn.Linear(128, 128)
        self.fc3 = nn.Linear(128, 64)
        self.fc4 = nn.Linear(64, 10)

        # Dropout module with 0.2 drop probability
        self.dropout = nn.Dropout(p=0.2)

    def forward(self, x):
        # make sure input tensor is flattened
        x = x.view(x.shape[0], -1)

        # Now with dropout
        x = self.dropout(F.relu(self.fc1(x)))
        x = self.dropout(F.relu(self.fc2(x)))
        x = self.dropout(F.relu(self.fc3(x)))

        # output so no dropout here
        x = F.log_softmax(self.fc4(x), dim=1)

        return x
model=Network()       
criterion=nn.NLLLoss()
test_acc=[]
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.001))
test_acc.append(results)
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.001_128.128.64.pth')
print('Model saved as NN_Adam_10E_lr0.001_128.128.64.pth')
results_40=pd.DataFrame(test_acc,columns=epoch_list)
results_40


Number of Parameters: 125898
Model saved as NN_Adam_10E_lr0.001_128.128.64.pth


In [4]:
class Network(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784, 128)
        self.fc2 = nn.Linear(128, 128)
        self.fc3 = nn.Linear(128, 32)
        self.fc4 = nn.Linear(32, 10)

        # Dropout module with 0.2 drop probability
        self.dropout = nn.Dropout(p=0.2)

    def forward(self, x):
        # make sure input tensor is flattened
        x = x.view(x.shape[0], -1)

        # Now with dropout
        x = self.dropout(F.relu(self.fc1(x)))
        x = self.dropout(F.relu(self.fc2(x)))
        x = self.dropout(F.relu(self.fc3(x)))

        # output so no dropout here
        x = F.log_softmax(self.fc4(x), dim=1)

        return x
model=Network()       
criterion=nn.NLLLoss()
test_acc=[]
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.001))
test_acc.append(results)
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.001_128.128.32.pth')
print('Model saved as NN_Adam_10E_lr0.001_128.128.32.pth')
results_41=pd.DataFrame(test_acc,columns=epoch_list)
results_41

Number of Parameters: 121450
Model saved as NN_Adam_10E_lr0.001_128.128.32.pth


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9405,0.963,0.9693,0.9714,0.9739,0.9746,0.976,0.9748,0.9746,0.9772


In [4]:
class Network(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784, 128)
        self.fc2 = nn.Linear(128, 64)
        self.fc3 = nn.Linear(64, 64)
        self.fc4 = nn.Linear(64, 10)

        # Dropout module with 0.2 drop probability
        self.dropout = nn.Dropout(p=0.2)

    def forward(self, x):
        # make sure input tensor is flattened
        x = x.view(x.shape[0], -1)

        # Now with dropout
        x = self.dropout(F.relu(self.fc1(x)))
        x = self.dropout(F.relu(self.fc2(x)))
        x = self.dropout(F.relu(self.fc3(x)))

        # output so no dropout here
        x = F.log_softmax(self.fc4(x), dim=1)

        return x
model=Network()       
criterion=nn.NLLLoss()
test_acc=[]
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.001))
test_acc.append(results)
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.001_128.64.64.pth')
print('Model saved as NN_Adam_10E_lr0.001_128.64.64.pth')
results_42=pd.DataFrame(test_acc,columns=epoch_list)
results_42

Number of Parameters: 113546
Model saved as NN_Adam_10E_lr0.001_128.64.64.pth


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9508,0.9643,0.9683,0.9731,0.974,0.9729,0.9767,0.9781,0.9779,0.9784


In [4]:
class Network(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784, 100)
        self.fc2 = nn.Linear(100, 50)
        self.fc3 = nn.Linear(50, 30)
        self.fc4 = nn.Linear(30, 10)

        # Dropout module with 0.2 drop probability
        self.dropout = nn.Dropout(p=0.2)

    def forward(self, x):
        # make sure input tensor is flattened
        x = x.view(x.shape[0], -1)

        # Now with dropout
        x = self.dropout(F.relu(self.fc1(x)))
        x = self.dropout(F.relu(self.fc2(x)))
        x = self.dropout(F.relu(self.fc3(x)))

        # output so no dropout here
        x = F.log_softmax(self.fc4(x), dim=1)

        return x
model=Network()       
criterion=nn.NLLLoss()
test_acc=[]
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.001))
test_acc.append(results)
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.001_100.50.30.pth')
print('Model saved as NN_Adam_10E_lr0.001_100.50.30.pth')
results_43=pd.DataFrame(test_acc,columns=epoch_list)
results_43

Number of Parameters: 85390
Model saved as NN_Adam_10E_lr0.001_100.50.30.pth


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9406,0.9524,0.9644,0.9689,0.9694,0.9697,0.9735,0.9737,0.9753,0.9761


In [4]:
class Network(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784, 120)
        self.fc2 = nn.Linear(120, 60)
        self.fc3 = nn.Linear(60, 40)
        self.fc4 = nn.Linear(40, 10)

        # Dropout module with 0.2 drop probability
        self.dropout = nn.Dropout(p=0.2)

    def forward(self, x):
        # make sure input tensor is flattened
        x = x.view(x.shape[0], -1)

        # Now with dropout
        x = self.dropout(F.relu(self.fc1(x)))
        x = self.dropout(F.relu(self.fc2(x)))
        x = self.dropout(F.relu(self.fc3(x)))

        # output so no dropout here
        x = F.log_softmax(self.fc4(x), dim=1)

        return x
model=Network()       
criterion=nn.NLLLoss()
test_acc=[]
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.001))
test_acc.append(results)
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.001_120.60.40.pth')
print('Model saved as NN_Adam_10E_lr0.001_120.60.40.pth')
results_44=pd.DataFrame(test_acc,columns=epoch_list)
results_44

Number of Parameters: 104310
Model saved as NN_Adam_10E_lr0.001_120.60.40.pth


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9486,0.9605,0.9666,0.9705,0.9718,0.9739,0.9719,0.9759,0.9749,0.9779


In [4]:
class Network(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784, 120)
        self.fc2 = nn.Linear(120, 50)
        self.fc3 = nn.Linear(50, 20)
        self.fc4 = nn.Linear(20, 10)

        # Dropout module with 0.2 drop probability
        self.dropout = nn.Dropout(p=0.2)

    def forward(self, x):
        # make sure input tensor is flattened
        x = x.view(x.shape[0], -1)

        # Now with dropout
        x = self.dropout(F.relu(self.fc1(x)))
        x = self.dropout(F.relu(self.fc2(x)))
        x = self.dropout(F.relu(self.fc3(x)))

        # output so no dropout here
        x = F.log_softmax(self.fc4(x), dim=1)

        return x
model=Network()       
criterion=nn.NLLLoss()
test_acc=[]
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.001))
test_acc.append(results)
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.001_120.50.20.pth')
print('Model saved as NN_Adam_10E_lr0.001_120.50.20.pth')
results_45=pd.DataFrame(test_acc,columns=epoch_list)
results_45

Number of Parameters: 101480
Model saved as NN_Adam_10E_lr0.001_120.50.20.pth


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9411,0.9584,0.9655,0.9702,0.9719,0.9737,0.974,0.9766,0.9756,0.9761


In [4]:
class Network(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784, 64)
        self.fc2 = nn.Linear(64, 64)
        self.fc3 = nn.Linear(64, 64)
        self.fc4 = nn.Linear(64, 10)

        # Dropout module with 0.2 drop probability
        self.dropout = nn.Dropout(p=0.2)

    def forward(self, x):
        # make sure input tensor is flattened
        x = x.view(x.shape[0], -1)

        # Now with dropout
        x = self.dropout(F.relu(self.fc1(x)))
        x = self.dropout(F.relu(self.fc2(x)))
        x = self.dropout(F.relu(self.fc3(x)))

        # output so no dropout here
        x = F.log_softmax(self.fc4(x), dim=1)

        return x
model=Network()       
criterion=nn.NLLLoss()
test_acc=[]
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.001))
test_acc.append(results)
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.001_64.64.64.pth')
print('Model saved as NN_Adam_10E_lr0.001_64.64.64.pth')
results_46=pd.DataFrame(test_acc,columns=epoch_list)
results_46

Number of Parameters: 59210
Model saved as NN_Adam_10E_lr0.001_64.64.64.pth


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9411,0.9569,0.9637,0.9668,0.9677,0.9667,0.9713,0.971,0.9731,0.9712


In [4]:
class Network(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784, 80)
        self.fc2 = nn.Linear(80, 60)
        self.fc3 = nn.Linear(60, 50)
        self.fc4 = nn.Linear(50, 10)

        # Dropout module with 0.2 drop probability
        self.dropout = nn.Dropout(p=0.2)

    def forward(self, x):
        # make sure input tensor is flattened
        x = x.view(x.shape[0], -1)

        # Now with dropout
        x = self.dropout(F.relu(self.fc1(x)))
        x = self.dropout(F.relu(self.fc2(x)))
        x = self.dropout(F.relu(self.fc3(x)))

        # output so no dropout here
        x = F.log_softmax(self.fc4(x), dim=1)

        return x
model=Network()       
criterion=nn.NLLLoss()
test_acc=[]
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.001))
test_acc.append(results)
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.001_80.60.50.pth')
print('Model saved as NN_Adam_10E_lr0.001_80.60.50.pth')
results_47=pd.DataFrame(test_acc,columns=epoch_list)
results_47

Number of Parameters: 71220
Model saved as NN_Adam_10E_lr0.001_80.60.50.pth


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9352,0.9551,0.9655,0.9676,0.9693,0.9688,0.9689,0.973,0.9753,0.9753


In [4]:
class Network(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784, 90)
        self.fc2 = nn.Linear(90, 55)
        self.fc3 = nn.Linear(55, 50)
        self.fc4 = nn.Linear(50, 10)

        # Dropout module with 0.2 drop probability
        self.dropout = nn.Dropout(p=0.2)

    def forward(self, x):
        # make sure input tensor is flattened
        x = x.view(x.shape[0], -1)

        # Now with dropout
        x = self.dropout(F.relu(self.fc1(x)))
        x = self.dropout(F.relu(self.fc2(x)))
        x = self.dropout(F.relu(self.fc3(x)))

        # output so no dropout here
        x = F.log_softmax(self.fc4(x), dim=1)

        return x
model=Network()       
criterion=nn.NLLLoss()
test_acc=[]
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.001))
test_acc.append(results)
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.001_90.55.50.pth')
print('Model saved as NN_Adam_10E_lr0.001_90.55.50.pth')
results_48=pd.DataFrame(test_acc,columns=epoch_list)
results_48

Number of Parameters: 78965
Model saved as NN_Adam_10E_lr0.001_90.55.50.pth


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9473,0.9604,0.9674,0.9708,0.9721,0.9736,0.9735,0.9741,0.9747,0.9758


In [4]:
class Network(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784, 95)
        self.fc2 = nn.Linear(95, 60)
        self.fc3 = nn.Linear(60, 50)
        self.fc4 = nn.Linear(50, 10)

        # Dropout module with 0.2 drop probability
        self.dropout = nn.Dropout(p=0.2)

    def forward(self, x):
        # make sure input tensor is flattened
        x = x.view(x.shape[0], -1)

        # Now with dropout
        x = self.dropout(F.relu(self.fc1(x)))
        x = self.dropout(F.relu(self.fc2(x)))
        x = self.dropout(F.relu(self.fc3(x)))

        # output so no dropout here
        x = F.log_softmax(self.fc4(x), dim=1)

        return x
model=Network()       
criterion=nn.NLLLoss()
test_acc=[]
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.001))
test_acc.append(results)
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.001_95.60.50.pth')
print('Model saved as NN_Adam_10E_lr0.001_95.60.50.pth')
results_49=pd.DataFrame(test_acc,columns=epoch_list)
results_49

Number of Parameters: 83895
Model saved as NN_Adam_10E_lr0.001_95.60.50.pth


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9454,0.9612,0.9689,0.9698,0.9717,0.9736,0.973,0.9753,0.9749,0.9768


**Keeping all else constant,we change model architecture increasing parameters while maintaining a good accuracy.**

In [5]:
class Network(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784,512)
        self.fc2 = nn.Linear(512,256)
        self.fc3 = nn.Linear(256,128)
        self.fc4 = nn.Linear(128,10)

        # Dropout module with 0.2 drop probability
        self.dropout = nn.Dropout(p=0.2)

    def forward(self, x):
        # make sure input tensor is flattened
        x = x.view(x.shape[0], -1)

        # Now with dropout
        x = self.dropout(F.relu(self.fc1(x)))
        x = self.dropout(F.relu(self.fc2(x)))
        x = self.dropout(F.relu(self.fc3(x)))

        # output so no dropout here
        x = F.log_softmax(self.fc4(x), dim=1)

        return x
model=Network()       
criterion=nn.NLLLoss()
test_acc=[]
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.001))
test_acc.append(results)
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.001_512.256.128.pth')
print('Model saved as NN_Adam_10E_lr0.001_512.256.128.pth')
results_50=pd.DataFrame(test_acc,columns=epoch_list)
results_50

Number of Parameters: 567434
Model saved as NN_Adam_10E_lr0.001_512.256.128.pth


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9643,0.973,0.9755,0.978,0.9768,0.9815,0.9804,0.982,0.9828,0.9823


In [5]:
class Network(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784,1024)
        self.fc2 = nn.Linear(1024,512)
        self.fc3 = nn.Linear(512,256)
        self.fc4 = nn.Linear(256,10)

        # Dropout module with 0.2 drop probability
        self.dropout = nn.Dropout(p=0.2)

    def forward(self, x):
        # make sure input tensor is flattened
        x = x.view(x.shape[0], -1)

        # Now with dropout
        x = self.dropout(F.relu(self.fc1(x)))
        x = self.dropout(F.relu(self.fc2(x)))
        x = self.dropout(F.relu(self.fc3(x)))

        # output so no dropout here
        x = F.log_softmax(self.fc4(x), dim=1)

        return x
model=Network()       
criterion=nn.NLLLoss()
test_acc=[]
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.001))
test_acc.append(results)
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.001_1024.512.256.pth')
print('Model saved as NN_Adam_10E_lr0.001_1024.512.256.pth')
results_51=pd.DataFrame(test_acc,columns=epoch_list)
results_51

Number of Parameters: 1462538
Model saved as NN_Adam_10E_lr0.001_1024.512.256.pth


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9643,0.9689,0.9774,0.9787,0.9758,0.9774,0.9803,0.9784,0.9819,0.9813


In [5]:
class Network(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784,2048)
        self.fc2 = nn.Linear(2048,1024)
        self.fc3 = nn.Linear(1024,512)
        self.fc4 = nn.Linear(512,10)

        # Dropout module with 0.2 drop probability
        self.dropout = nn.Dropout(p=0.2)

    def forward(self, x):
        # make sure input tensor is flattened
        x = x.view(x.shape[0], -1)

        # Now with dropout
        x = self.dropout(F.relu(self.fc1(x)))
        x = self.dropout(F.relu(self.fc2(x)))
        x = self.dropout(F.relu(self.fc3(x)))

        # output so no dropout here
        x = F.log_softmax(self.fc4(x), dim=1)

        return x
model=Network()       
criterion=nn.NLLLoss()
test_acc=[]
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.001))
test_acc.append(results)
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.001_2048.1024.512.pth')
print('Model saved as NN_Adam_10E_lr0.001_2048.1024.512.pth')
results_52=pd.DataFrame(test_acc,columns=epoch_list)
results_52

Number of Parameters: 4235786
Model saved as NN_Adam_10E_lr0.001_2048.1024.512.pth


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9581,0.9742,0.9776,0.9781,0.977,0.9789,0.9805,0.9805,0.9811,0.9802


In [5]:
class Network(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784, 400)
        self.fc2 = nn.Linear(400, 200)
        self.fc3 = nn.Linear(200, 100)
        self.fc4 = nn.Linear(100, 10)

        # Dropout module with 0.2 drop probability
        self.dropout = nn.Dropout(p=0.2)

    def forward(self, x):
        # make sure input tensor is flattened
        x = x.view(x.shape[0], -1)

        # Now with dropout
        x = self.dropout(F.relu(self.fc1(x)))
        x = self.dropout(F.relu(self.fc2(x)))
        x = self.dropout(F.relu(self.fc3(x)))

        # output so no dropout here
        x = F.log_softmax(self.fc4(x), dim=1)

        return x
model=Network()       
criterion=nn.NLLLoss()
test_acc=[]
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.001))
test_acc.append(results)
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.001_400.200.100.pth')
print('Model saved as NN_Adam_10E_lr0.001_400.200.100.pth')
results_53=pd.DataFrame(test_acc,columns=epoch_list)
results_53

Number of Parameters: 415310
Model saved as NN_Adam_10E_lr0.001_400.200.100.pth


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9589,0.9706,0.9766,0.9734,0.98,0.9795,0.9811,0.9793,0.981,0.9813


In [5]:
class Network(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784, 300)
        self.fc2 = nn.Linear(300, 150)
        self.fc3 = nn.Linear(150, 100)
        self.fc4 = nn.Linear(100, 10)

        # Dropout module with 0.2 drop probability
        self.dropout = nn.Dropout(p=0.2)

    def forward(self, x):
        # make sure input tensor is flattened
        x = x.view(x.shape[0], -1)

        # Now with dropout
        x = self.dropout(F.relu(self.fc1(x)))
        x = self.dropout(F.relu(self.fc2(x)))
        x = self.dropout(F.relu(self.fc3(x)))

        # output so no dropout here
        x = F.log_softmax(self.fc4(x), dim=1)

        return x
model=Network()       
criterion=nn.NLLLoss()
test_acc=[]
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.001))
test_acc.append(results)
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.001_300.150.100.pth')
print('Model saved as NN_Adam_10E_lr0.001_300.150.100.pth')
results_54=pd.DataFrame(test_acc,columns=epoch_list)
results_54

Number of Parameters: 296760
Model saved as NN_Adam_10E_lr0.001_300.150.100.pth


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9606,0.9685,0.9747,0.9768,0.9781,0.9784,0.9823,0.9821,0.9806,0.9818


In [5]:
class Network(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784, 350)
        self.fc2 = nn.Linear(350, 128)
        self.fc3 = nn.Linear(128, 100)
        self.fc4 = nn.Linear(100, 10)

        # Dropout module with 0.2 drop probability
        self.dropout = nn.Dropout(p=0.2)

    def forward(self, x):
        # make sure input tensor is flattened
        x = x.view(x.shape[0], -1)

        # Now with dropout
        x = self.dropout(F.relu(self.fc1(x)))
        x = self.dropout(F.relu(self.fc2(x)))
        x = self.dropout(F.relu(self.fc3(x)))

        # output so no dropout here
        x = F.log_softmax(self.fc4(x), dim=1)

        return x
model=Network()       
criterion=nn.NLLLoss()
test_acc=[]
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.001))
test_acc.append(results)
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.001_350.128.100.pth')
print('Model saved as NN_Adam_10E_lr0.001_350.128.100.pth')
results_55=pd.DataFrame(test_acc,columns=epoch_list)
results_55

Number of Parameters: 333588
Model saved as NN_Adam_10E_lr0.001_350.128.100.pth


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.9599,0.9727,0.9749,0.9752,0.98,0.9808,0.9828,0.9825,0.9793,0.9794


In [5]:
class Network(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784, 350)
        self.fc2 = nn.Linear(350, 128)
        self.fc3 = nn.Linear(128, 128)
        self.fc4 = nn.Linear(128, 10)

        # Dropout module with 0.2 drop probability
        self.dropout = nn.Dropout(p=0.2)

    def forward(self, x):
        # make sure input tensor is flattened
        x = x.view(x.shape[0], -1)

        # Now with dropout
        x = self.dropout(F.relu(self.fc1(x)))
        x = self.dropout(F.relu(self.fc2(x)))
        x = self.dropout(F.relu(self.fc3(x)))

        # output so no dropout here
        x = F.log_softmax(self.fc4(x), dim=1)

        return x
model=Network()       
criterion=nn.NLLLoss()
test_acc=[]
results=train_nn(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.001))
test_acc.append(results)
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.001_350.128.128.pth')
print('Model saved as NN_Adam_10E_lr0.001_350.128.128.pth')
results_56=pd.DataFrame(test_acc,columns=epoch_list)
results_56

Number of Parameters: 337480
Model saved as NN_Adam_10E_lr0.001_350.128.128.pth


Unnamed: 0,Epoch:1,Epoch:2,Epoch:3,Epoch:4,Epoch:5,Epoch:6,Epoch:7,Epoch:8,Epoch:9,Epoch:10
0,0.963,0.9749,0.9734,0.9753,0.9759,0.9811,0.9787,0.9795,0.9802,0.9806


**We observe that validation accuracy is fluctuating across epochs which means that there may be overfitting. So we tune number of epochs.**

## Tuning Epochs

#### We'll use Adam with learning rate 0.001 and default starting network architecture to optimize epochs

In [5]:
def train_nn_epochs_tune(epochs,optimizer):
    test_accuracy=[]
    train_losses,test_losses=[],[]

    for e in range(epochs):
        running_loss=0
        for images,labels in trainloader:
            optimizer.zero_grad()
            log_ps=model(images)
            loss=criterion(log_ps,labels) 
            loss.backward()
            optimizer.step()
            running_loss += loss.item() * images.shape[0]

        else:
            test_loss=0
            accuracy=0

            with torch.no_grad():
                
                model.eval()
                for images,labels in testloader:
                    log_ps=model(images)
                    test_loss+=criterion(log_ps,labels) *images.shape[0]
                    ps=torch.exp(log_ps)
                    top_p,top_class=ps.topk(1,dim=1)
                    equals=top_class==labels.view(*top_class.shape)
                    accuracy+=torch.sum(equals).item()
            model.train()
            
            train_losses.append(running_loss/len(trainloader.dataset))              
            test_losses.append(test_loss.item()/len(testloader.dataset))
            test_accuracy.append(accuracy/len(testloader.dataset))
            
            print("Epoch: {}/{}.. ".format(e+1, epochs),
                  "Training Loss: {:.3f}.. ".format(running_loss/len(trainloader.dataset)),
                  "Test Loss: {:.3f}.. ".format(test_loss/len(testloader.dataset)),
                  "Test Accuracy: {:.3f}".format(accuracy/len(testloader.dataset)))

In [6]:
train_nn_epochs_tune(epochs=10,optimizer=optim.Adam(model.parameters(),lr=0.001))
epoch_list=['Epoch:'+str(i) for i in range(1,11)]
print('Number of Parameters:',count_parameters(model))
torch.save(model.state_dict(), 'NN_Adam_10E_lr0.001.pth')
print('Model saved as NN_Adam_10E_lr0.001.pth')

Epoch: 1/10..  Training Loss: 0.372..  Test Loss: 0.139..  Test Accuracy: 0.956
Epoch: 2/10..  Training Loss: 0.150..  Test Loss: 0.097..  Test Accuracy: 0.971
Epoch: 3/10..  Training Loss: 0.110..  Test Loss: 0.087..  Test Accuracy: 0.974
Epoch: 4/10..  Training Loss: 0.090..  Test Loss: 0.074..  Test Accuracy: 0.977
Epoch: 5/10..  Training Loss: 0.076..  Test Loss: 0.071..  Test Accuracy: 0.979
Epoch: 6/10..  Training Loss: 0.067..  Test Loss: 0.074..  Test Accuracy: 0.977
Epoch: 7/10..  Training Loss: 0.060..  Test Loss: 0.073..  Test Accuracy: 0.979
Epoch: 8/10..  Training Loss: 0.058..  Test Loss: 0.073..  Test Accuracy: 0.981
Epoch: 9/10..  Training Loss: 0.050..  Test Loss: 0.083..  Test Accuracy: 0.979
Epoch: 10/10..  Training Loss: 0.048..  Test Loss: 0.065..  Test Accuracy: 0.982
Number of Parameters: 242762
Model saved as NN_Adam_10E_lr0.001.pth


**We see the test loss increased at 6th epoch and test accuracy dropped,hence we take 5 epochs as the optimum. Now accuracy at 5th epoch is 0.979 which will be taken as model accuracy**

Similarly for models with different architectures ,we optimize epochs and take the accuracy for the optimum epoch as model accuracy.We use it find models with maximum accuracy for increased parameters and decreased parameters.

## RESULTS-

**Note-**Kindly check excel file for how I reached this conclusion by comparative analysis.

**Maximum accuracy for maximum parameters:**  

**Model Saved As-**NN_Adam_10E_lr0.001_350.128.100    
**Number of parameters-**333588  
**Batch Size-**64  
**Data Normalized-**No  
**Data Scaled-**No  
**Optimizer Used-**Adam with 0.001 learning rate  
**Optimum Epoch-**7  
**Validation Accuracy at optimum epoch-**98.28%


**Maximum accuracy for minimum parameters:**  

**Model Saved As-**NN_Adam_10E_lr0.001_128.32.32  
**Number of parameters-**105994  
**Batch Size-**64  
**Data Normalized-**No  
**Data Scaled-**No  
**Optimizer Used-**Adam with 0.001 learning rate  
**Optimum Epoch-**9  
**Validation Accuracy at optimum epoch-**97.94%
