# Lab 9.1: About Adam with MNIST Classifier

**Jonathan Choi 2021**

**[Deep Learning By Torch] End to End study scripts of Deep Learning by implementing code practice with Pytorch.**

If you have an any issue, please PR below.

[[Deep Learning By Torch] - Github @JonyChoi](https://github.com/jonychoi/Deep-Learning-By-Torch)

In this lab(9.0 and 9.1), we are going to learn about various optimizers, including SGD(Stochastic Gradient Descent) as we used always, besides about Adam, Adagrad, Momentum, GD, Adadelta, RMSProp etc. We are going to create the neural network using optimizer Adam at the End. Please read script "09.0 About optimizers" to get more understand.

![optimizers](https://cdn-images-1.medium.com/max/2000/1*3mbLR7aSgbg_UoueBymw5g.png)

Reference from 

https://medium.com/octavian-ai/which-optimizer-and-learning-rate-should-i-use-for-deep-learning-5acb418f9b2

## Imports

In [218]:
import torch
import torchvision.datasets as datasets
import torchvision.transforms as transforms
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import random

In [219]:
device = 'cuda' if torch.cuda.is_available() else 'cpu'

random.seed(1)
torch.manual_seed(1)
if device == 'cuda':
    torch.cuda.manual_seed_all(1)

## Set Hyper Parameters

In [220]:
learning_rate = 0.001
training_epochs = 15
batch_size = 100

## Load MNIST Data

In [221]:
mnist_train = datasets.MNIST(root='MNIST_data/',
                             train=True,
                             transform=transforms.ToTensor(),
                             download=True)
mnist_test = datasets.MNIST(root='MNIST_data/',
                            train=False,
                            transform=transforms.ToTensor(),
                            download=True)

In [222]:
data_loader = torch.utils.data.DataLoader(dataset=mnist_train, shuffle=True, drop_last=True, batch_size = batch_size)

## Model Define

In [223]:
class LinearMNISTClassifier(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear = nn.Linear(784, 10)
        nn.init.normal_(self.linear.weight)

    def forward(self, x):
        return self.linear(x)

## Train with SGD and ADAM

In [228]:
def train(data_loader, model, optimizer):
    #set optimizer
    if optimizer == "sgd":
        optimizer = optim.SGD(model.parameters(), lr=learning_rate)
    else:
        optimizer = optim.Adam(model.parameters(), lr=learning_rate)

    total_batch = len(data_loader)

    for epoch in range(training_epochs):

        avg_cost = 0
        
        for X, Y in data_loader:

            #reshape input image into (batchsize x 784)
            #label is not one-hot encoded
            X = X.view(-1, 28 * 28).to(device)# Before X.shape = torch.Size([100, 1, 28, 28]) After torch.Size([100, 784])
            Y = Y.to(device)

            #prediction
            pred = model(X)

            #cost
            cost = F.cross_entropy(pred, Y)

            #Reduce the cost
            optimizer.zero_grad()
            cost.backward()
            optimizer.step()

            avg_cost += cost / total_batch

        print('Epoch: {:d}/15, Cost: {:.6f}'.format(epoch+1, avg_cost))

    print('Learning Finished')

## Train with SGD

In [229]:
model = LinearMNISTClassifier().to(device)
train(data_loader, model, 'sgd')

Epoch: 1/15, Cost: 12.765790
Epoch: 2/15, Cost: 10.279278
Epoch: 3/15, Cost: 8.915923
Epoch: 4/15, Cost: 7.992557
Epoch: 5/15, Cost: 7.297637
Epoch: 6/15, Cost: 6.743856
Epoch: 7/15, Cost: 6.280459
Epoch: 8/15, Cost: 5.879832
Epoch: 9/15, Cost: 5.527321
Epoch: 10/15, Cost: 5.214455
Epoch: 11/15, Cost: 4.935022
Epoch: 12/15, Cost: 4.684261
Epoch: 13/15, Cost: 4.458256
Epoch: 14/15, Cost: 4.253722
Epoch: 15/15, Cost: 4.067929
Learning Finished


In [230]:
#Test the model using test sets

with torch.no_grad():
    X_test = mnist_test.test_data.view(-1 ,28 * 28).float().to(device)
    Y_test = mnist_test.test_labels.to(device)

    pred = model(X_test)
    correct_prediction = torch.argmax(pred, 1) == Y_test
    accuracy = correct_prediction.float().mean()

    print('Accuracy: ', accuracy.item())

    #Get one and predict
    r = random.randint(0, len(mnist_test) - 1)

    #below X_single_data.shape => torch.size([1, 784])

    #X_single_data = mnist_test.test_data[r:r + 1].view(-1, 28 * 28).float().to(device)
    #Y_single_data = mnist_test.test_labels[r:r + 1].to(device)

    #X_test[r].shape => torch.size([784])
    #X_test[r:r+1].shape => torch.size([1, 784])
    #if torch.argmax(single_prediction, 1) => Since just torch.size([784]) makes IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)
    #else torch.argmax(single_prediction, 0) => no error with size([784])
    X_single_data = X_test[r]
    Y_single_data = Y_test[r]

    print('Label: ', Y_single_data.item())
    single_prediction = model(X_single_data)
    print('Prediction: ', torch.argmax(single_prediction, 0).item())

Accuracy:  0.4203999936580658
Label:  0
Prediction:  0


## Train with ADAM

In [231]:
model = LinearMNISTClassifier().to(device)
train(data_loader, model, 'adam')

Epoch: 1/15, Cost: 5.672674
Epoch: 2/15, Cost: 1.664780
Epoch: 3/15, Cost: 1.087722
Epoch: 4/15, Cost: 0.856496
Epoch: 5/15, Cost: 0.727496
Epoch: 6/15, Cost: 0.643582
Epoch: 7/15, Cost: 0.584310
Epoch: 8/15, Cost: 0.541188
Epoch: 9/15, Cost: 0.508095
Epoch: 10/15, Cost: 0.481329
Epoch: 11/15, Cost: 0.459101
Epoch: 12/15, Cost: 0.440584
Epoch: 13/15, Cost: 0.424890
Epoch: 14/15, Cost: 0.411225
Epoch: 15/15, Cost: 0.399128
Learning Finished


In [241]:
#Test the model using test sets

with torch.no_grad():
    X_test = mnist_test.data.view(-1 ,28 * 28).float().to(device)
    Y_test = mnist_test.targets.to(device)

    pred = model(X_test)
    correct_prediction = torch.argmax(pred, 1) == Y_test
    accuracy = correct_prediction.float().mean()

    print('Accuracy: ', accuracy.item())

    #Get one and predict
    r = random.randint(0, len(mnist_test) - 1)

    print(mnist_test.data[r: r+1].view(-1, 28 * 28).shape, X_test[r: r+1].shape)

    #below X_single_data.shape => torch.size([1, 784])

    #X_single_data = mnist_test.test_data[r:r + 1].view(-1, 28 * 28).float().to(device)
    #Y_single_data = mnist_test.test_labels[r:r + 1].to(device)

    #X_test[r].shape => torch.size([784])
    #X_test[r:r+1].shape => torch.size([1, 784])
    #if torch.argmax(single_prediction, 1) => Since just torch.size([784]) makes IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)
    #else torch.argmax(single_prediction, 0) => no error with size([784])
    X_single_data = X_test[r]
    Y_single_data = Y_test[r]

    print('Label: ', Y_single_data.item())
    single_prediction = model(X_single_data)
    print('Prediction: ', torch.argmax(single_prediction, 0).item())

Accuracy:  0.8872999548912048
torch.Size([1, 784]) torch.Size([1, 784])
Label:  3
Prediction:  3
