# Lab 9.3: About Relu Activation with MNIST Classifier

Edited By Steve Ive

Here, we are going to train our neural network with new activation function 'relu' including multiple layers.
You can learn more about the activation function in "09.2 About Activations".

Reference from

https://github.com/deeplearningzerotoall/PyTorch/blob/master/lab-09_2_mnist_nn.ipynb

In [4]:
## Imports

In [5]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import torchvision.datasets as datasets
import torchvision.transforms as transforms
import random

In [6]:
device = 'cuda' if torch.cuda.is_available() else 'cpu'
random.seed(1)
torch.manual_seed(1)

if device == 'cuda':
    torch.cuda.manual_seed_all(1)

## Set Hyperparameters

In [7]:
learning_rate = 0.001
training_epochs = 15
batch_size = 100

## Load MNIST Dataset

In [8]:
mnist_train = datasets.MNIST(root='MNIST_data/',
                             transform=transforms.ToTensor(),
                             download=True,
                             train=True)
mnist_test = datasets.MNIST(root='MNIST_data/',
                            transform=transforms.ToTensor(),
                            download=True,
                            train=False)

  return torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)


In [9]:
data_loader = torch.utils.data.DataLoader(dataset=mnist_train, shuffle=True, drop_last=True, batch_size=batch_size)

## Define Model

In [28]:
class Relu_MNIST_Classifier(nn.Module):
    def __init__(self):
        super().__init__()
        self.sq = nn.Sequential(
            nn.Linear(784, 256),
            nn.ReLU(),
            nn.Linear(256, 256),
            nn.ReLU(),
            nn.Linear(256, 10),
        )
        self.weighter()

    def forward(self, x):
        return self.sq(x)

    def weighter(self):
        for index, layer in enumerate(self.sq):
            if index != 1 and index != 3:
                nn.init.normal_(layer.weight)

In [29]:
model = Relu_MNIST_Classifier().to(device)

Linear(in_features=784, out_features=256, bias=True)
Linear(in_features=256, out_features=256, bias=True)
Linear(in_features=256, out_features=10, bias=True)


Sometimes optimizer cannot find paramter error.
Just rewriting the model class's init part, solved.
seems to be problem of initializing parameters.

In [12]:
optimizer = optim.Adam(model.parameters(), lr=learning_rate)

## Train Model

In [13]:
total_batch = len(data_loader)

for epoch in range(training_epochs):

    avg_cost = 0

    for X, Y in data_loader:

        X = X.view(-1, 28 * 28).to(device)
        Y = Y.to(device)

        #prediction
        pred = model(X)
        
        #cost
        cost = F.cross_entropy(pred, Y)

        #Reduce cost
        optimizer.zero_grad()
        cost.backward()
        optimizer.step()

        avg_cost += cost
        
    avg_cost = avg_cost / total_batch
    print('Epoch: {:d} / 15, Cost: {:.6f}'.format(epoch + 1, cost.item()))

print('Learning Finished')

Epoch: 1 / 15, Cost: 0.134539
Epoch: 2 / 15, Cost: 0.162091
Epoch: 3 / 15, Cost: 0.041155
Epoch: 4 / 15, Cost: 0.092741
Epoch: 5 / 15, Cost: 0.007326
Epoch: 6 / 15, Cost: 0.108856
Epoch: 7 / 15, Cost: 0.032949
Epoch: 8 / 15, Cost: 0.004769
Epoch: 9 / 15, Cost: 0.024743
Epoch: 10 / 15, Cost: 0.010422
Epoch: 11 / 15, Cost: 0.010646
Epoch: 12 / 15, Cost: 0.009288
Epoch: 13 / 15, Cost: 0.002405
Epoch: 14 / 15, Cost: 0.001527
Epoch: 15 / 15, Cost: 0.020818
Learning Finished


In [19]:
#Test the model using test sets

with torch.no_grad():
    X_test = mnist_test.data.view(-1, 28 * 28).float().to(device)
    Y_test = mnist_test.targets.to(device)

    #prediction
    pred = model(X_test)

    correct_prediction = torch.argmax(pred, 1) == Y_test
    accuracy = correct_prediction.float().mean()
    print('Accuracy: {:.9f}'.format(accuracy.item()))

    r = random.randint(0, len(mnist_test) - 1)

    X_single_test = X_test[r]
    Y_single_test = Y_test[r]

    print('Label: {}'.format(Y_single_test))
    print('Prediction: {}'.format(torch.argmax(model(X_single_test))))


Accuracy: 0.979900002
Label: 0
Prediction: 0
