<a href="https://colab.research.google.com/github/jonkrohn/pytorch/blob/master/notebooks/deep_net_in_pytorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Deep Neural Network in PyTorch

_Remember to change your Runtime type to GPU or TPU_

#### Load dependencies

In [0]:
import torch
import torch.nn as nn

from torchvision.datasets import MNIST
from torchvision import transforms

from torchsummary import summary

#### Load data

In [2]:
train = MNIST('data', train=True, transform=transforms.ToTensor(), download=True)
test = MNIST('data', train=False, transform=transforms.ToTensor())

  0%|          | 0/9912422 [00:00<?, ?it/s]

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to data/MNIST/raw/train-images-idx3-ubyte.gz


9920512it [00:00, 19193498.59it/s]                            


Extracting data/MNIST/raw/train-images-idx3-ubyte.gz


32768it [00:00, 297285.50it/s]                           
0it [00:00, ?it/s]

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to data/MNIST/raw/train-labels-idx1-ubyte.gz
Extracting data/MNIST/raw/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to data/MNIST/raw/t10k-images-idx3-ubyte.gz


1654784it [00:00, 4832451.98it/s]                           
8192it [00:00, 128550.46it/s]


Extracting data/MNIST/raw/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to data/MNIST/raw/t10k-labels-idx1-ubyte.gz
Extracting data/MNIST/raw/t10k-labels-idx1-ubyte.gz
Processing...
Done!


#### Batch data

In [0]:
train_loader = torch.utils.data.DataLoader(train, batch_size=128) 
test_loader = torch.utils.data.DataLoader(test, batch_size=128) 

#### Design neural network architecture

In [0]:
n_input = 784
n_dense_1 = 64
n_dense_2 = 64
n_dense_3 = 64
n_out = 10

In [0]:
model = nn.Sequential(
    
    # first hidden layer: 
    nn.Linear(n_input, n_dense_1), 
    nn.ReLU(), 
    
    # second hidden layer: 
    nn.Linear(n_dense_1, n_dense_2), 
    nn.ReLU(), 
    
    # third hidden layer: 
    nn.Linear(n_dense_2, n_dense_3), 
    nn.ReLU(), 
    nn.Dropout(),  
    
    # output layer: 
    nn.Linear(n_dense_3, n_out) 
)

In [7]:
summary(model, (1, n_input))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Linear-1                [-1, 1, 64]          50,240
              ReLU-2                [-1, 1, 64]               0
            Linear-3                [-1, 1, 64]           4,160
              ReLU-4                [-1, 1, 64]               0
            Linear-5                [-1, 1, 64]           4,160
              ReLU-6                [-1, 1, 64]               0
           Dropout-7                [-1, 1, 64]               0
            Linear-8                [-1, 1, 10]             650
Total params: 59,210
Trainable params: 59,210
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.00
Forward/backward pass size (MB): 0.00
Params size (MB): 0.23
Estimated Total Size (MB): 0.23
----------------------------------------------------------------


#### Configure training hyperparameters

In [0]:
cost_fxn = nn.CrossEntropyLoss() # includes softmax activation

In [0]:
optimizer = torch.optim.Adam(model.parameters())

#### Train

In [0]:
def accuracy_pct(pred_y, true_y):
  _, prediction = torch.max(pred_y, 1) # returns maximum values, indices; fed tensor, dim to reduce
  correct = (prediction == true_y).sum().item()
  return (correct / true_y.shape[0]) * 100.0

In [12]:
n_batches = len(train_loader)
n_batches

469

In [13]:
n_epochs = 10 

print('Training for {} epochs. \n'.format(n_epochs))

for epoch in range(n_epochs):
  
  avg_cost = 0.0
  avg_accuracy = 0.0
  
  for i, (X, y) in enumerate(train_loader): # enumerate() provides count of iterations  
    
    # forward propagation:
    X_flat = X.view(X.shape[0], -1)
    y_hat = model(X_flat)
    cost = cost_fxn(y_hat, y)
    avg_cost += cost / n_batches
    
    # backprop and optimization via gradient descent: 
    optimizer.zero_grad() # set gradients to zero; .backward() accumulates them in buffers
    cost.backward()
    optimizer.step()
    
    # calculate accuracy metric:
    accuracy = accuracy_pct(y_hat, y)
    avg_accuracy += accuracy / n_batches
    
    if (i + 1) % 100 == 0:
      print('Step {}'.format(i + 1))
    
  print('Epoch {}/{} complete: Cost: {:.3f}, Accuracy: {:.1f}% \n'
        .format(epoch + 1, n_epochs, avg_cost, avg_accuracy)) 

print('Training complete.')

Training for 10 epochs. 

Step 100
Step 200
Step 300
Step 400
Epoch 1/10 complete: Cost: 0.642, Accuracy: 80.5% 

Step 100
Step 200
Step 300
Step 400
Epoch 2/10 complete: Cost: 0.270, Accuracy: 92.7% 

Step 100
Step 200
Step 300
Step 400
Epoch 3/10 complete: Cost: 0.195, Accuracy: 94.7% 

Step 100
Step 200
Step 300
Step 400
Epoch 4/10 complete: Cost: 0.158, Accuracy: 95.7% 

Step 100
Step 200
Step 300
Step 400
Epoch 5/10 complete: Cost: 0.128, Accuracy: 96.5% 

Step 100
Step 200
Step 300
Step 400
Epoch 6/10 complete: Cost: 0.117, Accuracy: 96.9% 

Step 100
Step 200
Step 300
Step 400
Epoch 7/10 complete: Cost: 0.099, Accuracy: 97.3% 

Step 100
Step 200
Step 300
Step 400
Epoch 8/10 complete: Cost: 0.087, Accuracy: 97.5% 

Step 100
Step 200
Step 300
Step 400
Epoch 9/10 complete: Cost: 0.077, Accuracy: 97.8% 

Step 100
Step 200
Step 300
Step 400
Epoch 10/10 complete: Cost: 0.066, Accuracy: 98.2% 

Training complete.


#### Test model

In [14]:
n_test_batches = len(test_loader)
n_test_batches

79

In [15]:
model.eval() # disables dropout and batch norm

with torch.no_grad(): # disables autograd, reducing memory consumption
  
  avg_test_cost = 0.0
  avg_test_acc = 0.0
  
  for X, y in test_loader:
    
    # make predictions: 
    X_flat = X.view(X.shape[0], -1)
    y_hat = model(X_flat)
    
    # calculate cost: 
    cost = cost_fxn(y_hat, y)
    avg_test_cost += cost / n_test_batches
    
    # calculate accuracy:
    test_accuracy = accuracy_pct(y_hat, y)
    avg_test_acc += test_accuracy / n_test_batches

print('Test cost: {:.3f}, Test accuracy: {:.1f}%'.format(avg_test_cost, avg_test_acc))

# model.train() # 'undoes' model.eval()

Test cost: 0.102, Test accuracy: 97.3%
