<a href="https://colab.research.google.com/github/ccarpenterg/LearningPyTorch1.x/blob/master/01_getting_started_with_pytorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Getting Started in PyTorch: Training a NN on MNIST

This a small series of notebooks in which I introduce PyTorch, which is Facebook's machine learning framework.

Let's start by importing torch, which is the main library, torchvision and numpy:

In [0]:
import numpy as np

import torch
import torch.nn.functional as F
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms

from torchvision.datasets import MNIST
from torch.utils.data import DataLoader

torch is

In [15]:
print('PyTorch version:', torch.__version__)
print('Torchvision version:', torchvision.__version__)

PyTorch version: 1.1.0
Torchvision version: 0.3.0


### Simple Neural Network

We will start with the basic example of a shallow NN: an input layer, a hidden layer and the output layer. We'll use dropout to avoid overfitting.

In [0]:
class BasicNN(nn.Module):
    
    def __init__(self, input_size, hidden_size, num_classes):
        super(BasicNN, self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_size)
        self.fc2 = nn.Linear(hidden_size, num_classes)
        self.drop = nn.Dropout(0.2)
        self.relu = nn.ReLU()
        
    def forward(self, x):
        x = self.fc1(x)
        x = self.relu(x)
        x = self.drop(x)
        x = self.fc2(x)
        return x

### MNIST

In [0]:
train_transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize([0.1307], [0.3081])
])

valid_transform = train_transform

train_set = MNIST('./data/mnist', train=True, download=True, transform=train_transform)
valid_set = MNIST('./data/mnist', train=False, download=True, transform=train_transform)

In [18]:
print(train_set.data.shape)
print(valid_set.data.shape)

torch.Size([60000, 28, 28])
torch.Size([10000, 28, 28])


In [0]:
train_loader = DataLoader(train_set, batch_size=128, num_workers=0, shuffle=True)
valid_loader = DataLoader(valid_set, batch_size=512, num_workers=0, shuffle=False)

In [20]:
cuda = torch.device('cuda')

model = BasicNN(784, 128, 10)
model.to(cuda)

BasicNN(
  (fc1): Linear(in_features=784, out_features=128, bias=True)
  (fc2): Linear(in_features=128, out_features=10, bias=True)
  (drop): Dropout(p=0.2)
  (relu): ReLU()
)

In [21]:
input = torch.randn(784, device=cuda)
out = model(input)
print(out)

tensor([-0.5404,  0.0699,  0.0364, -0.4216, -0.2057, -0.0990, -0.0782,  0.4534,
         0.1385,  0.2457], device='cuda:0', grad_fn=<AddBackward0>)


In [22]:
X = torch.randn(28, 28) # matrix
print(X.shape)

x = torch.flatten(X) # vector
print(x.shape)

torch.Size([28, 28])
torch.Size([784])
