<a href="https://colab.research.google.com/github/cibergus/ML-open/blob/main/Exercise_2_MNIST_Solution.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
import torch
from torch import nn, optim
from torchvision import datasets, transforms

torch.manual_seed(42);

In [2]:
# Design NN model
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.features = nn.Sequential(
                            nn.Conv2d(1, 32, 3, 1),
                            nn.ReLU(),
                            nn.Conv2d(32, 64, 3, 1),
                            nn.ReLU(),
                            nn.MaxPool2d(2),
                            nn.Dropout(0.25),
                            nn.Flatten()
                        )
        self.classifier = nn.Sequential(
                            nn.Linear(9216, 128),
                            nn.ReLU(),
                            nn.Dropout(0.5),
                            nn.Linear(128, 10),
                            nn.LogSoftmax(1)
                        )
    
    def forward(self, x):
        x = self.features(x)
        output =  self.classifier(x)
        return output

In [3]:
# Create NN
model=Net()

In [4]:
# Load and Pre-process Data
transform=transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))
    ])
trainset = datasets.MNIST('../data', train=True, download=True,
                    transform=transform)
train_loader = torch.utils.data.DataLoader(trainset, 
                                           batch_size=64, 
                                           shuffle=True)

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to ../data/MNIST/raw/train-images-idx3-ubyte.gz


  0%|          | 0/9912422 [00:00<?, ?it/s]

Extracting ../data/MNIST/raw/train-images-idx3-ubyte.gz to ../data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to ../data/MNIST/raw/train-labels-idx1-ubyte.gz


  0%|          | 0/28881 [00:00<?, ?it/s]

Extracting ../data/MNIST/raw/train-labels-idx1-ubyte.gz to ../data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to ../data/MNIST/raw/t10k-images-idx3-ubyte.gz


  0%|          | 0/1648877 [00:00<?, ?it/s]

Extracting ../data/MNIST/raw/t10k-images-idx3-ubyte.gz to ../data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to ../data/MNIST/raw/t10k-labels-idx1-ubyte.gz


  0%|          | 0/4542 [00:00<?, ?it/s]

Extracting ../data/MNIST/raw/t10k-labels-idx1-ubyte.gz to ../data/MNIST/raw



In [5]:
# Function to test model
def test_model(model):
    # Load Test Data
    testset = datasets.MNIST('../data', train=False,
                        transform=transform)
    test_loader = torch.utils.data.DataLoader(testset, batch_size=1000)

    model.eval()
    device = next(model.parameters()).device

    # Test model
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            pred = output.argmax(dim=1, keepdim=True)
            correct += pred.eq(target.view_as(pred)).sum().item()

    # Print Accuracy
    print('\nTest set: Accuracy: {}/{} ({:.0f}%)\n'.format(
        correct, len(test_loader.dataset),
        100. * correct / len(test_loader.dataset)))

In [6]:
# Training (3+5)
# 3 
device = 'cuda' if torch.cuda.is_available() else 'cpu'

loss_fn = nn.NLLLoss()
optimizer = optim.Adadelta(model.parameters(), lr=1.0)
model.train()
model=model.to(device)

for epoch in range(3):
    for batch_idx, (batch_data, batch_target) in enumerate(train_loader):
        # 5
        batch_data, batch_target = batch_data.to(device), batch_target.to(device)
        optimizer.zero_grad()
        batch_output = model(batch_data)
        loss = loss_fn(batch_output, batch_target)
        loss.backward()
        optimizer.step()
        # print results
        if batch_idx % 50 == 0:
            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                epoch+1, batch_idx * len(batch_data), len(train_loader.dataset),
                100. * batch_idx / len(train_loader), loss.item()))
    test_model(model)


Test set: Accuracy: 9860/10000 (99%)


Test set: Accuracy: 9893/10000 (99%)


Test set: Accuracy: 9909/10000 (99%)



In [7]:
# Save Model
torch.save(model.state_dict(), "mnist_cnn.pt")

### This notebook is a modified version of the PyTorch MNIST Example [here](https://github.com/pytorch/examples/tree/main/mnist).

Things to Try:
- Run the example on Github in a local or server development environment with an IDE like VSCode, PyCharm. Notice how the authors use argparse to pass in configuration parameters.
- Experiment with different NN model designs, loss functions, and optimizers.
- View the data for examples that fail. Is anything wrong with the data, or are they just difficult examples?
