# Intro to AI in Pytorch

In this notebook we will explore the basics of doing classification with PyTorch.

First thing we want to demonstrate is that PyTorch can make performing GPU computations very easy. If the GPU is available, using the GPU for matrix multiplication is very easy.

In [None]:
import os
os.chdir("..")

from datetime import datetime
import numpy as np
import torch

Initialize a random matrix of size `N = 10000`

Multiply the matrix by itself and time the result (can use `datetime`).

Now move the matrix to GPU and perform the matrix multiplication there. Time the result again.

We now turn to deep learning. We first create a classification problem. In this dataset, each spiral arm is of a different class. There are 3 classes, two input values, and 1000 samples per class.

In [None]:
import random

import math
from IPython import display
from matplotlib import pyplot as plt
from torch import nn, optim
import torch.nn.functional as F
import torch.nn as nn
import torch.optim as optim


def plot_scatter(X, y, d=0, auto=False, zoom=1):
    X = X.cpu()
    y = y.cpu()
    plt.scatter(X.numpy()[:, 0], X.numpy()[:, 1], c=y, s=20, cmap=plt.cm.Spectral)
    plt.axis('square')
    plt.axis(np.array((-1.1, 1.1, -1.1, 1.1)) * zoom)
    if auto is True: plt.axis('equal')
    # plt.axis('off')

    # _m, _c = 0, '.15'
    # plt.axvline(0, ymin=_m, color=_c, lw=1, zorder=0)
    # plt.axhline(0, xmin=_m, color=_c, lw=1, zorder=0)

def plot_model(X, y, model):
    model.cpu()
    mesh = np.arange(-1.1, 1.1, 0.01)
    xx, yy = np.meshgrid(mesh, mesh)
    with torch.no_grad():
        data = torch.from_numpy(np.vstack((xx.reshape(-1), yy.reshape(-1))).T).float()
        Z = model(data).detach()
    Z = np.argmax(Z, axis=1).reshape(xx.shape)
    plt.contourf(xx, yy, Z, cmap=plt.cm.Spectral, alpha=0.3)
    plot_scatter(X, y)

plt.rc('figure', figsize=(5,5), dpi=100)

In [None]:
seed = 12345
random.seed(seed)
torch.manual_seed(seed)
N = 1000  # num_samples_per_class
D = 2  # dimensions
C = 3  # num_classes

Create and visualise the data

In [None]:
X_cpu = torch.zeros(N * C, D)
y_cpu = torch.zeros(N * C, dtype=torch.long)
for c in range(C):
    index = 0
    t = torch.linspace(0, 1, N)
    inner_var = torch.linspace((2 * math.pi / C) * (c), 
                               (2 * math.pi / C) * (2 + c), N) + torch.randn(N) * 0.2
    
    for ix in range(N * c, N * (c + 1)):
        X_cpu[ix] = t[index] * torch.FloatTensor((
            math.sin(inner_var[index]), math.cos(inner_var[index])
        ))
        y_cpu[ix] = c
        index += 1

print(f"Shapes: X = {tuple(X_cpu.size())}, Y = {tuple(y_cpu.size())} ")
plot_scatter(X_cpu, y_cpu)

Create a Dataset and DataLoader over the data `X_cpu` and `y_cpu`

In [None]:
from torch.utils.data import DataLoader, TensorDataset

# TODO:
dataset = None 
train_loader = None 

Implement the function `train()` below that encapsulates the training procedure for an epoch (single run over the entire training dataset)

In [None]:
def train(epoch, model, criteria, train_loader, optimizer, device):
    model.train()
    
    # TODO:
    # main loop to train NN over batches

    if epoch % 10 == 0:
        print(f'Train Epoch: {epoch} \tLoss: {loss.item():.6f}')

To see why neural networks are powerful, we will first train a completely linear model. 

In [None]:
H = 100

class LinearNetwork(nn.Module):
    def __init__(self):
        super().__init__()
        self.layers = nn.Sequential(
                    nn.Linear(D, H),
                    nn.Linear(H,C)
                    )

    def forward(self, x):
        output = self.layers(x)
        return output

model = LinearNetwork().to(device)

learning_rate =  1e-3
decay_factor = 1e-5
criteria = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate, weight_decay=decay_factor)

for epoch in range(200):
    train(epoch, model, criteria, train_loader, optimizer, device)

How does the loss change? We can plot the model's region of classification for each class by running the cell below. What sort of classification boundaries are produced?

In [None]:
plot_model(X_cpu, y_cpu, model)

Now implement a neural network with nonlinear activations (e.g. ReLU) and see how the model performs.

In [None]:
class MultiLayerPerceptron(nn.Module):
    # TODO
    pass

model = MultiLayerPerceptron().to(device)

learning_rate =  1e-3
decay_factor = 1e-5
criteria = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate, weight_decay=decay_factor)

for epoch in range(200):
    train(epoch, model, criteria, train_loader, optimizer, device)

In [None]:
print(model)
plot_model(X_cpu, y_cpu, model)