# Notebook A – PyTorch crash course

Learn tensors and train a small MLP on MNIST.

For a longer pytorch crash course, a great resource is https://docs.pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html

In [None]:
import torch, torchvision
from torchvision import transforms
from torch import nn
from torch.utils.data import DataLoader

In [None]:
# Tensor arithmetic

A = torch.randn(2, 3)
B = torch.randn(2, 3)

print(A.shape, B.shape)

# elementwise operations
print(A + B)
print(A - B)
print(A * B)
print(A / B)

In [None]:
# matrix operations
B_transpose = B.T # transpose
print(B)
print(B_transpose)

C = A @ B_transpose # matrix multiplication
print(C.shape)


In [None]:
# scalar operations
A = torch.ones((2, 2))
print(A + 1)
print(A * 10)

In [None]:
# slicing, indexing
A = torch.randn(4, 2)
print(A[:, 1]) # select second column
print(A[0, :]) # select first row
A[0, :] = torch.tensor([1, 2]) # assign to first row
print(A)

In [None]:
# Gradients

# gradients are tricky but pytorch does everything for us using the autograd engine.

# tensors can track their gradients:
x = torch.tensor([2.], requires_grad=True)
y = x ** 2 # y = x^2 -> dy/dx = 2x -> dy/dx evaluated at x=2 should be 4.
y.backward() # populate the gradients of any input to the value of y
print(x.grad)

# for more info, see https://docs.pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html


In [None]:
# Load MNIST
train_ds = torchvision.datasets.MNIST(root='./data', train=True, download=True,
                                      transform=transforms.ToTensor())
train_loader = DataLoader(train_ds, batch_size=128, shuffle=True)


In [None]:
import matplotlib.pyplot as plt

X, y = train_ds[0]

plt.imshow(X[0])
plt.title(f"Label: {y}")

### Exercises
- what is the shape of X, y? what data do they contain? How do these relate to the image and label you are seeing?

In [None]:

# Define MLP
class MLP(nn.Module):
    def __init__(self):
        super().__init__()
        self.net = nn.Sequential(
            nn.Flatten(), # flatten the tensor, 1 x 28 x 28 -> (1 x 28 x 28)
            nn.Linear(28*28, 128), # first network layer
            nn.ReLU(), # activation
            nn.Linear(128, 10) # classification layer
        )
    def forward(self, x):
        return self.net(x)

model = MLP()
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters())

for name, parameter in model.named_parameters():
    # tensors containing network weights and biases
    print(name, parameter.shape)


In [None]:
from tqdm import tqdm

# One training epoch
for images, labels in tqdm(train_loader, leave=False):
    optimizer.zero_grad()
    outputs = model(images)
    loss = loss_fn(outputs, labels)
    loss.backward()
    optimizer.step()
print("Finished one epoch")


### Exercises:
- change hidden units to 256 and rerun.
- print loss at each step - is it decreasing?
- print network parameters before and after training. How have they changed?
