# Pytorch: modules

In the previous notebook, we have laboriously trained a neural network written by hand. Pytorch actually contains many modules that make the definition of a network, the data collection and the training very easy.

In [None]:
import torch
import torch.nn as nn
import torch.nn.functional as F

# Neural network module

The `torch.nn` module allows to easily define complicated networks. It implements basic linear and convolutional layers, and allows to build with them easily. To do so, we define a `nn.Module` class.

This class has two methods:

* `__init__` specifies the different layers of the network
* `forward` defines how to go from the input to the output using the layers defined in the init.

For instance, this is how you would define a simple neural network with one hidden layer of size $100$, that takes $10$ inputs and returns $3$ outputs:

In [None]:
class Two_layers(nn.Module):
    def __init__(self):
        super().__init__()
        self.layer1 = nn.Linear(10, 100)
        self.layer2 = nn.Linear(100, 3)

    def forward(self, x):
        y = self.layer1(x)
        y = torch.tanh(y)
        y = self.layer2(y)
        return y 
    
net = Two_layers()

In [None]:
x = torch.randn(1000, 10)
net(x)

# Optim module

The `torch.optim` module then allows to easily learn the parameters of the net with backprop. An optimizer, like SGD, takes the parameters of the network as input.

In [None]:
import torch.optim as optim

optimizer = optim.SGD(lr=1e-2, params=net.parameters())

We can then compute a (toy) loss, backpropagate, and use the `optimizer.step` method to make an update:

In [None]:
optimizer.zero_grad()
y = net(x)
loss = y.sum()
loss.backward()
optimizer.step()

And that is it ! 

# Datasets
There are also some convenient tools to load datasets. For instance, we will work with MNIST:

In [None]:
from torchvision import datasets, transforms

Let's download the MNIST dataset, the train and testing data, and apply a simple normalizing transform.

In [None]:
transform=transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))
    ])
dataset1 = datasets.MNIST('../data', train=True, download=True,
                   transform=transform)
dataset2 = datasets.MNIST('../data', train=False,
                   transform=transform)

We can then define a loader that will automatically load the images when we train / test

In [None]:
batch_size = 128

train_loader = torch.utils.data.DataLoader(dataset1, batch_size=batch_size)
test_loader = torch.utils.data.DataLoader(dataset2, batch_size=batch_size)

We can then loop over these loaders to get the images:

In [None]:
import matplotlib.pyplot as plt

for x, y in train_loader:
    print(x.shape)
    print(y.shape)
    plt.imshow(x[0, 0])
    break

We are now ready to train a network on MNIST ! 

# Training on MNIST

**Exercise 1**: Define a one hidden neural network for MNIST, with a hidden layer of size $1000$.

In [None]:
class Two_layers(nn.Module):
    def __init__(self):
        super().__init__()
        # Your code here
    def forward(self, x):
        # Your code here
    
net = Two_layers()

In [None]:
for x, y in train_loader:
    print(net(x))
    break

**Exercise 2** : Define an SGD optimizer for this network

In [None]:
optimizer = 

**Exercise 3**: Perform one training epoch on the data, and compute the testing accuracy. Use the cross entropy as loss.

In [None]:
from torch.nn.functional import cross_entropy

In [None]:
for i, (x, y) in enumerate(train_loader):
    # You code here

In [None]:
good_preds = 0
for x, y in test_loader:
    pred = net(x)
    good_preds += (pred.argmax(axis=1) == y).sum().item()

print(good_preds / 10000)

You should get ~ 90% accuracy.

**Exercise 4**: Now, train a convolutionnal network, and try to get the best accuracy.

In [None]:
class CNN(nn.Module):
    def __init__(self):
        super().__init__()
        # Your code here

    def forward(self, x):
        # your code here
    
net = CNN()

for x, y in train_loader:
    print(net(x))
    break

In [None]:
for i, (x, y) in enumerate(train_loader):
    # Your code here

In [None]:
good_preds = 0
for x, y in test_loader:
    pred = net(x)
    good_preds += (pred.argmax(axis=1) == y).sum().item()

print(good_preds / 10000)

**Exercise 5**: Put this model on a GPU, and compare the training time.
To put a model to GPU, you can to `net.to('cuda')`. The data `x` and `y`should also be put on the GPU.

In [None]:
# Your code here.