<div class="alert alert-block alert-info">
<b>Number of points for this notebook:</b> 1
<br>
<b>Deadline:</b> March 2, 2020 (Monday). 23:00
</div>

# Exercise 1.2. Multilayer perceptron

The goal of this exercise is to get familiar with the basics of PyTorch and train a multilayer perceptron (MLP) model.

If you are not familiar with PyTorch, there is a number of good tutorials [here](https://pytorch.org/tutorials/index.html). We recommend the following ones:
* [What is PyTorch?](https://pytorch.org/tutorials/beginner/blitz/tensor_tutorial.html#sphx-glr-beginner-blitz-tensor-tutorial-py)
* [Autograd: Automatic Differentiation](https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html#sphx-glr-beginner-blitz-autograd-tutorial-py)
* [Learning PyTorch with Examples](https://pytorch.org/tutorials/beginner/pytorch_with_examples.html)
* [Neural Networks](https://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#sphx-glr-beginner-blitz-neural-networks-tutorial-py)

In [1]:
skip_training = True  # Set this flag to True before validation and submission

In [2]:
# During evaluation, this cell sets skip_training to True
# skip_training = True

In [3]:
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

import torch
import torch.nn as nn
import torch.nn.functional as F

import tools
import data

In [4]:
# When running on your own computer, you can specify the data directory by:
# data_dir = tools.select_data_dir('/your/local/data/directory')
data_dir = tools.select_data_dir()

The data directory is /coursedata


In [5]:
# Select device which you are going to use for training
#device = torch.device("cuda:0")
device = torch.device("cpu")

In [6]:
if skip_training:
    # The models are always evaluated on CPU
    device = torch.device("cpu")

# Data

We will use the same *winequality* dataset as in the logistic regression notebook.

In [7]:
trainset = data.WineQuality(data_dir, train=True, normalize=False)
train_inputs, train_targets = trainset.tensors
print(train_inputs.shape, train_targets.shape)

testset = data.WineQuality(data_dir, train=False, normalize=False)
test_inputs, test_targets = testset.tensors

torch.Size([5197, 11]) torch.Size([5197])


In [8]:
# Convert to a binary classification problem
train_targets = (train_targets >= 7).float().view(-1, 1)  
test_targets = (test_targets >= 7).float().view(-1, 1)  

In [9]:
# Normalize inputs to zero mean and unit variance
mean = train_inputs.mean(dim=0)
std = train_inputs.std(dim=0)
scaler = lambda x: (x - mean.to(x.device)) / std.to(x.device)

train_inputs = scaler(train_inputs)
test_inputs = scaler(test_inputs)

# Multilayer perceptron (MLP) network with two hidden layers

We will create a simple multilayer perceptron (MLP) network. The model has
- input dimensionality 11
- one hidden layer with 200 units with ReLU nonlinearity
- one hidden layer with 100 units with ReLU nonlinearity
- linear output layer with output dimensionality 1 and sigmoid nonlinearity.

Hints:
* You may want to look at [this tutorial](https://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#sphx-glr-beginner-blitz-neural-networks-tutorial-py) for reference.
* You can use [`nn.Linear`](https://pytorch.org/docs/stable/nn.html?highlight=nn%20linear#torch.nn.Linear)
module to define the fully-connected layers of the MLP.
* Simple architectures are usually created using module [`torch.nn.Sequential`](https://pytorch.org/docs/stable/nn.html#torch.nn.Sequential). You do not have to use this module in this exercise.

In [12]:
class MLP(nn.Module):
    def __init__(self, n_inputs=11):
        # YOUR CODE HERE
        
        super(MLP, self).__init__()
        self.w1 = nn.Linear(n_inputs, 200, bias=True)
        self.w2 = nn.Linear(200, 100, bias=True)
        self.w3 = nn.Linear(100, 1, bias=True)
        
#         raise NotImplementedError()

    def forward(self, x):
        """
        Args:
          x of shape (n_samples, n_inputs): Model inputs.
        
        Returns:
          y of shape (n_samples, 1): Model outputs.
        """
        # YOUR CODE HERE
        
        out = torch.relu(self.w1(x))
        out = torch.relu(self.w2(out))
        out = torch.sigmoid(self.w3(out))
        return out
        
#         raise NotImplementedError()

In [13]:
# Let us create the network and make sure it can process a random input of the right shape
def test_MLP_shapes():
    n_inputs = 11
    n_samples = 10
    net = MLP()
    y = net(torch.randn(n_samples, n_inputs))
    assert y.shape == torch.Size([n_samples, 1]), f"Bad y.shape={y.shape}"
    print('Success')

test_MLP_shapes()

Success


## Train the MLP network

### Training loop

Your task is to implement the training loop.
You training loop should have the same steps as in the logistic regression notebook.

Recommended hyperparameters:
* [Adam optimizer](https://pytorch.org/docs/stable/optim.html#torch.optim.Adam) with learning rate 0.01.
* You can process the data in the full-batch model (computing the gradients using all training data).
* Number of iterations (parameter updates): 2000.

Hints:
- We recommend you to print the classification accuracy during training. You can compute the accuracy using function `compute_accuracy`.
- The accuracy on the training set should be close to 1.0 (the model overfits to the training data).
The test accuracy should be above 0.84.

In [14]:
# Compute the accuracy of the model on the given dataset
def compute_accuracy(model, inputs, targets):
    with torch.no_grad():
        inputs, targets = inputs.to(device), targets.to(device)
        outputs = (model.forward(inputs) > 0.5).float()
        accuracy = (outputs == targets).sum().float() / targets.numel()
        return accuracy

In [15]:
# Create the model
model = MLP()
model.to(device)

MLP(
  (w1): Linear(in_features=11, out_features=200, bias=True)
  (w2): Linear(in_features=200, out_features=100, bias=True)
  (w3): Linear(in_features=100, out_features=1, bias=True)
)

In [17]:
# Implement the training loop here
if not skip_training:
    # YOUR CODE HERE
    
    criterion = nn.functional.binary_cross_entropy
    optim = torch.optim.Adam(model.parameters(), lr=0.01)
    for epoch in range(1000):
        optim.zero_grad()
        out = model(train_inputs)
        loss = criterion(out, train_targets)
        if epoch%100==0:
            print("epoch: {}, loss: {}, acc: {}".format(epoch+1, loss, compute_accuracy(model,train_inputs,train_targets)))
        loss.backward()
        optim.step()
    
#     raise NotImplementedError()

epoch: 1, loss: 0.6723265647888184, acc: 0.7436982989311218
epoch: 101, loss: 0.2455398142337799, acc: 0.8901289105415344
epoch: 201, loss: 0.20512372255325317, acc: 0.9020588994026184
epoch: 301, loss: 0.09938793629407883, acc: 0.9590148329734802
epoch: 401, loss: 0.04690491408109665, acc: 0.9894169569015503
epoch: 501, loss: 0.03797384351491928, acc: 0.9924956560134888
epoch: 601, loss: 0.023734984919428825, acc: 0.9976909756660461
epoch: 701, loss: 0.01579287089407444, acc: 0.9994227290153503
epoch: 801, loss: 0.010910815559327602, acc: 0.9996151328086853
epoch: 901, loss: 0.007824904285371304, acc: 0.999807596206665


In [18]:
# Save the model to disk (the pth-files will be submitted automatically together with your notebook)
if not skip_training:
    tools.save_model(model, '1_mlp.pth')
else:
    model = MLP()
    tools.load_model(model, '1_mlp.pth', device)

Do you want to save the model (type yes to confirm)? yes
Model saved to 1_mlp.pth.


In [19]:
accuracy = compute_accuracy(model, test_inputs, test_targets)
print('Accuracy on test set:', accuracy.item())
assert accuracy >= 0.84, 'MLP classifier has poor accuracy.'
print('Success')

Accuracy on test set: 0.8584615588188171
Success


In [None]:
# This cell tests MLP