# PyTorch 1.5/1.6 Discovery

This is the very first demo run with PyTorch. The goal of this notebook is to discover the very basics of PyTorch, things like the different cells available (`torch.nn` package) and creating some easy NN.

## Creating the first Neural Network in PyTorch

Define the EasyNet: two layer of Fully Connected NN - in PyTorch terms, `Linear` cells.

In [None]:
%%writefile model_def.py
import torch
import torch.nn as nn

class EasyNet(nn.Module):

    def __init__(self):
        super(EasyNet, self).__init__()
        # A very simple NN, with two "Dense" layers
        # This will receive 13 values in input (Boston Housing dataset will be used for test)
        # Will scale it down to 6, then output only 1 value - this is a Linear Regression problem
        self.hidden_1 = nn.Linear(13, 6)
        self.hidden_2 = nn.Linear(6, 1)

    def forward(self, x):
        # How the layers should do the forward pass? "Sequential"
        # The `tanh` and `sigmoid` are the activation layers for the cells.
        x = torch.tanh(self.hidden_1(x))
        x = torch.sigmoid(self.hidden_2(x))
        return x

Test: does it generate a prediction?

In [None]:
from model_def import EasyNet
import torch

# Launch the DumbNet
model = EasyNet()
# Create an input
i = torch.randn(13)
# Generate the output
out = model(i)
print(out)

Let's define the training process. A training process is composed of:

- a forward pass - just call the `model` function
- an estimation of the error - use an error function such as `RMSE`
- a backward propagation of the computed gradients - PyTorch provides a `backward()` function which makes it easy
- update the weights of the NN via an optimizer - PyTorch provides `torch.optim`
 - SGD could also be manual via `weight = weight - learning_rate * gradient`

In [None]:
# Loss definition. In this case, MSE (example)
target = torch.randn(1) # random target value
print('Target: '+str(target))
criterion = torch.nn.MSELoss() # Use MSE as Loss
loss = criterion(out, target) # Compute the MSE between output and target

print('Loss: '+str(loss))

In [None]:
# Backward propagation of the gradients
model.zero_grad()     # zeroes the gradient buffers of all parameters
loss.backward()     # backpropagation

Note: an example of manual update of the weights via SGD:

```python
learning_rate = 0.01
for f in net.parameters():
    f.data.sub_(f.grad.data * learning_rate)
```

In [None]:
# Update the weights via an optimizer
import torch.optim as optim
optimizer = optim.SGD(model.parameters(), lr=0.01)
optimizer.step()

The training loop will be:

```python
# In the training loop:
optimizer.zero_grad()               # zero the gradient buffers
output = model(i)                   # forward pass
loss = criterion(output, target)    # computation of the loss
loss.backward()                     # Backpropagation of the loss
optimizer.step()                    # Weights update
```

### Training script

The training script should:

1. Load a batch of data - via `torch.utils.data.DataLoader`
2. Predict the batch of the data - via `model`
3. Calculate the loss value by predict value and true value - via `torch.nn.MSELoss()`
4. Clear the grad value optimizer stored - via `optimizer.zero_grad()`
5. Backpropagate the loss value - via `loss.backward()`
6. Update optimizer - via `optimizer.step()

In [None]:
# Let's try loading data from the SKLearn Boston Housind dataset
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split

data = load_boston()
x, y = data['data'], data['target']
x_train, x_test, y_train, y_test = train_test_split(x,y, test_size=0.2, random_state=42)

In [None]:
# Store them for later use
import os, numpy as np
local_dir = './'

np.save(os.path.join(local_dir, 'x_train.npy'), x_train)
np.save(os.path.join(local_dir, 'x_test.npy'), x_test)
np.save(os.path.join(local_dir, 'y_train.npy'), y_train)
np.save(os.path.join(local_dir, 'y_test.npy'), y_test)

In [None]:
x_train = torch.tensor(x_train, dtype=torch.float)
x_test = torch.tensor(x_test, dtype=torch.float)
y_train = torch.tensor(y_train, dtype=torch.float).view(-1, 1)
y_test = torch.tensor(y_test, dtype=torch.float).view(-1, 1)

In [None]:
from torch.utils.data import TensorDataset, DataLoader

# let’s construct a Dataset of Tensor.
datasets = TensorDataset(x_train, y_train)
# Then, generate a DataLoder by using this Dataset
train_iter = DataLoader(datasets, batch_size=10, shuffle=True)

In [None]:
num_epochs = 5
for epoch in range(num_epochs):
    for x, y in train_iter:
        output = model(x)
        loss = criterion(output, y)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
    print("epoch {} loss: {:.4f}".format(epoch + 1, loss.item()))


In [None]:
# let’s check its performance on the testing dataset
print(criterion(model(x_test), y_test).item())

Now that we successfully run a PyTorch training, let's write a Training script:

In [None]:
%%writefile train.py

import argparse, torch, numpy as np, os
from torch.utils.data import TensorDataset, DataLoader
from torch.nn import MSELoss
from torch.optim import SGD
from model_def import EasyNet

def parse_args():
    parser = argparse.ArgumentParser()
    # hyperparameters sent by the client are passed as command-line arguments to the script
    parser.add_argument('--epochs', type=int, default=1)
    parser.add_argument('--batch_size', type=int, default=64)
    parser.add_argument('--learning_rate', type=float, default=0.1)
    # dataset info - we will use numpy
    parser.add_argument('--train', type=str, default=os.environ.get('SM_CHANNEL_TRAIN'))
    parser.add_argument('--test', type=str, default=os.environ.get('SM_CHANNEL_TEST'))
    # model directory: useful later for SageMaker
    parser.add_argument('--model_dir', type=str, default=os.environ.get('SM_MODEL_DIR'))
    # return the parsed arguments
    return parser.parse_known_args()


def get_train_data(train_dir):
    x_train = np.load(os.path.join(train_dir, 'x_train.npy'))
    y_train = np.load(os.path.join(train_dir, 'y_train.npy'))
    return x_train, y_train


def get_test_data(test_dir):
    x_test = np.load(os.path.join(test_dir, 'x_test.npy'))
    y_test = np.load(os.path.join(test_dir, 'y_test.npy'))
    return x_test, y_test
    


if __name__ == "__main__":
    # Parameters and hyper-parameters
    args, _ = parse_args()
    batch_size = args.batch_size
    epochs = args.epochs
    learning_rate = args.learning_rate
    print('batch_size = {}, epochs = {}, learning rate = {}'.format(batch_size, epochs, learning_rate))
    
    # Load the dataset
    x_train, y_train = get_train_data(args.train)
    x_test, y_test = get_test_data(args.test)
    # Parse it to torch.tensor
    x_train = torch.tensor(x_train, dtype=torch.float)
    x_test = torch.tensor(x_test, dtype=torch.float)
    y_train = torch.tensor(y_train, dtype=torch.float).view(-1, 1)
    y_test = torch.tensor(y_test, dtype=torch.float).view(-1, 1)
    # Create the TensorDataset and the DataLoader
    datasets = TensorDataset(x_train, y_train)
    train_iter = DataLoader(datasets, batch_size=batch_size, shuffle=True)
    
    # Setup NN, Loss, and SGD
    model = EasyNet()
    criterion = MSELoss()
    optimizer = SGD(model.parameters(), lr=learning_rate)
    
    # Training loop
    for epoch in range(epochs):
        for x, y in train_iter:
            output = model(x)
            loss = criterion(output, y)
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
        print("epoch {} loss: {:.4f}".format(epoch + 1, loss.item()))
        
    # Evaluate on test dataset
    print('MSE: '+str(criterion(model(x_test), y_test).item()))
    
    # Save the model
    model_path = os.path.join(args.model_dir, 'model.pt')
    torch.save(model.state_dict(), model_path)
    print('Model stored at: '+model_path)
    

Let's test the new training script (locally)!

In [None]:
!python train.py \
    --epochs 5 \
    --batch_size 128 \
    --learning_rate 0.01 \
    --train . \
    --test . \
    --model_dir .