<a href="https://colab.research.google.com/github/caocscar/workshops/blob/master/pytorch/Workshop_Regression_Class.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Regression Problem**

In [27]:
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
import numpy as np
import pandas as pd
from torch.utils.data import TensorDataset, DataLoader

print('Torch version', torch.__version__)
print('Pandas version', pd.__version__)
print('Numpy version', np.__version__)

Torch version 1.3.1
Pandas version 0.25.3
Numpy version 1.17.4


The following should say `cuda:0`. If it does not, we need to go to *Edit* -> *Notebook settings* and change it to a `GPU` from `None`. You only have to do this once per notebook.

In [28]:
device = 'cuda:0' if torch.cuda.is_available() else 'cpu'
device

'cuda:0'

Read in dataset

In [0]:
df_train = pd.read_csv('https://raw.githubusercontent.com/greght/Workshop-Keras-DNN/master/ChallengeProblems/dataRegression_train.csv', header=None)
df_val = pd.read_csv('https://raw.githubusercontent.com/greght/Workshop-Keras-DNN/master/ChallengeProblems/dataRegression_test.csv', header=None)

Construct our x,y variables along with the training and validation dataset

In [0]:
x_train = df_train.iloc[:,0:2]
y_train = df_train.iloc[:,2]
x_val = df_val.iloc[:,0:2]
y_val = df_val.iloc[:,2]

Preprocess our data to go from a `pandas` DataFrame to a `numpy` array to a `torch` tensor.

In [0]:
x_train_tensor = torch.tensor(x_train.to_numpy(), device=device, dtype=torch.float, requires_grad=True)
y_train_tensor = torch.tensor(y_train.to_numpy(), device=device, dtype=torch.float, requires_grad=True)
x_val_tensor = torch.tensor(x_val.to_numpy(), device=device, dtype=torch.float, requires_grad=True)
y_val_tensor = torch.tensor(y_val.to_numpy(), device=device, dtype=torch.float, requires_grad=True)
y_train_tensor = y_train_tensor.view(-1,1)
y_val_tensor = y_val_tensor.view(-1,1)

We'll write a python class to define out neural network.

In [0]:
class ThreeLayerNN(nn.Module):
    def __init__(self, dim_input, H):
        super().__init__()
        self.fc1 = nn.Linear(dim_input, H)
        self.fc2 = nn.Linear(H,H)
        self.fc3 = nn.Linear(H,1)
    
    def forward(self, x):
        x1 = F.relu(self.fc1(x))
        x2 = F.relu(self.fc2(x1))
        y_pred = self.fc3(x2)
        return y_pred

We create an instance of this class.

In [33]:
model = ThreeLayerNN(x_train_tensor.shape[1],5).to(device)
print(model)

ThreeLayerNN(
  (fc1): Linear(in_features=2, out_features=5, bias=True)
  (fc2): Linear(in_features=5, out_features=5, bias=True)
  (fc3): Linear(in_features=5, out_features=1, bias=True)
)


`model.parameters()` contains the **weights** and **bias** (alternating) for each of the 3 layers



In [34]:
params = list(model.parameters())
print(f'There are {len(params)} parameters')
for param in params:
    print(param)

There are 6 parameters
Parameter containing:
tensor([[ 0.1991, -0.3374],
        [ 0.6217,  0.2454],
        [ 0.2356, -0.4803],
        [ 0.2875, -0.2138],
        [ 0.5510, -0.4722]], device='cuda:0', requires_grad=True)
Parameter containing:
tensor([-0.2620, -0.4441,  0.2302, -0.4542, -0.0018], device='cuda:0',
       requires_grad=True)
Parameter containing:
tensor([[ 0.2439,  0.3755,  0.2190, -0.3126,  0.3004],
        [-0.4456,  0.1666,  0.1028, -0.2943, -0.1549],
        [ 0.1558,  0.1287, -0.4446,  0.1832, -0.0817],
        [-0.1445, -0.1680, -0.2676,  0.0634,  0.1835],
        [ 0.2595, -0.4457,  0.1425,  0.0747,  0.2135]], device='cuda:0',
       requires_grad=True)
Parameter containing:
tensor([-0.2887,  0.3904, -0.1645, -0.2864,  0.2194], device='cuda:0',
       requires_grad=True)
Parameter containing:
tensor([[ 0.1612, -0.0190,  0.4062, -0.1642,  0.4348]], device='cuda:0',
       requires_grad=True)
Parameter containing:
tensor([0.0838], device='cuda:0', requires_grad=Tru

We'll define a template for our `fit_model` function that contains `train` and `validate` functions.

---



In [0]:
def fit_model(model, loss_fn, optimizer):
    def train(x,y):
        yhat = model(x)
        loss = loss_fn(yhat,y)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        return loss.item()
    
    def validate(x,y):
        yhat = model(x)
        loss = loss_fn(yhat,y)
        return loss.item()
    
    return train, validate

We define our *loss function*, *learning rate*, and our *optimizer*. We pass this to `fit_model` to return our `train` and `validate` functions.


In [0]:
loss_fn = nn.MSELoss(reduction='mean') #default
learning_rate = 0.1
optimizer = optim.Adagrad(model.parameters(), lr=learning_rate)
train, validate = fit_model(model, loss_fn, optimizer)

## Mini-batches
From the documentation: `torch.nn` only supports mini-batches. The entire `torch.nn` package only supports inputs that are a mini-batch of samples, and not a single sample.

In [0]:
train_data = TensorDataset(x_train_tensor, y_train_tensor)
train_loader = DataLoader(dataset=train_data, batch_size=10, shuffle=True)

Here is our training loop with mini-batch processing. We have to move each mini-batch onto the GPU.

In [38]:
epochs = 100
for epoch in range(epochs):
    # training
    losses = []
    for i, (xbatch, ybatch) in enumerate(train_loader):
        xbatch = xbatch.to(device)
        ybatch = ybatch.to(device)
        loss = train(xbatch, ybatch)
        losses.append(loss)
    training_loss = np.mean(losses)
    # validation
    validation_loss = validate(x_test_tensor, y_test_tensor)
    # print intermediate results
    if epoch%10 == 9:
        print(epoch, training_loss, validation_loss)

9 4.1129890788685195 5.250312328338623
19 3.794783592224121 4.710597515106201
29 3.6706963669170034 4.282569885253906
39 3.54070782661438 4.075663089752197
49 3.459462035786022 3.831831455230713
59 3.435289729725231 3.7195568084716797
69 3.3426577394658867 3.662083864212036
79 3.3177800286899912 3.5623490810394287
89 3.288137435913086 3.5723767280578613
99 3.268090009689331 3.445207357406616


We can view the current state of our model using the `state_dict` method.

In [39]:
model.state_dict()

OrderedDict([('fc1.weight', tensor([[ 0.1991, -0.3374],
                      [ 1.7515, -0.0551],
                      [ 0.6901, -0.0575],
                      [ 0.2875, -0.2138],
                      [ 1.0115, -0.0454]], device='cuda:0')),
             ('fc1.bias',
              tensor([-0.2620, -0.8242,  0.5889, -0.4542,  0.3503], device='cuda:0')),
             ('fc2.weight',
              tensor([[ 0.2439,  2.5966,  0.6142, -0.3126,  0.6872],
                      [-0.4456,  2.2888,  0.2973, -0.2943,  0.0274],
                      [ 0.1558,  0.1287, -0.4446,  0.1832, -0.0817],
                      [-0.1445, -0.1680, -0.2676,  0.0634,  0.1835],
                      [ 0.2595,  1.5328,  0.5067,  0.0747,  0.5695]], device='cuda:0')),
             ('fc2.bias',
              tensor([ 0.0871,  0.5670, -0.1645, -0.2864,  0.5354], device='cuda:0')),
             ('fc3.weight',
              tensor([[ 1.0504,  0.7688,  0.4062, -0.1642,  0.9954]], device='cuda:0')),
             ('fc3.b