<a href="https://colab.research.google.com/github/caocscar/workshops/blob/master/pytorch/Workshop_Regression_Class.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Regression Problem**

In [2]:
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
from torch.utils.data import TensorDataset, DataLoader
import numpy as np
import pandas as pd

print('Torch version', torch.__version__)
print('Pandas version', pd.__version__)
print('Numpy version', np.__version__)

Torch version 1.3.1
Pandas version 0.25.3
Numpy version 1.17.4


The following should say `cuda:0`. If it does not, we need to go to *Edit* -> *Notebook settings* and change it to a `GPU` from `None`. You only have to do this once per notebook.

In [3]:
device = 'cuda:0' if torch.cuda.is_available() else 'cpu'
device

'cuda:0'

Read in dataset

In [0]:
df_train = pd.read_csv('https://raw.githubusercontent.com/greght/Workshop-Keras-DNN/master/ChallengeProblems/dataRegression_train.csv', header=None)
df_val = pd.read_csv('https://raw.githubusercontent.com/greght/Workshop-Keras-DNN/master/ChallengeProblems/dataRegression_test.csv', header=None)

Construct our x,y variables along with the training and validation dataset

In [0]:
x_train = df_train.iloc[:,0:2]
y_train = df_train.iloc[:,2]
x_val = df_val.iloc[:,0:2]
y_val = df_val.iloc[:,2]

Preprocess our data to go from a `pandas` DataFrame to a `numpy` array to a `torch` tensor.

In [0]:
x_train_tensor = torch.tensor(x_train.to_numpy(), device=device, dtype=torch.float, requires_grad=True)
y_train_tensor = torch.tensor(y_train.to_numpy(), device=device, dtype=torch.float, requires_grad=True)
x_val_tensor = torch.tensor(x_val.to_numpy(), device=device, dtype=torch.float, requires_grad=True)
y_val_tensor = torch.tensor(y_val.to_numpy(), device=device, dtype=torch.float, requires_grad=True)
y_train_tensor = y_train_tensor.view(-1,1)
y_val_tensor = y_val_tensor.view(-1,1)

We'll write a python class to define out neural network.

In [0]:
class ThreeLayerNN(nn.Module):
    def __init__(self, dim_input, H):
        super().__init__()
        self.fc1 = nn.Linear(dim_input, H)
        self.fc2 = nn.Linear(H,H)
        self.fc3 = nn.Linear(H,1)
    
    def forward(self, x):
        x1 = F.relu(self.fc1(x))
        x2 = F.relu(self.fc2(x1))
        y_pred = self.fc3(x2)
        return y_pred

We create an instance of this class.

In [8]:
model = ThreeLayerNN(x_train_tensor.shape[1],5).to(device)
print(model)

ThreeLayerNN(
  (fc1): Linear(in_features=2, out_features=5, bias=True)
  (fc2): Linear(in_features=5, out_features=5, bias=True)
  (fc3): Linear(in_features=5, out_features=1, bias=True)
)


`model.parameters()` contains the **weights** and **bias** (alternating) for each of the 3 layers



In [9]:
params = list(model.parameters())
print(f'There are {len(params)} parameters')
for param in params:
    print(param)

There are 6 parameters
Parameter containing:
tensor([[-0.6722, -0.1253],
        [ 0.3271, -0.5386],
        [-0.4360, -0.6635],
        [-0.0597,  0.2654],
        [-0.4511, -0.1803]], device='cuda:0', requires_grad=True)
Parameter containing:
tensor([ 0.4774,  0.0608,  0.3351,  0.6132, -0.1335], device='cuda:0',
       requires_grad=True)
Parameter containing:
tensor([[-0.4279,  0.0746, -0.2874, -0.4331,  0.0757],
        [-0.1138, -0.2704,  0.0156,  0.3182,  0.1802],
        [ 0.1589, -0.3853,  0.0769,  0.0236,  0.2774],
        [ 0.4160,  0.0268,  0.0658,  0.0249,  0.0023],
        [-0.1503,  0.1482, -0.0260,  0.2199,  0.2633]], device='cuda:0',
       requires_grad=True)
Parameter containing:
tensor([ 0.1400,  0.2608,  0.2217, -0.2910,  0.0465], device='cuda:0',
       requires_grad=True)
Parameter containing:
tensor([[ 0.1069,  0.0756, -0.3563,  0.3523, -0.4246]], device='cuda:0',
       requires_grad=True)
Parameter containing:
tensor([-0.3458], device='cuda:0', requires_grad=Tr

We'll define a template for our `fit_model` function that contains `train` and `validate` functions.

---



In [0]:
def fit_model(model, loss_fn, optimizer):
    def train(x,y):
        yhat = model(x)
        loss = loss_fn(yhat,y)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        return loss.item()
    
    def validate(x,y):
        yhat = model(x)
        loss = loss_fn(yhat,y)
        return loss.item()
    
    return train, validate

We define our *loss function*, *learning rate*, and our *optimizer*. We pass this to `fit_model` to return our `train` and `validate` functions.


In [0]:
loss_fn = nn.MSELoss(reduction='mean') #default
learning_rate = 0.1
optimizer = optim.Adagrad(model.parameters(), lr=learning_rate)
train, validate = fit_model(model, loss_fn, optimizer)

## Mini-batches
From the documentation: `torch.nn` only supports mini-batches. The entire `torch.nn` package only supports inputs that are a mini-batch of samples, and not a single sample.

In [0]:
train_data = TensorDataset(x_train_tensor, y_train_tensor)
train_loader = DataLoader(dataset=train_data, batch_size=10, shuffle=True)

Here is our training loop with mini-batch processing. We have to move each mini-batch onto the GPU.

In [13]:
epochs = 100
for epoch in range(epochs):
    # training
    losses = []
    for i, (xbatch, ybatch) in enumerate(train_loader):
        xbatch = xbatch.to(device)
        ybatch = ybatch.to(device)
        loss = train(xbatch, ybatch)
        losses.append(loss)
    training_loss = np.mean(losses)
    # validation
    validation_loss = validate(x_val_tensor, y_val_tensor)
    # print intermediate results
    if epoch%10 == 9:
        print(epoch, training_loss, validation_loss)

9 5.217282251878218 8.100061416625977
19 4.6458352262323555 6.509875774383545
29 4.617666352878917 6.0749030113220215
39 4.465590021827004 5.876566410064697
49 4.46304219419306 5.840087413787842
59 4.436497558246959 5.683042049407959
69 4.447906385768544 5.73892068862915
79 4.456741766496138 5.724264144897461
89 4.4289374351501465 5.7146830558776855
99 4.434686617417769 5.704777717590332


We can view the current state of our model using the `state_dict` method.

In [14]:
model.state_dict()

OrderedDict([('fc1.weight', tensor([[-0.9870, -0.4540],
                      [ 2.0965, -0.3272],
                      [-0.4208, -0.8602],
                      [ 1.4232,  0.2407],
                      [-0.4511, -0.1803]], device='cuda:0')),
             ('fc1.bias',
              tensor([ 0.0582,  0.2425,  0.0584,  0.6218, -0.1335], device='cuda:0')),
             ('fc2.weight',
              tensor([[-0.2153,  1.3850, -0.1548,  0.3375,  0.0757],
                      [ 0.1091,  1.0617,  0.1496,  1.1005,  0.1802],
                      [ 0.0043, -0.5234, -0.0231, -0.1097,  0.2774],
                      [ 0.4160,  0.0268,  0.0658,  0.0249,  0.0023],
                      [-0.2503, -0.0960, -0.1260,  0.0717,  0.2633]], device='cuda:0')),
             ('fc2.bias',
              tensor([ 0.3495,  0.5070,  0.0802, -0.2910, -0.1100], device='cuda:0')),
             ('fc3.weight',
              tensor([[ 1.0817,  0.8173, -0.2413,  0.3523, -0.3157]], device='cuda:0')),
             ('fc3.b