<center>
    <tr>
    <td><img src="images/Quansight_Logo_Lockup_1.png" width="25%"></img></td>
    </tr>
</center>

---
# PyTorch Linear Regression With One Hidden Layer

---

## Lesson plan

We will construct a neural network to solve 1D regression problem.  This neural network will consist of 1 hidden layer. 

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import pprint as pp
import torch
import torch.nn as nn
from torch.utils.data import DataLoader
from torch.utils.data import Dataset
import torch.nn.functional as F

## Generating test data

In [None]:
np.random.seed(0)

n_samples = 10
x = np.arange(n_samples)
y = np.sin(2 * np.pi * x / n_samples) * 4

plt.figure(figsize=(4,4))
plt.plot(x, y, 'o')
plt.xlabel('x')
plt.ylabel('y')
plt.xlim(-1,10)
plt.ylim(-5,5)

## Data utilities

Deep learning models are data intensive.  In many cases a large fraction of time is spent organizing data to support training deep neural networks.  PyTorch provides `Dataset` class in its `torch.utils.data` module to construct data loaders appropriate for deep network training.

### Constructing a `DataSet`

In [None]:
class MyDataset(Dataset):
    def __init__(self, x, y):
        self.x = x
        self.y = y
        
    def __len__(self):
        return len(self.x)
    
    def __getitem__(self, idx):
        sample = {
            'feature': torch.tensor([self.x[idx]], dtype=torch.float32), 
            'label': torch.tensor(np.array([self.y[idx]]), dtype=torch.float32)}
        return sample

In [None]:
dataset = MyDataset(x, y)
print('length: ', len(dataset))
for i in range(5):
    pp.pprint(dataset[i])

### Constructing a `DataLoader`

We use dataloader class to construct batches needed during training.

In [None]:
dataset = MyDataset(x, y)
batch_size = 4
shuffle = True
num_workers = 4
dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=shuffle, num_workers=num_workers)
for i_batch, samples in enumerate(dataloader):
    print('\nbatch# = %s' % i_batch)
    print('samples: ')
    pp.pprint(samples)
    break # Otherwise it prints too much stuff

## Neural network model

In [None]:
class Regression(nn.Module):
    def __init__(self, input_size=1, hidden_size=10):
        super(Regression, self).__init__()
        
        self.hidden = nn.Linear(in_features=input_size, out_features=hidden_size, bias=True)
        self.hidden_activation = nn.Tanh()
             
        self.output = nn.Linear(in_features=hidden_size, out_features=1, bias=True)
    
    def forward(self, x):
        x1 = self.hidden(x)
        x2 = self.hidden_activation(x1)
        x3 = self.output(x2)  # No activation for output, since we are
                              # dealing with a regression problem
        return x3

### Model summary

In [None]:
dummy = Regression()
print(dummy)

### Model parameters

In [None]:
for i, parameter in enumerate(dummy.parameters()):
    print(i, '\n', parameter)

## Loss

In [None]:
class MyLoss(nn.Module):
    def __init__(self):
        super(MyLoss, self).__init__()
        
    def forward(self, predictions, targets):
        d = torch.sub(predictions, targets)
        d2 = torch.pow(d, 2)
        d2sum = torch.sum(d2)
        
        return d2sum

## Training

In [None]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f'Using device: {device}')
if device.type == 'cuda':
    print(torch.cuda.get_device_name(0))

In [None]:
model = Regression(1, 10).to(device)
criterion = MyLoss().to(device)
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

In [None]:
dataset = MyDataset(x, y)
batch_size = 4
shuffle = True
num_workers = 4
training_sample_generator = DataLoader(dataset, 
                                       batch_size=batch_size, 
                                       shuffle=shuffle, 
                                       num_workers=num_workers)

In [None]:
num_epochs = 1000
for epoch in range(num_epochs):
    for batch_i, samples in enumerate(training_sample_generator):
        features = samples['feature'].to(device)
        targets = samples['label'].to(device)
        predictions = model(features)
        error = criterion(predictions, targets)
        optimizer.zero_grad()
        error.backward()        
        optimizer.step()
    if epoch % 100 == 0:
        print('epoch %d:' % epoch, error.item())

## Results

In [None]:
x_try = torch.tensor(np.linspace(-1, 10, 1000), dtype=torch.float32)
y_try = model(x_try.unsqueeze(1).to(device))

First detach, needed for values for which gradient is computed.  Next convert to a numpy array and flatten.

In [None]:
yy_try = y_try.detach().cpu().numpy().flatten()

In [None]:
plt.figure(figsize=(4,4))
plt.plot(x, y, 'o', label='Ground truth')
plt.plot(x_try, yy_try, 'r', label='Prediction')
plt.xlabel('x')
plt.ylabel('y')
plt.xlim(-1,10)
plt.ylim(-5,5)
plt.legend()

### If you're struggling with the breadth of NN hyperparameters...

Check out the [TensorFlow Playground](https://playground.tensorflow.org/).

<center>
    <tr>
    <td><img src="images/Quansight_Logo_Lockup_1.png" width="25%"></img></td>
    </tr>
</center>