<center>
    <tr>
    <td><img src="images/Quansight_Logo_Lockup_1.png" width="25%"></img></td>
    </tr>
</center>

---
# Linear Regression with PyTorch

---
## Outline

1. Using PyTorch data utilities
2. Constructing Linear Regression model in PyTorch
3. Training the neural network

+ Objective: use 1D linear regression as example of workflow with PyTorch
  + Framework extends to training much larger deep network models

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import pprint as pp

+ Generate simple data: straight line with some noise

In [None]:
n_samples = 10 # number of data points
m = .7
c = 0
x = np.linspace(0, 9, n_samples) 
y = m*x + c + np.random.normal(0,.3,x.shape) - 3
plt.figure()
plt.plot(x,y,'o')
plt.xlabel('x')
plt.ylabel('y')
plt.xlim(-1,10)
plt.ylim(-4,4)
plt.title(f'2D data (#data = {n_samples})');

+ More interesting data; line obviously poor model

In [None]:
np.random.seed(0)

n_samples = 10
x = np.arange(n_samples)
y = np.sin(2 * np.pi * x / n_samples) * 4

plt.figure(figsize=(4,4))
plt.plot(x, y, 'o')
plt.xlabel('x')
plt.ylabel('y')
plt.xlim(-1,10)
plt.ylim(-5,5)
plt.title(f'2D data (#data = {n_samples})');

---
## Using PyTorch data utilities

+ Deep learning models are data intensive
   + Organizing data to support training deep neural networks time-consuming

#### PyTorch `Dataset` & `DataLoader` classes

+ [PyTorch `Dataset` class](https://pytorch.org/docs/stable/data.html#torch.utils.data.Dataset) (in `torch.utils.data`) for constructing appropriate _data loaders_ for deep network training
+ Abstraction generalizes to work with large on-disk data sets
+ [PyTorch `DataLoader` class](https://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader) (again, in `torch.utils.data`) combines dataset and sampling strategy (e.g., in random batches without replacement) as Python iterable

In [None]:
import torch

from torch.utils.data import Dataset
class MyDataset(Dataset):
    def __init__(self, x, y):
        self.x = x
        self.y = y       
    def __len__(self):
        return len(self.x)    
    def __getitem__(self, idx):
        # Notice 'feature' returns homogeneous tuple corresponding to x
        sample = {
            'feature': torch.tensor([1,self.x[idx]], dtype=torch.double), 
            'label': torch.tensor([self.y[idx]], dtype=torch.double)}
        return sample

In [None]:
# NumPy arrays of data created previously
print(f'Features shape: {x.shape}')
print(f'Targets shape {y.shape}')
# Wraps in MyDataSet class around features & labels
dataset = MyDataset(x, y)
dataset

In [None]:
for k, sample in enumerate(dataset):
    print(f"Observation{k:2d}:\tFeatures: {sample['feature']}\tLabel: {sample['label']}")

+ Construct a `DataLoader` for batches in training

In [None]:
from torch.utils.data import DataLoader

dataset = MyDataset(x, y) # Instantiate dataset
batch_size = 3
shuffle = True
num_workers = 4
dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=shuffle, num_workers=num_workers)

In [None]:
for i_batch, samples in enumerate(dataloader):
    print('\nbatch# = %s' % i_batch)
    print('samples: ')
    pp.pprint(samples)

---
## Constructing Linear Regression model in PyTorch

+ With data ready, next construct PyTorch model
+ Inherit model class from PyTorch `nn.Module` class
   + Needs to provide `forward` method

In [None]:
import torch.nn as nn
from torch.nn.parameter import Parameter

In [None]:
class MyModel(nn.Module):
    def __init__(self, dim_input, dim_output):
        super(MyModel, self).__init__()
        
        # Defining model parameters m and c
        # for model y = mx + c.
        # Specififally, m represents weight matrix,
        #  c represents bias vector
        self.weight = Parameter(torch.empty([dim_output, dim_input], dtype=torch.double))
        self.bias = Parameter(torch.empty([dim_output, 1], dtype=torch.double))
        
        # Initializing parameter values uniformly distributed random values
        stdv = 1.0
        self.weight.data.uniform_(-stdv, stdv)
        self.bias.data.uniform_(-stdv, stdv)
        
    def forward(self, x):
        weight_and_bias = torch.cat((self.weight, self.bias), 1)
        # Output of model is y = [x, 1].T * [m, c]
        # Important: input x may not be single observation. 
        # Input x may be batch of inputs as tensor
        #
        # (Common to use dimension 0 use batch dimension)
        out = x.matmul(weight_and_bias.t())
        return out

+ Instantiate model for 1D linear regression problem

In [None]:
input_dim = 1
output_dim = 1
model = MyModel(input_dim, output_dim)

+ `dataloader` from before draws samples from data set in batches of fixed size
+ `model.forward` is prediction from model with current `weight`/`bias` values

In [None]:
for k_batch, sample in enumerate(dataloader):
    print(f'Batch number {k_batch}')
    prediction = model.forward(sample['feature'])
    print('Prediction:\n', prediction)

#### Defining a Loss Function

+ Can be defined using class `nn.Module` again
+ When instantiated, method `MyLoss.forward(predictions, target)` computes loss (squared error)
  $$ \mathcal{L}(\hat{y}, y) = \sum_{k=1}^{N}  \left[ y_{k} - \hat{y}_{k} \right]^2 $$
+ Typical loss function for *regression* problems (not scaled below)

In [None]:
class MyLoss(nn.Module):
    def __init__(self):
        super(MyLoss, self).__init__()
        
    def forward(self, predictions, targets):       
        diff = torch.sub(predictions, targets)
        diff2 = torch.pow(diff, 2)
        err = torch.sum(diff2)
        return err

---
## Training the neural network

+ Module [`torch.optim`](https://pytorch.org/docs/stable/optim.html) supports numerous oprimization algorithms
  + Custom strategies can be defined (e.g., [*stochastic gradient descent*](https://en.wikipedia.org/wiki/Stochastic_gradient_descent), [ADAM](https://en.wikipedia.org/wiki/Stochastic_gradient_descent#Adam), [AdaGrad](https://en.wikipedia.org/wiki/Stochastic_gradient_descent#AdaGrad), etc.)
+ Here, define `optimizer` using `torch.optim.SGD`


In [None]:
num_epochs = 1000  # How many times the entire training data is seen?
l_rate = 0.01
# Instantiate objects for optimizer & loss function
#   Requires model.parameters() & a learning rate
optimizer = torch.optim.SGD(model.parameters(), lr = l_rate)
loss = MyLoss()

In [None]:
# Re-initialize dataset & dataloader, just in case
dataset = MyDataset(x, y)
batch_size = 4
shuffle = True
num_workers = 4
training_data_loader = DataLoader(dataset, batch_size=batch_size, shuffle=shuffle, num_workers=num_workers)

In [None]:
sample = next(iter(training_data_loader))
print(model(sample['feature']))
print(model.forward(sample['feature']))
print('Model weight', model.weight)
print('Model bias', model.bias)

In [None]:
for epoch in range(num_epochs):
    if epoch % 100 == 0:
        print('Epoch = %s' % epoch)
    for k_batch, samples in enumerate(training_data_loader):
        predictions = model.forward(samples['feature'])
        error = loss.forward(predictions, samples['label'])
        if epoch % 100 == 0:
            print('\tBatch = %s, Error = %s' % (k_batch, error.item()))
        
        # Before the backward pass, use optimizer object to zero out all
        # gradients for variables to update (i.e., learnable parameters).
        # By default, gradients are accumulated in buffers (i.e., not overwritten)
        # whenever .backward() called.
        # See docs of torch.autograd.backward for details
        optimizer.zero_grad()
        
        # Backward pass: compute gradient of loss wrt model parameters
        error.backward()
        
        # Calling method optimizer.step to compute updated parameters
        optimizer.step()

#### Visualizing result of training

+ Make grid of values to sample `model`
+ Preprocess inputs into form suitable for `model.forward`

In [None]:
x_for_plotting = np.linspace(-1, 10, 1000)
design_matrix = torch.tensor(np.vstack([np.ones(x_for_plotting.shape), x_for_plotting]).T, dtype=torch.double)
print('Design matrix shape:', design_matrix.shape)

In [None]:
y_for_plotting = model.forward(design_matrix)
print('y_for_plotting shape:', y_for_plotting.shape)

In [None]:
plt.figure(figsize=(4,4))
plt.plot(x, y, 'o', label='data')
plt.plot(x_for_plotting, y_for_plotting.data.numpy(), 'r-', label='model')
plt.xlabel('x')
plt.ylabel('y')
plt.xlim(-1,10)
plt.ylim(-10,10)
plt.title('Current fit:')
plt.legend(loc='upper right')
plt.title('Data & linear regression model');

#### Saving model for later use
+ `model` has method `state_dict` that returns parameters

In [None]:
# Print model's state_dict
print("Model's state_dict:")
for key, val in model.state_dict().items():
    print(f'Key: {key}\tValue: {val}\t(Size: {val.size()})')

In [None]:
# Print optimizer's state_dict
print("Optimizer's state_dict:")
for var_name in optimizer.state_dict():
    print(var_name, "\t", optimizer.state_dict()[var_name])

In [None]:
import pathlib
filepath = pathlib.Path('.') / 'model01.pt'
torch.save(model.state_dict(), filepath)

#### Loading saved model
+ `nn.Module` includes `load_state_dict` method for retrieving models from disk

In [None]:
del model, x_for_plotting, y_for_plotting

In [None]:
model_loaded = MyModel(1, 1)
model_loaded.load_state_dict(torch.load('model01.pt'));

In [None]:
model_loaded.state_dict()

In [None]:
x_for_plotting = np.linspace(-1, 10, 1000)
design_matrix = torch.tensor(np.vstack([np.ones(x_for_plotting.shape), x_for_plotting]).T, dtype=torch.double)
print('Design matrix shape:', design_matrix.shape)

y_for_plotting = model_loaded.forward(design_matrix)
print('y_for_plotting shape:', y_for_plotting.shape)

plt.figure(figsize=(4,4))
plt.plot(x, y, 'o', label='data')
plt.plot(x_for_plotting, y_for_plotting.data.numpy(), 'r-', label='model')
plt.xlabel('x')
plt.ylabel('y')
plt.xlim(-1,10)
plt.ylim(-10,10)
plt.title('Current fit:')
plt.legend(loc='upper right')
plt.title('Data & linear regression model');

## Summary

+ Using PyTorch data utilities
+ Constructing Linear Regression model in PyTorch
+ Training the neural network
+ Framework extends to training much larger deep network models

<center>
    <tr>
    <td><img src="images/Quansight_Logo_Lockup_1.png" width="25%"></img></td>
    </tr>
</center>