## Linear Regression using PyTorch built-ins

Credits: \
https://jovian.ai/aakashns/02-linear-regression

In [1]:
import numpy as np
import torch
import torch.nn as nn
from torch.utils.data import TensorDataset

In [2]:
# features
X = np.array([[73, 67, 43], [91, 88, 64], [87, 134, 58], 
             [102, 43, 37], [69, 96, 70], [73, 67, 43], 
             [91, 88, 64], [87, 134, 58], [102, 43, 37], 
             [69, 96, 70], [73, 67, 43], [91, 88, 64], 
             [87, 134, 58], [102, 43, 37], [69, 96, 70]], 
             dtype='float32')

print(X)
print(X.shape)

[[ 73.  67.  43.]
 [ 91.  88.  64.]
 [ 87. 134.  58.]
 [102.  43.  37.]
 [ 69.  96.  70.]
 [ 73.  67.  43.]
 [ 91.  88.  64.]
 [ 87. 134.  58.]
 [102.  43.  37.]
 [ 69.  96.  70.]
 [ 73.  67.  43.]
 [ 91.  88.  64.]
 [ 87. 134.  58.]
 [102.  43.  37.]
 [ 69.  96.  70.]]
(15, 3)


In [3]:
# target
y = np.array([[56, 70], [81, 101], [119, 133], 
              [22, 37], [103, 119], [56, 70], 
             [81, 101], [119, 133], [22, 37], 
             [103, 119], [56, 70], [81, 101], 
             [119, 133], [22, 37], [103, 119]], 
             dtype='float32')
print(y)
print(y.shape)

[[ 56.  70.]
 [ 81. 101.]
 [119. 133.]
 [ 22.  37.]
 [103. 119.]
 [ 56.  70.]
 [ 81. 101.]
 [119. 133.]
 [ 22.  37.]
 [103. 119.]
 [ 56.  70.]
 [ 81. 101.]
 [119. 133.]
 [ 22.  37.]
 [103. 119.]]
(15, 2)


In [4]:
# Convert features and target to tensors
X = torch.from_numpy(X)
y = torch.from_numpy(y)
#print('X:')
#print(X)

#print('y:')
#print(y)

We'll create a `TensorDataset`, which allows access to rows from inputs and targets as tuples, and provides standard APIs for working with many different types of datasets in PyTorch. This also allows us to access a small section of the training data using the array indexing notation (`[0:3]` for instance).

In [5]:
# Define dataset
train_ds = TensorDataset(X, y)
train_ds[0:3]

(tensor([[ 73.,  67.,  43.],
         [ 91.,  88.,  64.],
         [ 87., 134.,  58.]]),
 tensor([[ 56.,  70.],
         [ 81., 101.],
         [119., 133.]]))

We'll also create a `DataLoader`, which can split the data into batches of a predefined size while training. It also provides other utilities like shuffling and random sampling of the data.

In [6]:
from torch.utils.data import DataLoader

In [7]:
# Define data loader
batch_size = 5
train_dl = DataLoader(train_ds, batch_size, shuffle=True)

The data loader is typically used in a `for-in` loop:

In [8]:
for xb, yb in train_dl:
    print(xb)
    print(yb)
    break

tensor([[102.,  43.,  37.],
        [ 87., 134.,  58.],
        [ 69.,  96.,  70.],
        [102.,  43.,  37.],
        [ 87., 134.,  58.]])
tensor([[ 22.,  37.],
        [119., 133.],
        [103., 119.],
        [ 22.,  37.],
        [119., 133.]])


In each iteration, the data loader returns one batch of data, with the given batch size. If `shuffle` is set to `True`, it shuffles the training data before creating batches.

### nn.Linear

Instead of initializing the weights & biases manually, we can define the model using the `nn.Linear` class from PyTorch, which does it automatically.

In [9]:
# Define model
model = nn.Linear(3, 2)
print(model.weight)
print(model.bias)

Parameter containing:
tensor([[ 0.2305, -0.5226, -0.2284],
        [ 0.3058,  0.0065, -0.5566]], requires_grad=True)
Parameter containing:
tensor([-0.3548,  0.0347], requires_grad=True)


PyTorch models also have a helpful `.parameters` method, which returns a list containing all the weights and bias matrices present in the model.

In [10]:
model.parameters()

<generator object Module.parameters at 0x7fb6f1487450>

In [11]:
list(model.parameters())

[Parameter containing:
 tensor([[ 0.2305, -0.5226, -0.2284],
         [ 0.3058,  0.0065, -0.5566]], requires_grad=True),
 Parameter containing:
 tensor([-0.3548,  0.0347], requires_grad=True)]

In [12]:
y_pred = model(X)
y_pred

tensor([[-28.3640,  -1.1371],
        [-39.9864,  -7.1837],
        [-63.5798,  -4.7672],
        [ -7.7640,  10.9142],
        [-50.6102, -17.1988],
        [-28.3640,  -1.1371],
        [-39.9864,  -7.1837],
        [-63.5798,  -4.7672],
        [ -7.7640,  10.9142],
        [-50.6102, -17.1988],
        [-28.3640,  -1.1371],
        [-39.9864,  -7.1837],
        [-63.5798,  -4.7672],
        [ -7.7640,  10.9142],
        [-50.6102, -17.1988]], grad_fn=<AddmmBackward>)

### Loss function

In [13]:
import torch.nn.functional as F

The `nn.functional` package contains many useful loss functions and several other utilities.

In [14]:
# Define loss function
loss_fn = F.mse_loss

In [15]:
loss = loss_fn(model(X), y)
print(loss)

tensor(13454.6963, grad_fn=<MseLossBackward>)


### Optimizer

Instead of manually manipulating the model's weights & biases using gradients, we can use the optimizer `optim.SGD`. SGD stands for `stochastic gradient descent`. It is called `stochastic` because samples are selected in batches (often with random shuffling) instead of as a single group.

In [17]:
# Define optimizer
opt = torch.optim.SGD(model.parameters(), lr=1e-5)

Note that `model.parameters()` is passed as an argument to `optim.SGD`, so that the optimizer knows which matrices should be modified during the update step. Also, we can specify a learning rate which controls the amount by which the parameters are modified.

### Train the model
We'll work batches of data, instead of processing the entire training data in every iteration. Let's define a utility function fit which trains the model for a given number of epochs.

In [18]:
# Utility function to train the model
def fit(num_epochs, model, loss_fn, opt, train_dl):
    
    # Repeat for given number of epochs
    for epoch in range(num_epochs):
        
        # Train with batches of data
        for xb,yb in train_dl:
            
            # 1. Generate predictions
            pred = model(xb)
            
            # 2. Calculate loss
            loss = loss_fn(pred, yb)
            
            # 3. Compute gradients
            loss.backward()
            
            # 4. Update parameters using gradients
            opt.step()
            
            # 5. Reset the gradients to zero
            opt.zero_grad()
        
        # Print the progress
        if (epoch+1) % 10 == 0:
            print('Epoch [{}/{}], Loss: {:.4f}'.format(epoch+1, num_epochs, loss.item()))

In [20]:
fit(100, model, loss_fn, opt, train_dl)

Epoch [10/100], Loss: 666.2463
Epoch [20/100], Loss: 694.8779
Epoch [30/100], Loss: 287.3572
Epoch [40/100], Loss: 286.5207
Epoch [50/100], Loss: 343.4597
Epoch [60/100], Loss: 82.1083
Epoch [70/100], Loss: 171.7728
Epoch [80/100], Loss: 100.4312
Epoch [90/100], Loss: 110.1005
Epoch [100/100], Loss: 53.5318


In [21]:
# Generate predictions
y_pred = model(X)
y_pred

tensor([[ 59.1456,  72.4246],
        [ 81.1555,  95.7189],
        [118.4337, 141.5985],
        [ 32.2722,  48.0136],
        [ 93.4522, 104.0567],
        [ 59.1456,  72.4246],
        [ 81.1555,  95.7189],
        [118.4337, 141.5985],
        [ 32.2722,  48.0136],
        [ 93.4522, 104.0567],
        [ 59.1456,  72.4246],
        [ 81.1555,  95.7189],
        [118.4337, 141.5985],
        [ 32.2722,  48.0136],
        [ 93.4522, 104.0567]], grad_fn=<AddmmBackward>)

In [22]:
y

tensor([[ 56.,  70.],
        [ 81., 101.],
        [119., 133.],
        [ 22.,  37.],
        [103., 119.],
        [ 56.,  70.],
        [ 81., 101.],
        [119., 133.],
        [ 22.,  37.],
        [103., 119.],
        [ 56.,  70.],
        [ 81., 101.],
        [119., 133.],
        [ 22.,  37.],
        [103., 119.]])