# Linear Models

> Linear models are those which use a linear function to combine features into a prediction

When your data only has a single scalar feature (e.g. height) and a single scalar target (e.g. age), that looks like a straight line equation:

![](https://github.com/AI-Core/Content-Public/blob/main/Content/units/Towards%20ChatGPT/2.%20Pytorch/2.%20Linear%20Models/images/Linear%20Model%20Graph%20and%20Equation.png?raw=1)

> When a linear model is used to make a prediction that can be any continuous number, the model is known as _linear regression_.

- Regression: Output can be any continuous value
- Linear: Output is a weighted sum of inputs, plus some constant (bias)

Of course, in many problems of interest, the data can have many features. This makes it harder to visualise, but the concept is exactly the same.

![](https://github.com/AI-Core/Content-Public/blob/main/Content/units/Towards%20ChatGPT/2.%20Pytorch/2.%20Linear%20Models/images/Linear%20Regression%20Equation.png?raw=1)

This means that each feature has an associated _weight_ which represents its influence on the prediction.

The prediction made by linear regression changes by a constant value as the value of each feature increases by one. That constant value, is the associated feature's weight.

![](https://github.com/AI-Core/Content-Public/blob/main/Content/units/Towards%20ChatGPT/2.%20Pytorch/2.%20Linear%20Models/images/Linear%20Regression%20Parameter%20Example.png?raw=1)

Linear regression model parameters:

- Weight: Influence of each feature
- Bias: A constant value for each output

> Note that, although the bias of a linear layer is often a scalar, a linear layer has as many biases as it has outputs

The linear layer can be represented graphically too:

![](https://github.com/AI-Core/Content-Public/blob/main/Content/units/Towards%20ChatGPT/2.%20Pytorch/2.%20Linear%20Models/images/Linear%20Layer%20Graphical%20Model%20with%20Labels.png?raw=1)

In PyTorch, the _`Linear` layer_ can be used to create a (initially random) linear model:


In [5]:
import torch

linear_layer = torch.nn.Linear(3, 1)  # 3 input features, 1 output prediction


PyTorch's `Linear` layers contain the model parameters - both the weights and the bias.


In [6]:
print("weight:", linear_layer.weight)
print("bias:", linear_layer.bias)


weight: Parameter containing:
tensor([[ 0.5183,  0.5672, -0.1289]], requires_grad=True)
bias: Parameter containing:
tensor([-0.5486], requires_grad=True)


We can use this linear layer in a PyTorch model to create a linear regression model (which is initialised with random parameters):


In [7]:
class LinearRegression(torch.nn.Module):
    def __init__(self):
        self.linear_layer = torch.nn.Linear(3, 1)

    def forward(self, X):
        return self.linear_layer(X)


> Linear models are differentiable, which means that they can be trained from end to end using gradient descent

This can be done in a typical PyTorch training loop:


In [8]:
import torch.nn.functional as F
from torch.utils.tensorboard import SummaryWriter


def train(model, dataloader, epochs=10):

    # SET UP LOGGING
    writer = SummaryWriter()
    batch_idx = 0

    optimiser = torch.optim.SGD(
        model.parameters(), lr=0.001)  # SET UP OPTIMISER

    for epoch in range(epochs):
        for batch in dataloader:

            features, labels = batch  # UNPACK BATCH OF DATA

            # COMPUTE LOSS
            predictions = model(features)
            loss = F.mse_loss(predictions, labels)

            # OPTIMISE
            loss.backward()
            optimiser.step()
            optimiser.zero_grad()

            # LOGGING
            writer.add_scalar("Loss/Train", loss.item(), batch_idx)
            batch_idx += 1


Usually, you'd start with the data. But in our case we don't have it yet, so let's get some, and then use PyTorch's `DataLoader` to shuffle it and batch examples together.


In [10]:
from torch.utils.data import DataLoader

dataset = 0  # See practice 1

train_set, val_set, test_set = torch.utils.data.random_split(dataset)

train_loader = DataLoader(train_set, shuffle=True, batch_size=16)


TypeError: random_split() missing 1 required positional argument: 'lengths'

Now we have everything we need to train our linear regression model, so let's train it.


In [None]:
model = LinearRegression()
train(model, train_loader)


Once trained, we can check out the values of the linear layer's parameters to see where they ended up:


In [None]:
print("Final weight:", model.linear_layer.weight)
print("Final bias:", model.linear_layer.bias)
