*Synthetic data* can help us to evaluate the properties of our learning algorithms and to confirm that our implementations work as expected.

First, we define the *SyntethicRegressionData* class, its input tensor $X$ and output $y = Xw + b + noise$.
Using this class, we generate our data with arbitrary $w$ and $b$ values, we store our data as a tensor using *TensorDataset*, and load our dataset into the *DataLoader* -> an iterable that abstracts the complexity of handling minibatches, reshuffling the data at every epoch, etc.

In [41]:
import torch
import torch.nn as nn
from torch.utils.data import TensorDataset, DataLoader
import warnings
warnings.filterwarnings("ignore") # no warnings for this notebook

class SyntheticRegressionData(nn.Module):
    def __init__(self, w, b, noise=0.01, num_train=1000, num_val=1000, batch_size=32):
        super().__init__()
        n = num_train + num_val
        self.X = torch.randn(n, len(w))
        noise = torch.randn(n, 1) * noise
        self.y = torch.matmul(self.X, w.reshape((-1,1))) + b + noise
        self.batch_size = batch_size
        #w.reshape((-1,1)) changes w to a column vector with shape [n,1]
        

data = SyntheticRegressionData(w=torch.tensor([2, -3.4]), b=4.2)
dataset = TensorDataset(data.X, data.y)
dataloader = DataLoader(dataset, batch_size=data.batch_size, shuffle=True)

Then, we create a *LinearRegression* class. We fill our weights with values samples from a normal distribution an our bias with zeros. *nn.LazyLinear* is fully connected layer with 1 output feature. It automatically infers __in_features__ during the first pass. When we call forward, it applies a linear transformation. We set our loss function as *nn.MSELoss()* and set *torch.optim.SGD* as our optimizer, then initialize the model.

In [42]:
class LinearRegression(nn.Module):
    def __init__(self, lr):
        super().__init__()
        self.net = nn.LazyLinear(1)
        self.net.weight.data.normal_(0, 0.01)
        self.net.bias.data.fill_(0)
    def forward(self, X):
        return self.net(X)

    def loss(self, y_hat, y):
        fn = nn.MSELoss()
        return fn(y_hat, y)
    
    def configure_optimizers(self, lr):
        return torch.optim.SGD(self.parameters(), lr)

model = LinearRegression(lr=0.01)

Finally, we create the *Trainer* class, which manages the training loop for our model. It initializes the number of epochs and configures the optimizer at a specific learning rate.

The model is set to training mode, and during each epoch the forward pass computes predictions based on the input batch & current parameters, the loss function calculates the mean squared error between the predictions and actual values, and *loss.backwards()* computes gradients for each parameter, and the optimizer updates the model's parameters based on the computer gradients.


In [43]:
import torch

class Trainer:
    def __init__(self, num_epochs=10, lr=0.01):
        self.num_epochs = num_epochs  
        self.lr = 0.01

    def fit(self, model, dataloader):
        self.model = model
        self.optimizer = self.model.configure_optimizers(lr=self.lr) 
        self.epoch = 0 

        for self.epoch in range(self.num_epochs):
            self.fit_epoch(dataloader)

    def fit_epoch(self, dataloader):
        self.model.train() 
        
        for batch_idx, (X_batch, y_batch) in enumerate(dataloader):
            self.optimizer.zero_grad()
            y_pred = self.model(X_batch)
            loss = self.model.loss(y_pred, y_batch)
            loss.backward()
            self.optimizer.step()
            
            print(f"Epoch {self.epoch+1}/{self.num_epochs}, Batch {batch_idx+1}, Loss: {loss.item():.4f}")

trainer = Trainer(num_epochs=10)
trainer.fit(model, dataloader)


Epoch 1/10, Batch 1, Loss: 28.9584
Epoch 1/10, Batch 2, Loss: 30.6595
Epoch 1/10, Batch 3, Loss: 30.0558
Epoch 1/10, Batch 4, Loss: 22.6798
Epoch 1/10, Batch 5, Loss: 18.0296
Epoch 1/10, Batch 6, Loss: 27.2961
Epoch 1/10, Batch 7, Loss: 28.1786
Epoch 1/10, Batch 8, Loss: 25.4579
Epoch 1/10, Batch 9, Loss: 21.9250
Epoch 1/10, Batch 10, Loss: 22.9026
Epoch 1/10, Batch 11, Loss: 25.8844
Epoch 1/10, Batch 12, Loss: 19.5837
Epoch 1/10, Batch 13, Loss: 21.8024
Epoch 1/10, Batch 14, Loss: 17.1907
Epoch 1/10, Batch 15, Loss: 23.7872
Epoch 1/10, Batch 16, Loss: 19.0034
Epoch 1/10, Batch 17, Loss: 16.1263
Epoch 1/10, Batch 18, Loss: 12.2449
Epoch 1/10, Batch 19, Loss: 18.3186
Epoch 1/10, Batch 20, Loss: 9.3419
Epoch 1/10, Batch 21, Loss: 14.0986
Epoch 1/10, Batch 22, Loss: 12.9609
Epoch 1/10, Batch 23, Loss: 22.3205
Epoch 1/10, Batch 24, Loss: 11.2398
Epoch 1/10, Batch 25, Loss: 14.7712
Epoch 1/10, Batch 26, Loss: 14.0446
Epoch 1/10, Batch 27, Loss: 8.6223
Epoch 1/10, Batch 28, Loss: 16.2983
Epo