In [None]:
import numpy as np
import torch
import torch.optim as optim
import torch.nn as nn

# Chapter 3: A Simple Regression Problem

Now that you've learned PyTorch's basics, it's time to put your knowledge into action :-)

We're using the same synthetic dataset from Challenge #0 (*b = 0.5* and *w = -3* for a **linear regression with a single feature (x)**), but this time you'll be using PyTorch instead of Numpy.

$$
\Large
y = b + w x
$$

## Data Generation

In [None]:
true_b = .5
true_w = -3
N = 100

# Data Generation
np.random.seed(42)
x = np.random.rand(N, 1)
epsilon = (.1 * np.random.randn(N, 1))
y = true_b + true_w * x + epsilon

# Shuffles the indices
idx = np.arange(N)
np.random.shuffle(idx)

# Uses first 80 random indices for train
train_idx = idx[:int(N*.8)]
# Uses the remaining indices for validation
val_idx = idx[int(N*.8):]

# Generates train and validation sets
x_train, y_train = x[train_idx], y[train_idx]
x_val, y_val = x[val_idx], y[val_idx]

## Data Preparation

The preparation of data includes **converting the data points** from Numpy arrays to PyTorch tensors and sending them to the available **device**:

In [None]:
device = 'cuda' if torch.cuda.is_available() else 'cpu'

# Our data was in Numpy arrays, but we need to transform them 
# into PyTorch's Tensors and then we send them to the 
# chosen device
x_train_tensor = torch.as_tensor(x_train).float().to(device)
y_train_tensor = torch.as_tensor(y_train).float().to(device)

## Model Configuration

Your first task is to define a **model**, an **optimizer**, and a **loss function** to tackle our **linear** regression with a **single input** and **single output**:

Hint: you can use a simple `Sequential` model for this task

In [None]:
n_epochs = 1000

torch.manual_seed(42)

lr = 0.1

model = nn.Sequential(nn.Linear(1,1))
optimizer = optim.SGD(model.parameters(), lr=lr)
loss_fn = nn.MSELoss()

## Model Training

Your second task is to implement gradient descent steps 1 to 4 using PyTorch:

Hint: you can use the `backward()` method to automatically compute gradients, and you can use optimizer defined in the previous task to update the parameters.

In [None]:
n_epochs = 1000

for epoch in range(n_epochs):
    
    model.train()
    
    # Step 1 - Computes model's predicted output - forward pass
    yhat = model(x_train_tensor)
    
    # Step 2 - Computes the loss
    loss = loss_fn(y_train_tensor, yhat)
    
    # Step 3 - Computes gradients for both "b" and "w" parameters
    loss.backward()
    
    # Step 4 - Updates parameters using gradients and the learning rate
    optimizer.step()
    optimizer.zero_grad()
    
print(model.state_dict())
print(loss)

OrderedDict([('0.weight', tensor([[-3.0310]])), ('0.bias', tensor([0.5235]))])
tensor(0.0080, grad_fn=<MseLossBackward>)


Congratulations! Your PyTorch model is able to learn both *b* and *w* that are **really close** to their true values, just like your previous Numpy version.