# Exercise: Implementing a Simple Linear Regression Model with PyTorch

## Introduction

In this exercise, we will be implementing a simple linear regression model using PyTorch, a popular open-source machine learning library. Linear regression is a fundamental algorithm in machine learning that is used to predict a target variable by fitting the best linear relationship between the dependent and independent variable.

## Requirements

1. Import the necessary libraries.
2. Prepare the data: Create tensors for our input data (x) and output data (y). For simplicity, we will use a simple linear relationship y = 2x + 1.
3. Initialize the parameters: Initialize our weight (w) and bias (b) using `torch.nn.Parameter`. `torch.nn.Parameter` is a kind of Tensor, that is automatically added to the list of its module’s parameters.
4. Define the model: Our model is a simple linear model: y = wx + b.
5. Define the loss function: We will use Mean Squared Error (MSE) as our loss function.
6. Train the model: Train our model for 1000 epochs. In each epoch, perform the following steps:
    - Forward pass: Compute predicted y by passing x to the model.
    - Compute and print loss.
    - Use autograd to compute the backward pass. This call will compute the gradient of loss with respect to all Tensors with `requires_grad=True`.
    - Update weights using gradient descent. We need to use `torch.no_grad()` to prevent tracking history (and consuming memory).
    - Manually zero the gradients after updating weights.
7. Print out the final values of w and b.

By the end of this exercise, you should be able to understand the basic workflow of creating and training a simple linear regression model with PyTorch.

In [None]:
# Import the necessary libraries
import torch
import matplotlib.pyplot as plt

In [None]:
# We will create a tensor for our input data (x) and output data (y)
# For simplicity, we will use a simple linear relationship y = 2x + 1
x = torch.arange(-10, 10, 0.1).reshape(-1, 1)
y = 2 * x + 1

In [None]:
# We will initialize our weight (w) and bias (b) using torch.Parameter
# torch.Parameter is a kind of Tensor, that is automatically added to the list of its module’s parameters
# TODO: Initialize the weight and bias parameters (w and b)
w =
b =

In [None]:
# Our model is a simple linear model: y = wx + b
def model(x):
    return w * x + b

The Mean Squared Error (MSE) cost function is defined as:

$$
MSE = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y_i})^2
$$

Where:
- $n$ is the number of observations
- $y_i$ is the actual value
- $\hat{y_i}$ is the predicted value

In [None]:
# We will use Mean Squared Error (MSE) as our loss function
def loss_fn(y_pred, y_true):
    # TODO: Compute the MSE loss
    # Note: do not loop over the samples, use pytorch functions that can handle arrays instead
    loss =
    
    return loss

GitHub Copilot: Sure, here's the explanation with the formulas formatted using LaTeX:

---

Gradient descent is an optimization algorithm used to minimize the cost function in linear regression. It's used to find the best parameters (weights and bias) for our model.

Given a cost function defined as the Mean Squared Error (MSE):

$$
J(w, b) = \frac{1}{n} \sum_{i=1}^{n} (y^{(i)} - (wx^{(i)} + b))^2
$$

Where:
- `n` is the number of observations
- `y^(i)` is the actual value
- `x^(i)` is the input value
- `w` is the weight
- `b` is the bias

The goal of gradient descent is to find the values of `w` and `b` that minimize the cost function `J(w, b)`. 

The update rules for `w` and `b` are:

$$
w := w - \alpha \frac{\partial J(w, b)}{\partial w}
$$

$$
b := b - \alpha \frac{\partial J(w, b)}{\partial b}
$$

Where:
- `α` is the learning rate
- The partial derivatives represent the gradients of the cost function with respect to `w` and `b`.

The gradients indicate the rate of change of the cost function at the current point, i.e., the slope of the cost function. The algorithm subtracts the gradients from the parameters because we want to move against the gradient (downhill) to minimize the cost function.

---

In [None]:
# Store the loss values for each epoch
loss_values = []

# We will train our model for 100 epochs
for epoch in range(100):
    # Forward pass: Compute predicted y by passing x to the model
    # TODO: Compute the predicted y using the model defined above
    y_pred =

    # Compute loss
    # TODO: Compute the loss using the loss function defined above
    loss =
    
    # Append the loss value to the list
    loss_values.append(loss.item())
    
    # Print the loss every 10 epochs
    if epoch % 10 == 0:
        print(f'Epoch {epoch}: {loss.item()}')

    # Use autograd to compute the backward pass. This call will compute the gradient of loss with respect to all Tensors with requires_grad=True.
    # After this call w.grad and b.grad will be Tensors holding the gradient of the loss with respect to w and b respectively
    # TODO: Compute the gradients of w and b using autograd
    

    # Update weights using gradient descent. We need to use torch.no_grad() to prevent tracking history (and consuming memory)
    with torch.inference_mode():
        w -= w.grad * 0.01
        b -= b.grad * 0.01

        # Manually zero the gradients after updating weights
        # The reason for this is that .backward() accumulates the gradients on subsequent backward passes
        w.grad.zero_()
        b.grad.zero_()

In [None]:
plt.figure(figsize=(10, 5))
plt.plot(loss_values)
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.show()

In [None]:
# Print the learned parameters
print(f'w = {w.item()}, b = {b.item()}')

In conclusion, PyTorch provides a powerful and flexible framework for implementing models in machine learning. Its ability to perform automatic differentiation and compute gradients makes it particularly useful for linear regression, among other algorithms. The simplicity and intuitiveness of its API, along with its Pythonic nature, make PyTorch a great tool for both beginners and experts in machine learning. As we've seen in this exercise, with just a few lines of code, we can set up and train a linear regression model. This ease of use, combined with its robust features, makes PyTorch a leading tool in the field of deep learning.