# Predict points in a Line

**Learning Source:**

freeCodeCamp.org : PyTorch for Deep Learning & Machine Learning – Full Course 

This example is a example of implementation of linear regression from the provided tutorial.

**References:**
- Link to resource: https://www.youtube.com/watch?v=V_xro1bcAuA
- https://pytorch.org/tutorials/beginner/ptcheat.html

## 01. Setup Environment

### 01.01 Setting up environment

In [None]:
# install basic packages required for the project
%conda install numpy pandas matplotlib

In [None]:
%conda install pytorch -c pytorch

### 01.02 Importing Libraries to python

In [None]:
import torch
from torch import nn

import matplotlib.pyplot as plt
import numpy as np

torch.__version__

### 01.03 Setting up Device Agnostic code and device selection

In [None]:
#setup devices agnostic code
print("Setting up device agnostic ...")
if torch.cuda.is_available():       # Check if cuda available
    device = torch.device("cuda")   # Set device as cuda
elif torch.mps.is_available():      # Check if mps available
    device = torch.device("mps")    # Set device as mps
else:                               # Default device selection
    device = torch.device("cpu")    # Set device as cpu, Default behavour

print(f"Selected device for processing : {device}")

## 02. Data Preparation and Loading

Machine learning is a game of 2 parts:
1. Get data into a numerical representation
2. Build a model to learn patterns in the numerical representation


To showcase this let's create some data using the linear regression formula

Yi = f(Xi,B) + ei

Yi = dependant variable
f = function
Xi = independent variable
B = unknown parameters (beta)
ei = error terms

Well use a linear regression formula to make straight line with known **parameters**

### 02.01 Preparing (Generating) data

In [None]:
# Let us create some data using the linear formula y = m * x + c
m = 0.7
c = 0.3

# Create range values already normalized to be between 0 and 1
start = 0
end = 1
step = 0.02

# Create Features and Labels for our dataset
X = torch.arange(start, end, step).unsqueeze(dim=1) # create a tensor of size [batch_size, 1] by unsqueezing a tensor of [batch_size] size dataset with 1 channel
y = m * X + c   # Create our label data aka dependent variable y from the given Features ie the independent variable X


### 02.02 Train Test Split data

In [None]:
# Let's split our dataset into training and testing sets

# We will use 80% of the data for training and 20% for testing
train_split = int(0.8*len(X))

# Split our dataset into train and test sets
X_train, y_train = X[:train_split], y[:train_split] # Prepare our training data
X_test, y_test = X[train_split:], y[train_split:]   # Prepare our testing data

Let us visualize our datas in a plot. The plot function utilizes matplotlib to plot Traning and Testing data. 

In [None]:
#installing required packages for helper_functions.py
%conda install requests

In [None]:
%conda install torchvision -c pytorch

In [None]:
# Fix issues: ImportError: cannot import name 'is_directory' from 'PIL._util' for torchvision
%conda install pillow

In [None]:
from helpers.helper_functions import plot_predictions, plot_loss_curves, accuracy_close_fn

In [None]:
plot_predictions(X_train, y_train, X_test, y_test)

## 03. Build a Model

Here we will build a simple linear regression model that will inherit nn.Module. This is a basic implementation where we will only do some limited functionalities.  
What our model does ?
* Start with random values for parameters wt and bias
* Look at training data and adjust the random values to better represent the ideal values.  
We might have a question how will our model be able to acheave this, throught 2 main algorithm:
    - Gradient decent
    - Back Propogation

#### Pytorch model building essentials

* torch.nn - contains all of the building blocks for computational grapns ( a nn can be considered a computational graph)
* torch.nn.Parameter - what parameters should model try and learn, ofter a pytorch layer  from torch.nn will set these for us
* torch.nn.Module - The base class for all neural netowrk moduels, if you subclass it, you should overwrite forward()
* torch.optim - This is where the optimizers in pytorch live, they will help with gradient descent
* def forward() - All nn,Module subclasses require you to overwrite forward(), this method defines what happens in the forward computation

https://pytorch.org/tutorials/beginner/ptcheat.html

In [None]:
import numpy as np

from torch import nn

class LinearRegressionBasicModelV1(nn.Module):                        # inherit nn.Module
    def __init__(self):
        super().__init__()
        # We have only 1 feature in X and we want to predict Y so 1 output neuron
        self.wt = nn.Parameter(torch.randn(1,                       # Initialize wt with random values, 1 is the number of features in our dataset, set it as parameter
                                           requires_grad=True,      # True means that the gradient will be calculated for this parameter
                                           dtype=torch.float))      # makesure the datatype is float for the wt parameter
        self.bias = nn.Parameter(torch.zeros(1,                     # Initialize bias with zeros, 1 is the number of features in our dataset, set it as parameter
                                             requires_grad=True,    # True means that the gradient will be calculated for this parameter
                                             dtype=torch.float))    # makesure the datatype is float for the bias parameter
    def forward(self, x: torch.Tensor) -> torch.Tensor:             # Define the forward pass of our model, takes in a tensor and returns a tensor
        '''
           Forward pass through the network.
           Args:
               x (torch.Tensor): Input tensor of shape (batch_size, 1).
           Returns:
               torch.Tensor: Output tensor of shape (batch_size, 1).
        '''
        return self.wt * x + self.bias

class LinearRegressionBasicModelV2(nn.Module):
    def __init__(self):
        super().__init__()
        # We have only 1 feature in X and we want to predict Y so 1 output neuron
        self.linear_layer = nn.Linear(in_features=1, out_features=1) # Define a linear layer with 1 input and 1 output
    
    def forward(self, x: torch.Tensor) -> torch.Tensor:
        '''
        Forward pass through the network.
           Args:
               x (torch.Tensor): Input tensor of shape (batch_size, 1).
           Returns:
               torch.Tensor: Output tensor of shape (batch_size, 1).
        '''
        return self.linear_layer(x)


In [None]:
torch.manual_seed(42)
model_0 = LinearRegressionBasicModelV1()
print("Linear Regression Model 0:")
print("Parameters:")
print(list(model_0.parameters()))
print("StateDict:")
print(model_0.state_dict())

model_1 = LinearRegressionBasicModelV2()
print("Linear Regression Model 1:")
print("Parameters:")
print(list(model_1.parameters()))
print("StateDict:")
print(model_1.state_dict())

In [None]:
#set the model to use the target device
model_0.to(device)
model_1.to(device)

# put data on the target device
X_train = X_train.to(device)
y_train = y_train.to(device)
X_test = X_test.to(device)
y_test = y_test.to(device)


In [None]:
# Make prediction with model (predictions is called inference).
# It does not track grads so less memory usage and faster as less things to keep track
# You can also do something similar with torch.no_grad() but inference_mode is preferred
with torch.inference_mode():
    y_pred_0 = model_0(X_test)
    y_pred_1 = model_1(X_test)

In [None]:
plot_predictions(X_train.cpu(), y_train.cpu(), X_test.cpu(), y_test.cpu(), predictions = y_pred_0.cpu())

In [None]:
plot_predictions(X_train.cpu(), y_train.cpu(), X_test.cpu(), y_test.cpu(), predictions = y_pred_1.cpu())

## 04. Training

The idea is to move from some unknown parameters to know parameters ie from poor representation to better representation.

### 04.01 Setting up HyperParameter, Loss Function and Optimizer

One way to measure how poor or how wring your models prediction are is to use a loss function, aka criterion or cost function.

Things we need to train:

* loss function: A func to measure how wrong our models prediction is.
* Optimizer: takes into account the loss of a model and adjusts the model's param to improve the loss function

In [None]:
lr = 0.01                                                   # learning rate
loss_fn = nn.MSELoss()                                      # setting up loss function as mean squared error 
optimizer_0 = torch.optim.SGD(params=model_0.parameters(),  # SGD is a stochastic gradient descent optimizer
                              lr=lr)

optimizer_1 = torch.optim.SGD(params=model_1.parameters(),
                          lr=lr)

### 04.02 Training Model

#### Building a training & testing loop in Pytorch
Okay, now we've got a loss function and optimizer ready to go, let's train a model.

Steps in training:

<details>
    <summary>PyTorch training loop steps</summary>
    <ol>
        <li><b>Forward pass</b> - The model goes through all of the training data once, performing its
            <code>forward()</code> function
            calculations (<code>model(x_train)</code>).
        </li>
        <li><b>Calculate the loss</b> - The model's outputs (predictions) are compared to the ground truth and evaluated
            to see how
            wrong they are (<code>loss = loss_fn(y_pred, y_train</code>).</li>
        <li><b>Zero gradients</b> - The optimizers gradients are set to zero (they are accumulated by default) so they
            can be
            recalculated for the specific training step (<code>optimizer.zero_grad()</code>).</li>
        <li><b>Perform backpropagation on the loss</b> - Computes the gradient of the loss with respect for every model
            parameter to
            be updated (each parameter
            with <code>requires_grad=True</code>). This is known as <b>backpropagation</b>, hence "backwards"
            (<code>loss.backward()</code>).</li>
        <li><b>Step the optimizer (gradient descent)</b> - Update the parameters with <code>requires_grad=True</code>
            with respect to the loss
            gradients in order to improve them (<code>optimizer.step()</code>).</li>
    </ol>
</details>



#### Model: LinearRegressionBasicModelV1 

In [None]:
epochs = 5000

#track different metrics
results_0 = {
    "train_loss": [],
    "train_acc": [],
    "test_loss": [],
    "test_acc": []
    }

# Train model LinearRegressionBasicModelV1
for epoch in range(epochs):
    # Training Loop
    model_0.train()                                 # Set model to training mode
    y_pred_0 = model_0(X_train)                     # Forward Pass
    train_loss = loss_fn(y_pred_0, y_train)         # Calculate training loss
    train_acc = accuracy_close_fn(y_train,          # Calculate training accuracy
                                  y_pred_0,
                                  rtol=0.01,
                                  atol=0.01)
    optimizer_0.zero_grad()                         # Zero Grad
    train_loss.backward()                           # Back Propogation
    optimizer_0.step()                              # Gradient Descent

    # Testing Loop
    model_0.eval()                                  # Set model to evaluation mode
    with torch.inference_mode():                    # Disable gradient calculation for performance reasons
        test_pred = model_0(X_test)                 # Make Prediction using the trained model
        test_loss = loss_fn(test_pred, y_test)      # Calculate model loss
        test_acc = accuracy_close_fn(y_test,        # Calculate model accuracy
                                     test_pred,
                                     rtol=0.01,
                                     atol=0.01)
    
    if epoch %10 == 0:
        results_0["train_loss"].append(train_loss.cpu().detach().numpy())
        results_0['train_acc'].append(train_acc)
        results_0["test_loss"].append(test_loss.cpu().detach().numpy())
        results_0["test_acc"].append(test_acc)
        print(f"Epoch: {epoch} | Train Loss: {train_loss:.4f} | Train Acc: {train_acc} | Test Loss: {test_loss:.4f} | Test Acc: {test_acc}")

#### Model V1 Training Evaluation

In [None]:
plot_loss_curves(results=results_0)

In [None]:
plot_predictions(X_train.cpu(), y_train.cpu(), X_test.cpu(), y_test.cpu(), predictions = test_pred.cpu())

#### Model: LinearRegressionBasicModelV2

In [None]:
epochs = 5000

#track different metrics
results_1 = {
    "train_loss": [],
    "train_acc": [],
    "test_loss": [],
    "test_acc": []
    }

# Train model LinearRegressionBasicModelV2
for epoch in range(epochs):
    # Training Loop
    model_1.train()                                 # Set model to training mode
    y_pred_1 = model_1(X_train)                     # Forward Pass
    train_loss = loss_fn(y_pred_1, y_train)         # Calculate training loss
    train_acc = accuracy_close_fn(y_train,          # Calculate training accuracy
                                  y_pred_1,
                                  rtol=0.01,
                                  atol=0.01)
    optimizer_1.zero_grad()                         # Zero Grad
    train_loss.backward()                           # Back Propogation
    optimizer_1.step()                              # Gradient Descent

    # Testing Loop
    model_1.eval()                                  # Set model to evaluation mode
    with torch.inference_mode():                    # Disable gradient calculation for performance reasons
        test_pred = model_1(X_test)                 # Make Prediction using the trained model
        test_loss = loss_fn(test_pred, y_test)      # Calculate model loss
        test_acc = accuracy_close_fn(y_test,        # Calculate model accuracy
                                     test_pred,
                                     rtol=0.01,
                                     atol=0.01)
    
    if epoch %10 == 0:
        results_1["train_loss"].append(train_loss.cpu().detach().numpy())
        results_1['train_acc'].append(train_acc)
        results_1["test_loss"].append(test_loss.cpu().detach().numpy())
        results_1["test_acc"].append(test_acc)
        print(f"Epoch: {epoch} | Train Loss: {train_loss:.4f} | Train Acc: {train_acc} | Test Loss: {test_loss:.4f} | Test Acc: {test_acc}")

#### Model V2 Training Results Evaluation

In [None]:
plot_loss_curves(results=results_1)

In [None]:
plot_predictions(X_train.cpu(), y_train.cpu(), X_test.cpu(), y_test.cpu(), predictions = test_pred.cpu())

## 05. Storing & Loading Model

For saving a model in Pytorch, there are 3 main methods are available we should know about saving and loading models

1. `torch.save()` - allows you to save a Pytorch model in python pickle format.
2. `torch.load()` - allows you to load a saved Pytorch obj
3. `torch.nn.Module.load_state_dict()` - this allows you to save state dictionary



You can either just save model dictionary or save and load whole model. Both has pros and cons

### 05.01 Saving Model state to file

In [None]:
from pathlib import Path
MODEL_PATH = Path("saved_models")
MODEL_PATH.mkdir(parents=True, exist_ok=True)

MODEL_0_NAME = "001_LinearRegressionBasicModelV1.pth"
MODEL_0_SAVE_PATH = MODEL_PATH / MODEL_0_NAME

MODEL_1_NAME = "001_LinearRegressionBasicModelV2.pth"
MODEL_1_SAVE_PATH = MODEL_PATH / MODEL_1_NAME

print(f"Saving model LinearRegressionBasicModelV1 to : {MODEL_0_SAVE_PATH}")
torch.save(obj=model_0.state_dict(),
           f=MODEL_0_SAVE_PATH)
print("Done")

print(f"Saving model LinearRegressionBasicModelV2 to : {MODEL_1_SAVE_PATH}")
torch.save(obj=model_0.state_dict(),
           f=MODEL_1_SAVE_PATH)
print("Done")

### 05.02 Loading Model state from file

Since we saved model's `state_dict()` rather than entire model, we create a new instance of a model class and load state_dict() into that new instance.

In [None]:
loaded_model_0 = LinearRegressionBasicModelV1()                 # Create an instance of the model
loaded_model_0.load_state_dict(torch.load(f=MODEL_0_SAVE_PATH)) # Load the saved state dict

loaded_model_0.to(device)

### 05.03 Testing loaded model

In [None]:
# make prediction from loaded model
loaded_model_0.eval()
with torch.inference_mode():
  loaded_model_preds = loaded_model_0(X_test)

#make prediction from old model
model_0.eval()
with torch.inference_mode():
  model_0_preds = model_0(X_test)

In [None]:
torch.eq(loaded_model_preds, model_0_preds).all()               # check if predictions are the same

## 06. Conclusion

In this example we saw two different ways we can utilize pytorch to create a linear regression model. One method included defining parameters ourselves, while other utilizing inbuild nn.Linear as a layer in our custom model. We also looked at the different stages in building and training a model. We utilized device agnostic code to do our training calculation on gpu's where available. We also utilized MSE(Mean Squeared Error) loss function to calculate loss and SGD (Stochastic Gradient Decent) for the optimizer for gradient decent for a linear regression problem. Lastly we we looked at how to save our model in torch format, so that it can be loaded later on.  

As this was a single independent variable ie simple linear regression example, there the model was simple in many ways including dimentional requirement for feature data sets. Next steps from this example would be moving on to building multiple independent variable regression model called multiple linear regression model.  

Further steps to learn would be optimization of hyperparameters and using different algorithms like gradient descent, stochastic gradient descent etc and implementing on the build models.