In [None]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# WorkFlow

<img src="Workflow.png" alt="The WorkFlow" width="1000">

## The Objectives

torch.nn gives us the fundemental building blocks  (layers) for neural networks

1.  Prepare and load data
2.  Making Model
3.  Traning Model
4.  Evaluating Model
5.  Saving/Loading Model



We will create a neural network for linear regression.

Basically a fancy way to say letting the nn guess the next few numbers that will come in a linear line when we give it some data of points of a linear line.

## Prepare and Load Data

Data can almost be anything, as long it can be put into a numerical representation

*   Spread Sheet
*   Images
*   Videos
*   Audios
*   Text
*   And More...

So then we can build a model to find patterns in those numerical representations, and generalize it

### Creating Data

In this case, we will give datapoints on a linear line (y = ax + b)

In [None]:
#Create a linear line, but with a weight and a bias
weight = 0.7
bias = 0.3

#Creating x values list
start = 0
end = 1
step = 0.02
x = torch.arange(start, end, step).unsqueeze(dim=1)
y = weight * x + bias

x[0:10], y[0:10]

### Splitting Training and Testing Data

Data that has been created will be divided into 2 datasets, training and testing data sets.

As for what they do... it's rather obvious

In [None]:
#Creating a ratio split of training/testing set sample size
#so everything up to [split] inside both lists is in the training list, everything after is in the testing list

split = int(len(x) * 0.8)
x_train, y_train = x[:split], y[:split]
x_test, y_test = x[split:], y[split:]

In [None]:
#Visualizing data using matplotlib

def plot_predictions(train_x=x_train, train_y=y_train, test_x=x_test, test_y=y_test, predictions=None):
  plt.figure(figsize=(10, 7))

  #plot training data
  plt.scatter(train_x, train_y, c="b", s=4, label="Training Data")

  #plot testing data (in the sameplot)
  plt.scatter(test_x, test_y, c="g", s=4, label="Testing Data")

  #when we want to compare predictions with testing
  if predictions != None:
    plt.scatter(test_x, predictions, c="r", s=4, label="Predictions")

  #makes a legend
  plt.legend(prop={"size": 14})

In [None]:
plot_predictions(x_train, y_train, x_test, y_test)

## Making Model

In [None]:
#Creating a python class for the model
#This takes in nn.module as a super class, which allows our class to access all the functionalities of our super class
class LinearRegressionModel(torch.nn.Module):
    def __init__(self):

        #Initialize the super class
        super().__init__()

        #Randomization of variables, parameter allows us to do some special things (storing tensor data, tracking gradients for performing gradient descent, and assigning the data types)
        self.weights = torch.nn.Parameter(torch.randn(1, requires_grad=True, dtype=torch.float))
        self.bias = torch.nn.Parameter(torch.randn(1, requires_grad=True, dtype=torch.float))


    #There is a built in "forward"method in the nn.Module
    #But we overwrote it with this as the default forward computation method for the model (y = ax+b)
    def forward(self, values: torch.Tensor) -> torch.Tensor:
        return self.weights * values + self.bias

### PyTorch Model Building Essentials


What does the model do?

1.  Start with ramdom values of weight and bias
2.  Adjust those random values to perform better on training data


How does it do that?

1.   Backpropagation
2.   Graident Descent




Cheatsheet for nn building moduels (really check this out)
https://pytorch.org/tutorials/beginner/ptcheat.html?highlight=cheat



*   torch.nn - contains all the building blocks of the nn
*   torch.nn.Paremeter - the paremeters for pytorch layers
*   torch.nn.Module - the functions are all written here
*   torch.optim - all the algorithms for gradient descent is here
*   def foward() - you have to hand define this and overwrite the one in nn.Module, this defines what happends in forward propagation

### Checking the model's internals using "paremeters()"

In [None]:
#Our model is really simple, so there isn't a lot that we can peek inside


#Creating a random seed allows us to get the same values from the random functions
torch.manual_seed(42)

#Create a object of our model calss
cool_model = LinearRegressionModel()

#Checking the paremeters in a list (because or else we cannot see the paremeters properly)
list(cool_model.parameters())

In [None]:
#Checking out the paremeters named in a list with state_dict()
cool_model.state_dict()

also running the initial network's prediction once so down at seeing results we can see some differences

In [None]:
y_predictions_original = cool_model(x_test)
y_predictions_original = y_predictions_original.detach().numpy()
y_predictions_original = y_predictions_original.tolist()
y_predictions_original

## Training/Testing Model

There are 3 modes for a model, training, evaluation, and infrance.

* Training is where we tweak the paremeters for a better performing network
* Evaluation is... evaluating how good the network is, not updating the paremeters
* Infrance is using the model to make predictions on new data, basically putting the network to real world use

Basically, infrance mode will stop pytorch from doing a few more things behind the scenes and make the code run quicker

### Setting Up

Well... the fun of math kicks in here, you know what's behind the scenes

Cost/Loss functions, Optimizer/Gradient Descent... yada yada
Do be aware that different scenarios will be better off using different loss functions and optimizers, search them later

* https://pytorch.org/docs/stable/nn.html#loss-functions
* https://pytorch.org/docs/stable/optim.html


In [None]:
#We are using the mean absoulte value difference loss
loss_function = torch.nn.L1Loss()

#We are using Stochasitic Gradient Descent on our model's paremeters, alongside a learning rate
optimizer = torch.optim.SGD(params=cool_model.parameters(), lr=0.01)

In the demonstration video, he wrote the training and testing together

but we'll write a def here for the testing and throw it in the training loop above

In [None]:
def testing(model):
    model.eval()
    with torch.inference_mode():
        test_predictions = model(x_test)
    test_loss = loss_function(test_predictions, y_test)

    return test_loss
    

### Training/Testing Loop

This is what should happen in a training loop

0. Get Data
1. Forward Propagation
2. Loss Calculation
3. Initialize Gradient Arrays
4. Back Propagation
5. Update Parameters
6. Repeat (Gradient Descent)

In [None]:
#How many steps we doing
epochs = 160

epoch_count = []
training_loss_values = []
testing_loss_values = []


#Repeat (Gradient Descent)
for epoch in range(epochs):
    epoch_count.append(epoch)

    #Set model to training mode
    cool_model.train()

    #Get data, and forward propagation
    y_predictions = cool_model(x_train)

    #Loss calculation
    train_loss = loss_function(y_predictions, y_train)
    training_loss_values.append(train_loss)

    #Initialize Gradient Arrays... this means we are creating empty arrays to hold all the components of the gradient for each parameter in the nn.
    #So then we can perform a step in gradient descent, using corrrespinding components of the gradient in the now filled arrays to update the weights biases.
    #In pytorch, apparently this is already done, but we have to do zero_grad(), to clear out the accumalated gradients by back propagation
    optimizer.zero_grad()

    #Back propagation
    train_loss.backward()

    #Update Parameters
    optimizer.step()



    #Testing Inside Training Loop
    
    #Set mode to evaluation mode
    cool_model.eval()

    #testing
    test_loss = testing(cool_model)
    testing_loss_values.append(test_loss)

    #Printing out model's parameters
    print(cool_model.state_dict())
    print(f"epoch {epoch}, training_loss {train_loss}, testing_loss {test_loss}")

### Seeing Results

The original values with the seed of 42
* weight = 0.3367
* bias = 0.1288

The desired values
* weight = 0.7
* bias = 0.3

In [None]:
#Original prediction
plot_predictions(predictions=y_predictions_original)

In [None]:
#Final prediction
with torch.inference_mode():
    y_predictions_new = cool_model(x_test)
plot_predictions(predictions=y_predictions_new)

In [None]:
#Plotting changes in loss over epoches

#Turning list of pytorch tensor data into numpy arrays for matplotlib
training_loss_values = np.array(torch.tensor(training_loss_values).detach().numpy())
testing_loss_values = np.array(torch.tensor(testing_loss_values).detach().numpy())

#The plot for real
plt.plot(epoch_count, training_loss_values, label="Training Loss")
plt.plot(epoch_count, testing_loss_values, label="Testing Loss")
plt.title("Training and Testing Loss Curves")
plt.xlabel("Epoch")
plt.ylabel("Loss")
plt.legend();

## Saving and Loading Model

Pickles!

1. `torch.save()` allows us to save a pytorch object in pickle format
2. `torch.load()` allows us to load to pytorch object, from pickle format
3. `torch.nn.Module.load_state_dict()` loads a model's state dictionary, basically a python dictionary

### Save

In [1]:
#We can use os.path or pathlib.Path for saving
from pathlib import Path


#Create directory
model_path = Path("models")
model_path.mkdir(parents=True, exist_ok=True)

#Create saving path, usually pytorch files are called "pth"
model_name = "pytorch_workflow_model_0.pth"
model_save_path = model_path / model_name

#Saving the state dict
torch.save(obj=cool_model.state_dict(),
           f=model_save_path)

NameError: name 'torch' is not defined

### Load

In [None]:
#we'll need to create a new model and load the saved state_dict() into the new model
cooler_model = LinearRegressionModel()

#loading the saved state dict from the new model, with torch.load()
cooler_model.load_state_dict(torch.load(f=model_save_path))