<a href="https://colab.research.google.com/github/pkro/pytorch_for_deep_learning/blob/main/01_pytorch_workflow_video.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Pytorch Workflow

Explore a pytorch end-to-end workflow.

Resources:

https://github.com/mrdbourke/pytorch-deep-learning/blob/main/01_pytorch_workflow.ipynb

https://www.learnpytorch.io/01_pytorch_workflow/

https://github.com/mrdbourke/pytorch-deep-learning/discussions


In [1]:
what_were_covering = {
    1: "data (prepare and load)",
    2: "build model",
    3: "fitting the model to the data (training)",
    4: "making predictions (=inference)and evaluating a model",
    5: "saving and loading a model",
    6: "putting it all together"
}

what_were_covering

{1: 'data (prepare and load)',
 2: 'build model',
 3: 'fitting the model to the data (training)',
 4: 'making predictions (=inference)and evaluating a model',
 5: 'saving and loading a model',
 6: 'putting it all together'}

torch.nn is for pytorch what keras is for tensorflow (more or less).

GPT:

>torch.nn provides a wide range of pre-defined layers, activation functions, loss functions, and other building blocks commonly used in deep learning. These components allow you to easily define the architecture of your neural network and specify how the data flows through the network.
>
>Similarly, Keras is a high-level neural networks API that provides a simplified and user-friendly interface for building and training neural networks. It also offers a wide range of pre-defined layers, activation functions, and loss functions, among other features. Keras is often praised for its ease of use and its ability to quickly prototype and experiment with different network architectures.

https://pytorch.org/docs/stable/nn.html

In [2]:
import torch
from torch import nn # nn contains all of PyTorch's building blocks for neural networks / graphs
import matplotlib.pyplot as plt

## Data (preparing and loading)

Data can be almost anything in machine learning.

- Excel spreadsheet
- Images
- Videos
- Audio
- DNA
- Text

Machine learning is a game of two parts:

1. Get data into a numerical representation
2. Build a model to learn patterns in that numerical representation

Let's create some *known* data using the linear [regression formula](https://www.google.com/search?q=linear+regression+formula)

![linear regression formula](https://drive.google.com/uc?id=1ScI7WdGHNNJB9lvrAFO50QHXcDABMjSl)

[how to embed google drive images](https://medium.com/analytics-vidhya/embedding-your-image-in-google-colab-markdown-3998d5ac2684)

Linear regression is a statistical modeling technique used to establish a relationship between a dependent variable and one or more independent variables. It assumes a linear relationship between the independent variables and the dependent variable. The general formula for linear regression can be expressed as follows:

y = β0 + β1 * x1 + β2 * x2 + ... + βn * xn + ε

In this formula:

- y is the dependent variable (the variable we want to predict or explain).
- x1, x2, ..., xn are the independent variables (also known as features or predictors).
- β0, β1, β2, ..., βn are the coefficients (parameters) that represent the slope or weight of each independent variable. β0 is the intercept term.
- ε represents the error term, which accounts for the variability in the dependent variable that is not explained by the independent variables.

Let's create some *known* data using the linear [regression formula](https://www.google.com/search?q=linear+regression+formula).

We'll use a linear regression formula to make a straight line with *known* parameters.

In [3]:
# create known parameters
weight = 0.7 # "b" in the formula from the image
bias = 0.3 # "a" in the formula from the image

# Create
start = 0
end = 1
step = 0.02
X = torch.arange(start, end, step).unsqueeze(dim=1) # matrix or tensor, should be in capital; add extra dimension; explanation later
# set labels according to the relationship we define. This is what the model should learn (even though we know it already as we defined it ourselves)
# creates a tensor (formula is applied to each X)
y = weight * X + bias

X[:10], y[:10], len(X), len(y)

(tensor([[0.0000],
         [0.0200],
         [0.0400],
         [0.0600],
         [0.0800],
         [0.1000],
         [0.1200],
         [0.1400],
         [0.1600],
         [0.1800]]),
 tensor([[0.3000],
         [0.3140],
         [0.3280],
         [0.3420],
         [0.3560],
         [0.3700],
         [0.3840],
         [0.3980],
         [0.4120],
         [0.4260]]),
 50,
 50)

### splitting data into training and test sets

One of the most important concepts in machine learning.

Analogy:

Course materials = training set (model learns patterns)
Practice exam = validation set (model tunes on this data)
Final exam = test set (data the model hasn't seen before)

Exception: language models.

GPT:

> the traditional concept of splitting data into separate training and test sets may not directly apply to language models in the same way as it does to supervised learning tasks with labeled data.
>
>In the context of language models, the primary goal is to train the model to generate coherent and contextually appropriate text based on the patterns and structures it learns from a given dataset. The evaluation of a language model is typically based on the quality and fluency of the generated text rather than a comparison with specific "correct outcomes" or ground truth labels.
>
>However, that doesn't mean that language models don't benefit from some form of evaluation or testing. Here are a few common approaches:
>
>- **Perplexity**: Perplexity is a metric commonly used to evaluate language models. It quantifies how well a language model predicts a given sequence of words. Lower perplexity values indicate better performance. You can compute perplexity on a held-out validation set or a separate portion of the dataset that was not used for training. This helps gauge the model's generalization ability and its ability to capture the underlying patterns in the language.
>- **Human Evaluation**: Another approach is to conduct human evaluations where human judges assess the quality of the generated text. This can involve subjective assessments such as fluency, coherence, and relevance to a given prompt. Human evaluations can provide valuable insights into the performance and shortcomings of the language model.
>- **Prompt Completion Evaluation**: In some cases, language models are evaluated based on their ability to complete given prompts or generate text in response to specific input. This can involve providing incomplete sentences or partial text and assessing how well the model generates the rest of the text.
>
>While traditional train-test splits may not be the primary approach for evaluating language models, the concept of having separate data for evaluation is still important. It allows you to assess the model's performance on unseen or held-out data and helps in understanding how well the model generalizes to different text inputs.

