# Three Core Components of PyTorch

<img src="../asssets/a1-three-components-of-pytorch.png"/>

1. **Tensor Library:** Extends the concept of array-oriented programming library, *`NumPy`* with the *`GPU`* support.

2. **Automatic Differentiation Engine `(Autograd)`:** Enables Automatic Calculation of Gradients for Tensor Operations simplifying the process of backpropagation and model optimization.

3. **Deep Learning Library:** Offers Modular, Flexible and Efficient Building Blocks including Pretrained Models, Loss Functions and Optimizers for designing and training a wide range of deep learning models.


> In the news, **LLMs** are often referred to as **AI models**. However, LLMs are also a type of  **deep  neural  network**,  and **PyTorch**  is  a  deep  learning  library.

<img src="../asssets/ai-ml-dl.png"/>

<img src="../asssets/a3-supervised-learning.png"/>

<img src="../asssets/apple silicon.png"/>

# Tensors

In [1]:
import torch

torch.__version__

  device: torch.device = torch.device(torch._C._get_default_device()),  # torch.device('cpu'),


'2.2.0'

In [2]:
torch.cuda.is_available()

False

## Understanding Tensors

<img src="../asssets/tensors.png">

## Scalars, Vectors, Matrices and Tensors

<img src="../asssets/create-tensors.png"/>

In [3]:
import torch

tensor0d: torch.Tensor = torch.tensor(data=1)
print(tensor0d)

tensor(1)


In [4]:
import torch

tensor1d: torch.Tensor = torch.tensor(data=[1, 2, 3])
print(tensor1d)

tensor([1, 2, 3])


In [6]:
import torch

tensor2d: torch.Tensor = torch.tensor(data=[[1, 2], [3, 4]])
print(tensor2d)

tensor([[1, 2],
        [3, 4]])


In [7]:
import torch

tensor3d: torch.Tensor = torch.tensor(
    data=[[[1, 2], [3, 4]], [[5, 6], [7, 8]]],
)
print(tensor3d)

tensor([[[1, 2],
         [3, 4]],

        [[5, 6],
         [7, 8]]])


## Tensor Datatypes

- PyTorch  adopts  the  default  `64-bit  integer`  data  type  from  Python.  We  can  access  the data type of a tensor via the *`.dtype`* attribute of a tensor:

In [8]:
import torch

tensor1d: torch.Tensor = torch.tensor(data=[1, 2, 3])
print(tensor1d)
print(tensor1d.dtype)

tensor([1, 2, 3])
torch.int64


- If we create tensors from Python floats, PyTorch creates tensors with a *`32-bit precision`* by default:

In [10]:
import torch

floatvector: torch.Tensor = torch.tensor(data=[1.0, 2.0, 3.0])
print(floatvector)
print(floatvector.dtype)

tensor([1., 2., 3.])
torch.float32


**This choice is primarily due to the *balance between precision and computational efficiency*. A 32-bit floating-point number offers sufficient precision for most deep learning tasks while consuming less memory and computational resources than a 64-bit floating-point number. Moreover, *GPU architectures are optimized for 32-bit* computations, and using this data type can significantly speed up model training and inference.**

> Moreover, it is possible to change the precision using a tensor’s **`.to`** method.

In [16]:
import torch

floatvector64: torch.Tensor = torch.tensor(data=[1, 2, 3])
print(floatvector64)
print(floatvector64.dtype, "\n")

floatvector32: torch.Tensor = torch.tensor(data=[1, 2, 3]).to(dtype=torch.float32)
print(floatvector32)
print(floatvector32.dtype)


tensor([1, 2, 3])
torch.int64 

tensor([1., 2., 3.])
torch.float32


## Common PyTorch Tensor Operations

- The *`.shape`* attribute allows us to access the shape of a tensor:

In [19]:
import torch

tensor2d: torch.Tensor = torch.tensor(data=[[1, 2, 3], [4, 5, 6]])
print(tensor2d)
print(tensor2d.shape)

tensor([[1, 2, 3],
        [4, 5, 6]])
torch.Size([2, 3])


As you can see, *`.shape`* returns `[2, 3]`, meaning the tensor has *2 rows* and *3 columns*. To reshape the tensor into a `3 × 2` tensor, we can use the *`.reshape`* method:

In [20]:
print(tensor2d.reshape(3, 2))

tensor([[1, 2],
        [3, 4],
        [5, 6]])


However, note that the *more common command for reshaping* tensors in PyTorch is *`.view()`*:

In [23]:
print(tensor2d.view(3, 2))

tensor([[1, 2],
        [3, 4],
        [5, 6]])


The key difference between `.view()` and `.reshape()` in PyTorch lies in how they handle memory layout: `.view()` requires the tensor to be **contiguous** (data stored in a continuous block of memory) and will raise an error if it isn’t, as it only *provides a new "view" into the existing data* **without copying it**. In contrast, `.reshape()` works regardless of whether the tensor is contiguous; if needed, it creates a new, contiguous copy of the data to ensure the desired shape. Use `.view()` for efficiency when the tensor is contiguous and `.reshape()` for flexibility.


- We can use **`.T`** to transpose a tensor, which means flipping it across its diagonal. Note that this is similar to reshaping a tensor, as you can see based on the following result:

In [24]:
print(tensor2d)
print(tensor2d.T)

tensor([[1, 2, 3],
        [4, 5, 6]])
tensor([[1, 4],
        [2, 5],
        [3, 6]])


The common way to multiply two matrices in PyTorch is the **`.matmul`** method:

In [27]:
print(tensor2d)
print(tensor2d.T)
print(tensor2d.matmul(other=tensor2d.T))

tensor([[1, 2, 3],
        [4, 5, 6]])
tensor([[1, 4],
        [2, 5],
        [3, 6]])
tensor([[14, 32],
        [32, 77]])


We can also adopt the **`@`** operator, which accomplishes the same thing more compactly:

In [28]:
print(tensor2d)
print(tensor2d.T)
print(tensor2d @ tensor2d.T)

tensor([[1, 2, 3],
        [4, 5, 6]])
tensor([[1, 4],
        [2, 5],
        [3, 6]])
tensor([[14, 32],
        [32, 77]])


# Autograd

## Seeing Models as Computational Graphs
Now let’s look at PyTorch’s *`automatic differentiation engine`*, also known as *`autograd`*. PyTorch’s autograd system provides *functions to compute gradients* in dynamic computational graphs automatically. 

- A **`computational graph`** is a `directed graph` that allows us to **express** and **visualize mathematical expressions**. In the context of deep learning, a computation graph lays out the sequence of calculations needed to compute the output of a neural network we  will  need  this  to  compute  the  required  gradients  for backpropagation,  the  main training algorithm for neural networks.

The code in the following listing implements the **forward pass (prediction step)** of a **simple logistic regression classifier**, which can be seen as a `single-layer neural network`. It returns a score between 0 and 1, which is compared to the true class label (0 or 1) when computing the loss.


<img src="../asssets/logistic-regression-forward-pass.png"/>


In [None]:
import torch
import torch.nn.functional as F

y: torch.Tensor = torch.tensor(data=[1.0])  # True Label

x1: torch.Tensor = torch.tensor(data=[1.1])  # Indepndent Variable
w1: torch.Tensor = torch.tensor(data=[2.2])  # Weight

b: torch.Tensor = torch.tensor(data=[0.0])  # Bias

z: torch.Tensor = x1 * w1 + b  # Linear Function
a: torch.Tensor = torch.sigmoid(input=z)  # Activation Function

loss: torch.Tensor = F.binary_cross_entropy(input=a, target=y)  # Loss

<img src="../asssets/computational-graph.png">

PyTorch builds such a computation graph in the background, and we can use this to *`calculate gradients(slope) of a loss function with respect to the model parameters`* (here **`w1`** and **`b`**) *`to train the model.`*

## Automatic Differentiation Made Easy
If we carry out computations in PyTorch, it will build a computational graph internally by default if one of its terminal  nodes has the **`requires_grad`** attribute  set to `True`. This is useful if we want to compute gradients. **Gradients are required when training neural networks** via the popular **`backpropagation algorithm`**, which can be considered an *`implementation of the chain rule`* from calculus for neural networks.

<img src="../asssets/partial-derivative.png">

## PARTIAL DERIVATIVES AND GRADIENTS

- **`Partial Derivatives:`** measure *`the rate at which a function changes with respect to one of its variables`*. 

- A **`gradient (slope)`** is a *`vector containing all of the partial derivatives of a multivariate function`*, a function with more than one variable as input.