# APPENDIX A


This following file covers the initiations and basics of pytorch



**A.1 WHAT IS PYTORCH?**



Useful due to its dealing with tensors, it automatically computes gradients for tensor operations and it has many built in loss functions and optimizers.
Deep learning is just a type of machine learning

Initializing it on the terminal
-pip install pytorch
-pip3 install torch torchvision torchaudio
-pip show torch should return version 2.4.0

In [3]:
import torch
torch.cuda.is_available()

False

**A.2 UNDERSTANDING TENSORS?**

Tensors are a generalization of matrices to higher dimensions
They are data containers for array-like structures

In [4]:
import torch

# 0 dimensional tensor
tensor0d = torch.tensor(1)

# 1 dimensional tensor
tensor1d = torch.tensor([1,2,3])

# 2 dimensional tensor
tensor2d = torch.tensor([[1,2],[3,4]])

# 3 dimensional tensor
tensor3d = torch.tensor([[[1,2],[3,4]],[[5,6],[7,8]]])

In [None]:
# Data Types, they are both 64-bit integers, 64 bits leads to more precision although it causes a larger memory consumption.
print(tensor0d.dtype, tensor1d.dtype)

torch.int64 torch.int64


In [12]:
# Operations

# Obtaining the tensor
print(tensor0d)

# Obtaining the size
print(tensor2d.shape)

# Reshaping the tensor
print(tensor3d.reshape(4,2))
# .view is more common on this case

# Transposing the tensor
print(tensor3d.T)


tensor(1)
torch.Size([2, 2])
tensor([[1, 2],
        [3, 4],
        [5, 6],
        [7, 8]])
tensor([[[1, 5],
         [3, 7]],

        [[2, 6],
         [4, 8]]])


  print(tensor3d.T)


**A.3 Seeing models as computation graphs**

Autograd is a built in function of Torch which computes gradients automatically.

In [None]:
# Logistic regression classifier

import torch.nn.functional as F

y = torch.tensor([1.0]) # label
x1 = torch.tensor([1.1]) # input
w1 = torch.tensor([2.2]) # weight
b = torch.tensor([0.0]) # bias
z = x1 * w1 + b # formula
a = torch.sigmoid(z) # activation formula, any number is squashed between 0 and 1.
loss = F.binary_cross_entropy(a, y) # output, how wrong the prediction is
print(loss)

tensor(0.0852)


**A.4 Automatic Differentiation Made Easy**

The attribute requires_grad set to True will build a computational graph internally, this is useful if we want to compute gradients.
Gradients are computed with partial derivatives, done using the chain rule from right to left in the computation graph.

In [18]:
# Computing gradients with autograd
import torch.nn.functional as F
from torch.autograd import grad


y = torch.tensor([1.0])
x1 = torch.tensor([1.1])
w1 = torch.tensor([2.2], requires_grad=True) # Parameter requires grad set to True
b = torch.tensor([0.0], requires_grad=True)
z = x1 * w1 + b
a = torch.sigmoid(z)
loss = F.binary_cross_entropy(a, y)

grad_L_w1 = grad(loss, w1, retain_graph=True) # Loss is a scalar value representing the model's error.
grad_L_b = grad(loss, b, retain_graph=True) # Retain graph maintains the gradient in memory, useful if we wish to use it later

print(grad_L_w1)
print(grad_L_b)

# loss.backward() does gradient computation for all parameters that have requires_grad at once. Store in .grad attributes
loss.backward()
b.grad


(tensor([-0.0898]),)
(tensor([-0.0817]),)


tensor([-0.0817])

**A.5 Implementing Multilayer Neural Networks**