<a href="https://colab.research.google.com/github/JWiseman-git/ml_sandbox/blob/main/pytorch_recap.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [4]:
import torch

Tensor - i x j x k - $T_{ijk}$

Reductive - higher order matrics

independent of underlying coordinate system


In [5]:
scalar = torch.tensor(3)
scalar.ndim

0

In [6]:
vector = torch.tensor([1,2])
vector.ndim

1

In [7]:
matrix = torch.tensor([[[1, 2, 3],
                        [3, 6, 9],
                        [2, 4, 5]]])
matrix.shape

# torch.size([0,1,2]) -> first, second and third dims

torch.Size([1, 3, 3])

In [8]:
random_tensor = torch.rand(3, 4)
random_tensor

tensor([[0.1053, 0.2695, 0.3588, 0.1994],
        [0.5472, 0.0062, 0.9516, 0.0753],
        [0.8860, 0.5832, 0.3376, 0.8090]])

In [9]:
# Create a random tensor of size (224, 224, 3)
random_image_size_tensor = torch.rand(size=(224, 224, 3))
random_image_size_tensor.shape, random_image_size_tensor.ndim

(torch.Size([224, 224, 3]), 3)

In [10]:
#torch zeros - useful for masking purposes
zeros = torch.zeros(size=(3, 4))
zeros, zeros.dtype

(tensor([[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]]),
 torch.float32)

- Floating point value used to indicate the precision to which a value is calculated - influencing the amount of data used to represent it
- Tensors are initialised with a specific device allocated to specify the hardware for memory.
- VRAM used by GPU and RAM used by CPU
- Specifying device at the start of a script:



```
import torch

# 1. Check for GPU (CUDA) or Apple Silicon (MPS)
if torch.cuda.is_available():
    device = torch.device("cuda")
elif torch.backends.mps.is_available():
    device = torch.device("mps") # For Mac M1/M2/M3 chips
else:
    device = torch.device("cpu")

print(f"Using device: {device}")

# 2. Use the 'device' variable for all future initializations
data = torch.ones((3, 3), device=device)
```



In [11]:
float_16_tensor = torch.tensor([3.0, 6.0, 9.0],
                               dtype=torch.float16) # torch.half would also work

float_16_tensor.dtype

torch.float16

Basic Operations

- inner dimensions of a matrix must match to allow for multiplication
- for example (3, 2) @ (2, 3)

The Dimensional Rule

If Matrix $A$ has dimensions $(m \times n)$ and Matrix $B$ has dimensions $(p \times q)$, they can only be multiplied if:$$n = p$$The resulting matrix will have the dimensions $(m \times q)$.

When are Transposed Matrices used?

While it isn't a
 requirement, there are specific scenarios where you multiply a matrix by its transpose ($A \times A^T$ or $A^T \times A$):

In [12]:
#matric multiplication

torch = torch.matmul(torch.rand(3, 2), torch.rand(2, 3))
torch

tensor([[0.2055, 0.5883, 1.0170],
        [0.5719, 1.3500, 1.2323],
        [0.4267, 0.9260, 0.4668]])

In [16]:
import torch

# Shapes need to be in the right way
tensor_A = torch.tensor([[1, 2],
                         [3, 4],
                         [5, 6]], dtype=torch.float32)

tensor_B = torch.tensor([[7, 10],
                         [8, 11],
                         [9, 12]], dtype=torch.float32)

# The operation works when tensor_B is transposed
print(f"Original shapes: tensor_A = {tensor_A.shape}, tensor_B = {tensor_B.shape}\n")
print(f"New shapes: tensor_A = {tensor_A.shape} (same as above), tensor_B.T = {tensor_B.T.shape}\n")
print(f"Multiplying: {tensor_A.shape} * {tensor_B.T.shape} <- inner dimensions match\n")
print("Output:\n")
output = torch.matmul(tensor_A, tensor_B.T)
print(output)
print(f"\nOutput shape: {output.shape}")

Original shapes: tensor_A = torch.Size([3, 2]), tensor_B = torch.Size([3, 2])

New shapes: tensor_A = torch.Size([3, 2]) (same as above), tensor_B.T = torch.Size([2, 3])

Multiplying: torch.Size([3, 2]) * torch.Size([2, 3]) <- inner dimensions match

Output:

tensor([[ 27.,  30.,  33.],
        [ 61.,  68.,  75.],
        [ 95., 106., 117.]])

Output shape: torch.Size([3, 3])


In [17]:
# torch.mm is a shortcut for matmul
torch.mm(tensor_A, tensor_B.T)

tensor([[ 27.,  30.,  33.],
        [ 61.,  68.,  75.],
        [ 95., 106., 117.]])

- Feed-forward layer implements several matrix multiplication steps
- Fully connected/dense layer - every neuron connects to every neuron in the previous layer

In [19]:
import torch
torch.manual_seed(42)
# randomly seed the weights of the matrix
# implements a matrix multiplication between an input x and a Weights matrix A
linear = torch.nn.Linear(in_features=2, # in_features = matches inner dimension of input
                         out_features=6) # out_features = describes outer value
x = tensor_A
output = linear(x)
print(f"Input shape: {x.shape}\n")
print(f"Output:\n{output}\n\nOutput shape: {output.shape}")

Input shape: torch.Size([3, 2])

Output:
tensor([[2.2368, 1.2292, 0.4714, 0.3864, 0.1309, 0.9838],
        [4.4919, 2.1970, 0.4469, 0.5285, 0.3401, 2.4777],
        [6.7469, 3.1648, 0.4224, 0.6705, 0.5493, 3.9716]],
       grad_fn=<AddmmBackward0>)

Output shape: torch.Size([3, 6])
