## 00. PyTorch Fundamentals


In [None]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

print(torch.__version__)

2.3.1+cu121


## Intro to Tensors

### Creating Tensors

In [None]:
# Scalar
scalar = torch.tensor(7)
scalar

tensor(7)

In [None]:
scalar.ndim

0

In [None]:
# get number out of scalar tensor
scalar.item()

7

In [None]:
# Vector
vector = torch.tensor([7, 7])
vector

tensor([7, 7])

In [None]:
vector.ndim

1

In [None]:
vector.shape

torch.Size([2])

In [None]:
# Matrix - capitalize MATRIX var names
MATRIX = torch.tensor([[7, 8],
                       [9, 10]])
MATRIX

tensor([[ 7,  8],
        [ 9, 10]])

In [None]:
MATRIX.ndim

2

In [None]:
MATRIX[0][1]

tensor(8)

In [None]:
MATRIX.shape

torch.Size([2, 2])

In [None]:
# Tensor - capitalize TENSOR var names
TENSOR = torch.tensor([[[1, 2, 3],
                        [3, 6, 9],
                        [2, 4, 5]]])
TENSOR

tensor([[[1, 2, 3],
         [3, 6, 9],
         [2, 4, 5]]])

In [None]:
TENSOR.ndim

3

In [None]:
TENSOR.shape

torch.Size([1, 3, 3])

In [None]:
TENSOR[0]

tensor([[1, 2, 3],
        [3, 6, 9],
        [2, 4, 5]])

### Random Tensors

Why random tensors?

random tensors are important because many NNs learn by starting from randomly initialized weights which are then adjusted over time during model training.

`Start with random numbers -> look at data -> update numbers -> look at data -> ...`

In [None]:
random_tensor = torch.rand(1, 3, 4)
random_tensor

tensor([[[0.3344, 0.6149, 0.3963, 0.7457],
         [0.6621, 0.4232, 0.3409, 0.0813],
         [0.2755, 0.4578, 0.9696, 0.9533]]])

In [None]:
# create random tensor with similar shape to image tensor
random_image_size_tensor = torch.rand(size=(224, 224, 3)) # height, width, color channels
random_image_size_tensor.shape, random_image_size_tensor.ndim

(torch.Size([224, 224, 3]), 3)

### Zeros and Ones

In [None]:
# create a tensor of zeros
zeros = torch.zeros(size=(3,4))
zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [None]:
# zero out a tensor
random_tensor * zeros

tensor([[[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]]])

In [None]:
# create a tensor of ones
ones = torch.ones(size=(3,4))
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [None]:
ones.dtype

torch.float32

### Create a range of tensors and tensors-like

In [None]:
# arange
zero_to_nine = torch.arange(start=0, end=10, step=1)
zero_to_nine

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [None]:
# creating tensors like
ten_zeros = torch.zeros_like(input=zero_to_nine)
ten_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

### Tensor Data Types

**NOTE** - one of the three biggest causes of errors in PyTorch

1. Tensors not the right datatype
2. Tensors not in the right shape
3. Tensors not on the right device

In [None]:
# Float 32
float_32_tensor = torch.tensor([3., 6., 9.],
                               dtype=None, # what data type is the tensor
                               device=None, # where the processing takes place (default="cpu", alt="cuda")
                               requires_grad=False) # do you want PyTorch to track gradients over operations?
float_32_tensor

tensor([3., 6., 9.])

In [None]:
float_32_tensor.dtype

torch.float32

In [None]:
float_16_tensor = float_32_tensor.type(torch.float16)
float_16_tensor

tensor([3., 6., 9.], dtype=torch.float16)

In [None]:
float_16_tensor.dtype

torch.float16

In [None]:
# you can still work between datatypes sometimes!
float_16_tensor * float_32_tensor

tensor([ 9., 36., 81.])

In [None]:
float_16_tensor.device

device(type='cpu')

## Tensor Operations

#### Operations include
1. Addition
2. Subtraction
3. Multiplication (element-wise)
4. Multiplication (matrix)
5. Division



**In-place operations** Operations that store the result into the
operand are called in-place. They are denoted by a `_` suffix. For
example: `x.copy_(y)`, `x.t_()`, will change `x`.


<div style="background-color: #54c7ec; color: #fff; font-weight: 700; padding-left: 10px; padding-top: 5px; padding-bottom: 5px"><strong>NOTE:</strong></div>
<div style="background-color: #f3f4f7; padding-left: 10px; padding-top: 10px; padding-bottom: 10px; padding-right: 10px">
<p>In-place operations save some memory, but can be problematic when computing derivatives because of an immediate lossof history. Hence, their use is discouraged.</p>
</div>

In [None]:
# create a tensor
tensor = torch.tensor([1, 2, 3])

# addition
tensor + 10

tensor([11, 12, 13])

In [None]:
# multiplication (element-wise)
tensor * 10
tensor * tensor

tensor([1, 4, 9])

In [None]:
# subtraction
tensor - 10

tensor([-9, -8, -7])

In [None]:
# Try out PyTorch in-built function --> but generally stick to python operators
torch.mul(tensor, 10)

tensor([10, 20, 30])

In [None]:
# division
tensor / 10

tensor([0.1000, 0.2000, 0.3000])

### Matrix multiplication

Two ways of performing multiplication
1. Simple, element-wise
2. Matrix multiplication (perhaps the most common)
  (aka dot product)

In [None]:
# 1. Element-wise
tensor * tensor

tensor([1, 4, 9])

In [None]:
# 2. Matrix multiplication --> can use @, but prefer to call torch.matmul
torch.matmul(tensor, tensor)

tensor(14)

In [None]:
tensor @ tensor

tensor(14)

In [None]:
# matmul is really fast compared to iterative operations
%%time
torch.matmul(tensor, tensor)

CPU times: user 349 µs, sys: 63 µs, total: 412 µs
Wall time: 345 µs


tensor(14)

### One of the most common errors in DL is shape errors.

Remember!
1. The inner dimensions m must match, i.e. (n, m) * (m, p)
2. The resulting matrix takes the shape of the outer dimensions, i.e. i.e. (n, m) * (m, p) = (n, p)

In [None]:
ten_A = torch.tensor([[1, 2],
                      [3, 4],
                      [5, 6]])

ten_B = torch.tensor([[7, 8],
                      [9, 10],
                      [11, 12]])

torch.mm(ten_A, ten_B.T) # torch.mm is an alias for torch.matmul if you wanna use that

tensor([[ 23,  29,  35],
        [ 53,  67,  81],
        [ 83, 105, 127]])

## Tensor Aggregation - min, max, mean, sum, etc

In [None]:
x = torch.arange(0., 100., 10.)
x

tensor([ 0., 10., 20., 30., 40., 50., 60., 70., 80., 90.])

In [None]:
x.min(), torch.min(x)

(tensor(0.), tensor(0.))

In [None]:
torch.argmin(x)

tensor(0)

In [None]:
x.max(), torch.max(x)

(tensor(90.), tensor(90.))

In [None]:
torch.argmax(x)

tensor(9)

In [None]:
x.mean(), torch.mean(x)

(tensor(45.), tensor(45.))

In [None]:
x.sum(), torch.sum(x)

(tensor(450.), tensor(450.))

### Reshaping, stacking, squeezing, and unsqueezing tensors

* Reshaping - reshapes input to new defined shape
* View - return a view of an input tensor of certain shape, but keep the same memory for the original tensor
* Stacking - combine multiple tensors vertically (vstack) or horizontally (hstack) ---> along a certain dimension specified by a ndim arg
* Squeeze - removes all `1` dimensions from a tensor
* Unqueeze - adds `1` dimension to a tensor
* Permute - return view of tensor with dimension permuted (swapped) in a certain way

In [None]:
x = torch.arange(1., 10.)
x, x.shape

(tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9]))

In [None]:
# Must be compatible with original tensor dimensions
x_reshaped = x.reshape(9, 1)
x_reshaped

tensor([[1.],
        [2.],
        [3.],
        [4.],
        [5.],
        [6.],
        [7.],
        [8.],
        [9.]])

In [None]:
# change the view - aka make a new alias for same allocated memory
# RECALL that view shares the same memory as the original tensor
#.       THAT IS, modifying z will modify x, and vice versa
z = x.view(1,9)
z, z.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [None]:
z[:, 0] = 5
x, z

(tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]),
 tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]]))

In [None]:
# stack em!
x_stacked = torch.stack([x, x, x, x]) # default is dim=0 --> vstack
x_stacked

tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.]])

In [None]:
# stack em!
x_stacked = torch.stack([x, x, x, x], dim=1) # --> hstack
x_stacked

tensor([[5., 5., 5., 5.],
        [2., 2., 2., 2.],
        [3., 3., 3., 3.],
        [4., 4., 4., 4.],
        [5., 5., 5., 5.],
        [6., 6., 6., 6.],
        [7., 7., 7., 7.],
        [8., 8., 8., 8.],
        [9., 9., 9., 9.]])

In [None]:
# squeeze - removes all dimensions of size=1 removed
x_reshaped, x_reshaped.shape

(tensor([[5.],
         [2.],
         [3.],
         [4.],
         [5.],
         [6.],
         [7.],
         [8.],
         [9.]]),
 torch.Size([9, 1]))

In [None]:
x_reshaped.squeeze(), x_reshaped.squeeze().shape

(tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9]))

In [None]:
x_reshaped.unsqueeze(dim=0), x_reshaped.unsqueeze(dim=0).shape

(tensor([[[5.],
          [2.],
          [3.],
          [4.],
          [5.],
          [6.],
          [7.],
          [8.],
          [9.]]]),
 torch.Size([1, 9, 1]))

In [None]:
# permute rearranges dim of target tensor in a specified order - returns a view
# RECALL - a view shares memory of the original tensor
x_original = torch.rand(size=(224, 224, 3)) # height, width, color channels

# permute the original tensor to rearrange axis order
#. change to color channels, height, width
x_permuted = x_original.permute(2, 0, 1) # 0->1, 2->0, 1->2

x_original.shape, x_permuted.shape

(torch.Size([224, 224, 3]), torch.Size([3, 224, 224]))

## Indexing - Data Selection

### Standard NumPy-like indexing & slicing

In [None]:
tensor = torch.ones(4, 4)
print(f"First row: {tensor[0]}")
print(f"First column: {tensor[:, 0]}")
print(f"Last column: {tensor[..., -1]}")
tensor[:,1] = 0
print(tensor)

In [None]:
# Create tensor
x = torch.arange(1, 10).reshape(1, 3, 3)
x, x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

In [None]:
# element indexing
x[0]

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [None]:
x[0][0]

tensor([1, 2, 3])

In [None]:
x[0][0][0]

tensor(1)

In [None]:
# slicing
x[0][0:2]

tensor([[1, 2, 3],
        [4, 5, 6]])

In [None]:
x[0][1:3][1]

tensor([7, 8, 9])

In [None]:
x[0][1:3][1][2]

tensor(9)

In [None]:
x[0][:, 1] # must use comma to access vertical dimensions

tensor([2, 5, 8])

In [None]:
x[:, 1, 1]

tensor([5])

In [None]:
x[0, 0, :]

tensor([1, 2, 3])

### PyTorch tensors and NumPy

* PyTorch is built on NumPy, hence can interact with it
* Can start with data in NumPy, but you want it in a Tensor
* Tensors made from NumPy arrays and vice versa **share the same memory!**
* `torch.from_numpy(ndarray)`
* `torch.Tensor.numpy()`

In [None]:
array = np.arange(1.0, 8.0)
# cast datatype after converting from numpy, which is default float64
tensor = torch.from_numpy(array).type(torch.float32)
tensor, tensor.shape

(tensor([1., 2., 3., 4., 5., 6., 7.]), torch.Size([7]))

In [None]:
tensor = torch.arange(0.,10.).reshape(2,5) # default datatype of PyTorch is float32
tensor.numpy(), tensor.numpy().dtype

(array([[0., 1., 2., 3., 4.],
        [5., 6., 7., 8., 9.]], dtype=float32),
 dtype('float32'))

## PyTorch Reproducibility

random seed


```
RANDOM_SEED = 42
torch.manual_seed(RANDOM_SEED)
```

NOTE - you have to call this for each code cell

In [None]:
RANDOM_SEED = 42
torch.manual_seed(RANDOM_SEED)

torch.rand(3,3)

tensor([[0.8823, 0.9150, 0.3829],
        [0.9593, 0.3904, 0.6009],
        [0.2566, 0.7936, 0.9408]])

## Running PyTorch Tensors and Objects on GPUs

### Set up device agnostic code, prefer cuda but backup to cpu
https://pytorch.org/docs/stable/notes/cuda.html#best-practices

In [None]:
# Check nvidia GPU access
!nvidia-smi

Fri Jul 26 00:45:43 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  Tesla T4                       Off | 00000000:00:04.0 Off |                    0 |
| N/A   48C    P8               9W /  70W |      0MiB / 15360MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

In [None]:
# Check PyTorch access to a GPU
torch.cuda.is_available()

True

In [None]:
# setup device agnostic code
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

In [None]:
# cound number of devices
torch.cuda.device_count()

1

### Putting tensors and models on the GPU

In [None]:
# Create tensor, it defaults to the cpu
# NOTE - this is why we set up the device variable (cuda if available)
tensor = torch.tensor([1,2,3])
tensor, tensor.device

(tensor([1, 2, 3]), device(type='cpu'))

In [None]:
tensor_gpu = tensor.to(device) # to(device) method moves objects between devices
tensor_gpu, tensor_gpu.device

(tensor([1, 2, 3], device='cuda:0'), device(type='cuda', index=0))

In [None]:
# move tensors back to the cpu
tensor_cpu = tensor_gpu.cpu()
tensor_cpu, tensor_cpu.device

(tensor([1, 2, 3]), device(type='cpu'))

In [None]:
# can only use NumPy on cpu
tensor_cpu.numpy()

array([1, 2, 3])