## Pytorch fundamentals
Remember to go Runtime -> Change runtime type -> GPU

In [None]:
!nvidia-smi

Tue Apr 11 06:30:40 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.12    Driver Version: 525.85.12    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   46C    P8     9W /  70W |      0MiB / 15360MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

In [None]:
import torch
print(torch.__version__)

2.0.0+cu118


In [None]:
!ls -l /usr/local | grep cuda

lrwxrwxrwx 1 root root   22 Feb  2 05:19 cuda -> /etc/alternatives/cuda
lrwxrwxrwx 1 root root   25 Feb  2 05:19 cuda-11 -> /etc/alternatives/cuda-11
drwxr-xr-x 1 root root 4096 Feb  2 05:35 cuda-11.8


## Tensors
PyTorch tensors - https://pytorch.org/docs/stable/tensors.html

In [None]:
#scalar
notscalar = torch.Tensor(7) #torch.Tensor is main class, creates tensor with default data type - torch.get_default_dtype()
print(notscalar)
print(type(notscalar))
print(notscalar.ndim)
scalar = torch.tensor(7) #torch.tensor is a function that returns a tensor
print(scalar)
print(type(scalar))
print(scalar.ndim)



tensor([ 1.9594e-25,  4.5850e-41,  1.9637e-25,  4.5850e-41,  4.6102e-04,
        -6.8768e+23,  1.9594e-25])
<class 'torch.Tensor'>
1
tensor(7)
<class 'torch.Tensor'>
0


In [None]:

scalar.item()

7

In [None]:
#vector
vector = torch.tensor([5,5,5])
vector.ndim

1

In [None]:
#matrix
matrix = torch.tensor([[7,8,4],[9,10,11]])
print(matrix.ndim)
print(matrix.shape)

2
torch.Size([2, 3])


In [None]:
#tensor
tensor = torch.tensor([[[1,2,3],
                        [4,5,6],
                        [7,8,9]]])
tensor

tensor([[[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]])

In [None]:
print(tensor.ndim)
print(tensor.shape)

3
torch.Size([1, 3, 3])


In [None]:
tensor[0][1][2]

tensor(6)

In [None]:
tensor_big = torch.tensor([[[[1,2,3,4],
                          [1,2,3,4],
                          [1,2,3,4]],
                          [[1,2,3,4],
                          [1,2,3,4],
                          [1,2,3,4]]]])
print(tensor_big.ndim)
print(tensor_big.shape)
#scalar and vector are lowercase, matrix and tesor uppercase so it should be MATRIX, TENSOR, TENSOR_BIG

4
torch.Size([1, 2, 3, 4])


### Random tensors
Neural networks learn by starting with tensors full of random numbers and adjust those random numbers to better represent data

In [None]:
random_tensor = torch.rand(3, 4)
random_tensor

tensor([[0.7025, 0.4677, 0.3927, 0.9416],
        [0.6319, 0.2149, 0.2436, 0.6073],
        [0.3885, 0.4224, 0.9347, 0.3381]])

In [None]:
random_img_tensor = torch.rand(size=(224,224,3)) #height, width, color channels
random_img_tensor.ndim

3

In [None]:
#tensor of zeros and ones
zeros = torch.zeros(size=(4,4))
ones = torch.ones(3,3)
ones,zeros

(tensor([[1., 1., 1.],
         [1., 1., 1.],
         [1., 1., 1.]]),
 tensor([[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]]))

In [None]:
ones.dtype

torch.float32

In [None]:
one_ten = torch.arange(0,10,step=2)
one_ten

tensor([0, 2, 4, 6, 8])

In [None]:
ten_zeros = torch.zeros_like(one_ten)
ten_zeros

tensor([0, 0, 0, 0, 0])

Errors you encounter in DL and pytorch:
1) tensors not right datatype
2) tensors not right shape
3) tensors not on the right device

precision in computing: https://en.wikipedia.org/wiki/Precision_(computer_science)

In [None]:
float_32_tensor = torch.tensor([3.5, 5.5, 7.3],
                               device=None, # cpu or cuda
                               requires_grad=False, #track gradients with this tensors operations
                               dtype=None) # what datatype, e.g. float32, float16...
float_32_tensor.device

device(type='cpu')

In [None]:
float_16_tensor = float_32_tensor.type(torch.float16)
float_16_tensor

tensor([3.5000, 5.5000, 7.3008], dtype=torch.float16)

In [None]:
float_16_tensor * float_32_tensor

tensor([12.2500, 30.2500, 53.2957])

In [None]:
int_32_tensor = torch.tensor([3,6,9], dtype=torch.int32)

In [None]:
float_32_tensor * int_32_tensor
# mytensor.shape, mytensor.dtype, mytensor.device - to get the info

tensor([10.5000, 33.0000, 65.7000])

## Manipulating tensors

operations:
* addition
* subtraction
* multiplication (element wise)
* division
* matrix multiplication

In [None]:
tensor = torch.tensor([1,2,3])
tensor = tensor + 10
tensor, tensor.dtype

(tensor([11, 12, 13]), torch.int64)

In [None]:
tensor = tensor * 10
tensor, tensor.dtype

(tensor([110, 120, 130]), torch.int64)

In [None]:
tensor = tensor / 10
tensor, tensor.dtype

(tensor([11., 12., 13.]), torch.float32)

In [None]:
tensor = tensor - 10
tensor, tensor.dtype

(tensor([1., 2., 3.]), torch.float32)

In [None]:
#torch built-in functions, use operators from python - better readability
tensor.add(5)

tensor([6., 7., 8.])

### Matrix multiplication in neural networks

Two ways:
* element-wise
* tensor-multiplication (dot-product)

Two rules for dot product:
*  inner dimensions must match (3,2) @ (3, 2) won't work. it must be e.g. (3,2) @ (2,3)
* result matrix has the shape of outer dimensions 

In [None]:
#element-wise
tensor * tensor

tensor([1., 4., 9.])

In [None]:
len(tensor)

3

In [None]:
#matrix-mul
%%time
torch.matmul(tensor, tensor)

CPU times: user 1.37 ms, sys: 0 ns, total: 1.37 ms
Wall time: 4.29 ms


tensor(14.)

In [None]:
# shape errors
tensor1 = torch.rand(3,2)
tensor2 = torch.rand(3,2)
torch.matmul(tensor1, tensor2)

RuntimeError: ignored

In [None]:
tensor1 = torch.rand(3,2)
tensor2 = torch.rand(2,3)
torch.matmul(tensor1, tensor2)

tensor([[0.7301, 0.7944, 1.1824],
        [0.3356, 0.3389, 0.5431],
        [0.5085, 0.3576, 0.8201]])

In [None]:
tensor_A = torch.tensor([[1,2],
                         [3,4],
                         [5,6]])
tensor_B = torch.tensor([[7,10,],
                         [8,11],
                         [9,12]])

In [None]:
#torch.matmul, tensor @ tensor, torch.mm - all the same
torch.mm(tensor_A, tensor_B)

RuntimeError: ignored

In [None]:
# To fix tensor shape issues, we can manipulate shape of one of our tensors using transpose
tensor_B.T, tensor_B.T.shape

(tensor([[ 7,  8,  9],
         [10, 11, 12]]),
 torch.Size([2, 3]))

In [None]:
torch.mm(tensor_A, tensor_B.T)

tensor([[ 27,  30,  33],
        [ 61,  68,  75],
        [ 95, 106, 117]])

In [None]:
torch.mm(tensor_A.T, tensor_B)

tensor([[ 76, 103],
        [100, 136]])

In [None]:
test_tensor = torch.zeros(2,3,4)
test_tensor, test_tensor.mT

(tensor([[[0., 0., 0., 0.],
          [0., 0., 0., 0.],
          [0., 0., 0., 0.]],
 
         [[0., 0., 0., 0.],
          [0., 0., 0., 0.],
          [0., 0., 0., 0.]]]),
 tensor([[[0., 0., 0.],
          [0., 0., 0.],
          [0., 0., 0.],
          [0., 0., 0.]],
 
         [[0., 0., 0.],
          [0., 0., 0.],
          [0., 0., 0.],
          [0., 0., 0.]]]))

### Tensor aggregation (min, max, mean, sum)

In [None]:
x = torch.arange(0,100,10)
x

tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

In [None]:
torch.min(x), torch.max(x)

(tensor(0), tensor(90))

In [None]:
torch.mean(x)

RuntimeError: ignored

In [None]:
x.dtype

torch.int64

In [None]:
torch.mean(x.type(torch.float32))
#torch.mean requires float32 - datatype issue

tensor(45.)

In [None]:
#you can use both of these
torch.max(x), x.max()

(tensor(90), tensor(90))

In [None]:
torch.sum(x), x.sum()

(tensor(450), tensor(450))

In [None]:
# find position in tensor that has min value 
x.argmin()

tensor(0)

# Reshaping, stacking, squeezing and unsqueezing tensors

* Reshaping - reshapes to a defined shape 
* View - return a view of an input tensor of certain shape but keep the same in memory
* Stacking - combine multiple tensors on top of each other (vstack) or side by side(hstack)
* Squeeze - removes all `1` dimensions from a Tensor
* Unsqueeze - add a `1` dimension to a target ten_zeros
* Permute - returna a view of the input with dimensions permuted (swapped) in a certain way

In [None]:
x = torch.arange(1., 10.)
x, x.shape

(tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9]))

In [None]:
x_reshaped = x.reshape(9, 1)
x_reshaped, x_reshaped.shape

(tensor([[1.],
         [2.],
         [3.],
         [4.],
         [5.],
         [6.],
         [7.],
         [8.],
         [9.]]),
 torch.Size([9, 1]))

In [None]:
# change the view, it shares the memory with original tensor. (changing z changes x)
z = x.view(1,9)
z, z.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [None]:
z[0, 0] = 5
z, x

(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]]),
 tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]))

In [None]:
# stack tensors on top of each other
x_stacked = torch.stack([x,x,x], dim=0) #dim cant be larger than x.ndim -> 1 
x_stacked

tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.]])

In [None]:
# squeeze - removes all single dimensions from a target tensor - from (1,1,9) it creates (9)
x_reshaped = x.reshape(1,9)
x_reshaped, x_reshaped.shape, x_reshaped.ndim

(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]), 2)

In [None]:
x_squeezed = x_reshaped.squeeze()
x_squeezed, x_squeezed.shape

(tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9]))

In [None]:
# unsqueeze, adds a single dim to a target tensor at a specific dim
x_unsqueezed = x_squeezed.unsqueeze(dim=1)
x_unsqueezed, x_unsqueezed.shape

(tensor([[5.],
         [2.],
         [3.],
         [4.],
         [5.],
         [6.],
         [7.],
         [8.],
         [9.]]),
 torch.Size([9, 1]))

In [None]:
# torch permute - rearranges dimensions of target tensor in specified order
x_original = torch.rand(size=(224,224,3))
print(x_original.shape)
# permute - rearrange the axis (or dim) order
x_permuted = x_original.permute(2,0,1) # shifts axis (orig->permuted) 0->1, 1->2, 2->0, Creates a VIEW - same memory!!! 
print(x_permuted.shape)

torch.Size([224, 224, 3])
torch.Size([3, 224, 224])


In [None]:
# indexing (selecting data)
x = torch.arange(1,19).reshape(2,3,3)
x, x.shape

(tensor([[[ 1,  2,  3],
          [ 4,  5,  6],
          [ 7,  8,  9]],
 
         [[10, 11, 12],
          [13, 14, 15],
          [16, 17, 18]]]),
 torch.Size([2, 3, 3]))

In [None]:
x[0][0][1]  
# x[0, 0, 1]

tensor(2)

In [None]:
# use : to select all of target dimension
x[:, :, 2]

tensor([[ 3,  6,  9],
        [12, 15, 18]])

In [None]:
# PyTorch tensors & NumPy - they interact. start with numpy array -> torch.from_numpy(ndarray) <- torch.Tensor.numpy()
import numpy as np

array = np.arange(1.0, 9.0)
tensor = torch.from_numpy(array)
array, tensor


(array([1., 2., 3., 4., 5., 6., 7., 8.]),
 tensor([1., 2., 3., 4., 5., 6., 7., 8.], dtype=torch.float64))

In [None]:
 # tensor to numpy
tensor = torch.ones(7)
numpy_tensor = tensor.numpy()
tensor, numpy_tensor

(tensor([1., 1., 1., 1., 1., 1., 1.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

In [None]:
numpy_tensor.dtype
#default of numpy is float64, default of torch is float32
# https://pytorch.org/tutorials/beginner/examples_tensor/polynomial_numpy.html

dtype('float32')

## Reproducibility (take random out of random)

In short, neural networks start learning with random values and you don't want to create random values every time but you want to be able to reproduce the results between runs

**random_seed**

https://pytorch.org/docs/stable/notes/randomness.html

In [None]:
rand1 = torch.rand(3, 4)
rand2 = torch.rand(3, 4)
rand1 == rand2


tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])

In [None]:
#set the random seed
RANDOM_SEED = 42
torch.manual_seed(RANDOM_SEED)
rand1 = torch.rand(3, 4)

torch.manual_seed(RANDOM_SEED) # you have to set it twice in colab
rand2 = torch.rand(3, 4)
rand1 == rand2

tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])

In [None]:
## Running tensors and PyTorch objects on GPUs 
!nvidia-smi

Tue Apr 11 06:31:37 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.12    Driver Version: 525.85.12    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   50C    P0    26W /  70W |    601MiB / 15360MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

In [None]:
# check for gpu access via pytorch
import torch
torch.cuda.is_available()

True

In [None]:
# Setup device agnostic code - use gpu if avail else no
# https://pytorch.org/docs/stable/notes/cuda.html#best-practices
device = 'cuda' if torch.cuda.is_available() else 'cpu'
device

'cuda'

In [None]:
torch.cuda.device_count()

1

## Putting tensors (and models) on gpu


In [None]:
# Create a tensor (default on CPU)
tensor = torch.tensor([1,2,3], device='cpu')
tensor, tensor.device

(tensor([1, 2, 3]), device(type='cpu'))

In [None]:
# Move tensor ot GPU if available
tensor_gpu = tensor.to(device)
tensor_gpu

tensor([1, 2, 3], device='cuda:0')

In [None]:
# Moving tensor back to CPU - if tensor is on GPU can't transorm it to numpy
tensor_back_cpu = tensor_gpu.cpu().numpy()
print(tensor_back_cpu)
tensor_back_cpu[0] = 10
print(tensor_back_cpu)

[1 2 3]
[10  2  3]


In [None]:
tensor_gpu

tensor([1, 2, 3], device='cuda:0')