<a href="https://colab.research.google.com/github/nmermigas/PyTorch/blob/main/00_pytorch_fundamentals.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Welcome to the PyTorch Basics!

In [2]:
!nvidia-smi

Sat Oct  7 18:33:33 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.105.17   Driver Version: 525.105.17   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   50C    P8    10W /  70W |      0MiB / 15360MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

In [3]:
import torch
import pandas as pd
import numpy as np
import matplotlib as plt

In [4]:
print(torch.__version__)

2.0.1+cu118


### Creating Tensors

In [5]:
#scalar
scalar = torch.tensor(7)
scalar

tensor(7)

In [6]:
scalar.ndim

0

In [7]:
# Get tensor back as python int
scalar.item()

7

In [8]:
# Vector
vector = torch.tensor([7,7])
vector.ndim #ndim shows the dimensions

1

In [9]:
vector.shape # 2 by 1 elements

torch.Size([2])

In [10]:
# MATRIX
MATRIX = torch.tensor([[7,8],[9,10]])
MATRIX.ndim
MATRIX.shape

torch.Size([2, 2])

In [11]:
# TENSOR
TENSOR = torch.tensor([[[1,2,3],[3,6,9],[2,4,5]]])
TENSOR

tensor([[[1, 2, 3],
         [3, 6, 9],
         [2, 4, 5]]])

In [12]:
TENSOR.ndim

3

In [13]:
TENSOR.shape # 1 tensor 3 by 3

torch.Size([1, 3, 3])

In [14]:
TENSOR[0]

tensor([[1, 2, 3],
        [3, 6, 9],
        [2, 4, 5]])

In [15]:
TENSOR_EXAMPLE = torch.tensor([[[1,2,3],[14,432,234],[123,6543,12312]],[[1,2,3],[2,3,4],[123,435,809]]])
TENSOR_EXAMPLE.ndim

3

In [16]:
TENSOR_EXAMPLE.shape

torch.Size([2, 3, 3])

### Random Tensors

Why random tensors?

Random tensors are important. The way nns learn is that they start with tensors full of random numbers and then adjust those numbers to better represent data.

`Start with random numbers -> look at data -> update random numbers -> look at data -> update`

Torch random tensors: https://pytorch.org/docs/stable/generated/torch.rand.html

In [17]:
# create a random tensor of size (3,4)

random_tensor = torch.rand(3,4)
random_tensor

tensor([[0.0744, 0.7130, 0.5112, 0.2759],
        [0.2288, 0.4344, 0.0311, 0.4217],
        [0.9968, 0.1269, 0.6114, 0.7251]])

In [18]:
random_tensor.ndim

2

In [19]:
# Create a random tensor with similar shape to an image tensor
random_image_tensor = torch.rand(size=(3,224,224)) #height, width, colour channels
random_image_tensor.shape,random_image_tensor.ndim

(torch.Size([3, 224, 224]), 3)

### Zeros and ones

In [20]:
# Tensor full of zeros
zeros = torch.zeros(size=(3,4))
zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [21]:
# Tensor full of ones
ones = torch.ones(size=(3,4))
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [22]:
ones.dtype #data type of the tensor

torch.float32

### Creating a range od tensors and tensors-like


In [23]:
# Use torch.arange(), instead of torch.range() which gives deprecated message
one_to_ten = torch.arange(start=1,end=11,step = 1)
one_to_ten

tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [24]:
# Creating tensors-like
ten_zeros = torch.zeros_like(input=one_to_ten)
ten_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

### Tensor data types

**Note:** Tensor datatypes is one of the 3 big errors someone might run into in PyTorch and deep learning:
1. Tensors not right datatype
2. Tensors not right shape
3. Tensors not on right device (eg. one tensor running on cpu and the other in gpu)



In [25]:
# Float 32 tensor
float_32_tensor = torch.tensor([3.0,6.0,9.0],
                               dtype = None, # what datatype is the tensor, eg. float32 is the default (single precision)
                               device = None, # what device is the tensor on
                               requires_grad = False) # whether or not to track gradients
float_32_tensor, float_32_tensor.dtype

(tensor([3., 6., 9.]), torch.float32)

In [26]:

float_16_tensor = float_32_tensor.type(torch.float16)
float_16_tensor

tensor([3., 6., 9.], dtype=torch.float16)

In [27]:
float_16_tensor * float_32_tensor

tensor([ 9., 36., 81.])

In [28]:
int_32_tensor = torch.tensor([3,6,9],dtype=torch.int32)
int_32_tensor

tensor([3, 6, 9], dtype=torch.int32)

In [29]:
int_32_tensor * float_32_tensor


tensor([ 9., 36., 81.])

### Getiing information from tensors

1. Tensors not right datatype - to get datatype from a tensor, can use `tensor.dtype`
2. Tensors not right shape - to get shape from a tensor, can use `tensor.shape`
3. Tensors not on right device (eg. one tensor running on cpu and the other in gpu) -to get device from a tensor, can use `tensor.device`

In [30]:
# Create a tensor
some_tensor = torch.rand(3,4)
some_tensor

tensor([[0.4243, 0.3778, 0.5719, 0.9608],
        [0.2399, 0.5890, 0.7403, 0.9596],
        [0.3005, 0.6332, 0.7544, 0.4178]])

In [31]:
# Find out details about some tensor
print(f"Datatype: {some_tensor.dtype}")
print(f"Shape: {some_tensor.shape}")
print(f"Device: {some_tensor.device}")



Datatype: torch.float32
Shape: torch.Size([3, 4])
Device: cpu


### Manipulating Tensors (Tensor operations)

Tensor operations include:
* Addition
* Subtraction
* Multiplication (element-wise)
* Division
* Matrix multiplication

In [32]:
# Create a tensor and add 10 to it
tensor = torch.tensor([1,2,3])
tensor + 10

tensor([11, 12, 13])

In [33]:
# multiply tensor by 10
tensor = tensor * 10
tensor

tensor([10, 20, 30])

In [34]:
# subtract 10
tensor - 10

tensor([ 0, 10, 20])

In [35]:
# divide by 10
tensor / 10

tensor([1., 2., 3.])

In [36]:
# Pre-built PyTorch functions
torch.mul(tensor,10)
torch.add(tensor,10)

tensor([20, 30, 40])

### Matrix multiplication

Two main ways of performing multiplication in neural networks and deep learning:

1. Element-wise multiplication
2. Matrix multiplication (dot product)

There are two main rules that performing matrix multiplication needs to satisfy:

1. The **inner dimensions** must match:
* `(3,2) @ (3,2)` won't work
* `(2,3) @ (3,2)` will work
* `(3,2) @ (2,3)` will work

2. The resulting matrix has the shape of the **outer dimensions**:
* `(2,3) @ (3,2)` -> (2,2)
* `(3,2) @ (2,3)` -> (3,3)



In [37]:
# Element wise multiplication
print(tensor,"*",tensor)
print(f"Equals:  {tensor*tensor}")

tensor([10, 20, 30]) * tensor([10, 20, 30])
Equals:  tensor([100, 400, 900])


In [38]:
# Matrix multiplication
torch.matmul(tensor,tensor)

tensor(1400)

In [39]:
torch.matmul(torch.rand(10,10),torch.rand(10,11)).shape


torch.Size([10, 11])

### Shape Errors

In [40]:
#shapes for matrix multiplication
tensor_A = torch.tensor ([[1,2],
                          [3,4],
                         [5,6]])

tensor_B = torch.tensor([[7,10],
                         [8,11],
                         [9,12]])


In [42]:
# torch.matmul(tensor_A,tensor_B)

To fix our tensor shape issues -> manipulate the shape of one of the tensors using **transpose**

**transpose** switches the axes or dimensions of a tensor

In [43]:
tensor_B,tensor_B.T

(tensor([[ 7, 10],
         [ 8, 11],
         [ 9, 12]]),
 tensor([[ 7,  8,  9],
         [10, 11, 12]]))

In [44]:
torch.matmul(tensor_A,tensor_B.T) # works since tensor_B is transposed


tensor([[ 27,  30,  33],
        [ 61,  68,  75],
        [ 95, 106, 117]])

### Finding the min, max, mean, sum etc (tensor,aggregation)




In [45]:
# Create a tensor

x = torch.arange(1,100,10)
x, x.dtype

(tensor([ 1, 11, 21, 31, 41, 51, 61, 71, 81, 91]), torch.int64)

In [46]:
# Find the min
torch.min(x),x.min()

(tensor(1), tensor(1))

In [47]:
# Find the max
torch.max(x),x.max()

(tensor(91), tensor(91))

In [48]:
# Find the mean - note: the torch.mean() function requires a tensor of float32 dtype to work
torch.mean(x.type(torch.float32)), x.type(torch.float32).mean()

(tensor(46.), tensor(46.))

In [49]:
# Find the sum
torch.sum(x), x.sum()

(tensor(460), tensor(460))

In [50]:
torch.argmax(x)

tensor(9)

### Finding the positional min, max

In [51]:
# Find the position in tensor that has the min value: argnmin() -> returns index
x.argmin()

tensor(0)

In [52]:
# Find the position in tensor that has the max value: argnmax() -> returns index
x.argmax()

tensor(9)

In [53]:
x[9]

tensor(91)

## Reshaping, stacking, squeezing and unsqueezing tensors

* Reshaping - reshapes an input tensors to a defined shape
* View - Return a view of an input tensor of certain shape but keep the same memory as the original tensor
* Stacking - combine multiple tensors on top of each other (vstack) or side by side (hstack)
* Squeeze - removes all `1` dimensions from a tensor
* Unsqueeze - add a `1` dimension to a target tensor
* Permute - Return a view of the input with dimensions permutated (swapped) in a certain way

In [54]:
# Let's do it.

x = torch.arange(1.,10.)
x,x.shape

(tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9]))

In [55]:
# Add an extra dimension - note: it has to be compatible with the original shape
x_reshaped = x.reshape(1,9)
x_reshaped, x_reshaped.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [56]:
# Change the view
z = x.view(1,9) # z shares the same memory with x
z,z.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [57]:
# Changing z changes x as well
z[:,0] = 5
z,x

(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]]),
 tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]))

In [58]:
# Stack tensors on top of each other
x_stacked = torch.stack([x,x,x,x,x], dim = 1)
x_stacked

tensor([[5., 5., 5., 5., 5.],
        [2., 2., 2., 2., 2.],
        [3., 3., 3., 3., 3.],
        [4., 4., 4., 4., 4.],
        [5., 5., 5., 5., 5.],
        [6., 6., 6., 6., 6.],
        [7., 7., 7., 7., 7.],
        [8., 8., 8., 8., 8.],
        [9., 9., 9., 9., 9.]])

In [59]:
# Squeeze tensor - removes all single dimensions from a target tensor
print(f"Previous tensor: {x_reshaped}")
print(f"Previous shape: {x_reshaped.shape}")

# Remove extra dimensions from x_reshaped
x_squeezed = torch.squeeze(x_reshaped, dim = 0)
print(f"\nNew tensor: {x_squeezed}")
print(f"New shape: {x_squeezed.shape}")


Previous tensor: tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]])
Previous shape: torch.Size([1, 9])

New tensor: tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.])
New shape: torch.Size([9])


In [60]:
# torch.unsqueeze() - adds a single dimension to a target tensor at a specific dimension

print(f"Previous tensor: {x_squeezed}")
print(f"Previous shape: {x_squeezed.shape}")

# Add an extra dimension
x_unsqueezed = torch.unsqueeze(x_squeezed, dim = 1)
print(f"\nNew tensor: {x_unsqueezed}")
print(f"New shape: {x_unsqueezed.shape}")


Previous tensor: tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.])
Previous shape: torch.Size([9])

New tensor: tensor([[5.],
        [2.],
        [3.],
        [4.],
        [5.],
        [6.],
        [7.],
        [8.],
        [9.]])
New shape: torch.Size([9, 1])


In [61]:
 # torch.permute - rearranges the dimensions of a target tensor in a specified order
 x_original = torch.rand(size = (224,224,3))  # useful for images

 # Permute the original tensor to rearrange the axis order

 x_permuted = x_original.permute(2,0,1) # shifts axis

 print(f"Previous shape: {x_original.shape}")
 print(f"New shape: {x_permuted.shape}")


Previous shape: torch.Size([224, 224, 3])
New shape: torch.Size([3, 224, 224])


In [62]:
x_original[0,0,0] = 0.7213789

In [63]:
x_original

tensor([[[0.7214, 0.6365, 0.3745],
         [0.8372, 0.2497, 0.4920],
         [0.3825, 0.3576, 0.5462],
         ...,
         [0.1634, 0.5564, 0.6888],
         [0.5777, 0.6530, 0.0650],
         [0.9768, 0.0215, 0.1043]],

        [[0.1472, 0.4400, 0.5275],
         [0.9992, 0.6879, 0.1740],
         [0.1196, 0.5950, 0.5503],
         ...,
         [0.1982, 0.6584, 0.2651],
         [0.3005, 0.8263, 0.6104],
         [0.1435, 0.9766, 0.5516]],

        [[0.6177, 0.2155, 0.5409],
         [0.0312, 0.3061, 0.3603],
         [0.0217, 0.8836, 0.3987],
         ...,
         [0.6873, 0.4196, 0.3090],
         [0.2780, 0.8380, 0.3220],
         [0.6617, 0.2362, 0.8286]],

        ...,

        [[0.8105, 0.7371, 0.3818],
         [0.6456, 0.9412, 0.7762],
         [0.3824, 0.8748, 0.7011],
         ...,
         [0.4622, 0.7350, 0.0059],
         [0.2651, 0.1506, 0.5560],
         [0.6899, 0.2706, 0.7733]],

        [[0.2382, 0.6177, 0.0655],
         [0.7107, 0.7161, 0.4662],
         [0.

## Indexing (selecting data from tensors)

Indexing with PyTorch is similar to indexing with NumPy

In [64]:
import torch
x = torch.arange(1,10).reshape(1,3,3)
x, x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

In [65]:
# indexing
x[0]

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [66]:
# indexing on middle bracket (dim=1)
x[0][0]

tensor([1, 2, 3])

In [67]:
# indexing on the inner bracket (last dimension)

x[0][0][0]

tensor(1)

In [68]:
x

tensor([[[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]])

In [69]:
x[:,:,2]

tensor([[3, 6, 9]])

## PyTorch tensors & NumPy

Numpy is a popular scientific Python numerical computing library.

Pytorch has a functionality to interact with it.

* Data in Numpy, want in PyTorch tensor -> `torch.from_numpy(ndarray)`
* PyTorch tensor -> NumPy -> `torch.Tensor.numpy()`


In [70]:
# NumPy array to tensor

import torch
import numpy as np

array = np.arange(1.0,8.0)
tensor = torch.from_numpy(array) # warning: when converting from numpy -> pytorch, pytorch reflects numpy's default datatype of float64
array, tensor

(array([1., 2., 3., 4., 5., 6., 7.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [71]:
array.dtype

dtype('float64')

In [72]:
# Change the value of array, what will this do to `tensor`?

array = array + 1

In [73]:
array,tensor

(array([2., 3., 4., 5., 6., 7., 8.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [74]:
# Tensor to NumPy array

tensor = torch.ones(7)
numpy_tensor = tensor.numpy()
tensor, numpy_tensor

(tensor([1., 1., 1., 1., 1., 1., 1.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

In [75]:
# Change the tensor, what happens to `numpy_tensor`
tensor = tensor + 1
tensor, numpy_tensor

# -> They don't share memory

(tensor([2., 2., 2., 2., 2., 2., 2.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

## Reproducability (trying to take out of random)

How a neural network learns:

`start with random numbers -> tensor operations -> update random numbers to try and make them better representations of the data -> again -> again again...`

To reduce randomness -> **random seed**

In [76]:
import torch

# Create 2 random tensors
random_tensor_A = torch.rand(3,4)
random_tensor_B = torch.rand(3,4)

print(random_tensor_A)
print(random_tensor_B)
print(random_tensor_A == random_tensor_B)

tensor([[0.9926, 0.9293, 0.9243, 0.9646],
        [0.9581, 0.8844, 0.0611, 0.4210],
        [0.0160, 0.8569, 0.6905, 0.5133]])
tensor([[0.3713, 0.4181, 0.2246, 0.4252],
        [0.8584, 0.4916, 0.4222, 0.6188],
        [0.2993, 0.9982, 0.6064, 0.5001]])
tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])


In [77]:
# Let's make some random but reproducible tensors
import torch

# Set random seed
RANDOM_SEED = 42

torch.manual_seed(RANDOM_SEED)
random_tensor_C = torch.rand(3,4)

torch.manual_seed(RANDOM_SEED)
random_tensor_D = torch.rand(3,4)

print(random_tensor_C)
print(random_tensor_D)
print(random_tensor_C == random_tensor_D)


tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])


### Check for GPU access with PyTorch


In [78]:
import torch
torch.cuda.is_available()

True

In [79]:
# Setup deviece agnostic code
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

In [80]:
# Count number of devices
torch.cuda.device_count()

1

## Putting tensors and models on the GPU

We want tensors/models on the GPU since it is faster in computations.

In [81]:
# Create a tensor (deafult on the CPU)
tensor = torch.tensor([1,2,3])

# Tensor not on GPU
print(tensor,tensor.device)

tensor([1, 2, 3]) cpu


In [82]:
# Move tensor to GPU (if available)

tensor_on_gpu = tensor.to(device)
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')

### Moving tensors back to the CPU


In [84]:
# if tensors is on GPU, can't transform it to NumPy
# tensor_on_gpu.numpy()

In [85]:
# To fix the GPU tensor with NumPy issue, we can first set it to the CPU
tensor_back_on_cpu = tensor_on_gpu.cpu().numpy()

In [86]:
tensor_back_on_cpu

array([1, 2, 3])