<a href="https://colab.research.google.com/github/Sebastian-Constantin-Iacob/learning_pytorch/blob/main/00_pytorch_fundamentals.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [4]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
print(torch.__version__)

2.0.0+cu118


## Introduction to Tensors
### Creating tensors
### PyTorch tensors are created using torch.Tensor(). See documentation at: https://pytorch.org/docs/stable/tensors.html

In [5]:
# Scalar
scalar = torch.tensor(7)
scalar

tensor(7)

In [6]:
# A scalar has 0 dimentions
scalar.ndim

0

In [7]:
# We can get the tensor back, in this case as a Python int
scalar.item()

7

In [8]:
# A vector as a tensor
vector = torch.tensor([7,7])
vector

tensor([7, 7])

In [9]:
# A vector is considered as having 1 dimention
vector.ndim

1

In [10]:
# A vector can not be retrived as an item
vector.item()

RuntimeError: ignored

In [11]:
# We can get the shape of a vacor 
vector.shape

torch.Size([2])

In [12]:
# Creating a MATRIX
MATRIX = torch.tensor([[7 ,8],
                      [9, 10]])
MATRIX

tensor([[ 7,  8],
        [ 9, 10]])

In [13]:
# Numbers of dimentions for a MATRIX
MATRIX.ndim

2

In [14]:
# The shape of a MATRIX
MATRIX.shape

torch.Size([2, 2])

In [15]:
# Accesing the rows of a matrix
MATRIX[0]

tensor([7, 8])

In [16]:
# Accesing the rows of a matrix
MATRIX[1]

tensor([ 9, 10])

In [17]:
# A tensor that is a larger matrix ( 3x3 )
TENSOR = torch.tensor([[[1, 2, 3],
                      [3, 6, 9],
                      [2, 4, 5]]])
TENSOR

tensor([[[1, 2, 3],
         [3, 6, 9],
         [2, 4, 5]]])

In [18]:
# Tensor dimentions
TENSOR.ndim

3

In [19]:
# The shape of a 3x3 tensor ( we can read it as one three by three tensor (1, 3, 3) )
TENSOR.shape

torch.Size([1, 3, 3])

### Random tensors
Why random tensors ?

Random tensors are important because the way many neural networks learn, is that they start with tensors full of random numbers, and then adjust those random numbers to better represent the data.

`Start with random numbers -> look at data -> update random numbers -> look at data -> update random numbers etc`

In [20]:
# Create a random tensor of size (3, 4) < https://pytorch.org/docs/stable/generated/torch.rand.html >
random_tensor = torch.rand(3, 4)
random_tensor

tensor([[0.0602, 0.9414, 0.4724, 0.4859],
        [0.3746, 0.7270, 0.4094, 0.9204],
        [0.9879, 0.7352, 0.4818, 0.1163]])

In [21]:
# Dimentions of this random tensor
random_tensor.ndim

2

In [22]:
# Create a random tensor, with similar shape to an image tensor
random_image_size_tensor = torch.rand(size=(244, 244, 3)) # height, width, color channel RGB
random_image_size_tensor.shape, random_image_size_tensor.ndim

(torch.Size([244, 244, 3]), 3)

## Zeros and ones tensors

In [23]:
# Creating a tensor of all zeros
zeros = torch.zeros(size=(3, 4))
zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [24]:
# Create a tensor of all ones
ones = torch.ones(size=(3, 4))
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [25]:
# As we can see we are using floats by default
ones.type()

'torch.FloatTensor'

In [26]:
ones.dtype

torch.float32

## Creating a range of tensors and tensors-like

In [27]:
# Using torch.range() - Depricated use torch.arange(start, end, step)
torch.range(0, 10)

  torch.range(0, 10)


tensor([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.])

In [28]:
# Using torch.arange()
one_to_ten = torch.arange(start=1, end=10, step=1)
one_to_ten

tensor([1, 2, 3, 4, 5, 6, 7, 8, 9])

In [29]:
# Creating tensors like ( for example a particular shape of tensor )
ten_zeroes = torch.zeros_like(input=one_to_ten)
ten_zeroes

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0])

### Tensor datatypes 
## https://pytorch.org/docs/stable/tensors.html
**Note:** Tensor datatypes is one of the 3 big errors you will run into with PyTorch & deep learning:
1. Tensors not right dataype
2. Tensors not right shape
3. Tensors not on right device

In [30]:
# Creating a float32 tensor
float_32_tensor = torch.tensor([3., 6., 9.],
                               dtype=None, # What data type is the tensor
                               device=None, # What device the tensor is on
                               requires_grad=False) # Wether or not to track gradients with this tensors operations
float_32_tensor

tensor([3., 6., 9.])

In [31]:
# As we can see, the default data type in torch is float32
float_32_tensor.dtype

torch.float32

In [32]:
# Let us do a floar 16 tensor for example
float_16_tensor = torch.tensor([5, 5, 7],
                               dtype=torch.float16)
float_16_tensor

tensor([5., 5., 7.], dtype=torch.float16)

In [33]:
# A different way of creating a 16 bit tensor from a 32 bit tensor
float_16_tensor_b = float_32_tensor.type(torch.float16)
float_16_tensor_b

tensor([3., 6., 9.], dtype=torch.float16)

In [34]:
# Experiment of multiplying 2 tensors of different data types
tensor_32x16 = float_32_tensor * float_16_tensor
tensor_32x16

tensor([15., 30., 63.])

In [35]:
# What data type dose it have now? ( I think before running the experiment that it will convert it up )
tensor_32x16.dtype

torch.float32

In [36]:
# Let us run an experiment with operations between int and float tensors
int_32_tensor = torch.tensor([3, 6, 9],
                             dtype=torch.int32)
int_32_tensor, int_32_tensor * float_32_tensor

(tensor([3, 6, 9], dtype=torch.int32), tensor([ 9., 36., 81.]))

### Getting information from tensors

1.Tensors not right dataype - `tensor.dtype` <br>
2.Tensors not right shape - `tensor.shape`<br>
3.Tensors not on right device - `tensor.device`<br>

In [37]:
# Create a random tensor , to do some experiments
some_tensor = torch.rand(size=(3,4))
some_tensor

tensor([[0.6014, 0.8093, 0.0483, 0.2673],
        [0.5562, 0.4122, 0.0941, 0.2338],
        [0.7472, 0.9166, 0.4952, 0.5022]])

In [38]:
# Let us see the default attributes of a random tensor
print(f"The actual tensor: {some_tensor}")
print(f"Datatype of the tensor: {some_tensor.dtype}")
print(f"Shape of the tensor: {some_tensor.shape}")
print(f"Device the tensor is on: {some_tensor.device}")

The actual tensor: tensor([[0.6014, 0.8093, 0.0483, 0.2673],
        [0.5562, 0.4122, 0.0941, 0.2338],
        [0.7472, 0.9166, 0.4952, 0.5022]])
Datatype of the tensor: torch.float32
Shape of the tensor: torch.Size([3, 4])
Device the tensor is on: cpu


### Manipulating Tensors (tensor operations)

Tensor operations include:
* Addition
* Substraction
* Multiplication (element-wise)
* Division
* Matrix multiplication

In [39]:
# Create a random tensor
tensor = torch.tensor([1, 2, 3])
tensor + 10

tensor([11, 12, 13])

In [40]:
# Multiplication
tensor * 10

tensor([10, 20, 30])

In [41]:
# Subtraction
tensor - 10

tensor([-9, -8, -7])

In [42]:
# Try out PyTorch inbuilt functions
torch.mul(tensor, 10)

tensor([10, 20, 30])

In [43]:
torch.add(tensor, -10)

tensor([-9, -8, -7])

### Matrix multiplication

Two main ways of performing matrix multiplication:
1. Element-wise multiplication
2. Matrix multiplication ( the dor product )

In [44]:
# Element wise multiplication
print(f"{tensor} * {tensor} = {tensor*tensor}")

tensor([1, 2, 3]) * tensor([1, 2, 3]) = tensor([1, 4, 9])


In [45]:
# Matrix multiplication
torch.matmul(tensor, tensor)

tensor(14)

In [46]:
# Matrix multiplication on actual matrix ( not vectors )
MATRIX_A = torch.tensor([[1, 2],
                         [3, 4]])
MATRIX_B = torch.tensor([[5, 6],
                         [7, 8]])
torch.matmul(MATRIX_A, MATRIX_B)

tensor([[19, 22],
        [43, 50]])

### One of the most comon errors in deep learning is the shape error
* Inner dimetions must match when multiplying 2 matrices.
* The resulting Matrix, has the shape of the **outer dimentions**. ( number of rows from MATRIX_A and number of columns from MATRIX_B )

In [47]:
# Experiment multiplying matrices with matiching inner dimentions
torch.matmul(torch.rand(3, 10), torch.rand(10, 3))

tensor([[3.5992, 3.2946, 2.8493],
        [2.6046, 2.2982, 2.2701],
        [3.5502, 3.1472, 3.1487]])

In [48]:
# Experiment multiplying matrices with unmatiching inner dimentions
torch.matmul(torch.rand(3, 11), torch.rand(10, 3))

RuntimeError: ignored

In [49]:
# Shapes for matrix multiplication
tensor_A = torch.tensor([[1, 2],
                         [2, 4],
                         [5, 6]])
tensor_B = torch.tensor([[7, 10],
                         [8, 11],
                         [9, 12]])

torch.matmul(tensor_A, tensor_B) # torch.matmul() = torch.mm()

RuntimeError: ignored

To fix our tensor issues, we can manipulate the shape of one of our tensors using.<br> A **transpose**.
a **transpose** switches the axes or dimentions of a given tensor.

In [50]:
tensor_B, tensor_B.shape

(tensor([[ 7, 10],
         [ 8, 11],
         [ 9, 12]]),
 torch.Size([3, 2]))

In [51]:
# The transpose of B
tensor_B.T , tensor_B.T.shape

(tensor([[ 7,  8,  9],
         [10, 11, 12]]),
 torch.Size([2, 3]))

In [52]:
# As we can imagine, now we can go forward and multiply tensor_A with the transpose of tensor_B
torch.matmul(tensor_A, tensor_B.T)

tensor([[ 27,  30,  33],
        [ 54,  60,  66],
        [ 95, 106, 117]])

## Finding the min, max, mean, sum etc ( tensor aggregation )

In [53]:
# Create a demo tensor
x = torch.arange(0, 100, 10)
x

tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

In [54]:
# Find the min
torch.min(x), x.min()

(tensor(0), tensor(0))

In [55]:
# Find the max 
torch.max(x), x.max()

(tensor(90), tensor(90))

In [56]:
# Find the avarage ( mean )  - As we can see we have to convert them because when we createed the tensor it was int64. Mean functions don't work on int64.
torch.mean(x.type(torch.float32)), x.type(torch.float32).mean()

(tensor(45.), tensor(45.))

In [57]:
# Find the sum
torch.sum(x), x.sum()

(tensor(450), tensor(450))

In [58]:
# Find teh index of the min and max
torch.argmin(x), torch.argmax(x)

(tensor(0), tensor(9))

## Reshaping, stacking, squeezing and unsqueezing tensors

* Reshaping - resahpes an input tensor to a defined shape
* View - Return a view of an input tensor of certain shape but keep the same memory as the original tensor
* Stacking - combine multiple tensors on top of each other ( vstack ) or side by side ( hstack )
* Squeeze - removes all `1` dimentions from a tensor
* Unsqueeze - add a `1` dimension to a target tensor
* Permute - Return a view of the input with dimentions permuted ( swapped ) in a certain way

In [59]:
# Create a new tensor
x = torch.arange(1., 10.)
x, x.shape

(tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9]))

In [60]:
# Add an extra dimention to our tensor ( error , they have to be the same size , in this case 9)
x_reshaped = x.reshape(1, 7)
x_reshaped, x_reshaped.shape

RuntimeError: ignored

In [61]:
# Add an extra dimention to our tensor 
x_reshaped = x.reshape(1, 9)
x_reshaped, x_reshaped.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [62]:
# Add an extra dimention to our tensor 
x_reshaped = x.reshape(9, 1)
x_reshaped, x_reshaped.shape

(tensor([[1.],
         [2.],
         [3.],
         [4.],
         [5.],
         [6.],
         [7.],
         [8.],
         [9.]]),
 torch.Size([9, 1]))

In [63]:
# Change view of the tensor
z = x.view(1, 9)
z, z.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [64]:
# Changing z will change x, because using tensor_b = tensor_a.view() means that tensor_b is pointing to the same memory adress
z[:, 0] = 5
z, x

(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]]),
 tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]))

In [65]:
# Stack tensors on top of each other ( dim=0 )
x_stacked = torch.stack([x, x, x, x])
x_stacked

tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.]])

In [66]:
# Stack tensors on top of each other ( dim=1 )
x_stacked = torch.stack([x, x, x, x], dim=1)
x_stacked

tensor([[5., 5., 5., 5.],
        [2., 2., 2., 2.],
        [3., 3., 3., 3.],
        [4., 4., 4., 4.],
        [5., 5., 5., 5.],
        [6., 6., 6., 6.],
        [7., 7., 7., 7.],
        [8., 8., 8., 8.],
        [9., 9., 9., 9.]])

In [67]:
# Torch sqeeze - removes all single dimentions from a target tensor
x_reshaped = x.reshape(1, 9)
x_reshaped, x_reshaped.shape

(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [68]:
x_reshaped.squeeze(), x_reshaped.squeeze().shape

(tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9]))

In [75]:
# Torch unsqeeze - removes all single dimentions from a target tensor
x_rereshaped = x_reshaped.unsqueeze(dim=0)
x_rereshaped, x_rereshaped.shape, x_reshaped.shape

(tensor([[[5., 2., 3., 4., 5., 6., 7., 8., 9.]]]),
 torch.Size([1, 1, 9]),
 torch.Size([1, 9]))

In [80]:
# torch.permute - re-aranges the dimentions of a target tensor in a specific order
x_original = torch.rand(size=(224, 224, 3)) # (height, width, color)
# Let us say we need to permute this tensor to be used in fucntion func(), that accepts input (color, height, width)
x_permuted = x_original.permute(2, 0, 1)
x_original.shape, x_permuted.shape

(torch.Size([224, 224, 3]), torch.Size([3, 224, 224]))

In [82]:
# They point to the same memory adress
x_original[0,0,0] = 1832
x_original[0, 0, 0], x_permuted[0, 0, 0]

(tensor(1832.), tensor(1832.))

## Selecting data from tensors, using indexes

Indexing with PyTorch is similar to NumPy.

In [83]:
# Create a tensor
x = torch.arange(1, 10).reshape(1, 3, 3)
x, x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

In [85]:
# Indexing on dim = 0 , that means the value of this 1 item
x[0]

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [86]:
# Index on the middle bracket (dim=1)
x[0][0]

tensor([1, 2, 3])

In [91]:
# Index on the last bracket (dim=2)
x[0][0][0]

tensor(1)

In [92]:
# x[0] - the tensor element ( a matrix )  / x[0][0] - the matrix first row / x[0][0][0] - the rows first element


In [94]:
# I can also use ":" to select "all" of a target dimention
x[:, 0]

tensor([[1, 2, 3]])

In [95]:
# Get all values of 0th and 1st dimention, but only index 1 of the 2nd dimention
x[:, :, 1]

tensor([[2, 5, 8]])

In [100]:
# Get all values of the 0 dimention but only the 1 index value of 1st and 2nd dimention
x[:, 1, 1], x[0][1][1], x[:, 1, 1].dtype, x[0][1][1].dtype

(tensor([5]), tensor(5), torch.int64, torch.int64)

In [101]:
# Get index 0 of 0th and 1st dimention and all values of 2nd dimention
x[0, 0, :]

tensor([1, 2, 3])

In [108]:
# Index on x to return 9
x[0, 2, 2]


tensor(9)

In [112]:
# Index on x to return 3, 6, 9
x[0, :, 2]

tensor([3, 6, 9])

## PyTourch tensors and NumPy
NumPy is a popular scientific Python numerical computing library.
And because of this, PyTourch has functionality to interact with it.
* Data in NumPy, want in PyTorch tensor -> `torch.from_numpy(ndarray)`
* PyTorch tensor -> NumPy -> `torch.Tensor.numpy()`

In [113]:
# NumPy array to tensor ( altough torch tensor by default is float 32 , numpy is float 64 <it converts up>)
import numpy as np
array = np.arange(1.0, 8.0)
tensor = torch.from_numpy(array)
array, tensor

(array([1., 2., 3., 4., 5., 6., 7.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [114]:
# Lets find out if the tensor points at the arrray memory adress or if it creates a new memory input
array[0] = 5.0
array, tensor

(array([5., 2., 3., 4., 5., 6., 7.]),
 tensor([5., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [115]:
# As we can see, the tensor shares memory with the array only if we directly change an element of the array.
# On addition array changes memory adress
array = array + 1
array, tensor

(array([6., 3., 4., 5., 6., 7., 8.]),
 tensor([5., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [116]:
# Tensor to numpy
tensor = torch.ones(7)
numpy_tensor = tensor.numpy()
tensor, numpy_tensor, tensor.dtype, numpy_tensor.dtype

(tensor([1., 1., 1., 1., 1., 1., 1.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32),
 torch.float32,
 dtype('float32'))

In [118]:
tensor[0] = 1832.0
tensor, numpy_tensor

(tensor([1.8320e+03, 1.0000e+00, 1.0000e+00, 1.0000e+00, 1.0000e+00, 1.0000e+00,
         1.0000e+00]),
 array([1.832e+03, 1.000e+00, 1.000e+00, 1.000e+00, 1.000e+00, 1.000e+00,
        1.000e+00], dtype=float32))

## Reproduciability ( trying to take random out of random )
In short how a neural network learns:
`start with random numbners -> tensor operations -> update random numbers to try to make them better representation of the data -> loop...`
To reduce the randomnes in neural networks and PyTorch, comes the concept of a **random seed**.
Essentially whatthe random seed does is "flavour" the randomness.

In [121]:
# Create 2 random tensors
random_tensor_A = torch.rand(3, 4)
random_tensor_B = torch.rand(3, 4)
random_tensor_A, random_tensor_B

(tensor([[0.6854, 0.4982, 0.7767, 0.6539],
         [0.0872, 0.6317, 0.7357, 0.3607],
         [0.2578, 0.6386, 0.9205, 0.7282]]),
 tensor([[0.0579, 0.5902, 0.2940, 0.6790],
         [0.0391, 0.8707, 0.9776, 0.6793],
         [0.0720, 0.8254, 0.0113, 0.3327]]))

In [122]:
random_tensor_A == random_tensor_B

tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])

In [127]:
# Let's make some random, but reproduceble tensors


# Set the random seed
RANDOM_SEED = 42

torch.manual_seed(RANDOM_SEED)
random_tensor_C = torch.rand(3, 4)

torch.manual_seed(RANDOM_SEED)
random_tensor_D = torch.rand(3, 4)

random_tensor_C, random_tensor_D

(tensor([[0.8823, 0.9150, 0.3829, 0.9593],
         [0.3904, 0.6009, 0.2566, 0.7936],
         [0.9408, 0.1332, 0.9346, 0.5936]]),
 tensor([[0.8823, 0.9150, 0.3829, 0.9593],
         [0.3904, 0.6009, 0.2566, 0.7936],
         [0.9408, 0.1332, 0.9346, 0.5936]]))

In [128]:
random_tensor_C == random_tensor_D

tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])