<a href="https://colab.research.google.com/github/pkro/pytorch_for_deep_learning/blob/main/00_pytorch_fundamentals_video.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## 00. PyTorch Fundamentals

Resource notebook: https://www.learnpytorch.io/00_pytorch_fundamentals/

In [2]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
print(torch.__version__)


2.0.1+cu118


In [3]:
!nvidia-smi # works only if connected to a runtime with GPU, not needed for now

/bin/bash: nvidia-smi: command not found


## Introduction to Tensors

### Creating tensors

Pytorch tensors are created using [torch.Tensor](https://pytorch.org/docs/stable/tensors.html)

In [4]:
# scalar

scalar = torch.tensor(7)
scalar

tensor(7)

In [5]:
print(scalar.ndim) # 0 -> scalar is a 0-dimensional tensor with a dimension count of 0
print(scalar.item()) # 7, get tensor back as python int
print(scalar.shape)

0
7
torch.Size([])


In [6]:
vector = torch.tensor([3,4]) # vector: magnitude and direction, indicated by x/y coordinates that represent a vector from [0,0] to [x,y]
vector

tensor([3, 4])

In [7]:
print(vector.ndim) # 1-dimensional tensor
print(vector.shape) # torch.Size([2])
# print(vector.item()) # error as a vector can't be converted to a scalar

1
torch.Size([2])


In [8]:
# matrices and tensors are usually written in all uppercase

MATRIX = torch.tensor([
                      [7,8],
                      [9,10]
                       ])

MATRIX

tensor([[ 7,  8],
        [ 9, 10]])

In [9]:
print(MATRIX.ndim) # 2-dimensional tensor
print(MATRIX.shape) # torch.Size([2,2]) 2 by 2

2
torch.Size([2, 2])


In [10]:
TENSOR = torch.tensor([[[1,2,3],
                        [4,5,6],
                        [7,8,9]]])
TENSOR

tensor([[[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]])

In [11]:
TENSOR[0]

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [12]:
TENSOR [0,1] # same as TENSOR[0][1]

tensor([4, 5, 6])

In [13]:
print(TENSOR.ndim) # 3-dimensional tensor
print(TENSOR.shape) # torch.Size([1, 3, 3]) 1 * 3 * 3

3
torch.Size([1, 3, 3])


In [14]:
TENSOR2 = torch.tensor([[[1,2,3],
                        [4,5,6],
                        [7,8,9]],
                        [[11,12,13],
                        [14,15,16],
                        [17,18,19]]])
TENSOR2

tensor([[[ 1,  2,  3],
         [ 4,  5,  6],
         [ 7,  8,  9]],

        [[11, 12, 13],
         [14, 15, 16],
         [17, 18, 19]]])

In [15]:
print(TENSOR2.ndim) # 3
print(TENSOR2.shape) # [2,3,3]


3
torch.Size([2, 3, 3])


### Random tensors

Why random tensors?

Random tensors are important because many neural networks start learning with tensors of random numbers and adjust those numbers during the learning / training process to better represent the data.

Flow:

- start with random numbers
- look at data
- updata random numbers
- look at data
- update random numbers
- etc

In [16]:
# create a random tensor of size (3,4)

random_tensor = torch.rand(3,4) # number of items per dimension, torch.rand(1,3,4) would create a 3-dimensional tensor
random_tensor

tensor([[0.8627, 0.2958, 0.9004, 0.8551],
        [0.2754, 0.2031, 0.2747, 0.2703],
        [0.0209, 0.2265, 0.9170, 0.3863]])

In [17]:
random_tensor.ndim, random_tensor.shape

(2, torch.Size([3, 4]))

In [18]:
# create a random tensor with a similar shape to an image tensor
random_image_size_tensor = torch.rand(size=(224,224,3)) # height, width, color channels (R,G,B); sometimes color channels are at the beginning (e.g. 3,224,224)

In [19]:
random_image_size_tensor.ndim, random_image_size_tensor.shape

(3, torch.Size([224, 224, 3]))

### zeroes and ones

In [20]:
zeros = torch.zeros(size=(3,4))
zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [21]:
ones = torch.ones(size=(3,4))
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [22]:
ones.dtype # all tensors are float32 unless otherwise defined

torch.float32

### create range of tensors and tensors-like

In [23]:
torch.range(0,10) # deprecated, produces range 0-10
torch.arange(0,10) # works like python range, creates range 0-9 (10 is exclusive)

  torch.range(0,10) # deprecated, produces range 0-10


tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [24]:
one_to_ten = torch.arange(1,11)
one_to_ten

tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [25]:
zero_to_1000_step_50 = torch.arange(0,1001,50)
zero_to_1000_step_50

tensor([   0,   50,  100,  150,  200,  250,  300,  350,  400,  450,  500,  550,
         600,  650,  700,  750,  800,  850,  900,  950, 1000])

In [26]:
# same as
torch.arange(start=0, end=1001, step=50)

tensor([   0,   50,  100,  150,  200,  250,  300,  350,  400,  450,  500,  550,
         600,  650,  700,  750,  800,  850,  900,  950, 1000])

In [27]:
# tensors like
# zeros all values of the input tensor / returns a tensor in the same shape as the input tensor with all values zerod
ten_zeros = torch.zeros_like(one_to_ten)
ten_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

## Tensor datatypes

Tensor datatypes is one of the 3 big error sources when working with pytorch & deep learning:

1. Tensor is not the right datatype
2. Tensor is not the right shape
3. tensor is not the right device

In [28]:
float_32_tensor = torch.tensor([3.0,6.0,9.0],
                               dtype=None, # specify datatype (defaults to float32 if floats are given)
                               device="cpu", # what device is the tensor on; "cuda" for gpu
                               requires_grad=False) # whethter to track gradients in tensor operations or not
float_32_tensor.dtype

torch.float32

In [29]:
int64_tensor = torch.tensor([3,6,9]) # note we use ints, so the tensor will default to int64
int64_tensor.dtype

torch.int64

In [30]:
# convert tensor
float_16_tensor = float_32_tensor.type(torch.float16)
float_16_tensor

tensor([3., 6., 9.], dtype=torch.float16)

In [31]:
# multiplication between tensors of different datatype possible,
# output is the lower precission datatype
float_16_tensor * float_32_tensor

tensor([ 9., 36., 81.])

In [32]:
(int64_tensor * float_32_tensor).dtype # float32 as multiplying int * float doesn't have precission loss

torch.float32

In [33]:
(torch.tensor([1,2,3], dtype=torch.int32) * torch.tensor([4,5,6], dtype=torch.float32)).dtype # float32, same as above

torch.float32

### getting information from tensors

1. datatype (tensor.dtype)
2. shape (tensor.shape, tensor.size())
3. device (tensor.device)


In [34]:
some_tensor = torch.rand(3,4)

def tensor_info(tensor):
  print(f"shape: {tensor.shape}, dimensions: {tensor.ndim}") # no str(tensor.shape) needed in formatted strings / string literals
  print(f"dtype: {tensor.dtype}")
  print(f"device: {tensor.device}")

tensor_info(some_tensor)


shape: torch.Size([3, 4]), dimensions: 2
dtype: torch.float32
device: cpu


### manipulating tensors (tensor operations)

- addition
- subtraction
- division
- multiplication (element-wise)
- matrix multiplication

In [35]:
tensor = torch.tensor([1,2,3])
tensor + 10 # adds 10 to each element

tensor([11, 12, 13])

In [36]:
tensor * 10 # multiplies each element by 10

tensor([10, 20, 30])

In [37]:
print(tensor - 10); # doesn't print if just use "tensor - 10" - why?

tensor([-9, -8, -7])


In [38]:
# using pytorch in-built functions
torch.mul(tensor, 10), torch.add(tensor, 5), torch.sub(tensor, 20)

(tensor([10, 20, 30]), tensor([6, 7, 8]), tensor([-19, -18, -17]))

In general, the course recommends to use the standard python operators instead of the torch methods for better readability.

In general, the torch methods are faster for more complex operations such as matrix multiplication (matmul)


### Matrix multiplication

[how to multiply matrices](https://www.mathsisfun.com/algebra/matrix-multiplying.html)

[visualization](http://matrixmultiplication.xyz/)

Two main ways of performing multiplication in neural networks and deep learning:

1. Element-wise multiplication / scalare multiplication
  - multiply by single number: just multiply each matrix element with that number
2. Matrix multiplication / dot product (most common operation in neural networks)
  - symbol: **&middot;** (just a fat dot), e.g. "a **&middot;** b"
  - pytorch function: `torch.matmul(t1, t2)` or `torch.mm(...)`, python operator: `@`, e.g. `t1 @ t2`
  - **inner dimensions** must match:
    - (3, **2**) @ (**3**, 2) will NOT work
    - (2, **3**) @ (**3**, 2) WILL work
    - (3, **2**) @ (**2**, 3) WILL work
  - the resulting matrix has the shape of the **outer dimensions**  
    - (**2**, 3) @ (3, **2**) -> resulting shape: (2, 2)
    - (**3**, 2) @ (2, **3**) -> resulting shape: (3, 3)
    - torch.rand(7,10) @ torch.rand(10,2) -> shape: (7,2)
  - order counts!
  - dot product of rows and columns
  - rows of first matrix are multiplied by columns of second matrix



In [39]:
(torch.rand(7,10) @ torch.rand(10,2)).shape

torch.Size([7, 2])

In [40]:
# element-wise multiplication
torch.tensor([[1,2,3], [4,5,6]]) * torch.tensor([[1,2,3], [4,5,6]])

tensor([[ 1,  4,  9],
        [16, 25, 36]])

In [41]:
# matrix multiplication with vectors
# tensor from before is a [1,2,3]
torch.matmul(tensor, tensor) # 1*1 + 2*2 + 3*3

tensor(14)

In [42]:
# dot product / matrix multiplication
torch.matmul(torch.tensor(
    [ [1,2,3],
      [4,5,6]
    ]),torch.tensor(
        [
          [7,8],
          [9,10],
          [11,12]
        ]))

tensor([[ 58,  64],
        [139, 154]])

### One of the most common errors in deep learning: shape errors



In [56]:
# shapes for matrix multiplication
tensor_A = torch.tensor([[1,2],
                         [3,4],
                         [5,6]])

tensor_B = torch.tensor([[7,10],
                         [8,11],
                         [9,12]])

tensor_info(tensor_A)
tensor_info(tensor_B)

try:
  torch.matmul(tensor_A, tensor_B) # error
except RuntimeError:
  print("can't multiply tensors: incompatible shapes")



shape: torch.Size([3, 2]), dimensions: 2
dtype: torch.int64
device: cpu
shape: torch.Size([3, 2]), dimensions: 2
dtype: torch.int64
device: cpu
can't multiply tensors: incompatible shapes


To fix the error, we can transpose one of our tensors so that the outer dimensions match.

Transpose switches the axes or dimensions of a given tensor.

In [58]:
tensor_info(tensor_B)
print(tensor_B)
tensor_info(tensor_B.T) # transposed
print(tensor_B.T)


shape: torch.Size([3, 2]), dimensions: 2
dtype: torch.int64
device: cpu
tensor([[ 7, 10],
        [ 8, 11],
        [ 9, 12]])
shape: torch.Size([2, 3]), dimensions: 2
dtype: torch.int64
device: cpu
tensor([[ 7,  8,  9],
        [10, 11, 12]])
