## 00. Pytorch fundamentals

Resource notebook: https://www.learnpytorch.io/00_pytorch_fundamentals/ 

In [None]:
# Check if your GPU/TPU is supporting CUDA.
# TPU is a processor that is specialized in handling tensor operations. 
# GPU is a processor that is specialized in handling graphics. And recently, it is also used for machine learning. Faster than CPU.
# Cuda is a parallel computing platform and application programming interface mod el created by Nvidia.
!nvidia-smi

In [None]:
# Install PyTorch with GPU support (cu121): https://pytorch.org/get-started/locally/#start-locally
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

In [None]:
%pip install pandas
%pip install numpy
%pip install matplotlib

In [2]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
print(torch.__version__)
torch.cuda.is_available()

2.1.2+cu121


True

In [None]:
print("I am excited to learn pytorch!")

## Introduction to Tensors



### Creating tensors

In the context of PyTorch and other machine learning libraries, a tensor is a generalization of vectors and matrices to potentially higher dimensions. It's the primary data structure used by neural networks.

In more detail:

- A 0-dimensional tensor is a single number (or scalar).
- A 1-dimensional tensor is a vector.
- A 2-dimensional tensor is a matrix.
- Tensors may also have more than two dimensions.

Tensors in PyTorch are similar to NumPy's ndarrays, and they can be used on a GPU as well. The tensor is one of the fundamental concepts in deep learning and PyTorch, so understanding what tensors are and how to manipulate them is crucial for building neural networks.

![](./img/scalar-vector-matrix-tensor.webp)

In [9]:
# scalar
scalar = torch.tensor(7)
scalar


# See how the above printed out tensor(7)?
# That means although scalar is a single number, it's of type torch.Tensor.

tensor(7)

In [11]:
# Dymension of a scalar(tensor)
scalar.ndim

0

In [13]:
# Get the actual value for a tensor
scalar.item()

7

In [16]:
# Vector

vector = torch.tensor([7,7])

# Vector dimension (number of brakets in a tensor)
vector.ndim

torch.Size([2])

In [18]:
# Number of elements in a vector
vector.shape

torch.Size([2])

In [20]:
# Matrix
MATRIX = torch.tensor([[7,8],[9,10]])
MATRIX

tensor([[ 7,  8],
        [ 9, 10]])

In [21]:
MATRIX.ndim

2

In [22]:
MATRIX[1]

tensor([ 9, 10])

In [24]:
MATRIX.shape

torch.Size([2, 2])

In [35]:
# Tensor
TENSOR = torch.tensor([[[1,2,3],[3,6,9],[2,4,4]]])
TENSOR

tensor([[[1, 2, 3],
         [3, 6, 9],
         [2, 4, 4]]])

In [36]:
TENSOR.ndim

3

In [37]:
TENSOR.shape, TENSOR.size()
# shape and size are the same but size is a function and shape is an attribute

(torch.Size([1, 3, 3]), torch.Size([1, 3, 3]))

Some tensor visualisation:

![](./img/tensor_shape1.png)
![](./img/tensor_shape2.png)
![](./img/tensor_shape3.png)
![](./img/tensor_shape4.png)

In [35]:
TENSOR2 = torch.tensor([[[1,2,3,4,5],[1,2,3,4,5]]])
TENSOR2.shape

torch.Size([1, 2, 5])

### Random tensors

Why random tensors?

Random tensors are important because the way many neural networks learn is that they start with tensors full of random numbers to better represent the data.

__Start with random numbers -> look at data -> update random numbers -> look at data -> update random numbers__

In [37]:
# Create a random tensor of shape (3,4)
random_tensor = torch.rand(3,4)
random_tensor

tensor([[0.5768, 0.7574, 0.3424, 0.9172],
        [0.3021, 0.6128, 0.7562, 0.4897],
        [0.5138, 0.8452, 0.4766, 0.9069]])

In [38]:
random_tensor.ndim

2

In [3]:
random_tensor = torch.rand(4,5,2)
random_tensor

tensor([[[0.6591, 0.4218],
         [0.4137, 0.4684],
         [0.4316, 0.0725],
         [0.7943, 0.7937],
         [0.3437, 0.9042]],

        [[0.5238, 0.4086],
         [0.5206, 0.3143],
         [0.2686, 0.4987],
         [0.7505, 0.9732],
         [0.9177, 0.2931]],

        [[0.3707, 0.1161],
         [0.4171, 0.4594],
         [0.6483, 0.3749],
         [0.5519, 0.4703],
         [0.8743, 0.5569]],

        [[0.0224, 0.9984],
         [0.1157, 0.2822],
         [0.6692, 0.2695],
         [0.0932, 0.9936],
         [0.7787, 0.6376]]])

In [9]:
# Create a random tensor with similar shape to an image tensor (hight, width, color channels (R,G,B))
random_image_size_tensor = torch.rand(size=(224,224,3))
random_image_size_tensor.shape, random_image_size_tensor.ndim

(torch.Size([224, 224, 3]), 3)

In [10]:
random_image_size_tensor.dtype

torch.float32

### Zeros and ones

In [4]:
# Create a tensor of all zeros, usefull for zero other tensors
zeros = torch.zeros(5,2)
zeros

tensor([[0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.]])

In [6]:
ones = torch.ones(3,4)
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [7]:
ones.dtype

torch.float32

## Creating tensors in range and tensors-like

In [16]:
# Use torch.range
one_to_ten = torch.arange(1,11)
one_to_ten

tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [17]:
one_to_ten = torch.arange(start=0, end=1000, step=77)
one_to_ten

tensor([  0,  77, 154, 231, 308, 385, 462, 539, 616, 693, 770, 847, 924])

In [18]:
# creating tensors like, same shape as other tensors
ten_zeros = torch.zeros_like(one_to_ten)
ten_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [23]:
ten_rand = torch.rand_like(one_to_ten, dtype=torch.float32)
ten_rand

tensor([0.4786, 0.4460, 0.1637, 0.0523, 0.4237, 0.5206, 0.6270, 0.3820, 0.8643,
        0.9325, 0.0144, 0.1777, 0.7528])

In [21]:
ten_ones = torch.ones_like(one_to_ten)
ten_ones

tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])

## Tensor datatype

Determine precision, how detailed the tensor is.

A torch.Tensor is a multi-dimensional matrix containing elements of a single data type. [Data types](https://pytorch.org/docs/stable/tensors.html#data-types)

**Note:** Tensor is one of the 3 bigest error you'll run into with PyTorch and deep learning:

1. Tensor not right datatype
2. Tensors not right shape
3. Tensors not on the right device

In [25]:
# Float 32 tensor
float_32_tensor = torch.tensor([1.0,2.0,3.0], 
                               dtype=None,
                               device=None, 
                               requires_grad=False) # track gradience for backpropagation
float_32_tensor.dtype

torch.float32

In [27]:
float_16_tensor = float_32_tensor.type(torch.float16)
float_16_tensor

tensor([1., 2., 3.], dtype=torch.float16)

In [30]:
# Here datatype difference is not a problem
float_16_tensor, float_32_tensor, float_16_tensor * float_32_tensor

(tensor([1., 2., 3.], dtype=torch.float16),
 tensor([1., 2., 3.]),
 tensor([1., 4., 9.]))

In [31]:
# Here datatype difference is not a problem
int_32_tensor = torch.tensor([3,6,9], dtype=torch.int32)

int_32_tensor, float_32_tensor, int_32_tensor * float_32_tensor

(tensor([3, 6, 9], dtype=torch.int32),
 tensor([1., 2., 3.]),
 tensor([ 3., 12., 27.]))

In [None]:
# Here datatype difference is not a problem
int_32_tensor = torch.tensor([3,6,9], dtype=torch.int32)

int_32_tensor, float_32_tensor, int_32_tensor * float_32_tensor

## Getting information from tensors

Once you've created tensors (or someone else or a PyTorch module has created them for you), you might want to get some information from them.

We've seen these before but three of the most common attributes you'll want to find out about tensors are:

- shape - what shape is the tensor? (some operations require specific shape rules)
- dtype - what datatype are the elements within the tensor stored in?
- device - what device is the tensor stored on? (usually GPU or CPU)

Let's create a random tensor and find out details about it.


In [38]:
# Create a tensor
some_tensor = torch.rand(3, 4)

# Find out details about it
print(some_tensor)
print(f"Shape of tensor: {some_tensor.shape}")
print(f"Datatype of tensor: {some_tensor.dtype}")
print(f"Device tensor is stored on: {some_tensor.device}") # will default to CPU

tensor([[0.5552, 0.1641, 0.7541, 0.9946],
        [0.4853, 0.7021, 0.4579, 0.1000],
        [0.4666, 0.9677, 0.6146, 0.4124]])
Shape of tensor: torch.Size([3, 4])
Datatype of tensor: torch.float32
Device tensor is stored on: cpu


In [40]:
#To change default device in pytorch:
some_tensor = torch.rand(3, 4)
torch.set_default_device('cuda')
print(f"Device tensor is stored on: {some_tensor.device}")

Device tensor is stored on: cuda:0


## Basic operations

In [42]:
# Create a tensor of values and add a number to it
tensor = torch.tensor([1, 2, 3])
tensor + 10

tensor([11, 12, 13], device='cuda:0')

In [43]:
# Multiply it by 10
tensor * 10

tensor([10, 20, 30], device='cuda:0')

In [44]:
# Tensors don't change unless reassigned
tensor

tensor([1, 2, 3], device='cuda:0')

In [45]:
# Subtract and reassign
tensor = tensor - 10
tensor

tensor([-9, -8, -7], device='cuda:0')

In [46]:
# Add and reassign
tensor = tensor + 10
tensor

tensor([1, 2, 3], device='cuda:0')

In [47]:
# Can also use torch functions
torch.multiply(tensor, 10)

tensor([10, 20, 30], device='cuda:0')

In [48]:
# Original tensor is still unchanged 
tensor

tensor([1, 2, 3], device='cuda:0')

In [49]:
# Element-wise multiplication (each element multiplies its equivalent, index 0->0, 1->1, 2->2)
print(tensor, "*", tensor)
print("Equals:", tensor * tensor)

tensor([1, 2, 3], device='cuda:0') * tensor([1, 2, 3], device='cuda:0')
Equals: tensor([1, 4, 9], device='cuda:0')


## Matrix multiplication (is all you need)¶

One of the most common operations in machine learning and deep learning algorithms (like neural networks) is [matrix multiplication](https://www.mathsisfun.com/algebra/matrix-multiplying.html).

PyTorch implements matrix multiplication functionality in the torch.matmul() method.

The main two rules for matrix multiplication to remember are:

The inner dimensions must match:  
(3, 2) @ (3, 2) won't work  
(2, 3) @ (3, 2) will work  
(3, 2) @ (2, 3) will work  
The resulting matrix has the shape of the outer dimensions:  
(2, 3) @ (3, 2) -> (2, 2)  
(3, 2) @ (2, 3) -> (3, 3)  
Note: "@" in Python is the symbol for matrix multiplication.  

Resource: You can see all of the rules for matrix multiplication  

using torch.matmul() in the PyTorch documentation.

Let's create a tensor and perform element-wise multiplication and matrix multiplication on it.

In [50]:
import torch
tensor = torch.tensor([1, 2, 3])
tensor.shape

torch.Size([3])


The difference between element-wise multiplication and matrix multiplication is the addition of values.

For our tensor variable with values [1, 2, 3]:

| Operation |	Calculation	| Code |
|-----------|---------------|------|
| Element-wise multiplication | [1 * 1, 2 * 2, 3 * 3] = [1, 4, 9]|tensor * tensor | 
| Matrix multiplication | [1 * 1 + 2 * 2 + 3 * 3] = [14] | tensor.matmul(tensor) |  


In [59]:
# Element-wise matrix multiplication
tensor = torch.tensor([1., 2., 3.])
tensor * tensor

tensor([1., 4., 9.], device='cuda:0')

In [60]:
# Matrix multiplication
torch.matmul(tensor, tensor)

tensor(14., device='cuda:0')

In [61]:
# Can also use the "@" symbol for matrix multiplication, though not recommended
tensor @ tensor

tensor(14., device='cuda:0')

You can do matrix multiplication by hand but it's not recommended.

The in-built torch.matmul() method is faster.

In [62]:
%%time
# Matrix multiplication by hand 
# (avoid doing operations with for loops at all cost, they are computationally expensive)
value = 0
for i in range(len(tensor)):
  value += tensor[i] * tensor[i]
value

CPU times: total: 0 ns
Wall time: 11.2 ms


tensor(14., device='cuda:0')

In [63]:
%%time
torch.matmul(tensor, tensor)

CPU times: total: 0 ns
Wall time: 0 ns


tensor(14., device='cuda:0')