<a href="https://colab.research.google.com/github/nakibworkspace/PyTorch/blob/main/Pytorch_lesson00.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Importing PyTorch

In [None]:
import torch
torch.__version__

'2.5.1+cu121'

Creating Tensors

In [None]:
#scalar ; A scalar is a single number and in tensor-speak it's a zero dimension tensor.
scalar= torch.tensor(7)
scalar

tensor(7)

In [None]:
scalar.ndim

0

In [None]:
scalar.item()

7

In [None]:
#Vector
vector= torch.tensor([7,7])
vector

tensor([7, 7])

In [None]:
vector.ndim

1

In [None]:
vector.shape

torch.Size([2])

In [None]:
#Matrix
matrix= torch.tensor([[7,7],
                      [1,2]])
matrix

tensor([[7, 7],
        [1, 2]])

In [None]:
matrix.ndim

2

In [None]:
matrix.shape

torch.Size([2, 2])

In [None]:
#Tensor
TENSOR= torch.tensor([[[1,2,3],
                     [4,5,6],
                     [7,8,9]]])

TENSOR

tensor([[[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]])

In [None]:
TENSOR.ndim

3

In [None]:
TENSOR.shape

torch.Size([1, 3, 3])

In [None]:
#create a random tensor of (3,4)
random_tensor= torch.rand(size=(3,4))
random_tensor, random_tensor.dtype

(tensor([[0.4514, 0.0490, 0.7729, 0.7249],
         [0.7287, 0.1474, 0.9179, 0.0292],
         [0.9660, 0.9198, 0.0401, 0.0668]]),
 torch.float32)

In [None]:
#create a random tensor of size (224,224,3) ([height, width, color_channel])
random_image_size_tensors= torch.rand(size=(224,224,3))
random_image_size_tensors.shape, random_image_size_tensors.ndim

(torch.Size([224, 224, 3]), 3)

Zeros and ones

Sometimes you'll just want to fill tensors with zeros or ones.

This happens a lot with masking (like masking some of the values in one tensor with zeros to let a model know not to learn them).

In [None]:
#create a tensor with zeros
zeros= torch.zeros(size=(3,4))
zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [None]:
#with ones
ones= torch.ones(size=(3,4))
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [None]:
#create a range of values 0 to 10
zero_to_ten= torch.range(0,10)
zero_to_ten

  zero_to_ten= torch.range(0,10)


tensor([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.])

In [None]:
#can also create a tensor of zeros similar to another tensor
ten_zeros= torch.zeros_like(input=zero_to_ten)
ten_zeros

tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

In [None]:
#default datatype for tensors in float32
float_32_tensor= torch.tensor([3.0,6.0,9.0],
                              dtype=None,
                              device=None,
                              requires_grad=False)
float_32_tensor.dtype

torch.float32

Manipulating Tensors:
1. Addition
2. Subtraction
3. Multiplication
4. Division
5. Matirx Multiplication

In [None]:
# Create a tensor of values and add a number to it
tensor = torch.tensor([1, 2, 3])
tensor + 10

tensor([11, 12, 13])

In [None]:
# Multiply it by 10
tensor * 10

tensor([10, 20, 30])

In [None]:
# Tensors don't change unless reassigned
tensor

tensor([1, 2, 3])

In [None]:
# Subtract and reassign
tensor = tensor - 10
tensor

tensor([-9, -8, -7])

In [None]:
# Add and reassign
tensor = tensor + 10
tensor

tensor([1, 2, 3])

In [None]:
# Can also use torch functions
torch.multiply(tensor, 10)

tensor([10, 20, 30])

In [None]:
# Original tensor is still unchanged
tensor

tensor([1, 2, 3])

In [None]:
# Element-wise multiplication (each element multiplies its equivalent, index 0->0, 1->1, 2->2)
print(tensor, "*", tensor)
print("Equals:", tensor * tensor)

tensor([1, 2, 3]) * tensor([1, 2, 3])
Equals: tensor([1, 4, 9])


Matrix multiplication (is all you need)

One of the most common operations in machine learning and deep learning algorithms (like neural networks) is matrix multiplication.

PyTorch implements matrix multiplication functionality in the torch.matmul() method.

The main two rules for matrix multiplication to remember are:

The inner dimensions must match:
(3, 2) @ (3, 2) won't work
(2, 3) @ (3, 2) will work
(3, 2) @ (2, 3) will work
The resulting matrix has the shape of the outer dimensions:
(2, 3) @ (3, 2) -> (2, 2)
(3, 2) @ (2, 3) -> (3, 3)

In [None]:
import torch
tensor= torch.tensor([1,2,3])
tensor.shape

torch.Size([3])

Operation	Calculation	Code
Element-wise multiplication	[1*1, 2*2, 3*3] = [1, 4, 9]	tensor * tensor
Matrix multiplication	[1*1 + 2*2 + 3*3] = [14]	tensor.matmul(tensor)

In [None]:
# Element-wise matrix multiplication
tensor * tensor

tensor([1, 4, 9])

In [None]:
# Matrix multiplication
torch.matmul(tensor, tensor)

tensor(14)

In [None]:
# Shapes need to be in the right way
tensor_A = torch.tensor([[1, 2],
                         [3, 4],
                         [5, 6]], dtype=torch.float32)

tensor_B = torch.tensor([[7, 10],
                         [8, 11],
                         [9, 12]], dtype=torch.float32)

torch.matmul(tensor_A, tensor_B) # (this will error)


RuntimeError: mat1 and mat2 shapes cannot be multiplied (3x2 and 3x2)

In [None]:
print(tensor_A)
print(tensor_B)

tensor([[1., 2.],
        [3., 4.],
        [5., 6.]])
tensor([[ 7., 10.],
        [ 8., 11.],
        [ 9., 12.]])


In [None]:
print(tensor_A)
print(tensor_B.T)

tensor([[1., 2.],
        [3., 4.],
        [5., 6.]])
tensor([[ 7.,  8.,  9.],
        [10., 11., 12.]])


In [None]:
torch.matmul(tensor_A, tensor_B.T)

tensor([[ 27.,  30.,  33.],
        [ 61.,  68.,  75.],
        [ 95., 106., 117.]])

In [None]:
# torch.mm is a shortcut for matmul
torch.mm(tensor_A, tensor_B.T)

tensor([[ 27.,  30.,  33.],
        [ 61.,  68.,  75.],
        [ 95., 106., 117.]])

Neural networks are full of matrix multiplications and dot products.

The torch.nn.Linear() module (we'll see this in action later on), also known as a feed-forward layer or fully connected layer, implements a matrix multiplication between an input x and a weights matrix A.

$$ y = x\cdot{A^T} + b $$

Where:

x is the input to the layer (deep learning is a stack of layers like torch.nn.Linear() and others on top of each other).
A is the weights matrix created by the layer, this starts out as random numbers that get adjusted as a neural network learns to better represent patterns in the data (notice the "T", that's because the weights matrix gets transposed).
Note: You might also often see W or another letter like X used to showcase the weights matrix.
b is the bias term used to slightly offset the weights and inputs.
y is the output (a manipulation of the input in the hopes to discover patterns in it).
This is a linear function (you may have seen something like $y = mx+b$ in high school or elsewhere), and can be used to draw a straight line!

Let's play around with a linear layer.

Try changing the values of in_features and out_features below and see what happens.

Do you notice anything to do with the shapes?

In [None]:
# Since the linear layer starts with a random weights matrix, let's make it reproducible (more on this later)
torch.manual_seed(42)
#this uses matrix multiplication
linear= torch.nn.Linear(in_features=2,  # in_features = matches inner dimension of input
                        out_features=6) # out_features = describes outer value
x= tensor_A
output= linear(x)
print(f"Input shape: {x.shape}\n")
print(f"Output:\n{output}\n\nOutput shape: {output.shape}")

Input shape: torch.Size([3, 2])

Output:
tensor([[2.2368, 1.2292, 0.4714, 0.3864, 0.1309, 0.9838],
        [4.4919, 2.1970, 0.4469, 0.5285, 0.3401, 2.4777],
        [6.7469, 3.1648, 0.4224, 0.6705, 0.5493, 3.9716]],
       grad_fn=<AddmmBackward0>)

Output shape: torch.Size([3, 6])


Finding the min, max, mean, sum, etc (aggregation)

In [None]:
x= torch.arange(0,100,10)
x

tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

In [None]:
print(f"Minimum: {x.min()}")
print(f"Maximum: {x.max()}")
# print(f"Mean: {x.mean()}") # this will error
print(f"Mean: {x.type(torch.float32).mean()}") # won't work without float datatype
print(f"Sum: {x.sum()}")

Minimum: 0
Maximum: 90
Mean: 45.0
Sum: 450


In [None]:
torch.max(x), torch.min(x), torch.mean(x.type(torch.float32)), torch.sum(x)

(tensor(90), tensor(0), tensor(45.), tensor(450))

Positional min/max


In [None]:
tensor=torch.arange(1,100,10)
print(f'tensor: {tensor}')

print(f'Index where the max value : {tensor.argmax()}')
print(f'Index where the min value : {tensor.argmin()}')

tensor: tensor([ 1, 11, 21, 31, 41, 51, 61, 71, 81, 91])
Index where the max value : 9
Index where the min value : 0


Change tensor datatype

In [None]:
# Create a tensor and check its datatype
tensor = torch.arange(10., 100., 10.)
tensor.dtype

torch.float32

In [None]:
# Create a float16 tensor
tensor_float16 = tensor.type(torch.float16)
tensor_float16

tensor([10., 20., 30., 40., 50., 60., 70., 80., 90.], dtype=torch.float16)

In [None]:
# Create an int8 tensor
tensor_int8 = tensor.type(torch.int8)
tensor_int8

tensor([10, 20, 30, 40, 50, 60, 70, 80, 90], dtype=torch.int8)

Reshaping, stacking, squeezing and unsqueezing

Why do any of these?

Because deep learning models (neural networks) are all about manipulating tensors in some way. And because of the rules of matrix multiplication, if you've got shape mismatches, you'll run into errors. These methods help you make sure the right elements of your tensors are mixing with the right elements of other tensors.

torch.reshape(input, shape)	:Reshapes input to shape (if compatible), can also use torch.Tensor.reshape().
Tensor.view(shape)	:Returns a view of the original tensor in a different shape but shares the same data as the original tensor.
torch.stack(tensors, dim=0)	:Concatenates a sequence of tensors along a new dimension (dim), all tensors must be same size.
torch.squeeze(input)	:Squeezes input to remove all the dimenions with value 1.
torch.unsqueeze(input, dim)	:Returns input with a dimension value of 1 added at dim.
torch.permute(input, dims)	:Returns a view of the original input with its dimensions permuted (rearranged) to dims.

In [None]:
#create tensor
tensor= torch.arange(1.,8.)
tensor, tensor.shape

(tensor([1., 2., 3., 4., 5., 6., 7.]), torch.Size([7]))

In [None]:
#add extra dimension
tensor_reshaped= tensor.reshape(1,7)
tensor_reshaped, tensor_reshaped.shape

(tensor([[1., 2., 3., 4., 5., 6., 7.]]), torch.Size([1, 7]))

In [None]:
# Change view (keeps same data as original but changes view)
z= tensor.view(1,7)
z, z.shape

(tensor([[1., 2., 3., 4., 5., 6., 7.]]), torch.Size([1, 7]))

In [None]:
#changing z changes x
z,tensor

(tensor([[1., 2., 3., 4., 5., 6., 7.]]), tensor([1., 2., 3., 4., 5., 6., 7.]))

In [None]:
# Stack tensors on top of each other
tensor_stacked = torch.stack([tensor, tensor,tensor, tensor], dim=0) # try changing dim to dim=1 and see what happens
tensor_stacked

tensor([[1., 2., 3., 4., 5., 6., 7.],
        [1., 2., 3., 4., 5., 6., 7.],
        [1., 2., 3., 4., 5., 6., 7.],
        [1., 2., 3., 4., 5., 6., 7.]])

In [None]:
# How about removing all single dimensions from a tensor?

# To do so you can use torch.squeeze() (I remember this as squeezing the tensor to only have dimensions over 1).
print(f'previous tensor: {tensor_reshaped}')
print(f'previous tensor shape: {tensor_reshaped.shape}')

#remove extra dimension from tensor_reshaped
tensor_squeezed= tensor_reshaped.squeeze()
print(f'new tensor: {tensor_squeezed}')
print(f'new tensor shape: {tensor_squeezed.shape}')

previous tensor: tensor([[1., 2., 3., 4., 5., 6., 7.]])
previous tensor shape: torch.Size([1, 7])
new tensor: tensor([1., 2., 3., 4., 5., 6., 7.])
new tensor shape: torch.Size([7])


In [None]:
# You can also rearrange the order of axes values with torch.permute(input, dims), where the input gets turned into a view with new dims.
# Create tensor with specific shape
x_original = torch.rand(size=(224, 224, 3))

# Permute the original tensor to rearrange the axis order
x_permuted = x_original.permute(2, 0, 1) # shifts axis 0->1, 1->2, 2->0

print(f"Previous shape: {x_original.shape}")
print(f"New shape: {x_permuted.shape}")

Previous shape: torch.Size([224, 224, 3])
New shape: torch.Size([3, 224, 224])


Indexing (selecting data from tensors)

Sometimes you'll want to select specific data from tensors (for example, only the first column or second row).

To do so, you can use indexing

In [None]:
#create a tensor
import torch
x=torch.arange(1,10).reshape(1,3,3)
x, x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

In [None]:
#Let's index bracket by bracket
print(f'first square bracket: \n{x[0]}')
print(f'second square bracket: \n{x[0][0]}')
print(f'third square bracket: \n{x[0][0][0]}')

first square bracket: 
tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])
second square bracket: 
tensor([1, 2, 3])
third square bracket: 
1


In [None]:
# Get all values of 0th dimension and the 0 index of 1st dimension
x[:, 0]

tensor([[1, 2, 3]])

In [None]:
# Get all values of 0th & 1st dimensions but only index 1 of 2nd dimension
x[:, :, 1]

tensor([[2, 5, 8]])

In [None]:
# Get all values of the 0 dimension but only the 1 index value of the 1st and 2nd dimension
x[:, 1, 1]

tensor([5])

In [None]:
# Get index 0 of 0th and 1st dimension and all values of 2nd dimension
x[0, 0, :] # same as x[0][0]

tensor([1, 2, 3])

PyTorch tensors & NumPy
Since NumPy is a popular Python numerical computing library, PyTorch has functionality to interact with it nicely.

The two main methods you'll want to use for NumPy to PyTorch (and back again) are:

torch.from_numpy(ndarray) - NumPy array -> PyTorch tensor.
torch.Tensor.numpy() - PyTorch tensor -> NumPy array.

In [9]:
#NUmPy array to tensor
import torch
import numpy as np
array=np.arange(1.0,8.0)
tensor=torch.from_numpy(array)
array,tensor

(array([1., 2., 3., 4., 5., 6., 7.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [10]:
#change the array keep the tensor
array= array+1
array, tensor

(array([2., 3., 4., 5., 6., 7., 8.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [11]:
#tensor to numpy array
tensor=torch.ones(7)
numpy_tensor= tensor.numpy()
tensor, numpy_tensor

(tensor([1., 1., 1., 1., 1., 1., 1.]),
 array([1., 1., 1., 1., 1., 1., 1.], dtype=float32))

In [12]:
import torch

#create two random tensors
random_ten_A= torch.rand(3,4)
random_ten_B= torch.rand(3,4)

print(f"Tensor A: \n{random_ten_A}\n")
print(f"Tensor B: \n{random_ten_B}\n")
print(f"Does Tensor A equal Tensor B? (anywhere)")
random_ten_A == random_ten_B

Tensor A: 
tensor([[0.4646, 0.0475, 0.2455, 0.0386],
        [0.2045, 0.6840, 0.1833, 0.1169],
        [0.2927, 0.4652, 0.9718, 0.4574]])

Tensor B: 
tensor([[5.7625e-01, 7.7968e-02, 9.1419e-01, 9.1799e-01],
        [8.7461e-01, 9.5308e-01, 1.9453e-01, 2.8062e-04],
        [9.6137e-02, 8.9255e-01, 2.3932e-02, 1.5606e-01]])

Does Tensor A equal Tensor B? (anywhere)


tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])

In [16]:
import torch
import random

#Set the random seed
RANDOM_SEED=42
torch.manual_seed(seed=RANDOM_SEED)
random_ten_C= torch.rand(3,4)

#have to reset the seed everytime a new rand is called
#without this ten D would be diff than ten C
torch.random.manual_seed(seed=RANDOM_SEED) # if we comment this line, the matrix changes to false
random_ten_D= torch.rand(3,4)

print(f"Tensor C:\n{random_ten_C}\n")
print(f"Tensor D:\n{random_ten_D}\n")
print(f"Does Tensor C equal Tensor D? (anywhere)")
random_ten_C == random_ten_D

Tensor C:
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])

Tensor D:
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])

Does Tensor C equal Tensor D? (anywhere)


tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])