<a href="https://colab.research.google.com/github/sayanarajasekhar/PyTorch/blob/main/00_pytorch_fundamentals.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 00. PyTorch Fundamentals

### What is PyTorch ?
[PyTorch](https://pytorch.org/) is an open source machine learning and deep learning framework.

### What can PyTorch be used for ?
PyTorch allows you to manipulate and process data and write machine learning algorithms using Python code.



In [None]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
print(torch.__version__)

2.6.0+cu124


## Intoduction to tensors

Tensors are the fundamental building blocks of machine learning.

Their job is to represent data in a numerical way.

For Example, you could represent an image as a tensor with shape ```[3, 224, 224]``` which would mean ```[color_channels, height, width]```

## Creating tensors
> Documentaion on [tensers](https://docs.pytorch.org/docs/stable/tensors.html)

Lets create a **scalar**

### Scalar
A scalar is a single number and in tensor-speak it's a zero dimension tensor

In [None]:
# Scalar
scalar = torch.tensor(7)
scalar

tensor(7)

This means although  ```scalar``` is a single number, its of type ```torch.tensor```

Lets check te dimensions of a tensor using ```ndim``` attribute

In [None]:
scalar.ndim

0

What if we want to retriece the number from the tensor ?

As in, turn it from  ```torch.tensor``` to a python integer?

To do we can use ```item()``` method


In [None]:
# Get the python number within a tensor (Only works with one-element tensors)
scalar.item()

7

### Vector

A vector is a single dimension tensor but can contain many numbers.

As in, you can have a vector ```[3, 2]``` to describe ```[bedrooms, bathrooms]``` in a house. Or you could have ```[3, 2, 2]``` to describe ```[bedrooms, bathrooms, car_parks]``` in a house.

The important trend here is that a vector is flexible in what it can reporesent

In [None]:
# Vector
vector = torch.tensor([7, 7])
vector

tensor([7, 7])

In [None]:
# Check number of dimensions of vector
vector.ndim

1

## Reshaping, View, Stacking, Squeezing and Unsqueezing, Permute tensors

* Reshaping - Reshape an input tensor to a defined shape. (Reshape should be compatable with original tensor size)
* View - Return a view of an input tensor of certain shape but keep the same memory as the original tensor (Tensor Reference). (View should be compatable with original tensor size)
* Stacking - Concatenate a sequence of tensors along a new dimension. Combine multipe tensors on top of each other (stack, vstack - verical stack or hstack - horizental stack).
* Squeeze - Remove `1` dimension from a tensor.
* Unsqueezing - Addd `1` dimension to a target tensor.
* Permute - Return a view of input with dimensions permuted (swapped) in a certain way.

In [None]:
# Let's create a tensor
import torch
x = torch.arange(1., 10,)
x, x.shape

(tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9]))

### Reshape

In [None]:
# Add an extra dimension
x_reshaped = x.reshape(1, 7) # we are trying to squze 9 element tensor to 7 elements
x_reshaped, x_reshaped.shape

RuntimeError: shape '[1, 7]' is invalid for input of size 9

In [None]:
x_reshaped = x.reshape(1, 9) # we are matching elements with x tensor elements
x_reshaped, x_reshaped.shape # Obsesrve the change in tensor shape from [9] -> [1, 9]

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [None]:
x_reshaped = x.reshape(2, 9) # we are trying to reshape 2 * 9 elements when x has only 9 elements
x_reshaped, x_reshaped.shape # Obsesrve the change in tensor shape from [9] -> [1, 9]

RuntimeError: shape '[2, 9]' is invalid for input of size 9

In [None]:
x_reshaped = x.reshape(9, 1)
x_reshaped, x_reshaped.shape

(tensor([[1.],
         [2.],
         [3.],
         [4.],
         [5.],
         [6.],
         [7.],
         [8.],
         [9.]]),
 torch.Size([9, 1]))

In [None]:
x = torch.arange(1., 11.) # now x has 10 elements
x_reshaped = x.reshape(5, 2) # this will work since 5 * 2 = 10 which is equal to the elements in original tensor
x_reshaped, x_reshaped.shape

(tensor([[ 1.,  2.],
         [ 3.,  4.],
         [ 5.,  6.],
         [ 7.,  8.],
         [ 9., 10.]]),
 torch.Size([5, 2]))

### View

In [None]:
# Change the view
x = torch.arange(1., 10.)
z = x.view(3, 3) # Even view should match elements in original tensor
x, x.shape, z, z.shape

(tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.]),
 torch.Size([9]),
 tensor([[1., 2., 3.],
         [4., 5., 6.],
         [7., 8., 9.]]),
 torch.Size([3, 3]))

In [None]:
z[0][0] = 5
x, z # Since z is a reference tensor, changes made to view will reflect in original tensor as well

(tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]),
 tensor([[5., 2., 3.],
         [4., 5., 6.],
         [7., 8., 9.]]))

### Stack

In [None]:
# Stack tensors on top of each other
y = torch.stack([x, x, x, x], dim=0)
z = torch.vstack([x, x, x, x])
y, z

(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.],
         [5., 2., 3., 4., 5., 6., 7., 8., 9.],
         [5., 2., 3., 4., 5., 6., 7., 8., 9.],
         [5., 2., 3., 4., 5., 6., 7., 8., 9.]]),
 tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.],
         [5., 2., 3., 4., 5., 6., 7., 8., 9.],
         [5., 2., 3., 4., 5., 6., 7., 8., 9.],
         [5., 2., 3., 4., 5., 6., 7., 8., 9.]]))

In [None]:
y = torch.stack([x, x, x, x], dim=1)
y

tensor([[5., 5., 5., 5.],
        [2., 2., 2., 2.],
        [3., 3., 3., 3.],
        [4., 4., 4., 4.],
        [5., 5., 5., 5.],
        [6., 6., 6., 6.],
        [7., 7., 7., 7.],
        [8., 8., 8., 8.],
        [9., 9., 9., 9.]])

In [None]:
y = torch.cat([x, x, x, x], dim= 0)
z = torch.hstack([x, x, x, x])
y, z

(tensor([5., 2., 3., 4., 5., 6., 7., 8., 9., 5., 2., 3., 4., 5., 6., 7., 8., 9.,
         5., 2., 3., 4., 5., 6., 7., 8., 9., 5., 2., 3., 4., 5., 6., 7., 8., 9.]),
 tensor([5., 2., 3., 4., 5., 6., 7., 8., 9., 5., 2., 3., 4., 5., 6., 7., 8., 9.,
         5., 2., 3., 4., 5., 6., 7., 8., 9., 5., 2., 3., 4., 5., 6., 7., 8., 9.]))

### Squeeze

In [None]:
# Squeeze
x_reshaped, x_reshaped.shape

(tensor([[ 1.,  2.],
         [ 3.,  4.],
         [ 5.,  6.],
         [ 7.,  8.],
         [ 9., 10.]]),
 torch.Size([5, 2]))

In [None]:
z = x_reshaped.squeeze() # Squeeze has no effect since we dont have any empty dimension
z, z.shape

(tensor([[ 1.,  2.],
         [ 3.,  4.],
         [ 5.,  6.],
         [ 7.,  8.],
         [ 9., 10.]]),
 torch.Size([5, 2]))

In [None]:
x_reshaped_1 = x.reshape(1, 9)
x_reshaped_1, x_reshaped_1.shape

(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [None]:
z = x_reshaped_1.squeeze() # Squeeze removed empty dimension which is empty
z, z.shape # observe the shape after squeeze. removed all singe dimensions. ex:- (2, 1) -> 2, (2, 1, 2, 1, 2) -> (2, 2, 2)

(tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9]))

In [None]:
print(f"Original tensor: {x_reshaped_1}")
print(f"Original tensor shape: {x_reshaped_1.shape}")
print()
print(f"Squeezed tensor: {z}")
print(f"Squeezed tensor shape: {z.shape}")

Original tensor: tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]])
Original tensor shape: torch.Size([1, 9])

Squeezed tensor: tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.])
Squeezed tensor shape: torch.Size([9])


In [None]:
a = torch.zeros([2, 1, 2, 1, 2])
a, a.shape

(tensor([[[[[0., 0.]],
 
           [[0., 0.]]]],
 
 
 
         [[[[0., 0.]],
 
           [[0., 0.]]]]]),
 torch.Size([2, 1, 2, 1, 2]))

In [None]:
b = a.squeeze();
b, b.shape

(tensor([[[0., 0.],
          [0., 0.]],
 
         [[0., 0.],
          [0., 0.]]]),
 torch.Size([2, 2, 2]))

### Unsqueeze

Adds a single dimension to a target tensor at a specific dimension

In [None]:
print(f"Previous squeezed tensor: {z}")
print(f"Previous squeezed tensor shape: {z.shape}")

Previous squeezed tensor: tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.])
Previous squeezed tensor shape: torch.Size([9])


In [None]:
# Add an extra dimension with unsqueeze
z_unsqueezed = z.unsqueeze(dim=0)

print(f"Unsqueezed tensor at dim 0: {z_unsqueezed}")
print(f"Unsqueezed tensor shape: {z_unsqueezed.shape}")

Unsqueezed tensor at dim 0: tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]])
Unsqueezed tensor shape: torch.Size([1, 9])


In [None]:
# Add an extra dimension at 1 with unsqueeze
z_unsqueezed_1 = z.unsqueeze(dim=1)

print(f"Unsqueezed tensor at dim 1: {z_unsqueezed_1}")
print(f"Unsqueezed tensor shape: {z_unsqueezed_1.shape}")

Unsqueezed tensor at dim 0: tensor([[5.],
        [2.],
        [3.],
        [4.],
        [5.],
        [6.],
        [7.],
        [8.],
        [9.]])
Unsqueezed tensor shape: torch.Size([9, 1])


In [None]:
# Add an extra dimension at 2 with unsqueeze
z_unsqueezed_2 = z.unsqueeze(dim=2)
z_unsqueezed_2, z_unsqueezed_2.shape

IndexError: Dimension out of range (expected to be in range of [-2, 1], but got 2)

### Permute

Rearranges the dimesions of a target tensor in a specified order - Returns a view of the original tensor with its dimensions permuted
`original tensor shape remains the same`

Reshape - Will not change the original tensor. Return new tensor with different shape `original tensor shape remains the same`

View - Will change the original tensor values. Return a refernce to the original tensor with different shape or same shape. `Original tensor shape remains the same`


Common places `permute` is used in images



In [None]:
x = torch.randn(3, 3)
x

tensor([[-1.1568, -0.5642, -1.2367],
        [-0.2231,  1.9764,  0.1938],
        [-1.6929, -1.2609,  0.7786]])

In [None]:
z = torch.permute(x, (1, 0))
z

tensor([[-1.1568, -0.2231, -1.6929],
        [-0.5642,  1.9764, -1.2609],
        [-1.2367,  0.1938,  0.7786]])

In [None]:
z[0][0] = 9.9999
x, z

(tensor([[ 9.9999, -0.5642, -1.2367],
         [-0.2231,  1.9764,  0.1938],
         [-1.6929, -1.2609,  0.7786]]),
 tensor([[ 9.9999, -0.2231, -1.6929],
         [-0.5642,  1.9764, -1.2609],
         [-1.2367,  0.1938,  0.7786]]))

In [None]:
x = torch.randn(2, 3, 5)
print(x.shape)
z = torch.permute(x, (2, 0 , 1))
print(z.shape)
# (2, 3 ,5) -> (2, 0 , 1) ->
# 2'nd index in original should move to 0,
# 0'th index in orifianl should move to 1,
# 1'st index in original should move to 2

torch.Size([2, 3, 5])
torch.Size([5, 2, 3])


In [None]:
x_image = torch.rand(size=(224, 220, 3)) # [height, width, colour_channel]

# Permute the original image tensor to rearrange the axis (or dim order)
# Change original [height, width, colour_channel] -> [colour_channel, height, width]

x_permuted_image = torch.permute(x_image, (2, 0 , 1)) # [colour_channel, height, width]
print(f"Shape of the image tensor: {x_image.shape}")
print(f"\nShape of the permuted image tensor: {x_permuted_image.shape}")

Shape of the image tensor: torch.Size([224, 220, 3])

Shape of the permuted image tensor: torch.Size([3, 224, 220])


## Indexing ( Selecting data from tensors)

Indexing with PyTorch is similar to indexing with NumPy

In [None]:
# Create a tensor
import torch

x = torch.arange(1, 10).reshape(1, 3, 3)
x, x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

In [None]:
# Let's index on our new tensor
x[0]

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [None]:
x[0][1]

tensor([1, 2, 3])

In [None]:
x[0][1][2]

tensor(6)

In [None]:
z = x.reshape(3, 3, 1)
z

tensor([[[1],
         [2],
         [3]],

        [[4],
         [5],
         [6]],

        [[7],
         [8],
         [9]]])

In [None]:
z[0]

tensor([[1],
        [2],
        [3]])

In [None]:
z[0][1]

tensor([2])

In [None]:
z[0][1][0]

tensor(2)

In [None]:
y = x.reshape(3, 1, 3)
y

tensor([[[1, 2, 3]],

        [[4, 5, 6]],

        [[7, 8, 9]]])

In [None]:
y[2]

tensor([[7, 8, 9]])

In [None]:
y[2][0][1]

tensor(8)

### You can also use `:` to select `all` of a target dimension

In [None]:
y[1, 0, :]

tensor([4, 5, 6])

In [None]:
x

tensor([[[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]])

In [None]:
# Get all values of 0th and 1st dimension but only index 1 of 2nd dimension
x[:, :, 1]

tensor([[2, 5, 8]])

In [None]:
# Get all values of 0th and 1st dimension but only the 1 index value of 2nd dimension
x[:, 1, 1], x[0][1][1] # x[:, 1, 1] === x[0][1][1]

(tensor([5]), tensor(5))

In [None]:
# Get index 0 of 0th and 1st dimension and all values of 2nd dimesion
x[0, 0, :], x[0][0] # x[0, 0, :] === x[0][0]

(tensor([1, 2, 3]), tensor([1, 2, 3]))

In [None]:
# Index on x to return 9
# tensor([[[1, 2, 3],
#         [4, 5, 6],
#         [7, 8, 9]]])

x[:, 2, 2], x[0][2][2]

(tensor([9]), tensor(9))

In [None]:
# Index on x to return 3, 6, 9
# tensor([[[1, 2, 3],
#         [4, 5, 6],
#         [7, 8, 9]]])

x[:, :, 2]

tensor([[3, 6, 9]])

## PyTorch Tensors & NumPy

NumPy is a popular scientific python numerical computing library.

Because of this, PyTorch has functionality to interact with NumPy.

* Data in NumPy array, convert to PyTorch Tensor -> `torch.from_numpy(ndarray)`

* PyTorch tensor -> NumPy -> `torch.numpy()`


In [None]:
# Numpy array to PyTorch tensor
import numpy as np
import torch

array = np.arange(1.0, 8.0)
print(f"Numpy array: {array}")
print(f"Numpy array type: {array.dtype}")

# Warning: When converting from numpy -> pytorch,
# pytorch reflects numpy's default datatype float64
tensor = torch.from_numpy(array)
print(f"\nTensor: {tensor}")
print(f"Tensor type: {tensor.dtype}")

# Converting tensor to float32. Since torch default datatype is float32
tensor_32 = torch.from_numpy(array).type(torch.float32)
print(f"\nTensor: {tensor_32}")
print(f"Tensor type: {tensor_32.dtype}")

Numpy array: [1. 2. 3. 4. 5. 6. 7.]
Numpy array type: float64

Tensor: tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64)
Tensor type: torch.float64

Tensor: tensor([1., 2., 3., 4., 5., 6., 7.])
Tensor type: torch.float32


In [None]:
# PyTorch tensor to NumPy array
tensor = torch.randn(10)
print(f"Tensor: {tensor}")
print(f"Tensor type: {tensor.dtype}")

array = tensor.numpy()
print(f"\nNumpy array: {array}")
print(f"Numpy array type: {array.dtype}")

Tensor: tensor([-1.9311,  1.1102, -1.4196,  0.9916, -0.6720,  0.3499,  2.2228, -0.2855,
        -0.3005,  0.7290])
Tensor type: torch.float32

Numpy array: [-1.9311318   1.1102197  -1.4195913   0.99158025 -0.6720314   0.3498882
  2.2227669  -0.28549585 -0.30052128  0.72898155]
Numpy array type: float32


## Reproducibility (trying to take random out of random)

In short how a neural network learns:

`start with random number -> perform tensor operations -> update random numbers to try and make them better representations of the data -> again -> again -> again ...`

To reduce the randomness in neural netowoks and PyTorch comes the concept of **random seed**.

Essentially what the random seed does is "flavour" the randomness.



In [None]:
import torch

# Create tow random tensors
random_tensor_a = torch.rand(3, 4) # always creates a random numbers when the cell is executed
random_tensor_b = torch.rand(3, 4)

print(f"Tensor A: \n{random_tensor_a}")
print(f"\nTensor B: \n{random_tensor_b}")
print(f"\nTensor A == Tensor B: \n{random_tensor_a == random_tensor_b}")


Tensor A: 
tensor([[0.0103, 0.6773, 0.0410, 0.3374],
        [0.3344, 0.5789, 0.1009, 0.2901],
        [0.4782, 0.5421, 0.3393, 0.3245]])

Tensor B: 
tensor([[0.0217, 0.0630, 0.8580, 0.6216],
        [0.8575, 0.7905, 0.9536, 0.3787],
        [0.5137, 0.7516, 0.0324, 0.5101]])

Tensor A == Tensor B: 
tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])


In [None]:
# Let's make some random but reproducible tensors

# Set the random seed
RANDOM_SEED = 42 # Can set to any number
torch.manual_seed(RANDOM_SEED)
# Create tensor with random seed
random_tensor_c = torch.rand(3, 4)

torch.manual_seed(RANDOM_SEED)
random_tensor_d = torch.rand(3, 4)

print(f"Tensor C: \n{random_tensor_c}")
print(f"\nTensor D: \n{random_tensor_d}")
print(f"\nTensor C == Tensor D: \n{random_tensor_c == random_tensor_d}")

Tensor C: 
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])

Tensor D: 
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])

Tensor C == Tensor D: 
tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])


## Running tensors and PyTorch objects on GPUs (Making faster computations)

GPUs = faster computation on numbers, thanks to CUDA + NVIDIA hardware + PyTorch working behind the scenes

1. Getting a GPU - Colab pro or use your own GPU or used colud computing (GCP, AWS, Azure)

In [None]:
!nvidia-smi

Sat Aug 23 14:49:51 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  Tesla T4                       Off |   00000000:00:04.0 Off |                    0 |
| N/A   44C    P8              9W /   70W |       0MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

### Check for GPU access with PyTorch

In [None]:
import torch
torch.cuda.is_available()

True

### Setup device agnostic code

In [None]:
device = "cude" if torch.cuda.is_available() else "cpu"
device

'cude'

### Count number of devices

In [None]:
torch.cuda.device_count()

1

## Putting tensors and models on the GPU

The reason we want our tensors/ models on the GPU is because using a GPU results in faster computations.

In [1]:
# Create a tensor (default on CPU)

import torch

tensor = torch.tensor([1, 2, 3], device = 'cpu')
tensor1 = torch.tensor([4, 5, 6]) # default device is CPU

print(tensor, tensor.device)
print(tensor1, tensor1.device)

tensor([1, 2, 3]) cpu
tensor([4, 5, 6]) cpu


In [5]:
# Move tensor to GPU (if available) - Changed the run setting in colab to use GPU
device = "cuda" if torch.cuda.is_available() else "cpu"
print(device)
tensor_on_gpu = tensor.to(device)
tensor_on_gpu1 = tensor1.to(device)

print(tensor_on_gpu.device, tensor_on_gpu1.device)

cuda
cuda:0 cuda:0


## Moving tensors back to CPU

If tensor is on GPU, can't transfter it to NumPy

In [6]:
tensor_on_gpu.numpy()

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

To fix the GPU tensor with NumPy issue, we fist set it to CPU

In [7]:
tensor_back_on_cpu = tensor_on_gpu.cpu().numpy()
tensor_back_on_cpu

array([1, 2, 3])

In [8]:
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')