<a href="https://colab.research.google.com/github/harshit7271/Deep_learning_with_PyTorch/blob/main/Pytorch_basics.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## 00. PyTorch Fundamentals

Resource Notebook : (https://www.learnpytorch.io/00_pytorch_fundamentals/)

In [236]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
print(torch.__version__)


2.8.0+cu126


In [237]:
# Tensors

## Creating tensors
Pytorch tensors are craeted using `torch.tensor()` = (https://pytorch.org/docs/stable/tensors.html)

In [238]:
# scalar
scalar = torch.tensor(7)
scalar

tensor(7)

In [239]:
scalar.ndim

0

In [240]:
# get tensor back as Python int

scalar.item()

7

In [241]:
# vector

vector = torch.tensor([7,7])
vector

tensor([7, 7])

In [242]:
vector.ndim

1

In [243]:
vector.shape

torch.Size([2])

In [244]:
# MATRIX

MATRIX = torch.tensor([[7,8],
                       [9,10]])
MATRIX

tensor([[ 7,  8],
        [ 9, 10]])

In [245]:
MATRIX.ndim

2

In [246]:
MATRIX[0]

tensor([7, 8])

In [247]:
MATRIX[1]

tensor([ 9, 10])

In [248]:
MATRIX.shape

torch.Size([2, 2])

In [249]:
# TENSOR
TENSOR = torch.tensor([[[[1,2,3,4],
                        [3,6,9,12],
                        [2,4,6,8],
                        [7,3,4,5]]]])
TENSOR


tensor([[[[ 1,  2,  3,  4],
          [ 3,  6,  9, 12],
          [ 2,  4,  6,  8],
          [ 7,  3,  4,  5]]]])

In [250]:
TENSOR.ndim

4

In [251]:
TENSOR.shape

torch.Size([1, 1, 4, 4])

In [252]:
TENSOR[0]

tensor([[[ 1,  2,  3,  4],
         [ 3,  6,  9, 12],
         [ 2,  4,  6,  8],
         [ 7,  3,  4,  5]]])

### Random Tensors

It is imp bcz the way many nueral networks learn is that they start with tensors full of random numbers and then adjust those random numbers to better representation of the data

`Strat with random numbers -> look at data -> update random numbers -> look at data -> update random numbers`

In [253]:
# creating a random tensor of size (3,4)
random_tensor = torch.rand(3,4)
random_tensor

tensor([[0.9668, 0.6890, 0.5180, 0.6022],
        [0.2996, 0.3727, 0.3733, 0.1649],
        [0.9698, 0.4311, 0.0252, 0.3100]])

In [254]:
random_tensor.ndim

2

In [255]:
# create a random tensor with similar shape to an image tensor

random_image_size_tensor = torch.rand(size=(224,224,3)) # height, width, color channels (R,G,B)
random_image_size_tensor.shape, random_image_size_tensor.ndim

(torch.Size([224, 224, 3]), 3)

# Zeros and Ones

In [256]:
# create a tensor of all zeroes

zeros = torch.zeros(size=(3,4))
zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [257]:
zeros*random_tensor

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [258]:
# create a tensor of all ones

ones = torch.ones(size=(3,4))
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [259]:
ones.dtype

torch.float32

### Creating a range of tensors and tensors-like

In [260]:
# Useage of torch.arange()

one_to_six = torch.arange(start =0, end =21, step = 2)
one_to_six

tensor([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18, 20])

In [261]:
# creating tensor like

six_zeros= torch.zeros_like(input = one_to_six)
six_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

## **Tensor Datatypes**
**Note**: Tensor datatypes is one of the 3 big errors you'll run into with PyTorch & Deep Learning
   1. Tensor not right datatype
   2. Tensor not right shape
   3. Tensor not on the right device

In [262]:
# Float 32 tensor

float_32_tensor = torch.tensor([3.0, 6.0, 9.0],
                               dtype = None, # what datatype is the tensor(eg - float32 or float16)
                               device = None, # What device your tensor on
                               requires_grad = False) # Whether or not to track gradients with this tensor operations
float_32_tensor

tensor([3., 6., 9.])

In [263]:
float_32_tensor.dtype

torch.float32

In [264]:
float_16_tensor = float_32_tensor.type(torch.float16)
float_16_tensor

tensor([3., 6., 9.], dtype=torch.float16)

In [265]:
float_16_tensor * float_32_tensor

tensor([ 9., 36., 81.])

In [266]:
int_32_tensor = torch.tensor([3, 6, 9], dtype = torch.int32)
int_32_tensor

tensor([3, 6, 9], dtype=torch.int32)

In [267]:
int_32_tensor * float_16_tensor

tensor([ 9., 36., 81.], dtype=torch.float16)

### Getting Information from Tensors (tensor attributes) :
   1. Tensor not right datatype - to do get datatype from a tensor, use `tensor.dtype`
  2. Tensor not right shape -  to get shape from a tensor, use `tensor.shape` or  `tensor.size()`
   3. Tensor not on the right device - to get device from a tensor, use `tensor,device`

In [268]:
# Create a tensor

dummy_tensor = torch.rand(3,4)
dummy_tensor = dummy_tensor.type(torch.float64)    # we can change its datatype as we want to, this just to showcase
dummy_tensor

tensor([[0.1774, 0.8051, 0.3108, 0.1113],
        [0.1339, 0.2704, 0.9033, 0.5143],
        [0.0263, 0.6051, 0.8486, 0.5398]], dtype=torch.float64)

In [269]:
# Find out details about dummy tensor

print(dummy_tensor)

print(f"Datatype of tensor : {dummy_tensor.dtype}")
print(f"Shape of tensor : {dummy_tensor.shape}")
print(f"Deivce of tensor : {dummy_tensor.device}")

tensor([[0.1774, 0.8051, 0.3108, 0.1113],
        [0.1339, 0.2704, 0.9033, 0.5143],
        [0.0263, 0.6051, 0.8486, 0.5398]], dtype=torch.float64)
Datatype of tensor : torch.float64
Shape of tensor : torch.Size([3, 4])
Deivce of tensor : cpu


# Manipulating Tensors (tensor operations)

Tensor Operations include :
* Addition
* Substraction
* Multiplication (element-wise)
* Division
* Matrix multiplication

In [270]:
# Create a tensor and add 10 to it

tensor = torch.tensor([1, 2, 3])
tensor + 10

tensor([11, 12, 13])

In [271]:
# Multply tensor by 10

tensor*10

tensor([10, 20, 30])

In [272]:
tensor

tensor([1, 2, 3])

In [273]:
# Substravt 10

tensor - 10

tensor([-9, -8, -7])

In [274]:
# Try out PyTorch in-build functions

torch.mul(tensor, 10)

tensor([10, 20, 30])

In [275]:
torch.add(tensor, 10)

tensor([11, 12, 13])

# Matrix Multiplication in PyTorch

Two main ways of performing multiplication in neural networks and deep learning :
1. Element-wise multiplication
2. Matrix Multiplication

There are two main rules that performing matrix multiplication needs to satisfy :    
1. The **inner dimentions**
 *  `(3,2) @ (3,2)` won't work
 * `(2,3) @ (3,2)` will work
 * `(3,2) @ (2,3)` will work
2. The resulting matrix has the shape of the **outer dimensions**
 * `(2,3) @ (3,2)` -> `(2,2)`
 * `(3,2) @ (2,3)` -> `(3,3)`

In [276]:
torch.matmul(torch.rand(3,2), torch.rand(2,3))   # this will work but not (3,2) and (3,2) due to rule 1

tensor([[0.8141, 0.5014, 0.5434],
        [0.4805, 0.2604, 0.4450],
        [0.7043, 0.3723, 0.6851]])

In [277]:
# element wise multiplication

print(tensor, "*", tensor)
print(f"Equale : {tensor * tensor}")

tensor([1, 2, 3]) * tensor([1, 2, 3])
Equale : tensor([1, 4, 9])


In [278]:
# Matrix multiplication

torch.matmul(tensor, tensor)

tensor(14)

In [279]:
# matrix multiplication by hand

1*1 + 2*2 + 3*3

14

In [280]:
# understanding the time complexity between these two

%%time
value = 0
for i in range(len(tensor)):
  value += tensor[i] * tensor[i]
print(value)

tensor(14)
CPU times: user 867 µs, sys: 0 ns, total: 867 µs
Wall time: 3.81 ms


In [281]:
%%time
torch.matmul(tensor, tensor)

CPU times: user 100 µs, sys: 0 ns, total: 100 µs
Wall time: 104 µs


tensor(14)

# One of the most common errors in Deep Learning  is shape errors

In [282]:
tensor_A = torch.tensor([[1,2],
                         [3,4],
                         [6,7]])
tensor_B =  torch.tensor([
    [8,6,6],
    [4,5,6]
])

In [283]:
torch.matmul(tensor_A, tensor_B)

tensor([[16, 16, 18],
        [40, 38, 42],
        [76, 71, 78]])

In [284]:
# bro this is just matrix multiplication we all hav studied in class 11 or 10th already
# rule : (m*n) @ (n*m/anything) = (m*m/anything)

# Finding the min, max, mean, sum, etc (tensor aggregation)

In [285]:
x = torch.arange(1, 100, 10)
x

tensor([ 1, 11, 21, 31, 41, 51, 61, 71, 81, 91])

In [286]:
torch.max(x), x.max()

(tensor(91), tensor(91))

In [287]:
torch.min(x), x.min()

(tensor(1), tensor(1))

In [288]:
torch.mean(x.type(torch.float32)), x.type(torch.float32).mean()  # correct dtype is required as it was int, torch.mean() requires a tensor of float32

(tensor(46.), tensor(46.))

In [289]:
x.sum(), torch.sum(x)

(tensor(460), tensor(460))

# Finding the positional min and max

In [290]:
x

tensor([ 1, 11, 21, 31, 41, 51, 61, 71, 81, 91])

In [291]:
# find the tensor that has the minimum value with argmin() -> returns the index position of target tensor where min value occurs

x.argmin()

tensor(0)

In [292]:
x[0]

tensor(1)

In [293]:
# find the tensor that has the max value with argmin() -> returns the index position of target tensor where max value occurs

x.argmax()

tensor(9)

In [294]:
x[9]

tensor(91)

# **Reshaping, stacking, squeezing and unsqueezing tensors**
* Reshaping - reshapes an input tensor to a defined shape
* View - Return a view of an input tensor of certain shape but keeps the same memory as the original tensor
* Stacking -  combines multiple tensors on top of each other (vstack) or side by side (hstack)
* Squeeze -  removes all `1` dimension from a tensor
* Unsqueeze - add a `1` dimension to a target tensor
* Permute - Return a view of the input with dimensiuon permuted(swapped) in a certain way

In [295]:
# Let's create a tensor

import torch
x = torch.arange(1., 10.)
x, x.shape

(tensor([1., 2., 3., 4., 5., 6., 7., 8., 9.]), torch.Size([9]))

In [296]:
# Add an extra dimension by reshaping

x_reshaped = x.reshape(1, 9)
x_reshaped, x_reshaped.shape


(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [297]:
# or

x_reshaped = x.reshape(1,9)
x_reshaped, x_reshaped.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [298]:
# Change the view

z = x.view(1, 9)
z, z.shape

(tensor([[1., 2., 3., 4., 5., 6., 7., 8., 9.]]), torch.Size([1, 9]))

In [299]:
# Changing z changes x (bcz a view of a tensor shares the same memory as the original tensor)

z[:,0] = 5
z,x

(tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]]),
 tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.]))

In [300]:
# Stack tensors on top of each other

x_stacked = torch.stack([x,x,x,x], dim = 0)   # dim is either 0 or 1, try it by yourself to check what happens by changing dimension
x_stacked

tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.],
        [5., 2., 3., 4., 5., 6., 7., 8., 9.]])

In [301]:
# squeezing : torch.squeeze() - removes all single dimensions from a target tensor

print(f"Previous tensor : {x_reshaped}")
print(f"Previous shape : {x_reshaped.shape}")

# Remove extra dimensions from x_reshaped
x_squeezed = x_reshaped.squeeze()
print(f"\nNew tensor : {x_squeezed}")
print(f"New shape : {x_squeezed.shape}")

Previous tensor : tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]])
Previous shape : torch.Size([1, 9])

New tensor : tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.])
New shape : torch.Size([9])


In [302]:
# torch.unsqueeze() -  adds a single dimension to a target at specific dimension

print(f"Previous target : {x_squeezed}")
print(f"Previous shape ; {x_squeezed.shape}")

# Add an extra dimension with unsqueezed

x_unsqueezed = x_squeezed.unsqueeze(dim = 0)
print(f"\nNew tensor : {x_unsqueezed}")
print(f"New shape : {x_unsqueezed.shape}")


Previous target : tensor([5., 2., 3., 4., 5., 6., 7., 8., 9.])
Previous shape ; torch.Size([9])

New tensor : tensor([[5., 2., 3., 4., 5., 6., 7., 8., 9.]])
New shape : torch.Size([1, 9])


In [303]:
# torch.permute -  rearranges the dimensions of a target tensor in a specified order
x_original = torch.rand(size =(224, 224, 3)) # [height, width, colour_channels]

# permute the original tensor to rearrange the axis(or dim) order
x_permuted = x_original.permute(2, 0, 1) # shifts axis 0 -> 1, 1->2, 2->1

print(f"Previous shape : {x_original.shape}")
print(f"New shape : {x_permuted.shape}")    # [colour_channels, height, width]



Previous shape : torch.Size([224, 224, 3])
New shape : torch.Size([3, 224, 224])


# **INDEXING - Selcting data from tensors**

Indexing in PyTorch is similar to indexing in NumPy

In [304]:
# create a tensor

import torch
x = torch.arange(1,10).reshape(1,3,3)
x, x.shape

(tensor([[[1, 2, 3],
          [4, 5, 6],
          [7, 8, 9]]]),
 torch.Size([1, 3, 3]))

In [305]:
# Lets's index of new tensor
x[0]

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [306]:
x[0][0]          # selects the whole row 0

tensor([1, 2, 3])

In [307]:
x[0][0][0]

tensor(1)

In [308]:
x[0][0][1]        # selects the second elements of first row whose index is 0

tensor(2)

In [309]:
x[0][1][0]

tensor(4)

In [310]:
x[0][1]       # selects the whole row at 1 index

tensor([4, 5, 6])

In [311]:
x[0]

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [312]:
x[0][2][2]   # will give tensor(9)

tensor(9)

In [313]:
# we can also use ':' to select 'all' of the target dimension
x[:,0]

tensor([[1, 2, 3]])

In [314]:
x[0,0,1]

tensor(2)

In [315]:
x[:,1]

tensor([[4, 5, 6]])

In [316]:
# Get all values of the 0 dimension but only t5he 1 index value of the 1st and 2nd dimension
x[:,1,1]

tensor([5])

In [324]:
# index of x to get 7
x[:,2,0]

# Index on x to return 3,6,7
x[:,:,2]

tensor([[3, 6, 9]])

In [321]:
# Get index 0 of the 0th and 1st dimension and all values of 2nd dimension
x[0,0,:]

tensor([1, 2, 3])

# **Pytorch tensors and NumPy**
Numpy is a popular scientific Python numerical computing libraray

ANd because of this, PyTorch has functionality to inteact with it
* Data in NumPy, want in PyTorch tensor -> `torch.from_numpy(ndarray)`
* PyTorch tensor -> NumPy -> `torch.Tensor.numpy()`

In [329]:
# Numpy array to Tensor
import torch
import numpy as np


array = np.arange(1.0, 8.0)
tensor = torch.from_numpy(array)   # to cahnge the dtype use `.type(torch.float32)`
array, tensor


(array([1., 2., 3., 4., 5., 6., 7.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [330]:
# Note that if you go from numpy to pytorch the default data type will always be the data type of numpy which is float64 whereas it float32 in PyTorch case

In [331]:
torch.arange(1.0,8.0).dtype

torch.float32

In [332]:
np.arange(1.0,8.0).dtype

dtype('float64')

In [335]:
# Lets see what happens if i change the array

array = array +1
array, tensor            # no change in tensor

(array([3., 4., 5., 6., 7., 8., 9.]),
 tensor([1., 2., 3., 4., 5., 6., 7.], dtype=torch.float64))

In [337]:
# Tensor to numpy array

tensor = torch.arange(1.0, 5.0)
numpy_tensor = tensor.numpy()
tensor, numpy_tensor

(tensor([1., 2., 3., 4.]), array([1., 2., 3., 4.], dtype=float32))

In [338]:
# Note that if you go from PyTprch to NumPy the default data type will always be the data type of PyTorch which is float32 whereas it float64 in NumPy case

In [340]:
tensor = tensor + 1
tensor, numpy_tensor   # no change in numpy_tensor after changing tensor

(tensor([2., 3., 4., 5.]), array([1., 2., 3., 4.], dtype=float32))

# **Reproducibility(trying to take random out of random)**
steps how a neural network learn :
* `start with random numbers -> tensor operations -> update random numbers and try to make them better representations of the data -> again -> again....`

And to reduce the randomness in nueral networks and PyTorch comes the concept of **Random Seed**
* Essentially7 what random seed does is "flavour" the randomness.

In [343]:
import torch

# lets create two random tensors

random_tensor_A = torch.rand(3,4)
random_tensor_B = torch.rand(3,4)

print(random_tensor_A)
print(random_tensor_B)
print(random_tensor_A == random_tensor_B)


tensor([[0.0809, 0.9836, 0.6149, 0.7452],
        [0.8540, 0.1501, 0.2034, 0.5094],
        [0.6863, 0.6070, 0.5209, 0.9204]])
tensor([[0.2024, 0.6196, 0.3982, 0.9279],
        [0.5592, 0.7141, 0.4373, 0.0333],
        [0.4556, 0.3575, 0.7345, 0.7725]])
tensor([[False, False, False, False],
        [False, False, False, False],
        [False, False, False, False]])


In [346]:
# Lets make some random but reproducible tensors

RANDOM_SEED = 42
torch.manual_seed(RANDOM_SEED)
random_tensor_C = torch.rand(3,4)

torch.manual_seed(RANDOM_SEED)   # we wrote it again because maual_seed code works only for one block of code
random_tensor_D = torch.rand(3,4)

print(random_tensor_C)
print(random_tensor_D)
print(random_tensor_C == random_tensor_D)

tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])
tensor([[True, True, True, True],
        [True, True, True, True],
        [True, True, True, True]])


# Running tensors and pytorch objects on the GPUs(and making faster computation)

GPUs = faster computation on numbers, thanks to CUDA + NVIDIA HARDWARE + PyTorch working in the behind scenes to make everything better

In [1]:
!nvidia-smi

Wed Oct 15 13:21:03 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  Tesla T4                       Off |   00000000:00:04.0 Off |                    0 |
| N/A   46C    P8              9W /   70W |       0MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

# 1. **Check for GPU access with PyTorch**

In [2]:
# Check for GPU access with PyTorch

import torch
torch.cuda.is_available()

True

# 2. **PyTorch since its capable of running on the GPU or CPU, its best to practice to setup device** **agnostic code** : http://pytorch.org/docs/notes/cuda.html#best-practices

* eg - run on GPU if available, else default to CPU


In [3]:
# Setup device agnostic code

device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

In [5]:
# Count number of devices

torch.cuda.device_count()

1

# 3. **Putting tensors (and models) on the GPU**

* The reason we want our tensors/models on the GPU is because using a GPU results in faster computations.

In [8]:
# Create a tensor (defualt on the CPU)
tensor = torch.tensor([1,2,3])
tensor, tensor.device

(tensor([1, 2, 3]), device(type='cpu'))

In [9]:
# Move tensor on GPU (if available)
tensor_on_gpu = tensor.to(device)
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')

# **4. Moving tensors back to the CPU**

In [10]:
# if tensor is on GPU, can't transform it to NumPy
tensor_on_gpu.numpy()

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

In [11]:
# To fix the GPU tensor with NumPy issue, we can first set it to the CPU
tensor_back_on_cpu = tensor_on_gpu.cpu().numpy()
tensor_back_on_cpu

array([1, 2, 3])

In [12]:
tensor_on_gpu

tensor([1, 2, 3], device='cuda:0')