This is a notebook for pytorch learning

Based on the course here: https://www.youtube.com/watch?v=V_xro1bcAuA&list=PLD2CAWzRz8e_7BGTAjc8ordnegtwW13et

# 00 PyTorch Fundamentals

## Check version

In [1]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
print(torch.__version__)

2.6.0+cu126


In [2]:
!nvidia-smi

Tue Mar 18 20:38:02 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 572.42                 Driver Version: 572.42         CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                  Driver-Model | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA GeForce RTX 4070      WDDM  |   00000000:01:00.0  On |                  N/A |
| 40%   30C    P5             27W /  200W |    1176MiB /  12282MiB |     48%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

## Introduction to Tensor

### Creating tensors

In [3]:
# Scalar
scalar = torch.tensor(7)
display(scalar)

tensor(7)

In [4]:
print(scalar.ndim)

0


In [5]:
# Get tensor back as Python int
print(scalar.item())

7


### Vector

Same as np, the dimension is the depth of `[]`

In [6]:
vector = torch.tensor([7, 7])
print(vector)
print(vector.ndim)

tensor([7, 7])
1


### Matrix

In [7]:
matrix = torch.tensor([[7, 8],
                       [9, 10]])
print(matrix)
print(matrix.ndim)
print(matrix[0])
print(matrix[1])
print(matrix.shape)

tensor([[ 7,  8],
        [ 9, 10]])
2
tensor([7, 8])
tensor([ 9, 10])
torch.Size([2, 2])


### Tensor

In [8]:
tensor = torch.tensor([[[1, 2, 3, 4],
                        [3, 6, 9, 6],
                        [2, 4, 5, 1]],
                       [[2, 3, 4, 2],
                        [2, 3, 6, 4],
                        [1, 1, 1, 9]]])
print(tensor)
print(tensor.ndim)
print(tensor.shape)


tensor([[[1, 2, 3, 4],
         [3, 6, 9, 6],
         [2, 4, 5, 1]],

        [[2, 3, 4, 2],
         [2, 3, 6, 4],
         [1, 1, 1, 9]]])
3
torch.Size([2, 3, 4])


### Create random tensor in PyTorch

Generates a 3 by 4 tensor

In [9]:
random_tensor = torch.rand(3, 4)
print(random_tensor)

tensor([[0.3320, 0.1545, 0.8181, 0.4580],
        [0.5511, 0.0500, 0.4060, 0.2323],
        [0.8576, 0.1687, 0.6128, 0.9154]])


In [10]:
print(random_tensor.ndim)

2


### Image Tensor

Create a random tensor with similar shape to an image tensor

In [11]:
random_image_size_tensor = torch.rand(size = (224, 224, 3))
# height, width, color channels
print(random_image_size_tensor.shape, random_image_size_tensor.ndim)

torch.Size([224, 224, 3]) 3


### Create a tensor of all zeros or ones

In [12]:
zeros = torch.zeros(size = (3, 4))
ones = torch.ones(size = (3, 4))
print(zeros)
print(ones)

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])
tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])


### Default Data types

In [13]:
print(ones.dtype)

torch.float32


### Create a range of tensors and tensors-like

In [14]:
one_to_ten = torch.arange(0, 10)
print(one_to_ten)

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])


In [15]:
with_step = torch.arange(start = 1, end = 11, step = 2)
print(with_step) # Goes from 1 to 11 - 1

tensor([1, 3, 5, 7, 9])


### Creating tensors-like

In [16]:
ten_zeros = torch.zeros_like(input = one_to_ten)
# Create a tensor with the same shape but filled with zero
# Does not affect original tensor
print(ten_zeros)
print(one_to_ten)

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])


## Tensor datatypes

Three most important parameters

1. `dtype`: What datatype is the tensor (e.g. float64, float32)
    1. Tensors not right datatype
    2. Tensors not right shape
    3. Tensors not on the right device
2. `device`: e.g.: `"cuda", "cpu", None`
    An error will occur if an operation is performed on two tensors that are not on the same devices.
3. `requires_grad`: Determines weather or not you want the tensors to track the gradients with this tensor's operation.

In [17]:
float_32_tensor = torch.tensor([3.0, 6.0, 9.0], dtype = None, device = "cuda", requires_grad = False)
print(float_32_tensor)
print(float_32_tensor.dtype)

tensor([3., 6., 9.], device='cuda:0')
torch.float32


In [18]:
float_16_tensor = float_32_tensor.type(torch.float16)
print(float_16_tensor)
float_16_tensor = float_32_tensor.type(torch.half)
print(float_16_tensor)

tensor([3., 6., 9.], device='cuda:0', dtype=torch.float16)
tensor([3., 6., 9.], device='cuda:0', dtype=torch.float16)


## Getting tensor attributes

Check Data Type

In [19]:
tensor.dtype

torch.int64

Check Data Shape

In [20]:
tensor.shape

torch.Size([2, 3, 4])

Check Device

In [21]:
print(tensor.device)
print(float_32_tensor.device)

cpu
cuda:0


Check Information

In [22]:
some_tensor = torch.rand(3, 4)
def tensor_info(ts):
    print(ts)
    print(f"Datatype of tensor: {ts.dtype}")
    print(f"Shape of tensor: {ts.shape}")
    print(f"Device tensor is on: {ts.device}")
    
tensor_info(some_tensor)
print('\n')
tensor_info(float_32_tensor)

tensor([[0.1081, 0.8390, 0.5762, 0.9815],
        [0.2298, 0.1721, 0.5286, 0.8098],
        [0.7269, 0.2397, 0.8617, 0.8585]])
Datatype of tensor: torch.float32
Shape of tensor: torch.Size([3, 4])
Device tensor is on: cpu


tensor([3., 6., 9.], device='cuda:0')
Datatype of tensor: torch.float32
Shape of tensor: torch.Size([3])
Device tensor is on: cuda:0


## Manipulate Tensors

Basic mathematical operations:
1. addition
2. subtraction
3. multiplication (matrix and coefficient)
4. division
5. matrix multiplication (matrix and matrix)

In [23]:
tensor = torch.tensor([1, 2, 3])
print(tensor + 10)
print(tensor * 10)
print(tensor - 10)
print(tensor / 2)


tensor([11, 12, 13])
tensor([10, 20, 30])
tensor([-9, -8, -7])
tensor([0.5000, 1.0000, 1.5000])


In-Built functions

In [24]:
print(torch.mul(tensor, 10), torch.add(tensor, 10))
print(torch.subtract(tensor, 10), torch.divide(tensor, 10))

tensor([10, 20, 30]) tensor([11, 12, 13])
tensor([-9, -8, -7]) tensor([0.1000, 0.2000, 0.3000])


### Matrix multiplication

In [25]:
random_tensor = torch.arange(start = 1, end = 10000000, step = 1)

Element multiplication

In [26]:
%%time
a = random_tensor * random_tensor
# [1 * 1, 2 * 2, 3 * 3]

CPU times: total: 156 ms
Wall time: 7 ms


In [27]:
%%time
b = torch.mul(random_tensor, random_tensor)

CPU times: total: 125 ms
Wall time: 10 ms


Matrix mutiplication

In [28]:
random_tensor = torch.arange(start = 1, end = 1000000, step = 1)

In [29]:
%%time
print(torch.matmul(random_tensor, random_tensor))
# Row by Col

tensor(333332833333500000)
CPU times: total: 0 ns
Wall time: 1 ms


In [30]:
%%time
answer = 0
for i in range(len(random_tensor)):
    answer += random_tensor[i] * random_tensor[i]

print(answer)

tensor(333332833333500000)
CPU times: total: 6.19 s
Wall time: 6.37 s


In [31]:
%%time
print(random_tensor @ random_tensor)

tensor(333332833333500000)
CPU times: total: 0 ns
Wall time: 2 ms


Shapes

In [32]:
torch_A = torch.rand(2, 3)
torch_B = torch.rand(2, 3)
print(torch_A)
print(torch_A.T)
product = torch_A @ torch_B.T
print(product)

tensor([[0.1428, 0.9433, 0.1434],
        [0.2099, 0.2637, 0.9171]])
tensor([[0.1428, 0.2099],
        [0.9433, 0.2637],
        [0.1434, 0.9171]])
tensor([[0.9049, 0.8902],
        [0.6637, 0.9983]])


## Finding the min, max, mean, sum, etc

In [35]:
random_tensor = torch.arange(0, 100, 10)
print(random_tensor)

tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])


In [42]:
print(torch.min(random_tensor))
print(torch.max(random_tensor))

# since mean can be float, so before running .mean, you have to change to float32
print(torch.mean(random_tensor.type(torch.float32)))
print(torch.sum(random_tensor))

print(random_tensor.min())
print(random_tensor.max())
print(random_tensor.type(torch.float32).mean())
print(random_tensor.sum())


tensor(0)
tensor(90)
tensor(45.)
tensor(450)
tensor(0)
tensor(90)
tensor(45.)
tensor(450)


Find subscript

In [46]:
# Get the subscript of the smallest/largest element within tensor
print(random_tensor.argmin())
print(random_tensor.argmax())

tensor(0)
tensor(9)


### Reshape, stacking, squeezing and unsqueezing tensors