### 00. PyTorch Fundamentals

Timestamp: 1:14:00 - https://youtu.be/Z_ikDlimN6A?t=4440&si=0MbmtYJsTSlTceYB

In [1]:
!nvidia-smi

Mon Jun  2 14:19:05 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 576.02                 Driver Version: 576.02         CUDA Version: 12.9     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                  Driver-Model | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA GeForce GTX 1650      WDDM  |   00000000:01:00.0 Off |                  N/A |
| N/A   55C    P8              5W /   50W |       0MiB /   4096MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

In [2]:
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
print(torch.__version__)

2.7.0+cu126


## Introduction to Tensors

### Creating tensors

PyTorch tensors are creating using torch.tensor()



In [3]:
# scalar
s = torch.tensor(7)
s

tensor(7)

In [4]:
s.ndim # Number of dimensions (scalar has zero)

0

In [5]:
# Get value back as python int type
s.item()

7

In [6]:
# vector
v = torch.tensor([7,7])
v

tensor([7, 7])

In [7]:
v.ndim # Vector has one dimension

1

In [8]:
v.shape

torch.Size([2])

In [9]:
# matrix
X = torch.tensor([
    [5, 6],
    [7, 8]
])

In [10]:
X.ndim # Matrices have two dimensions

2

In [11]:
X.shape # return the size of each dimension of the tensor

torch.Size([2, 2])

In [12]:
X.size() # alias for .shape

torch.Size([2, 2])

In [13]:
# Tensor
T = torch.tensor([[
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
]])

In [14]:
T.ndim # Three dimensions (same as number of opening square brackets '[' )

3

In [15]:
T.size()

torch.Size([1, 3, 3])

In [16]:
# indexing & slicing tensors
T[0] # show's first dimension of tensor T

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])

In [17]:
# to get at the 2nd dimension-
T[0][0]

tensor([1, 2, 3])

In [18]:
# or
T[0][1]

tensor([4, 5, 6])

### Random tensors

Why random tensors?

Random tensors are important because the way in which many neural networks learn is that they start with tensors full of random numbers, and then adjust the random numbers as they 'learn'. The random numbers become adjusted to better represent the data

`Start with random numbers -> Look at data -> Update random numbers -> Look at data -> Update random numbers -> ....`

Pytorch docs for torch.rand - https://docs.pytorch.org/docs/stable/generated/torch.rand.html

In [19]:
# Create a random tensor of size (3, 4)
rt = torch.rand(size=(3, 4))
rt

tensor([[0.9579, 0.4746, 0.2607, 0.7167],
        [0.0911, 0.7781, 0.1477, 0.6523],
        [0.7048, 0.2786, 0.7484, 0.1239]])

In [20]:
# Create a random tensor with similar shape to an image tensor
t = torch.rand(size=(3, 224, 224)) # color channels, height, width

In [21]:
t.shape, t.ndim

(torch.Size([3, 224, 224]), 3)

### Zeros and ones

In [22]:
# Create a tensor of all zeros
zeros = torch.zeros(size=(4, 4))
zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [23]:
# Create a tensor of all ones
ones = torch.ones(size=(4, 4))
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [24]:
ones.dtype # float32 is the default dtype

torch.float32

### Creating a range of tensors and tensors-like

In [25]:
# Use torch.range
torch.range(1,5) # Note the warning- torch.range() is deprecated.



tensor([1., 2., 3., 4., 5.])

In [26]:
torch.__version__ # torch version = '2.7.0+cu126'

'2.7.0+cu126'

In [27]:
# Instead, use torch.arange()
t = torch.arange(0,5)
t

tensor([0, 1, 2, 3, 4])

In [28]:
t = torch.arange(start=5, end=25, step=5) # Note end is not inclusive
t

tensor([ 5, 10, 15, 20])

In [29]:
# Creating tensors-like (eg a tensor of zeros that are similar to another tensor)
t1 = torch.arange(start=0, end=10)
t1


tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [30]:
t2 = torch.zeros_like(t1)
t2

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [31]:
t1.size() == t2.size() # Checking t1 and t2 are same size/shape

True

In [32]:
t3 = torch.ones_like(t2)
t3

tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1])

### Tensor dtypes

**Note** tensor dtypes (mis-matches etc.) will be one of the three common errors you'll run into when using PyTorch for deep learning

1. Tensors not the right dtype
1. Tensors not the right shape
1. Tensors not on the right device (torch.device Class)

In [33]:
t_float32 = torch.tensor([3.0, 6.0, 9.0], dtype=None)
t_float32.dtype # default precision

torch.float32

In [34]:
t_float16 = torch.tensor([3.0, 6.0, 9.0], dtype=torch.float16)
t_float16.dtype # 'half'

torch.float16

In [35]:
t_float64 = torch.tensor([3.0, 6.0, 9.0], dtype=torch.float64)
t_float64.dtype # 'double'

torch.float64

In [36]:
# Other parameters available when creating tensors (device, requires_grad)

t = torch.tensor(data=[2, 4, 6, 8, 10], # array_like : data to make the tensor
                 dtype=torch.float32, # torch.dtype : dtype of the tensor
                 device=None, # torch.device : e.g. "cuda0" or "cpu" - what device the tensor is on
                 requires_grad=False) # bool : Whether to track gradients with this tensors operations
t.dtype

torch.float32

In [37]:
# Convert between dtypes

t = t_float64.type(torch.float16)
t.dtype

torch.float16

In [38]:
t2 = t_float64.type_as(t)
t2.dtype

torch.float16

### Getting information from tensors

* dtype - to get the dtype of a tensor object `t`, use `t.dtype`
* shape - to get the shape of a tensor objecct `t`, use `t.shape` or `t.size()`
* device - to get the device the tensor `t` is on, use `t.device`


In [39]:
t = torch.rand(3, 4)
t

tensor([[0.4761, 0.6153, 0.4220, 0.5698],
        [0.8659, 0.3824, 0.5756, 0.4927],
        [0.4071, 0.6443, 0.1410, 0.2044]])

In [40]:
# Find out details about tensor (tensor attributes)
print(f"Tensor t: {t}")
print("")
print("Tensor Attributes:")
print(f"dtype of tensor t: {t.dtype}")
print(f"shape of tensor t: {t.shape}")
print(f"device of tensor t: {t.device}")

Tensor t: tensor([[0.4761, 0.6153, 0.4220, 0.5698],
        [0.8659, 0.3824, 0.5756, 0.4927],
        [0.4071, 0.6443, 0.1410, 0.2044]])

Tensor Attributes:
dtype of tensor t: torch.float32
shape of tensor t: torch.Size([3, 4])
device of tensor t: cpu


### Manipulating tensors

Tensor operations include:
* Addition
* Subtraction
* Multiplication (element-wise)
* Division
* Matrix multiplication

In [41]:
## Addition

t = torch.tensor([1,2,3])
print(f"Tensor t: {t}")
print(f"+10:  {t + 10}")
print(f"+100: {t + 100}")
print(f"+10:  {t.add(10)}")
print(f"+100: {t.add(100)}")

Tensor t: tensor([1, 2, 3])
+10:  tensor([11, 12, 13])
+100: tensor([101, 102, 103])
+10:  tensor([11, 12, 13])
+100: tensor([101, 102, 103])


In [42]:
# Subtraction
t = torch.tensor([4, 6, 8])
print(f"Tensor t: {t}")
print(f"-2: {t - 2}")
print(f"-4: {t - 4}")

print(f"-2: {t.sub(-2)}")
print(f"-4: {t.sub(-4)}")

Tensor t: tensor([4, 6, 8])
-2: tensor([2, 4, 6])
-4: tensor([0, 2, 4])
-2: tensor([ 6,  8, 10])
-4: tensor([ 8, 10, 12])


In [43]:
# Multiplication (elementwise)

# Subtraction
t = torch.tensor([10, 20, 30])
print(f"Tensor t: {t}")
print(f"multiply by 5: {t * 5}")
print(f"multiply by 5: {t.mul(5)}")

Tensor t: tensor([10, 20, 30])
multiply by 5: tensor([ 50, 100, 150])
multiply by 5: tensor([ 50, 100, 150])


### Matrix multiplication

Two main ways of performing multiplication in neural networks and deep learning

1. Elementwise multiplication
1. Matrix multiplication (most common in nn's)

More information on multiplying matrices: https://www.mathsisfun.com/algebra/matrix-multiplying.html

There are two main rules that performing matrix multiplcation needs to satisfy:

1. The inner dimensions must match:
* `(3, 2) @ (2, 3)` - works! 2=2. Result is tensor of shape (3, 3)
* `(2, 3) @ (2, 3)` - won't work! 3 != 2
* `(2, 3) @ (3, 2)` - works! 3=3. Result is tensor of shape (2, 2)

In [44]:
## Element wise multiplication
t = torch.tensor([1,2,3])
print(f"Tensor t: {t}")
print(f"t * t: {t*t}") # element wise
print(f"t * t: {t.multiply(t)}") # element wise
print(f"t * t: {t.mul(t)}") # element wise


Tensor t: tensor([1, 2, 3])
t * t: tensor([1, 4, 9])
t * t: tensor([1, 4, 9])
t * t: tensor([1, 4, 9])


In [45]:
## Matrix multiplication
t = torch.tensor([1,2,3])
print(f"Tensor t: {t}")
print(f"t x t: {t@t}") # dot product
print(f"t x t: {t.matmul(t)}") # dot product
print(f"t x t: {torch.matmul(t, t)}") # dot product



Tensor t: tensor([1, 2, 3])
t x t: 14
t x t: 14
t x t: 14


In [46]:
## Matrix multiplication by hand
result = 1*1 + 2*2 + 3*3
result

14

In [47]:
%%timeit
## Matrix multiplication by hand 2
result = 0
for elem in t:
    result += elem * elem
result

29.9 μs ± 1.16 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)


In [48]:
%%timeit
t.matmul(t) # vectorized version of matrix multiplication much faster than 'by hand' above.

2.35 μs ± 121 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)


### One of the most common errors in deep learning: shape errors