# L1.1 - Introduction to ANN

Machine learning is the field of study that gives computers the abiity to learn without being explicitly programmed. i.e. learn directly from data, no explicitly programed rules
## Data and process
### Focus: Structured data
* Data in Tables (rows and columns)
* Predict a target from features

### Method: manual feature engineering
* Human experts design features
* model learns from these engineered features
* success depends on feature quaity

## Models
### Linear models
* core idea: assumes linear relationships
* eg: linear/logistic regression
* traits: fast, interpretable but simple
### Tree Based models
* core idea: learns if-then-else rules
traits: capture non linearity

## Limitations
### Feature engineering Bottleneck
Time consuming, requires domain expertise and is often suboptimal
### Inability to handle high dimension data
struggles with images, audio, or raw text
### Limited representation learning
learn shallow patterns, not deep hierarchical features
## New paradigm
ML needs manual feature extraction, DL learns them automatically

## Deep Learning
* Models learn directly from raw data
* Handles images, text and audio
* no manual feature extraction

## Factors for Deep learning
* Big data
* computational power
* Better algorithms

# L1.2 - Introduction to PyTorch and Tensors

## Advantages of PyTorch
* dynamic computational graphs allow for flexible model architectures
* pythonic interface that integrates seamleslly with the python ecosystem
* Extensive debugging capabilities with standard python debugging tools

## Performance Optimization
* Efficient GPU integration through cuda integration
* optimized tensors operatins for numerical computations
* Support for distributed training across multiple devices

## Theoretical foundation
tensors are used to represent

## Notation
* Scalar: lowercase italic
* vector: lowercase bold letters
* matrices: Uppercase bold letters
* Tensors: Uppercase bold letters with rank notation

# L1.3 Pytorch environment

ON Computer

In [None]:
import torch
print(f"PyTorch version: {torch.__version__}")

PyTorch version: 2.8.0+cpu


In [None]:
# Additional libraries
# numpy for numerical operarions
import numpy as np

# matplotlib
import matplotlib.pyplot as plt
import matplotlib

print(f"numpy version {np.__version__}")
print(f"matplotlib version {matplotlib.__version__}")

numpy version 2.2.3
matplotlib version 3.10.1


In [None]:
# Check for GPU availability
print(f"CUDA available: {torch.cuda.is_available()}")

CUDA available: False


ON Colab

In [None]:
import torch
print(f"PyTorch version: {torch.__version__}")

PyTorch version: 2.8.0+cu126


In [None]:
# Check for GPU availability
print(f"CUDA available: {torch.cuda.is_available()}")

CUDA available: True


# L1.4 - Tensor creation methods

Tensor creation forms the foundation of any deep learning workflow, proving mechanism to:
1. Initialize data structures for inputs, parameters and outputs
2. control numerical precision through data type specification
3. optimize memory usage through appropricate tensor sizing
4. ensure model reproducibility through deterministic initialization

In [None]:
# scalar
scalar = torch.tensor(7)
print(scalar)

# vector
vector = torch.tensor([1,2,3,4])
print(vector)

# matrix
matrix = torch.tensor([
    [1,2],
    [3,4]
])
print(matrix)

# 3D tensor
tensor = torch.tensor([
    [[1,2],[3,4]],
    [[5,6],[7,8]]
])
print(tensor)

tensor(7)
tensor([1, 2, 3, 4])
tensor([[1, 2],
        [3, 4]])
tensor([[[1, 2],
         [3, 4]],

        [[5, 6],
         [7, 8]]])


In [None]:
tensor.shape

torch.Size([2, 2, 2])

In [None]:
tensor.dim()

3

In [None]:
# Data type specification
float_tensor = torch.tensor([1.0,2.0,3.0], dtype=torch.float32)

print(float_tensor.element_size())

4


In [None]:
# Numpy to Tensor
np_array = np.array([[1,2,3],[4,5,6]])

tensor_from_numpy = torch.from_numpy(np_array)

print(tensor_from_numpy)

tensor([[1, 2, 3],
        [4, 5, 6]])


In [None]:
# memory is shared
np_array[0, 0] = 100
print(tensor_from_numpy)

tensor([[100,   2,   3],
        [  4,   5,   6]])


In [None]:
# Create empty, ones, zeros tensors

# zeros tensor
zeros_tensor = torch.zeros(3,3)
print(zeros_tensor)

ones_tensor = torch.ones(3,3)
print(ones_tensor)

# doesnt initialize null values,
# just chooses addresses in memory and displays whatever the hell is on there
empty_tensor = torch.empty(2,2)
print(empty_tensor)

filled_tensor = torch.full((3,2), 42)
print(filled_tensor)


tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]])
tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])
tensor([[0., 0.],
        [0., 0.]])
tensor([[42, 42],
        [42, 42],
        [42, 42]])


In [None]:
# Identity matrices
I = torch.eye(3)
I

tensor([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]])

In [None]:
# non square identity
I2 = torch.eye(2,4)
I2

tensor([[1., 0., 0., 0.],
        [0., 1., 0., 0.]])

In [None]:
# Creating sequential tensors

# 1. linear spacing
# tensor with 5 valus evenly spaced between 0 and 1
linear_tensor = torch.linspace(0,1,5)
linear_tensor

tensor([0.0000, 0.2500, 0.5000, 0.7500, 1.0000])

In [None]:
# 2. logarithmic spacing
log_tensor = torch.logspace(1, 4, 4)
log_tensor

tensor([   10.,   100.,  1000., 10000.])

In [None]:
# 3. arrange values from 0 to 9
range_tensor = torch.arange(0,9)
range_tensor

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8])

In [None]:
# 4. Step tensor
step_tensor = torch.arange(0, 10, 1.75)
step_tensor

tensor([0.0000, 1.7500, 3.5000, 5.2500, 7.0000, 8.7500])

In [None]:
# Diagonmal matrices
diag = torch.diag(torch.tensor([1,2,3]))
diag

tensor([[1, 0, 0],
        [0, 2, 0],
        [0, 0, 3]])

In [None]:
mat = torch.tensor([
    [1,2,3],
    [4,5,6],
    [7,8,9]
])
torch.diag(mat)

tensor([1, 5, 9])

# L1.5 - Tensor Manipulation Methods

In [None]:
import torch
# Random Tensor
rand_tensor = torch.rand(size=(3,4))
rand_tensor

tensor([[0.3497, 0.9385, 0.8205, 0.3868],
        [0.3590, 0.1507, 0.6423, 0.8876],
        [0.0165, 0.6045, 0.7368, 0.3323]])

In [None]:
print(f"Min value: {torch.min(rand_tensor).item():.6f}")

print(f"Max value: {torch.max(rand_tensor).item():.6f}")

print(f"mean value: {torch.mean(rand_tensor).item():.6f}")

print(f"sigma value: {torch.std(rand_tensor).item():.6f}")

Min value: 0.016495
Max value: 0.938481
mean value: 0.518757
sigma value: 0.296565


In [None]:
# Theoretical std
theoretical_std = 1.0/(12 ** 0.5)
print(f"Theoretical std: {theoretical_std:.6f}")

Theoretical std: 0.288675


In [None]:
# Computer vision
random_image_size_tensor = torch.rand(size=(224,224,3))
print(f"tensor: {random_image_size_tensor}")


tensor: tensor([[[0.7207, 0.5879, 0.5509],
         [0.0262, 0.2247, 0.2531],
         [0.9825, 0.4592, 0.8357],
         ...,
         [0.3068, 0.3260, 0.9011],
         [0.2741, 0.4767, 0.2637],
         [0.1384, 0.5522, 0.8968]],

        [[0.2918, 0.5903, 0.5991],
         [0.1710, 0.9614, 0.9425],
         [0.8386, 0.0910, 0.0216],
         ...,
         [0.1384, 0.9036, 0.6075],
         [0.1949, 0.0901, 0.8175],
         [0.5470, 0.7143, 0.2584]],

        [[0.9564, 0.7055, 0.2621],
         [0.4984, 0.6837, 0.8362],
         [0.7403, 0.2765, 0.1719],
         ...,
         [0.3088, 0.7172, 0.4323],
         [0.0612, 0.1075, 0.0787],
         [0.0738, 0.4626, 0.7423]],

        ...,

        [[0.0538, 0.8010, 0.5307],
         [0.6675, 0.3210, 0.6036],
         [0.4021, 0.3645, 0.1297],
         ...,
         [0.3662, 0.7464, 0.5394],
         [0.3449, 0.6532, 0.2733],
         [0.1032, 0.8769, 0.9875]],

        [[0.2987, 0.5662, 0.3114],
         [0.5840, 0.7161, 0.3661],
    

In [None]:
# bytes per element
random_image_size_tensor.element_size()

4

In [None]:
# total memory
random_image_size_tensor.numel() * random_image_size_tensor.element_size()

602112

Standard image formats in deep learning
* Imagenet standard: 224x 224x3
* CIFAR-10: 32x32x3
* MNIST: 28x28x1
*

In [None]:
import torch
zero_to_ten = torch.arange(start=0, end=10, step=1)
zero_to_ten

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [None]:
# build a tensor of a similar shape to the input tensor
ten_zeros = torch.zeros_like(input=zero_to_ten)
ten_zeros

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [None]:
float_32_tensor = torch.tensor([3.0,6.0,9.0],
                               dtype=None,
                               device=None,
                               requires_grad=False)
float_32_tensor.dtype

torch.float32

In [None]:
# additiona
tensor = torch.tensor([1,2,3])
tensor + 10

tensor([11, 12, 13])

In [None]:
tensor * 10

tensor([10, 20, 30])

In [None]:
# Built functions
torch.multiply(tensor, 10)

tensor([10, 20, 30])

In [None]:
# Matrix multiplication
A = torch.tensor([0, 1, 2])
B = torch.tensor([[0], [1], [2]])
# Notation 1 (dimensions must be compatiable)
torch.matmul(A, B)
# Notation 2 only for 2D
torch.mm(A, B)
# notation 2
A @ B

RuntimeError: self must be a matrix

# L1.6 - Reshaping methods

In [None]:
import torch
x = torch.arange(1., 8)
x, x.shape

(tensor([1., 2., 3., 4., 5., 6., 7.]), torch.Size([7]))

In [None]:
y = x.reshape(1,7)
y, y.shape

(tensor([[1., 2., 3., 4., 5., 6., 7.]]), torch.Size([1, 7]))

In [None]:
z = x.view(1,7)
z, z.shape

(tensor([[1., 2., 3., 4., 5., 6., 7.]]), torch.Size([1, 7]))

In [None]:
z[:, 0] = 5
z, x

(tensor([[5., 2., 3., 4., 5., 6., 7.]]), tensor([5., 2., 3., 4., 5., 6., 7.]))

In [None]:
y[:,1] = 69
y, x

(tensor([[ 5., 69.,  3.,  4.,  5.,  6.,  7.]]),
 tensor([ 5., 69.,  3.,  4.,  5.,  6.,  7.]))

In [None]:
xs0 = torch.stack([x, x, x, x], dim=0)
xs0

tensor([[ 5., 69.,  3.,  4.,  5.,  6.,  7.],
        [ 5., 69.,  3.,  4.,  5.,  6.,  7.],
        [ 5., 69.,  3.,  4.,  5.,  6.,  7.],
        [ 5., 69.,  3.,  4.,  5.,  6.,  7.]])

In [None]:
xs1 = torch.stack([x, x, x, x], dim=1)
xs1

tensor([[ 5.,  5.,  5.,  5.],
        [69., 69., 69., 69.],
        [ 3.,  3.,  3.,  3.],
        [ 4.,  4.,  4.,  4.],
        [ 5.,  5.,  5.,  5.],
        [ 6.,  6.,  6.,  6.],
        [ 7.,  7.,  7.,  7.]])

In [None]:
xs0.shape, xs1.shape

(torch.Size([4, 7]), torch.Size([7, 4]))

In [None]:
# squeeze
x_squeeze = y.squeeze()
y.shape, x_squeeze.shape, x_squeeze

(torch.Size([1, 7]),
 torch.Size([7]),
 tensor([ 5., 69.,  3.,  4.,  5.,  6.,  7.]))

In [None]:
z = x_squeeze.unsqueeze(dim=1)
z, z.shape

(tensor([[ 5.],
         [69.],
         [ 3.],
         [ 4.],
         [ 5.],
         [ 6.],
         [ 7.]]),
 torch.Size([7, 1]))

In [None]:
x = torch.rand(size=[224,224,3])
y = x.permute(2,0,1)
x.shape, y.shape

(torch.Size([224, 224, 3]), torch.Size([3, 224, 224]))

In [None]:
x = torch.arange(1,10).reshape(1,3,3)
x

tensor([[[1, 2, 3],
         [4, 5, 6],
         [7, 8, 9]]])

# L1.7 - PyTorch and NumPy Integration

In [None]:
import numpy as np
import torch
# From numpy to tensor
x = np.arange(1,8)
t = torch.from_numpy(x)
t

tensor([1, 2, 3, 4, 5, 6, 7])

In [None]:
# From tensor to numpy
y = t.numpy()
y

array([1, 2, 3, 4, 5, 6, 7])

# L1.8 - Reproducibility in PyTorch

In [None]:
RANDOM_SEED = 42
torch.manual_seed(RANDOM_SEED)

random_tensor_a = torch.rand(3,4)

random_tensor_a

tensor([[0.8823, 0.9150, 0.3829, 0.9593],
        [0.3904, 0.6009, 0.2566, 0.7936],
        [0.9408, 0.1332, 0.9346, 0.5936]])

# L1.9 Hardware Acceleratio in PyTorch

In [1]:
!nvidia-smi

Wed Oct 15 15:49:57 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  Tesla T4                       Off |   00000000:00:04.0 Off |                    0 |
| N/A   68C    P8             11W /   70W |       0MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

In [2]:
import torch

cuda_available = torch.cuda.is_available()

cuda_available

True

In [4]:
torch.cuda.device_count()

1

In [7]:
torch.cuda.get_device_name(0)

'Tesla T4'

In [9]:
# memory allocated, reserved memory, total memory
torch.cuda.memory_allocated(), torch.cuda.memory_reserved(0), torch.cuda.get_device_properties(0).total_memory

(0, 0, 15828320256)

In [10]:
torch.version.cuda

'12.6'

In [11]:
torch.backends.cudnn.version()

91002

In [13]:
device = "cuda"

In [14]:
x = torch.tensor([1,2,3])
x, x.device

(tensor([1, 2, 3]), device(type='cpu'))

In [16]:
y = x.to(device)
y, y.device

(tensor([1, 2, 3], device='cuda:0'), device(type='cuda', index=0))

In [17]:
z = y.cpu().numpy()
z

array([1, 2, 3])

In [None]:
y = y.to("cpu")

# L1.10 - Common PyTorch Errors

In [1]:
# Matrix wrong size
# torch.mm use instead of torch.matmul

In [2]:
# Incompatible Tensor Datatypes
# eg
# t1 = dtype = torch.float16
# t2 => dtype = torch.float32

In [6]:
# Different devices
import torch
a = torch.arange(1.0,10.0, device='cpu')
b = torch.arange(1.0,10.0, device='cuda')

a.to('cuda')@b

tensor(285., device='cuda:0')

# L1.11 - Linear Regression: Data Creation and Loading