<a href="https://colab.research.google.com/github/Sweta-Das/PyTorch-For-ML/blob/main/Fundamentals/0_pytorch_fundamentals.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [36]:
import torch
torch.__version__

'2.9.0+cu126'

### Scalar
A 0-dimension tensor, or simply a single number.

In [37]:
scalar = torch.tensor(7)
scalar

tensor(7)

It means that although var 'scalar' is a single number, it's of type `torch.Tensor`.

In [38]:
# Dimension of tensor
scalar.ndim

0

In [39]:
# Retrieve the number from within the tensor
scalar.item()

7

### Vector
A single-dimension tensor, but can contain many numbers. It is a number with direction.

In [40]:
vector = torch.tensor([7, 7])
vector

tensor([7, 7])

In [41]:
# Dimension
vector.ndim

1

In [42]:
# Shape of vector
vector.shape

torch.Size([2])

Shape tells how the elements inside the tensor is arranged.

### Matrix
A 2-dimensional array of numbers.

In [43]:
MATRIX = torch.tensor(
    [
        [7, 8],
        [9, 10]
    ]
)

MATRIX

tensor([[ 7,  8],
        [ 9, 10]])

In [44]:
# Dimension of matrix
MATRIX.ndim

2

In [45]:
# Shape of matrix
MATRIX.shape

torch.Size([2, 2])

### Tensor
An n-dimensional array of numbers.

In [46]:
TENSOR = torch.tensor(
    [
        [
          [1, 2, 3],
          [3, 6, 9],
          [2, 4, 5]
        ]
    ]
)
TENSOR

tensor([[[1, 2, 3],
         [3, 6, 9],
         [2, 4, 5]]])

In [47]:
# Dimension
TENSOR.ndim

3

In [48]:
# Shape
TENSOR.shape

torch.Size([1, 3, 3])

## Random Tensors

A ML model often starts out with large random tensors of numbers and adjusts these random numbers as it works through data to better represent it...

1. Start with random numbers
2. Look at data
3. Update random numbers
4. Look at data
5. Update random numbers ...

In [49]:
# Create a random tensor of size (3, 4)
random_tensor = torch.rand(
    size = (3, 4)
)
random_tensor

tensor([[0.9470, 0.6319, 0.3612, 0.3986],
        [0.2593, 0.4064, 0.4180, 0.7873],
        [0.2726, 0.0894, 0.4914, 0.0545]])

In [50]:
random_tensor.dtype

torch.float32

In [51]:
# Create a random tensor in the common image shape ([height, width, color_channels])
random_img_size_tensor = torch.rand(size=(224, 224, 3))
random_img_size_tensor.shape

torch.Size([224, 224, 3])

In [52]:
random_img_size_tensor.ndim

3

## Zeros & Ones

Mostly used for masking where some of the values in one tensors is converted to zeros to let a model know not to learn them.

In [53]:
# Create a tensor of all zeros
zeros = torch.zeros(size=(3, 4))
zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [54]:
zeros.dtype

torch.float32

In [55]:
# Create a tensor of all ones
ones = torch.ones(size=(3,4))
ones

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

In [56]:
ones.dtype

torch.float32

## Range in tensors
1 to 10 or, 0 to 100.

In [57]:
# Create a range of values from 0 to 10
zero_to_ten = torch.arange(
    start=0,
    end=10,
    step=1
)

zero_to_ten

tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

## Tensor with the same shape as another

In [58]:
# Create a zeros tensor similar to another tensor
zeros_tnsr = torch.zeros_like(
    input = zero_to_ten
)
zeros_tnsr

tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

A tensor filled with zeros with the same shape as previous input (zero_to_ten).

In [59]:
# Create a ones tensor similar to another tensor
ones_tnsr = torch.ones_like(
    input = zero_to_ten
)
ones_tnsr

tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1])

# Tensor Datatypes

There are many different types of tensor datatypes available in PyTorch. Some are specific for CPU and some are better for GPU.

- `torch.cuda` -> Tensor is being used for GPU (since Nvidia GPUs use computing toolkit called CUDA)

- Default type : **32-bit floating point** -> `torch.float32` or `torch.float`

- **16-bit floating point** -> `torch.float16` or `torch.half`

- **64-bit floating point** -> `torch.float64` or `torch.double`

There's also 8-bit, 16-bit, 32-bit and 64-bit integers, plus more!

## Reason for Datatypes : Precision

- Precision is the amount of detail used to describe a number.
- Higher the precision value, the more detail and hence data used to express a number.
- The more detail you've to calculate, the more compute you've to use. </br>
So, lower precision datatypes are generally faster, but fall behind in evaluation metrics like accuracy.
</br>

Resources:
- https://docs.pytorch.org/docs/stable/tensors.html#data-types
- https://en.wikipedia.org/wiki/Precision_(computer_science)


In [60]:
# Create tensors with default datatype
float32_tnsr = torch.tensor(
    [3.0, 6.0, 9.0],
    dtype = None, # defaults to None so, torch.float32 get selected
    device = None, # defaults to None, which uses the default tensor type
    requires_grad = False # if True, operations performed on tensor are recorded
)
float32_tnsr.shape

torch.Size([3])

In [61]:
float32_tnsr.dtype, float32_tnsr.device

(torch.float32, device(type='cpu'))

Most common issues while using PyTorch are:
- shape issues (tensor shapes don't match up),
- datatype, and
- device issues.
</br>

PyTorch often likes tensors to be of the same format, if one of the tensor is of `dtype = torch.float32`, and the other is `dtype = torch.float16`, it'll throw error.
</br>

Also, if one of the tensor is on CPU and another on the GPU, then there'll be error, because PyTorch likes calculations between tensors to be on the same device.

In [62]:
float16_tnsr = torch.tensor(
    [3.0, 6.0, 9.0],
    dtype=torch.float16
)
float16_tnsr.dtype

torch.float16

# Getting information from tensors

3 most common attributes to find out about tensors:
- shape -> what shape is the tensor?
  - Some operations require specific shape rules.
- dtype -> what datatype are the elements within the tensor stored in?
- device -> what device is the tensor stored on? (GPU/CPU)

In [63]:
# Create a random tensor and find out its details
tnsr = torch.rand(size=(3, 4))
print(tnsr)
print(f"Shape of tensor: {tnsr.shape}")
print(f"Datatype of tensor: {tnsr.dtype}")
print(f"Device tensor is stored on: {tnsr.device}")

tensor([[0.1426, 0.8419, 0.6978, 0.6939],
        [0.4898, 0.7671, 0.4878, 0.2557],
        [0.1329, 0.8066, 0.1911, 0.0597]])
Shape of tensor: torch.Size([3, 4])
Datatype of tensor: torch.float32
Device tensor is stored on: cpu


***Note**: When you run into issues in PyTorch, it's very often one to do with one of the three attributes above. So when the error messages show up, sing yourself a little song called "what, what, where":*

- what shape are my tensors?
- what datatype are they and
- where are they stored? </br>

"what shape, what datatype, where where where"

# Manipulating Tensors (Tensor Ops)

### Basic Operations:
- Addition (+)
- Subtraction (-)
- Multiplicaiton (*)
- Division (/)

In [64]:
# Create a tensor & add number to it
tnsr = torch.tensor([1, 2, 3])
tnsr + 10

tensor([11, 12, 13])

In [65]:
# Multiply tensor by 10
tnsr * 10

tensor([10, 20, 30])

In [66]:
tnsr

tensor([1, 2, 3])

Tensor values inside the tensor don't change unless they're reassigned.

In [67]:
# Subtract and reassign
tnsr = tnsr - 10
tnsr

tensor([-9, -8, -7])

In [68]:
# Add & reassign
tnsr = tnsr + 10
tnsr

tensor([1, 2, 3])

In [69]:
# PyTorch Built-in functions
torch.multiply(tnsr, 10)

tensor([10, 20, 30])

In [70]:
torch.add(tnsr, 10)

tensor([11, 12, 13])

In [71]:
tnsr

tensor([1, 2, 3])

Original tensor is still unchanged.

In [72]:
# Element-wise multiplication
print(tnsr, '*', tnsr)
print('Equals:', tnsr * tnsr)

tensor([1, 2, 3]) * tensor([1, 2, 3])
Equals: tensor([1, 4, 9])


## Matrix Multiplication

2 rules for matrix multiplication in PyTorch are:
- Inner dimensions must match.
- Resulting matrix has the shape of the outer dimensions.

In [73]:
tnsr

tensor([1, 2, 3])

In [74]:
tnsr.shape

torch.Size([3])

Difference between element-wise multiplication and matrix multiplication is the addition of values.

In [75]:
# Element-wise matrix multiplication
tnsr * tnsr

tensor([1, 4, 9])

In [76]:
# Matrix multiplication
torch.matmul(tnsr, tnsr)

tensor(14)

In [77]:
tnsr @ tnsr

tensor(14)

In Python, matrix multiplication can performed using `@` symbol.

In [78]:
# Matrix multiplication by hand
%%time
value = 0
for i in range(len(tnsr)):
  value += tnsr[i] * tnsr[i]
value

CPU times: user 604 µs, sys: 59 µs, total: 663 µs
Wall time: 2.21 ms


tensor(14)

In [79]:
%%time
torch.matmul(tnsr, tnsr)

CPU times: user 713 µs, sys: 0 ns, total: 713 µs
Wall time: 654 µs


tensor(14)

# Most common errors in Deep Learning

## Shape Errors

Shape mismatch is the most common error while performing operations on matrices.

In [80]:
tnsr_A = torch.tensor(
    [
        [1, 2],
        [3, 4],
        [5, 6]
    ],
    dtype=torch.float32
)

tnsr_B = torch.tensor(
    [
        [7, 10],
        [8, 11],
        [9, 12]
    ],
    dtype = torch.float32
)

In [81]:
tnsr_A.shape, tnsr_B.shape

(torch.Size([3, 2]), torch.Size([3, 2]))

In [82]:
torch.matmul(tnsr_A, tnsr_B)

RuntimeError: mat1 and mat2 shapes cannot be multiplied (3x2 and 3x2)

To multiplt mismatched shapes tensors, we need to perform matrix transpose.

In [83]:
tnsr_A, tnsr_B

(tensor([[1., 2.],
         [3., 4.],
         [5., 6.]]),
 tensor([[ 7., 10.],
         [ 8., 11.],
         [ 9., 12.]]))

In [84]:
tnsr_B.T

tensor([[ 7.,  8.,  9.],
        [10., 11., 12.]])

In [87]:
print(f"New shapes: tensor_A = {tnsr_A.shape} (same as above), \
tensor_B = {tnsr_B.T.shape}\n")
print(f"Multiplying: {tnsr_A.shape} * {tnsr_B.T.shape} <- inner dimension \
match\n")
print("Output: \n")
output = torch.matmul(tnsr_A, tnsr_B.T)
print(output)
print(f"\n Output shape: {output.shape}")

New shapes: tensor_A = torch.Size([3, 2]) (same as above), tensor_B = torch.Size([2, 3])

Multiplying: torch.Size([3, 2]) * torch.Size([2, 3]) <- inner dimension match

Output: 

tensor([[ 27.,  30.,  33.],
        [ 61.,  68.,  75.],
        [ 95., 106., 117.]])

 Output shape: torch.Size([3, 3])


In [88]:
# torch.mm <- Shortcut for matmul
torch.mm(tnsr_A, tnsr_B.T)

tensor([[ 27.,  30.,  33.],
        [ 61.,  68.,  75.],
        [ 95., 106., 117.]])

***Note**: Such matrix multiplication is also referred to as the **dot product** of 2 matrices.*

`torch.nn.Linear()` </br>

- Given module is also known as **Feed-Forward Layer** or **Fully Connected Layer**
- This module implements a matrix multiplication between an input `x` and a weights matrix `A`.

$$y=x\cdot{W^T} + b$$
  
    - `x` is input to the layer
    - `W` is weight matrix created by the layer. This starts out as random numbers that get adjusted as a neural network learns to better represent data patterns.
    - `b` is bias term used to slightly offset the weights and inputs
    - `y` is output

This is a linear function that is used to draw a straight line.

In [89]:
# Starting linear layers with random weights matrix
torch.manual_seed(42) # for reproducibility
linear = torch.nn.Linear(
    in_features=2, # matches inner dimension of input
    out_features=6 # describes outer value
)

x = tnsr_A
output = linear(x)
print(f"Input shape: {x.shape}\n")
print(f"Output: \n{output}\n Output shape: {output.shape}")

Input shape: torch.Size([3, 2])

Output: 
tensor([[2.2368, 1.2292, 0.4714, 0.3864, 0.1309, 0.9838],
        [4.4919, 2.1970, 0.4469, 0.5285, 0.3401, 2.4777],
        [6.7469, 3.1648, 0.4224, 0.6705, 0.5493, 3.9716]],
       grad_fn=<AddmmBackward0>)
 Output shape: torch.Size([3, 6])


In [90]:
# Changing `in_features` from 2 to 3
linear = torch.nn.Linear(
    in_features=3, # matches inner dimension of input
    out_features=6 # describes outer value
)

x = tnsr_A
output = linear(x)
print(f"Input shape: {x.shape}\n")
print(f"Output: \n{output}\n Output shape: {output.shape}")

RuntimeError: mat1 and mat2 shapes cannot be multiplied (3x2 and 3x6)

In [93]:
linear = torch.nn.Linear(
    in_features=3, # matches inner dimension of input
    out_features=6 # describes outer value
)

x = tnsr_A
output = linear(x.T)
print(f"Input shape: {x.T.shape}\n")
print(f"Output: \n{output}\n Output shape: {output.shape}")

Input shape: torch.Size([2, 3])

Output: 
tensor([[ 1.5145, -0.8616,  1.4147, -2.7991, -1.5690, -1.1473],
        [ 2.0582, -1.4256,  1.7337, -3.1335, -1.4399, -1.8924]],
       grad_fn=<AddmmBackward0>)
 Output shape: torch.Size([2, 6])


***Note: Matrix Multiplications is all you need.***
https://marksaroufim.substack.com/p/working-class-deep-learner

## Aggregation of Tensor
Finding min, max, mean, sum, etc.

In [94]:
x = torch.arange(
    start=0,
    end=100,
    step=10
)
x

tensor([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90])

In [95]:
print(f"Minimum: {x.min()}")
print(f"Maximum: {x.max()}")
print(f"Mean: {x.type(torch.float32).mean()}")
print(f"Sum: {x.sum()}")

Minimum: 0
Maximum: 90
Mean: 45.0
Sum: 450


***Note**: Methods such as `torch.mean()` require tensors to be in `torch.float32` or another specific datatype, otherwise the operation fails.*

In [96]:
# Using torch methods
print(f"Minimum: {torch.max(x)}")
print(f"Maximum: {torch.min(x)}")
print(f"Mean: {torch.mean(x.type(torch.float32))}")
print(f"Sum: {torch.sum(x)}")

Minimum: 90
Maximum: 0
Mean: 45.0
Sum: 450


## Positional min/max
Finding the index of a tensor where the max or min occurs

In [97]:
tnsr = torch.arange(
    start=10,
    end=100,
    step=10
)
print(f"Tensor: {tnsr}")

Tensor: tensor([10, 20, 30, 40, 50, 60, 70, 80, 90])


In [98]:
print(f"Index with max value: {tnsr.argmax()}")
print(f"Index with min value: {tnsr.argmin()}")

Index with max value: 8
Index with min value: 0


## Change tensor datatype
