# Introduction to PyTorch ðŸ”¥

<img src="Assets/misc-pytorch-course-launch-cover-white-text-black-background.jpg" alt="Alt Text" width="1024" height="512">

PyTorch is an open-source machine learning framework developed by Facebook's AI Research lab (FAIR). 
It provides two high-level features:
1. **Tensor computation** (like NumPy) with strong GPU acceleration.
2. **Deep Neural Networks** built on a tape-based autograd system.

In simpler words, PyTorch is a library that helps us do fast numerical computations, especially those involved in building and training deep neural networks.

It has become popular because:
- It is relatively easy to learn and use.
- It has dynamic computation graphs (meaning the graph of operations is built on the fly, making debugging and prototyping much easier compared to older static frameworks).
- A large community and a wealth of tutorials and resources are available online.

## Importing PyTorch

Once you've installed PyTorch, you can import it in Python like this:

In [20]:
import torch

print("PyTorch version:", torch.__version__)

PyTorch version: 2.4.1+cu118


# What is a Tensor?

A **tensor** in PyTorch is the fundamental data structure used to store and manipulate data. It is similar to a NumPy array but with additional capabilities, such as GPU acceleration and automatic differentiation, which are essential for deep learning.

Key Properties of PyTorch Tensors:
- *__Multidimensional__*: Tensors can have any number of dimensions (scalars, vectors, matrices, or higher-dimensional arrays).
- *__Flexible Data Types__*: PyTorch tensors support various data types such as float32, int64, bool, etc.
- *__Supports GPU Acceleration__*: Tensors can be moved between the CPU and GPU for fast computations.
- *__Autograd Support__*: PyTorch tensors can track operations for automatic differentiation (if requires_grad=True is set).

## Creating Tensors

We can create PyTorch tensors in multiple ways:
1. Directly from Python lists (or nested lists).
2. Using built-in functions like `torch.zeros()`, `torch.ones()`, `torch.rand()`, etc.
3. From NumPy arrays, using `torch.from_numpy()`.

Let's look at some examples.

In [21]:
# I have this python list as my data
data_list = [1.0, 2.0, 3.0]

# I will create a tensor by just calling torch.tensor and give it the data
tensor_from_list = torch.tensor(data_list)

print("Tensor from Python list:", tensor_from_list)

Tensor from Python list: tensor([1., 2., 3.])


In [22]:
# Printing the type of this variable 
tensor_from_list.type()

'torch.FloatTensor'

In [23]:
# Here I am creating a tensor of zeros with dim 2 x 3
zeros_tensor = torch.zeros((2, 3))

print("Tensor of zeros:", zeros_tensor)

Tensor of zeros: tensor([[0., 0., 0.],
        [0., 0., 0.]])


In [24]:
# The same thing but with ones
ones_tensor = torch.ones((2, 3))

print("Tensor of ones:", ones_tensor)

Tensor of ones: tensor([[1., 1., 1.],
        [1., 1., 1.]])


In [25]:
# I am creating a tensor with random values
rand_tensor = torch.rand((2, 3))

print("Random tensor:", rand_tensor)

Random tensor: tensor([[0.7762, 0.4683, 0.8690],
        [0.4742, 0.9484, 0.6742]])


In [26]:
import numpy as np

# I can also type cast numpy arrays into PyTorch tensors
np_array = np.array([[1, 2], 
                     [3, 4]])

tensor_from_numpy = torch.from_numpy(np_array)

print("Tensor from NumPy array:", tensor_from_numpy)

Tensor from NumPy array: tensor([[1, 2],
        [3, 4]])


# Basic Tensor Properties

Some useful properties and attributes of tensors include:
- `.shape` or `.size()` to get the shape of the tensor.
- `.dtype` to get the data type (e.g., float32, int64, etc.).
- `.device` to see whether the tensor is on CPU or GPU.

Let's explore these:

In [27]:
# Here I am creating a random tensor with values between 0 and 1
example_tensor1 = torch.rand((3, 4), dtype=torch.float64) # dtype: float64

print("Example tensor:\n", example_tensor1)
print("Shape of tensor:", example_tensor1.shape)
print("Size of tensor:", example_tensor1.size())
print("Data type of tensor:", example_tensor1.dtype)
print("Device tensor is on:", example_tensor1.device)

Example tensor:
 tensor([[0.9361, 0.5924, 0.1402, 0.9144],
        [0.4014, 0.9394, 0.1785, 0.3076],
        [0.5186, 0.5787, 0.2095, 0.1230]], dtype=torch.float64)
Shape of tensor: torch.Size([3, 4])
Size of tensor: torch.Size([3, 4])
Data type of tensor: torch.float64
Device tensor is on: cpu


In [28]:
# If I want to create random int tensor I will need to use torch.randint and give it Low, High and the shape
example_tensor2 = torch.randint(0, 100, (3, 4), dtype=torch.int32) # dtype: int32

print("Example tensor:\n", example_tensor2)
print("Shape of tensor:", example_tensor2.shape)
print("Size of tensor:", example_tensor2.size())
print("Data type of tensor:", example_tensor2.dtype)
print("Device tensor is on:", example_tensor2.device)

Example tensor:
 tensor([[93, 44, 15,  1],
        [29, 63, 71, 78],
        [ 1, 90, 46, 37]], dtype=torch.int32)
Shape of tensor: torch.Size([3, 4])
Size of tensor: torch.Size([3, 4])
Data type of tensor: torch.int32
Device tensor is on: cpu


# Basic Tensor Operations

We can perform many operations on tensors. Some common operations include:
- Element-wise addition, subtraction, multiplication, division
- Matrix multiplication
- Summation, mean, max, min, etc.
- Reshaping and transposing
- Slicing and indexing

Remember: If you're familiar with NumPy, many of these operations are quite similar, but note that the function names and syntax might differ slightly in PyTorch.

## Element-wise Operations

Element-wise operations apply to each element of the tensor individually. 
For example, adding two tensors of the same shape will add the corresponding elements.

In [29]:
a = torch.tensor([1, 2, 3], dtype=torch.float32)
b = torch.tensor([4, 5, 6], dtype=torch.float32)

In [30]:
# Element-wise addition
add_result = a + b

print("Element-wise addition:", add_result)

Element-wise addition: tensor([5., 7., 9.])


In [31]:
# Element-wise subtraction
sub_result = b - a

print("Element-wise subtraction:", sub_result)

Element-wise subtraction: tensor([3., 3., 3.])


In [32]:
# Element-wise multiplication
mul_result = a * b

print("Element-wise multiplication:", mul_result)

Element-wise multiplication: tensor([ 4., 10., 18.])


Just make sure you do not divide by zero!! **Do not worry PyTorch will give you inf but still do not do that.**

In [33]:
# Element-wise division
div_result = b / a

print("Element-wise division:", div_result)

Element-wise division: tensor([4.0000, 2.5000, 2.0000])


## Matrix Multiplication

For matrix multiplication (also called a dot product for 1D vectors), 
we can use:
- `torch.matmul(tensor1, tensor2)` 
- Or the `@` operator

When dealing with 2D matrices, be mindful of their shapes. 
For matrix multiplication, the inner dimensions must match.


In [34]:
mat1 = torch.tensor([[1, 2], 
                     [3, 4]], dtype=torch.float32)

mat2 = torch.tensor([[5, 6], 
                     [7, 8]], dtype=torch.float32)

# Matrix multiplication with torch.matmul
matmul_result = torch.matmul(mat1, mat2)
print("Matrix multiplication result (using torch.matmul):\n", matmul_result)

# Matrix multiplication using @ operator
matmul_result_2 = mat1 @ mat2
print("\nMatrix multiplication result (using @ operator):\n", matmul_result_2)

Matrix multiplication result (using torch.matmul):
 tensor([[19., 22.],
        [43., 50.]])

Matrix multiplication result (using @ operator):
 tensor([[19., 22.],
        [43., 50.]])


# Tensor Aggregations

PyTorch provides many functions to aggregate or reduce a tensor:
- `torch.sum()`
- `torch.mean()`
- `torch.max()`
- `torch.min()`
- `torch.argmax()` / `torch.argmin()`
- etc.

These functions help derive single-value results or indices of maximum/minimum from tensor data.

In [35]:
test_tensor = torch.tensor([3, 1, 4, 2, 5], dtype=torch.float32)

print("Sum:", torch.sum(test_tensor))
print("Mean:", torch.mean(test_tensor))
print("Max:", torch.max(test_tensor))
print("Min:", torch.min(test_tensor))
print("Argmax (index of the max value):", torch.argmax(test_tensor))
print("Argmin (index of the min value):", torch.argmin(test_tensor))

Sum: tensor(15.)
Mean: tensor(3.)
Max: tensor(5.)
Min: tensor(1.)
Argmax (index of the max value): tensor(4)
Argmin (index of the min value): tensor(1)


# Reshaping and Slicing Tensors

- **Reshaping**: Changing the shape of a tensor without changing its data. Common methods:
  - `tensor.view(new_shape)` (in older PyTorch code)
  - `tensor.reshape(new_shape)`
- **Slicing**: Selecting specific ranges, rows, columns, or sub-tensors from a tensor.

### Reshaping

In [36]:
x_1 = torch.arange(1, 13) # arange will create tensor that ranges from 1 to 12
print("Original tensor:\n", x_1)
print("Shape of original tensor:", x_1.shape)

x_reshaped = x_1.reshape(3, 4)  # or x.view(3, 4)
print("\nReshaped tensor (3x4):\n", x_reshaped)
print("Shape of reshaped tensor:", x_reshaped.shape)

Original tensor:
 tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])
Shape of original tensor: torch.Size([12])

Reshaped tensor (3x4):
 tensor([[ 1,  2,  3,  4],
        [ 5,  6,  7,  8],
        [ 9, 10, 11, 12]])
Shape of reshaped tensor: torch.Size([3, 4])


#### What will happen if the tensor is 11 numbers not 12?

In [37]:
x_2 = torch.arange(1, 12)
print("Original tensor:\n", x_2)
print("Shape of original tensor:", x_2.shape)

try:
    x_reshaped_2 = x_2.reshape(3, 4)  
    print("\nReshaped tensor (3x4):\n", x_reshaped_2)
    print("Shape of reshaped tensor:", x_reshaped_2.shape)
except Exception as e:
    print("Error: ", e)

Original tensor:
 tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])
Shape of original tensor: torch.Size([11])
Error:  shape '[3, 4]' is invalid for input of size 11


If you want to make sure that you will resize the tensor correctly just multiply the row and cols you want to reshape into and the result must equal the number of elements in the tensor.
* new shape -> (3, 4) = 3 * 4 = 12
* number of elements in the tensor x_1 = 12 then you can reshape this tensor.
* number of elements in the tensor x_2 = 11 then you can ***not*** reshape this tensor.

### Slicing Example

In [38]:
x_sliced = x_reshaped[1:, 1:]  # Rows from index 1 to end, columns from index 1 to end
print("Sliced tensor (from reshaped):\n", x_sliced)

Sliced tensor (from reshaped):
 tensor([[ 6,  7,  8],
        [10, 11, 12]])


# Moving Tensors to GPU

To harness the power of GPUs for faster computations, 
you can move your tensor to the GPU (if available).

- Check for GPU with `torch.cuda.is_available()`.
- Create or move a tensor to the GPU using `.to('cuda')` or `tensor.cuda()`.

In [39]:
# Check if GPU is available
device = "cuda" if torch.cuda.is_available() else "cpu"
print("Using device:", device)

# Create a tensor and move it to GPU if available
gpu_tensor = torch.tensor([1.0, 2.0, 3.0], device=device)
print("Tensor on device:", gpu_tensor.device)

# Alternatively, move an existing CPU tensor to GPU
cpu_tensor = torch.tensor([4.0, 5.0, 6.0])
cpu_tensor_to_gpu = cpu_tensor.to(device)
print("Tensor moved to GPU:", cpu_tensor_to_gpu.device)

Using device: cuda


Tensor on device: cuda:0
Tensor moved to GPU: cuda:0


In [40]:
try:
    print(cpu_tensor + cpu_tensor_to_gpu)
except Exception as e:
    print("Error: ", e)

Error:  Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!


# Question 

How can a vector be added to matrix?    "This operation is undefined in Linear Algebra"
- In data science we can do that thanks to broadcasting 

## What is broadcasting?
Broadcasting is a mechanism that allows PyTorch (and other array libraries like NumPy) to perform arithmetic operations on tensors of different shapes in a consistent way. Instead of throwing an error whenever tensor shapes don't match exactly, PyTorch will "stretch" or broadcast certain dimensions so that the two tensors become compatible for element-wise operations.

In [41]:
tensor_2d = torch.tensor([
    [0, 0, 0],
    [10, 10, 10],
    [20, 20, 20],
    [30, 30, 30]
])

# 1D Tensor (shape: [3])
tensor_1d = torch.tensor([1, 2, 3])

# The 1D tensor can be broadcast to shape [4, 3] for element-wise addition.
result_2 = tensor_2d + tensor_1d

print("2D Tensor:\n", tensor_2d)
print("1D Tensor:", tensor_1d)
print("Result of 2D + 1D:\n", result_2)
print("Shape of result:", result_2.shape)

2D Tensor:
 tensor([[ 0,  0,  0],
        [10, 10, 10],
        [20, 20, 20],
        [30, 30, 30]])
1D Tensor: tensor([1, 2, 3])
Result of 2D + 1D:
 tensor([[ 1,  2,  3],
        [11, 12, 13],
        [21, 22, 23],
        [31, 32, 33]])
Shape of result: torch.Size([4, 3])


In [None]:
# 2D tensor of shape (4, 3)
tensor_2d = torch.tensor([
    [ 0,  0,  0],
    [10, 10, 10],
    [20, 20, 20],
    [30, 30, 30]
])

print("Original 2D tensor:\n", tensor_2d)
print("Shape of original:", tensor_2d.shape, "\n")

# Row-wise Broadcasting
# We'll create a (4,1) tensor to be added to each row.
tensor_row = torch.tensor([[1],
                           [2],
                           [3],
                           [4]])  # shape: (4, 1)

print("Row-tensor:\n", tensor_row)
print("Shape of row-tensor:", tensor_row.shape)

res_row_broadcast = tensor_2d + tensor_row
print("\nResult of adding (4,3) + (4,1):\n", res_row_broadcast)
print("Shape of result:", res_row_broadcast.shape, "\n")

# Column-wise Broadcasting
# We'll create a (1,3) tensor to be broadcast along each column of 2D tensor.
tensor_col = torch.tensor([[10, 20, 30]])  # shape: (1, 3)

print("Column-tensor:\n", tensor_col)
print("Shape of column-tensor:", tensor_col.shape)

res_col_broadcast = tensor_2d + tensor_col
print("\nResult of adding (4,3) + (1,3):\n", res_col_broadcast)
print("Shape of result:", res_col_broadcast.shape)

Original 2D tensor:
 tensor([[ 0,  0,  0],
        [10, 10, 10],
        [20, 20, 20],
        [30, 30, 30]])
Shape of original: torch.Size([4, 3]) 

Row-tensor:
 tensor([[1],
        [2],
        [3],
        [4]])
Shape of row-tensor: torch.Size([4, 1])

Result of adding (4,3) + (4,1):
 tensor([[ 1,  1,  1],
        [12, 12, 12],
        [23, 23, 23],
        [34, 34, 34]])
Shape of result: torch.Size([4, 3]) 

Column-tensor:
 tensor([[10, 20, 30]])
Shape of column-tensor: torch.Size([1, 3])

Result of adding (4,3) + (1,3):
 tensor([[10, 20, 30],
        [20, 30, 40],
        [30, 40, 50],
        [40, 50, 60]])
Shape of result: torch.Size([4, 3])


You can look for more information about PyTorch and Broadcasting in the documentations: https://pytorch.org/docs/stable/torch.html