# PyTorch Zero-to-Mastery Tutorial
## About
Hello everyone. My name is Meshkat. For AI enthusiasts, I developed this jupyter notebook to have an understanding of PyTorch library, how to use it, when to use it, and what should be done in order to understand all aspects of that library.

## 1. Installation
You can install the PyTorch library with CUDA activated. In simple terms, CUDA is a layer above your NVIDIA GPU, which accelerates training time and use your VRAM (your GPU RAM) for storing your data. It has some more details that we cover later.

**Note**: Use [this link](https://pytorch.org/get-started/locally/) to install PyTorch regarding your system configs and your CUDA version.


In [11]:
!pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

Looking in indexes: https://download.pytorch.org/whl/cu118
Collecting torchaudio
  Using cached https://download.pytorch.org/whl/cu118/torchaudio-2.3.1%2Bcu118-cp310-cp310-win_amd64.whl (4.0 MB)
Installing collected packages: torchaudio
Successfully installed torchaudio-2.3.1+cu118



[notice] A new release of pip available: 22.3.1 -> 24.1.1
[notice] To update, run: python.exe -m pip install --upgrade pip


### Or you can use torch on CPU by just running the below command:

In [5]:
!pip install torch torchvision




[notice] A new release of pip available: 22.3.1 -> 24.1.1
[notice] To update, run: python.exe -m pip install --upgrade pip


## 2. Chek Installation

### Check what version of PyTorch you're using by:

In [3]:
!pip show torch

Name: torch
Version: 2.3.1+cu118
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: packages@pytorch.org
License: BSD-3
Location: h:\documents\work\pytorch-zero-to-mastery\venv\lib\site-packages
Requires: filelock, fsspec, jinja2, mkl, networkx, sympy, typing-extensions
Required-by: torchaudio, torchvision


### Check whether you have CUDA or not:

In [4]:
import torch

# Check if CUDA is available
if torch.cuda.is_available():
    print("CUDA is available. PyTorch is running on GPU.")
else:
    print("CUDA is not available. PyTorch is running on CPU.")

CUDA is available. PyTorch is running on GPU.


## 3. Tensors
Tensors are building blocks of torch. Your data will be converted into tensors to do the calculations.

In [5]:
import torch

# Creating a tensor from a list
sample_list = [1, 2, 3, 4]
tensor_from_list = torch.tensor(sample_list)
print(f'tensor_from_list: {tensor_from_list}, type: {type(tensor_from_list)}')

# With specific data type
tensor = torch.tensor([1.0, 2.0, 3.0], dtype=torch.float32)
print(f'data type of float32: {tensor}')

# Creating a tensor with random values
# for example: with size of 3*3
random_tensor = torch.randn(3, 3)
print(f'random_tensor: {random_tensor}')

# Creating a tensor filled with zeros
zeros_tensor = torch.zeros(3, 3)
print(f'zeros_tensor: {zeros_tensor}')

# Creating a tensor filled with ones
ones_tensor = torch.ones(3, 3)
print(f'ones_tensor: {ones_tensor}')

# Identity matrix
tensor = torch.eye(3)
print(f'eye tensor: {tensor}')

tensor_from_list: tensor([1, 2, 3, 4]), type: <class 'torch.Tensor'>
data type of float32: tensor([1., 2., 3.])
random_tensor: tensor([[-0.6662, -0.4889, -1.0792],
        [-1.1445, -0.1601, -2.1224],
        [-0.1507,  0.6502, -1.8212]])
zeros_tensor: tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]])
ones_tensor: tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])
eye tensor: tensor([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]])



A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.0.0 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

Traceback (most recent call last):  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.3056.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.3056.0_x64__qbz5n2kfra8p0\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "H:\Documents\Work\pytorch-zero-to-mastery\venv\lib\site-packages\ipykernel_launcher.py", line 18, in <module>
    app.launch_new_instance()
  File "H:

### 3.1. Tensor Operations:
You can have all the arithmetics on tensors. The operators (+, -, *, /) are all overrided and are equivalent to calling the torch.\<func>(<input_1>, <input_2>)

In [6]:
#Example Tensors of a and b:
a = torch.tensor([1, 2, 3])
b = torch.tensor([4, 5, 6])

# Addition
c = a + b
print(f'add: {torch.add(a, b)}')


# Subtraction
c = a - b
print(f'sub: {torch.sub(a, b)}')


# Multiplication
c = a * b
print(f'mult: {torch.mul(a, b)}')


# Division
c = a / b
print(f'division: {torch.div(a, b)}')

# Exponentiation
c = a ** 2
print(f'power: {torch.pow(a, 2)}')

# Tensor addition
a = torch.tensor([1, 2])
b = torch.tensor([3, 4])
print(a + b)

add: tensor([5, 7, 9])
sub: tensor([-3, -3, -3])
mult: tensor([ 4, 10, 18])
division: tensor([0.2500, 0.4000, 0.5000])
power: tensor([1, 4, 9])
tensor([4, 6])


### 3.2. Matrix Operations

In [7]:
# Matrix multiplication
a = torch.randn(2, 3)
b = torch.randn(3, 2)
c = torch.matmul(a, b)

# Element-wise multiplication
c = a * b.T  # Transpose b to match dimensions

# Matrix transpose
c = a.t()

# Matrix inverse
a = torch.randn(3, 3)
a_inv = torch.inverse(a)

# Determinant
det = torch.det(a)

# Eigenvalues and eigenvectors
L_complex, V_complex = torch.linalg.eig(a)
print(f'eigvals = {L_complex}')
print(f'eigvecs = {V_complex}')

eigvals = tensor([-0.5211+0.0000j,  1.0233+0.8075j,  1.0233-0.8075j])
eigvecs = tensor([[-0.6197+0.0000j,  0.7467+0.0000j,  0.7467-0.0000j],
        [-0.5336+0.0000j, -0.1745-0.5960j, -0.1745+0.5960j],
        [ 0.5755+0.0000j,  0.2332+0.0484j,  0.2332-0.0484j]])


### 3.3. Advanced Operation

#### 3.3.1. Reshaping and Slicing

In [27]:
a = torch.randn(4, 4)
print(f'a = {a}')

# Reshape (or view)
b = a.view(2, 8)
print(f'view = {b}')
b = a.reshape(2, 8)
print(f'reshape = {b}')

# Squeeze and unsqueeze
b = a.unsqueeze(0)  # Add a dimension
print(f'unsqueeze = {b}')
c = b.squeeze(0)    # Remove a dimension
print(f'squeeze = {c}')

# Transpose and permute
b = a.transpose(0, 1)  # Swap dimensions
print(f'transpose = {b}')

c = a.permute(1, 0)    # Permute dimensions
print(f'permute = {c}')

# Indexing and slicing
b = a[0, :]
print(f'tensor \'a\' with first row and all columns: {b}')
c = a[:, 1]
print(f'tensor \'a\' with all rows and second column: {c}')

d = a[0:2, 1:3]
print(f'tensor \'a\' with rows number of 0 and 1, and column number of 1 and 2: {d}') # 2 by 2 matrix

a = tensor([[-1.4623, -0.7722, -1.4488,  1.1760],
        [-0.6497,  1.7598,  1.4511, -1.8684],
        [ 0.7739, -0.3217, -0.2484, -0.7102],
        [ 0.5602,  0.7355,  0.1892,  0.1634]])

a = tensor([[ 1.6423, -0.1596, -0.4974,  0.4396],
        [-0.7581,  1.0783,  0.8008,  1.6806],
        [ 1.2791,  1.2964,  0.6105,  1.3347],
        [-0.2316,  0.0418, -0.2516,  0.8599]])
view = tensor([[ 1.6423, -0.1596, -0.4974,  0.4396, -0.7581,  1.0783,  0.8008,  1.6806],
        [ 1.2791,  1.2964,  0.6105,  1.3347, -0.2316,  0.0418, -0.2516,  0.8599]])
reshape = tensor([[ 1.6423, -0.1596, -0.4974,  0.4396, -0.7581,  1.0783,  0.8008,  1.6806],
        [ 1.2791,  1.2964,  0.6105,  1.3347, -0.2316,  0.0418, -0.2516,  0.8599]])
unsqueeze = tensor([[[ 1.6423, -0.1596, -0.4974,  0.4396],
         [-0.7581,  1.0783,  0.8008,  1.6806],
         [ 1.2791,  1.2964,  0.6105,  1.3347],
         [-0.2316,  0.0418, -0.2516,  0.8599]]])
squeeze = tensor([[ 1.6423, -0.1596, -0.4974,  0.4396],
        [-0.7581,  1.0783,  0.8008,  1.6806],
        [ 1.2791,  1.2964,  0.6105,  1.3347],
        [-0.2316,  0.0418, -0.2516,  0.8599]])
transpose = tensor([[ 1.6423, -0.7581,  1.2791, -0.2316],
        [

#### Note in using random:
If you don't want to generate different values each time when you call the `torch.randn()` or `torch.rand()`, you can set seeds so that each time you get the same random value. (useful for testing)

To do so, just use the code below:
```
torch.manual_seed(42)
```

In [16]:
torch.manual_seed(42)
torch.randn(4, 4)
# your output would be always be:
# tensor([[ 1.9269,  1.4873,  0.9007, -2.1055],
#         [ 0.6784, -1.2345, -0.0431, -1.6047],
#         [-0.7521,  1.6487, -0.3925, -1.4036],
#         [-0.7279, -0.5594, -0.7688,  0.7624]])

tensor([[ 1.9269,  1.4873,  0.9007, -2.1055],
        [ 0.6784, -1.2345, -0.0431, -1.6047],
        [-0.7521,  1.6487, -0.3925, -1.4036],
        [-0.7279, -0.5594, -0.7688,  0.7624]])

#### 3.3.2. Reductions and Aggregations

In [20]:
a = torch.tensor([[1, 2], [3, 4]])
print(f'a = {a}')
# Sum
total = torch.sum(a)
print(f'torch.sum: {total}')

# Mean
mean = torch.mean(a.float())
print(f'torch.mean: {mean}')

# Standard deviation
std = torch.std(a.float())
print(f'torch.std: {mean}')

# Min and max
min_val, min_idx = torch.min(a, dim=0)
max_val, max_idx = torch.max(a, dim=1)
print(f'min value: {min_val}, \n'
      f'min index: {min_idx},\n'
      f'max value: {max_val}, \n'
      f'max index: {max_idx}')


# Argmin and argmax
min_idx = torch.argmin(a)
max_idx = torch.argmax(a)
print(f'min_idx = {min_idx}, max_idx = {max_idx}')


a = tensor([[1, 2],
        [3, 4]])
torch.sum: 10
torch.mean: 2.5
torch.std: 2.5
min value: tensor([1, 2]), 
min index: tensor([0, 0]),
max value: tensor([2, 4]), 
max index: tensor([1, 1])
min_idx = 0, max_idx = 3


#### 3.3.3. Comparison Operations

In [25]:
a = torch.tensor([1, 2, 3])
b = torch.tensor([3, 2, 1])

# Element-wise comparisons
c = a > b
print(f'is \'a\' greater than b: {c}')
p = a >= b
print(f'is \'a\' greater or equal than b: {p}')
d = a == b
print(f'd is the boolean if a equals b or not: {d}')

# Boolean reductions
any_true = torch.any(c)
all_true = torch.all(c)
print(f'c = {c}')
print(f"is any of value of 'c' equals true? : {any_true}")
print(f"are all of the values of 'c' equals true? : {all_true}")


is 'a' greater than b: tensor([False, False,  True])
is 'a' greater or equal than b: tensor([False,  True,  True])
d is the boolean if a equals b or not: tensor([False,  True, False])
c = tensor([False, False,  True])
is any of value of 'c' equals true? : True
are all of the values of 'c' equals true? : False


#### 3.3.4. Broadcasting

In [28]:
a = torch.tensor([[1, 2, 3], [4, 5, 6]])
b = torch.tensor([1, 2, 3])

# Broadcast b to match the shape of a
c = a + b
print(f'c = {c}')


c = tensor([[2, 4, 6],
        [5, 7, 9]])


### 3.4. Random Tensor Operations
check out [this note](#note-in-using-random)

In [34]:
# Random tensors
random_tensor = torch.rand(3, 3)  # Uniformly distributed between [0, 1)
print(f'random tensor created with torch.rand: {random_tensor}')
random_tensor = torch.randn(3, 3)  # Standard normal distribution
print(f'random tensor created with torch.randn: {random_tensor}')

# Set the seed for reproducibility
torch.manual_seed(42)

# Random integers
random_int = torch.randint(low=0, high=10, size=(3, 3))
print(f'random integer generation: {random_int}')

# Random choice
choices = torch.multinomial(torch.tensor([0.1, 0.2, 0.3, 0.4]), 3)
print(f'choices = {choices}')

random tensor created with torch.rand: tensor([[0.9346, 0.5936, 0.8694],
        [0.5677, 0.7411, 0.4294],
        [0.8854, 0.5739, 0.2666]])
random tensor created with torch.randn: tensor([[ 1.2211,  0.1511, -0.3319],
        [-0.4785, -0.2631, -0.1786],
        [-1.1859, -0.8860, -0.7150]])
random integer generation: tensor([[2, 7, 6],
        [4, 6, 5],
        [0, 4, 0]])
choices = tensor([2, 3, 0])


### 3.5. Gradient Operations (Autograd)
This is step is crutial for training. More explanation will be given in the future. In simple terms and for now, when calling the `z.backward()` function, by setting the `requires_grad=True`, the output which is `z` will be saved within `x.gard` for further calculations of loss function.

In other words, Autograd is PyTorch’s automatic differentiation engine that powers neural network training.

In [52]:
# Create a tensor with gradient tracking
x = torch.tensor(2.0, requires_grad=True)
print(f'x = {x}')
# Perform operations
y = x ** 2
print(f'y = {y}')
z = y * 3
print(f'z = {z}')

# Compute gradients
z.backward()

print(x.grad, x)  # dz/dx = 6x => 6*2 = 12

tensor(2.)


### 3.6. Saving and Loading Tensors

In [51]:
# Save tensor to a file
a = torch.randn(3, 3)
torch.save(a, 'tensor.pt')

# Load tensor from a file
tensor_from_file = torch.load('tensor.pt')
tensor_from_file

tensor([[-2.4661,  0.3623,  0.3765],
        [-0.1808,  0.3930,  0.4327],
        [-1.3627,  1.3564,  0.6688]])