# PyTorch Crash Course

## 5 Chapters
1. Introduction to Tensors  
  - Tensor Operations: Create, Numpy, GPU Support
2. Autograd: Automatic Differentiation of PyTorch  
  - Linear Regression Example
3. Training Loop with: Model, Loss, Optimizer  
  - A Typical PyTorch Workflow / training pipeline
4. Neural Network  
  - Also: GPU/Mps, DataLoader, Transforms & Evaluation
5. Convolutional Neural Network (CNN)  
  - Also: Save & Load Model

# 1. Introduction to Tensors

Everything in Pytorch is a tensor. Tensor is the basic datastructure in PyTorch that extends numpy. 

In theory, tensor is a multi-dimensional matrix containing elements of the same data type.

In [16]:
import torch

x = torch.empty(5)
print(f'empty(5): {x} with shape {x.shape}')
x = torch.empty(2,3)
print(f'empty(2,3): {x} with shape: {x.shape}')
x = torch.ones(1)
print(f'ones(1): {x} with shape: {x.shape}')
x = torch.ones(1,2)
print(f'ones(1,2): {x} with shape: {x.shape}')
x = torch.zeros(2)
print(f'zeros(2): {x} with shape: {x.shape}')
x = torch.zeros(2,2)
print(f'zeros(2, 2): {x} with shape: {x.shape}')
x = torch.rand(5)
print(f'rand(5): {x} with shape: {x.shape}')
x = torch.rand(5,5)
print(f'rand(5,5): {x} with shape: {x.shape}')

empty(5): tensor([0., 0., 0., 0., 0.]) with shape torch.Size([5])
empty(2,3): tensor([[0., 0., 0.],
        [0., 0., 0.]]) with shape: torch.Size([2, 3])
ones(1): tensor([1.]) with shape: torch.Size([1])
ones(1,2): tensor([[1., 1.]]) with shape: torch.Size([1, 2])
zeros(2): tensor([0., 0.]) with shape: torch.Size([2])
zeros(2, 2): tensor([[0., 0.],
        [0., 0.]]) with shape: torch.Size([2, 2])
rand(5): tensor([0.6733, 0.4980, 0.4894, 0.5151, 0.1446]) with shape: torch.Size([5])
rand(5,5): tensor([[0.8415, 0.5387, 0.5452, 0.6928, 0.3202],
        [0.7853, 0.2013, 0.9171, 0.4358, 0.1147],
        [0.9142, 0.1029, 0.1442, 0.8652, 0.0256],
        [0.5981, 0.9496, 0.9197, 0.6694, 0.3403],
        [0.4247, 0.6171, 0.0515, 0.8784, 0.1009]]) with shape: torch.Size([5, 5])


We can check the size of the tensor using `.size()` or the shape of a tensor using `.shape()`

In [17]:
print(f'Size: {x.size()}') # specific dimension .size(0)
print(f'Shape: {x.shape}') # specific dimension .shape[0]

Size: torch.Size([5, 5])
Shape: torch.Size([5, 5])


In [18]:
# check the data type of a tensor using:
print(f'The tensor {x} has dtype: {x.dtype}')

# To define a tensor with different dtype we should define it upon creation
x = torch.rand(5,5, dtype=torch.float16)
print(f'The tensor {x} has dtype: {x.dtype}')


The tensor tensor([[0.8415, 0.5387, 0.5452, 0.6928, 0.3202],
        [0.7853, 0.2013, 0.9171, 0.4358, 0.1147],
        [0.9142, 0.1029, 0.1442, 0.8652, 0.0256],
        [0.5981, 0.9496, 0.9197, 0.6694, 0.3403],
        [0.4247, 0.6171, 0.0515, 0.8784, 0.1009]]) has dtype: torch.float32
The tensor tensor([[0.4570, 0.9282, 0.7173, 0.4727, 0.1353],
        [0.1509, 0.6665, 0.3784, 0.6333, 0.7949],
        [0.0498, 0.8657, 0.3813, 0.8369, 0.3643],
        [0.0381, 0.1670, 0.9619, 0.4648, 0.5718],
        [0.0898, 0.2207, 0.0068, 0.7817, 0.5752]], dtype=torch.float16) has dtype: torch.float16


We can also construct a tensor from a Python List or a numpy array

In [20]:
import numpy as np

np_array = np.array([1,2,3])
x = torch.tensor(np_array)
x

tensor([1, 2, 3])

In [21]:
py_list = [1,2,3]
x = torch.tensor(py_list)
x

tensor([1, 2, 3])

Another one important thing to know is that a tensor has an argument `requires_grad` which is by default set to `False`. If we set this to true, then python will track `gradients` for that numpy array. 

In a simple way, it will tell pytorch that it will need to calculate gradients for this tensor. We need this later in the optimization step, and we use it for all variables in our model that we want to optimize.

In [22]:
x = torch.tensor([5.5, 3], requires_grad=True)
print(x)

tensor([5.5000, 3.0000], requires_grad=True)


## 1.2 Operations on tensors

This is simliar to numpy arrays. All the operations unless specified are element-wise.

In [57]:
x = torch.ones(2,2)
y = torch.rand(2,2)
x, y

(tensor([[1., 1.],
         [1., 1.]]),
 tensor([[0.9356, 0.8502],
         [0.8692, 0.1311]]))

In [58]:
# Elementwise addition
z = x + y
z

tensor([[1.9356, 1.8502],
        [1.8692, 1.1311]])

In [59]:
# Elementwise subtraction, multiplication and division
z = x - y
print(f'subtraction: {z}')
z = x * y
print(f'subtraction: {z}')
z = x / y
print(f'subtraction: {z}')

subtraction: tensor([[0.0644, 0.1498],
        [0.1308, 0.8689]])
subtraction: tensor([[0.9356, 0.8502],
        [0.8692, 0.1311]])
subtraction: tensor([[1.0688, 1.1762],
        [1.1505, 7.6269]])


# 

Indexing and slicing on torch tensors:

In [60]:
x = torch.rand(5,3,2)
x

tensor([[[0.3224, 0.0053],
         [0.7883, 0.4331],
         [0.2507, 0.8969]],

        [[0.2828, 0.5628],
         [0.0871, 0.4265],
         [0.2292, 0.5341]],

        [[0.4714, 0.1922],
         [0.1939, 0.3335],
         [0.3064, 0.5884]],

        [[0.9674, 0.1169],
         [0.5641, 0.1002],
         [0.3880, 0.6121]],

        [[0.6654, 0.0183],
         [0.9934, 0.4897],
         [0.2130, 0.9235]]])

In [61]:
x[:,1,:]

tensor([[0.7883, 0.4331],
        [0.0871, 0.4265],
        [0.1939, 0.3335],
        [0.5641, 0.1002],
        [0.9934, 0.4897]])

`.item()` converts a single number tensor into scalar. If it has more than 1 element it will break.

In [62]:
#wihtout item it is a tensor:
print(f'dtype: {x[1,1,1]} of tensor {x}')
x[1,1,1].item(), type(x[1,1,1].item())

dtype: 0.4265291690826416 of tensor tensor([[[0.3224, 0.0053],
         [0.7883, 0.4331],
         [0.2507, 0.8969]],

        [[0.2828, 0.5628],
         [0.0871, 0.4265],
         [0.2292, 0.5341]],

        [[0.4714, 0.1922],
         [0.1939, 0.3335],
         [0.3064, 0.5884]],

        [[0.9674, 0.1169],
         [0.5641, 0.1002],
         [0.3880, 0.6121]],

        [[0.6654, 0.0183],
         [0.9934, 0.4897],
         [0.2130, 0.9235]]])


(0.4265291690826416, float)

In [63]:
x[1,1].item()

RuntimeError: a Tensor with 2 elements cannot be converted to Scalar

Reshaping a tensor:
- Can be done either with `view()` either with `.reshape()`

The main differences between torch.view() and torch.reshape() are: Memory and Storage

`view()`:
- Creates a view of the original tensor (shares the same memory)
- Does not copy data - just changes how the tensor is interpreted
- Requires the tensor to be contiguous in memory
- Will fail if the tensor is not contiguous

`reshape()`:
- Can return either a view OR a copy depending on the tensor's memory layout
- Works with both contiguous and non-contiguous tensors
- May create a copy if the tensor is not contiguous

In [64]:
x = torch.rand(4,4)
y = x.view(16)
y, x.view(16).shape

(tensor([0.7304, 0.0771, 0.7171, 0.3782, 0.4496, 0.0919, 0.6032, 0.2691, 0.3198,
         0.6993, 0.6584, 0.2737, 0.7963, 0.5940, 0.6395, 0.0698]),
 torch.Size([16]))

In [65]:
z = x.view(-1, 8) # will fill the dimensions required to match the shape of the original tensor
z

tensor([[0.7304, 0.0771, 0.7171, 0.3782, 0.4496, 0.0919, 0.6032, 0.2691],
        [0.3198, 0.6993, 0.6584, 0.2737, 0.7963, 0.5940, 0.6395, 0.0698]])

In [66]:
print(f"x storage address: {x.storage().data_ptr()}")
print(f"z storage address: {z.storage().data_ptr()}")

x storage address: 5409951936
z storage address: 5409951936


Convert a tensor to a numpy array or vice versa.

If tensors are on the CPU, converting them to numpy will share the same memory location (so one change affect the both the tensor and the numpy array).

In [70]:
x = torch.rand(5,2)
np_x = x.numpy()

In [71]:
x.add_(1)
print(x)
print(np_x)

tensor([[1.4628, 1.5563],
        [1.7423, 1.0034],
        [1.6913, 1.0113],
        [1.0346, 1.1374],
        [1.0591, 1.0742]])
[[1.462799  1.5562607]
 [1.7423112 1.003419 ]
 [1.691334  1.0113358]
 [1.0345505 1.137443 ]
 [1.0590763 1.0742233]]


In [72]:
x == np_x

tensor([[True, True],
        [True, True],
        [True, True],
        [True, True],
        [True, True]])

On the other hand, if we create a torch from numpy will still share the same memory address, unless we specify explicitly a new tensor from that numpy array with the traditional way.

In [73]:
# Method 1: torch.from_numpy() - SHARES memory
tensor1 = torch.from_numpy(np_array)

# Method 2: torch.tensor() - COPIES data (creates new memory)
tensor2 = torch.tensor(np_array)