<a href="https://colab.research.google.com/github/RaiqaRasool/ml_training/blob/main/1_Pytorch_Fundamentals.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# PyTorch Crash Course

## 1. Tensors

Everything in PyTorch is based on Tensor operations. A Tensor is a multi-dimensional matrix containing elements of a single data type:


In [None]:
import torch

# torch.empty(size): uninitiallized
x = torch.empty(1) # scalar
print("empty(1):", x)
x = torch.empty(3) # vector
print("empty(3):",x)
x = torch.empty(2, 3) # matrix
print("empty(2,3):",x)
x = torch.empty(2, 2, 3) # tensor, 3 dimensions
#x = torch.empty(2,2,2,3) # tensor, 4 dimensions
print("empty(2, 2, 3):",x)

# torch.rand(size): random numbers [0, 1]
x = torch.rand(5, 3)
print("rand(5,3):", x)

# torch.zeros(size), fill with 0
# torch.ones(size), fill with 1
x = torch.zeros(5, 3)
print("zeros(5,3):", x)

empty(1): tensor([0.])
empty(3): tensor([ 3.3631e-44,  1.3563e-19, -3.8383e+14])
empty(2,3): tensor([[4.1288e+29, 3.1116e-41, 4.6171e+29],
        [3.1116e-41, 8.9683e-44, 0.0000e+00]])
empty(2, 2, 3): tensor([[[ 0.0000e+00,  0.0000e+00,  0.0000e+00],
         [ 0.0000e+00,  0.0000e+00,  0.0000e+00]],

        [[ 9.1084e-44,  0.0000e+00, -5.8022e+14],
         [ 4.3108e-41,  4.1224e+29,  3.1116e-41]]])
rand(5,3): tensor([[0.6477, 0.2010, 0.7509],
        [0.5010, 0.9053, 0.5927],
        [0.8208, 0.4227, 0.9219],
        [0.1728, 0.0976, 0.3212],
        [0.1968, 0.9312, 0.7002]])
zeros(5,3): tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]])


In [None]:
# check size
print("size", x.size())  # x.size(0)
print("shape", x.shape)  # x.shape[0]

size torch.Size([5, 3])
shape torch.Size([5, 3])


**.shape** is an alias for **.size()**, and was added to more closely match numpy, see this discussion [here](https://stackoverflow.com/questions/63263292/what-is-the-difference-between-tensor-size-and-tensor-shape-in-pytorch)


In [None]:
#getting length of first axis
print("size(0)", x.size(0))
print("shape[0]",x.shape[0])

size(0) 5
shape[0] 5


In [None]:
# check data type
print(x.dtype)

# specify types, float32 default
x = torch.zeros(5, 3, dtype=torch.float16)
print(x)

# check type
print(x.dtype)

torch.float32
tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]], dtype=torch.float16)
torch.float16


In [None]:
# construct from data
x = torch.tensor([5.5, 3])
print(x, x.dtype)

tensor([5.5000, 3.0000]) torch.float32


In [None]:
# requires_grad argument
# This will tell pytorch that it will need to calculate the gradients for this tensor
# later in optimization steps
# i.e. this is a variable in model that we want to optimize
x = torch.tensor([5.5, 3], requires_grad=True)
print(x)

tensor([5.5000, 3.0000], requires_grad=True)


#### Operations with Tensors

In [None]:
# Operations
x = torch.ones(2, 2)
y = torch.rand(2, 2)

# elementwise addition
z = x + y
# torch.add(x,y)

# in place addition, everythin with a trailing underscore is an inplace operation
# i.e. it will modify the variable
# y.add_(x)

print(x)
print(y)
print(z)

tensor([[1., 1.],
        [1., 1.]])
tensor([[0.8550, 0.8337],
        [0.8552, 0.0211]])
tensor([[1.8550, 1.8337],
        [1.8552, 1.0211]])


In [None]:
# subtraction
z = x - y
z = torch.sub(x, y)

# multiplication
z = x * y
z = torch.mul(x,y)

# division
z = x / y
z = torch.div(x,y)

In [None]:
# Slicing
x = torch.rand(5,3)
print(x)
print("x[:, 0]", x[:, 0]) # all rows, column 0
print("x[1, :]", x[1, :]) # row 1, all columns
print("x[1, 1]", x[1,1]) # element at 1, 1

# Get the actual value if only 1 element in your tensor
print("x[1,1].item()", x[1,1].item())

tensor([[0.2355, 0.6315, 0.5294],
        [0.7079, 0.1320, 0.9746],
        [0.4767, 0.1814, 0.3997],
        [0.1706, 0.0584, 0.0201],
        [0.8826, 0.2055, 0.7359]])
x[:, 0] tensor([0.2355, 0.7079, 0.4767, 0.1706, 0.8826])
x[1, :] tensor([0.7079, 0.1320, 0.9746])
x[1, 1] tensor(0.1320)
x[1,1].item() 0.13199615478515625


##### **x[0,0] versus x[0:1,0:1]**

<u>.item() work for both of them</u>

*x[0, 0]:*
This expression accesses a single element in the array x. The indices 0, 0 refer to the specific element at the first row and first column.
The result is a scalar value, and you're accessing a single element.

*x[0:1, 0:1]:*
This expression accesses a subarray or submatrix of x, specifically a 1x1 subarray. The notation 0:1 specifies a range of indices from 0 (inclusive) to 1 (exclusive), which effectively selects a single row (the first row) and 0:1 for columns also selects a single column (the first column).
The result is a subarray with a shape of (1, 1), and you're accessing a portion of the original array.

In [None]:
print("x[0:1,0:1]: ", x[0:1,0:1], ",  x[0:1,0:1].shape: ", x[0:1,0:1].shape, ",  x[0:1,0:1].item(): ", x[0:1,0:1].item() )
print("x[0,0]: ", x[0,0], ", x[0,0].shape: ", x[0,0].shape, ", x[0,0].item(): ", x[0,0].item())

x[0:1,0:1]:  tensor([[0.2355]]) ,  x[0:1,0:1].shape:  torch.Size([1, 1]) ,  x[0:1,0:1].item():  0.23549479246139526
x[0,0]:  tensor(0.2355) , x[0,0].shape:  torch.Size([]) , x[0,0].item():  0.23549479246139526


##### **torch.rand() versus torch.randn()**

*torch.rand():*
* This function generates random numbers from a uniform distribution between 0 (inclusive) and 1 (exclusive).
* In a uniform distribution, all values have an equal probability of being sampled. It results in a flat or rectangular-shaped distribution.

*torch.randn():*
* This function generates random numbers from a standard normal distribution (mean = 0, standard deviation = 1).
* In a normal distribution, data around the mean occur more frequently, thus following a bell shaped curve. It's also known as a Gaussian distribution.

In [None]:
torch.rand(2,2),torch.randn(2,2)

(tensor([[0.3450, 0.4698],
         [0.8379, 0.1016]]),
 tensor([[ 0.5095, -2.9783],
         [-1.0667,  0.2969]]))

In [None]:
# Reshape with torch.view()
x = torch.randn(4, 4)
y = x.view(16)
z = x.view(-1, 8)  # the size -1 is inferred from other dimensions
# if -1 it pytorch will automatically determine the necessary size
print(x.size(), y.size(), z.size())

torch.Size([4, 4]) torch.Size([16]) torch.Size([2, 8])


#### NumPy

Converting a Torch Tensor to a NumPy array and vice versa is very easy

In [None]:
a = torch.ones(5)
print(a)

# torch to numpy with .numpy()
b = a.numpy()
print(b)
print(type(b))

tensor([1., 1., 1., 1., 1.])
[1. 1. 1. 1. 1.]
<class 'numpy.ndarray'>


In [None]:
# Careful: If the Tensor is on the CPU (not the GPU),
# both objects will share the same memory location, so changing one
# will also change the other
a.add_(1)
print(a)
print(b)

tensor([2., 2., 2., 2., 2.])
[2. 2. 2. 2. 2.]


In [None]:
# numpy to torch with .from_numpy(x), or torch.tensor() to copy it
import numpy as np
a = np.ones(5)
b = torch.from_numpy(a)
c = torch.tensor(a)
print(a)
print(b)
print(c)

# again be careful when modifying
a += 1
print(a)
print(b)
print(c)

[1. 1. 1. 1. 1.]
tensor([1., 1., 1., 1., 1.], dtype=torch.float64)
tensor([1., 1., 1., 1., 1.], dtype=torch.float64)
[2. 2. 2. 2. 2.]
tensor([2., 2., 2., 2., 2.], dtype=torch.float64)
tensor([1., 1., 1., 1., 1.], dtype=torch.float64)


#### GPU Support

By default all tensors are created on the CPU. But we can also move them to the GPU (if it's available ), or create them directly on the GPU.

In [None]:
torch.cuda.is_available()

True

In [None]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

x = torch.rand(2,2).to(device)  # move tensors to GPU device
#x = x.to("cpu")
#x = x.to("cuda")

x = torch.rand(2,2, device=device)  # or directy create them on GPU
x

tensor([[0.6252, 0.9534],
        [0.5378, 0.5673]], device='cuda:0')