# PyTorch

PyTorch is a deep learning framework used for research and development in machine learning and artificial intelligence.

A tensor is a fundamental data structure that is similar to arrays or matrices. Tensors are the building blocks of neural networks and are used to represent data in the form of multi-dimensional arrays

### Types of Tensors
![](./Tensor.PNG)

In [11]:
import torch
import torch.nn as nn
import numpy as np

In [13]:
scalar = torch.tensor(42.0)  # Creates a scalar tensor with the value 42.0. has 0 dimensions
vector = torch.tensor([1, 2, 3, 4, 5])  # Creates a 1-D tensor with 5 elements
matrix = torch.tensor([[1, 2, 3], [4, 5, 6]])  # Creates a 2-D tensor with 2 rows and 3 columns
four_dim_tensor = torch.randn(32, 3, 64, 64)  # Create a 4-D tensor with shape (batch_size, channels, height, width)
four_dim_tensor[0]

tensor([[[ 1.3729,  0.5235,  0.0917,  ...,  0.5436,  1.0157, -0.4076],
         [-0.8359,  0.4857, -0.3840,  ...,  1.2926,  0.0419,  0.0745],
         [-1.0659,  1.0442, -0.0694,  ...,  0.5912, -0.3699,  2.1489],
         ...,
         [ 1.2287, -0.8064, -1.9017,  ...,  2.2413,  1.2596,  0.3129],
         [-0.4460,  0.7702, -0.4863,  ..., -1.1460,  1.3075, -0.5704],
         [ 1.1359, -0.1912, -0.3292,  ..., -1.0832,  0.7019, -2.3536]],

        [[-0.3411,  1.1427,  0.0199,  ..., -0.4445, -0.0500, -0.7401],
         [ 0.5400, -1.0509,  0.3330,  ..., -0.4529,  0.2833, -1.2745],
         [ 1.3937,  1.5880,  1.5692,  ..., -1.0110, -0.2030,  0.1606],
         ...,
         [ 0.9348,  0.2876, -0.7562,  ...,  0.1915, -1.4093,  0.6099],
         [ 0.1194, -0.4032, -0.5989,  ..., -0.1392, -0.2586, -0.2671],
         [ 0.8928, -1.5127,  0.3421,  ..., -0.6807, -0.1817,  1.4142]],

        [[ 0.1711, -0.1057, -1.1131,  ..., -2.4121,  0.7486, -0.1343],
         [ 0.1748,  0.4051, -0.2572,  ...,  0

Different arguments can be provided for tensor creation:
* Data
* Dtype
* Device (specify the device (CPU or GPU) on which the tensor should be located using this argument. If not provided, the tensor will be created on the CPU by default.)
* Requires_grad (If set to True, the tensor will be set up to track operations on it for automatic differentiation (autograd) during backpropagation. This is useful for gradient-based optimization and training deep learning models.)

In [15]:
tensor = torch.tensor(data=[[1, 2, 3], [4, 5, 6]], 
dtype=torch.float32, 
device='cpu', 
requires_grad=False)

In [45]:
tensor = torch.tensor(data=[[1, 2, 3], [3, 2, 3]])
tensor.type(torch.float32)
tensor.numel()  # total elements in tensor

6

In [46]:
reshaped_tensor = torch.reshape(tensor, (3, 2))
reshaped_tensor

tensor([[1, 2],
        [3, 3],
        [2, 3]])

In [47]:
reshaped_tensor = torch.reshape(tensor, (-1, 2))  # -1 is used to infer one of dimensions
reshaped_tensor

tensor([[1, 2],
        [3, 3],
        [2, 3]])

In [48]:
x = torch.randn(32, 3, 64, 64)
x_flattened = x.view(x.size(0), -1)
x_flattened

tensor([[ 1.2488, -0.4536,  0.2434,  ..., -0.5588,  0.4220, -0.0659],
        [-0.6298,  0.5623, -1.7265,  ..., -1.4979, -1.2487, -0.0790],
        [-0.4422,  1.1159, -0.8890,  ...,  0.8135,  0.0231,  1.6736],
        ...,
        [ 1.4524, -0.4520,  0.1751,  ...,  1.8863,  1.6410,  0.8342],
        [ 1.2307,  0.7091, -1.0273,  ..., -0.4876, -2.5519, -1.0745],
        [ 0.9052, -1.2224,  0.6602,  ...,  0.3539,  0.4184,  0.8794]])

In [61]:
expanded_tensor = torch.unsqueeze(tensor, dim=0)  #  Returns a new tensor with a dimension of size one inserted at the specified position.
expanded_tensor.size(), tensor.size()

(torch.Size([1, 2, 3]), torch.Size([2, 3]))

### Permute function
The permute() function allows to rearrange dimensions in a tensor, providing with the flexibility to change the shape and orientation of data

In [63]:
permuted_tensor = tensor.permute(1, 0)  # Swap dimensions 0 and 1
permuted_tensor.shape, tensor.shape

(torch.Size([3, 2]), torch.Size([2, 3]))

In [66]:
tensor, permuted_tensor

(tensor([[1, 2, 3],
         [3, 2, 3]]),
 tensor([[1, 3],
         [2, 2],
         [3, 3]]))

In [68]:
# Transposing a Tensor (Swapping Rows and Columns)
transposed_tensor_1 = tensor.t()
transposed_tensor_2 = torch.transpose(tensor, 0, 1)  # Swap axes 0 and 1

print(f'The original tensor shape is: {tensor.shape},\n' 
      f'The transposed tensor using .t shape is: {transposed_tensor_1.shape},\n' 
      f'The transposed tensor using .tranpose shape is: {transposed_tensor_2.shape}')

The original tensor shape is: torch.Size([2, 3]),
The transposed tensor using .t shape is: torch.Size([3, 2]),
The transposed tensor using .tranpose shape is: torch.Size([3, 2])


Addition and subtraction between tensors same shape

In [79]:
tensor_a = torch.tensor([[4, 5, 7], [8, 9, 0]])
tensor_b = torch.tensor([[5, 4, 3], [9, 8, 7]])

tensor_a + tensor_b

tensor([[ 9,  9, 10],
        [17, 17,  7]])

In [80]:
tensor_a - tensor_b

tensor([[-1,  1,  4],
        [-1,  1, -7]])

Element-wise multiplication between 2 tensors of same shape

In [81]:
tensor_a * tensor_b

tensor([[20, 20, 21],
        [72, 72,  0]])

Matrix-wise multiplication (dot product) between 2 tensors where the inner dimensions match (the number of columns in the first tensor equals the number of rows in the second tensor)

In [82]:
tensor_c = torch.tensor([[5, 4, 3], [9, 8, 7], [1, 1, 1]])
matmu = torch.matmul(tensor_a, tensor_c)
matmu

tensor([[ 72,  63,  54],
        [121, 104,  87]])

In [84]:
div = tensor_a / tensor_b
div

tensor([[0.8000, 1.2500, 2.3333],
        [0.8889, 1.1250, 0.0000]])

In [88]:
result_exp = tensor_a ** tensor_b
result_exp

tensor([[     1024,       625,       343],
        [134217728,  43046721,         0]])

In [90]:
result_sqrt = torch.sqrt(tensor_a)
result_sqrt

tensor([[2.0000, 2.2361, 2.6458],
        [2.8284, 3.0000, 0.0000]])

In [92]:
result_log = torch.log(tensor_a)  # natural logarithm (base e)
result_log

tensor([[1.3863, 1.6094, 1.9459],
        [2.0794, 2.1972,   -inf]])

In [96]:
tensor_a = tensor_a.type(torch.float32)
total_sum = torch.sum(tensor_a)

# Compute the mean along axis 1 (rows)
mean_along_rows = torch.mean(tensor_a, dim=1)

# Compute the maximum value along axis 0 (columns)
max_along_columns = torch.max(tensor_a, dim=0)

# Compute the minimum value along axis 1 (rows)
min_along_rows = torch.min(tensor_a, dim=1)

total_sum, mean_along_rows, max_along_columns, min_along_rows

(tensor(33.),
 tensor([5.3333, 5.6667]),
 torch.return_types.max(
 values=tensor([8., 9., 7.]),
 indices=tensor([1, 1, 0])),
 torch.return_types.min(
 values=tensor([4., 0.]),
 indices=tensor([0, 2])))

Broadcasting in PyTorch

The key idea behind broadcasting is that the smaller tensor is "broadcasted" or expanded to match the shape of the larger one

In [98]:
scalar = 2
result_broadcast = tensor_a + scalar
print(f'broadcast results is: {result_broadcast} and of shape {result_broadcast.shape}')

broadcast results is: tensor([[ 6.,  7.,  9.],
        [10., 11.,  2.]]) and of shape torch.Size([2, 3])


Two 2x2 tensors, tensor_a and tensor_b, and we want to concatenate them along dimension 0 to create a new tensor with a shape of 4x2.

In [102]:
tensor_a, tensor_b = torch.tensor([[2, 2], [2, 2]]), torch.tensor([[4, 4], [4, 4]])
concatenated_tensor = torch.cat((tensor_a, tensor_b), dim=0)
print(f'concatenated tensor is: {concatenated_tensor} and of shape {concatenated_tensor.shape}')

concatenated tensor is: tensor([[2, 2],
        [2, 2],
        [4, 4],
        [4, 4]]) and of shape torch.Size([4, 2])


### AutoGrad and Gradients

Autograd, short for Automatic Differentiation, is a key feature of PyTorch that allows for automatic computation of gradients (derivatives) of tensors. It is an essential component for training deep learning models through backpropagation.
1. **Gradient Calculation** - In deep learning, we often need to compute gradients of a loss function with respect to model parameters. Autograd simplifies this process. When you perform operations on tensors that require gradients, PyTorch automatically tracks these operations and constructs a computation graph.

2. **Computation Graph** - A computation graph is a directed acyclic graph (DAG) that represents the sequence of operations applied to tensors. Each operation in the graph is a node, and tensors flowing through these nodes are edges. The graph allows PyTorch to trace how input tensors influence the output tensors, which is crucial for gradient calculation.

3. **Dynamic Computational Graph** - PyTorch uses a dynamic computation graph, which means the graph is built on-the-fly as operations are executed. This dynamic nature allows flexibility and is well-suited for models with varying architectures or inputs of different shapes.

4. **Gradients** - Once you have a computation graph, you can compute gradients by backpropagating through the graph. Gradients represent how a small change in each input tensor would affect the final output. The gradients are computed using the chain rule of calculus, and they indicate the direction and magnitude of parameter updates during optimization.