<h1 align="center">PyTorch</h1>

1. Tensor Computation
2. GPU Acceleration : able to use GPU to speedup calculations
3. Dynamic Computing Graph : can change to execution flow during runtime
4. Automatic Differentiation
5. Distributed Training
6. Interoperability with other libraries



<h3 align="center">Core PyTorch Modules</h3>

- `torch` : the core module providing multidimensional arrays(tensor) and mathematical operations on them.
- `torch.autograd` : automatic differentiation engine that records operations on tensor to compute gradients for optimizations.
- `torch.nn` : provides the nn library, including layers, activations, loss functions and utilities to build deep learning models.
- `torch.optim` : contains optimization algorithms(optimizers) such as SGD, Adam, and RMSprop.
- `torch.utils.data` : used for data handling, including Dataset and Dataloader classes for managing and loading datasets efficiently.
- `torch.jit` : supports JIT compilation and TorchScript for optimizing models and enabling deployment without Python dependencies.
- `torch.distributed` : tolls for distributed training across multiple GPUs and machines, facilitating parallel computation.
- `torch.cuda` : interfaces with NVIDIA CUDA to enable GPU acceleration for tensor computations and model training.
- `torch.backends `: contains settings and allows control over backend libraries like cuDNN, MKL and other performance tuning.
- `torch.multiprocessing` : used for parallelism and multiprocessing similar to python module but with support for CUDA tensors.
- `torch.quantization` : tool for model quantization to reduce model size and improve inference speed, especially on edge devices.
- `torch.onyx `: support exporting PyTorch models like ONNX(Open Neural Network Exchange) format for interoperability with other frameworks and deployment.



In [1]:
import torch

In [2]:
torch.__version__

'2.5.1'

In [3]:
# For CUDA device
if torch.cuda.is_available():
    print(torch.get_device(0))
else:
    print("Nvidia GPU is not available")

# FOR M-Series mac gpu
if torch.backends.mps.is_available():
    device = torch.device("mps")
    print(device)
else:
    print("MPS is not available")


Nvidia GPU is not available
mps


In [10]:
# Tensor can be thought as an n-dimensional array

# zero-dimensional tensor : Scalar
zero_d_tensor = torch.tensor(0)
zero_d_tensor.ndim, zero_d_tensor

(0, tensor(0))

In [11]:
# one-dimensional tensor : Vector

one_d_tensor = torch.tensor([1, 5, 10, 13])
one_d_tensor, one_d_tensor.ndim


(tensor([ 1,  5, 10, 13]), 1)

In [12]:
# two-dimensional tensor: Matrix

# grayscale image can be represented as a 2D tensor where each entry corresponds to the pixel intensity(0-255)
two_d_tensor = torch.tensor([
    [0, 255,128],
    [34, 90, 180]
])

two_d_tensor, two_d_tensor.ndim

(tensor([[  0, 255, 128],
         [ 34,  90, 180]]),
 2)

In [13]:
# three-dimensional tensor: RGB Image
three_d_tensor = torch.tensor([
    [
        [0, 255,128],
        [0, 255,128],
        [0, 255,128]
    ]
], dtype=torch.uint8)

three_d_tensor, three_d_tensor.ndim

(tensor([[[  0, 255, 128],
          [  0, 255, 128],
          [  0, 255, 128]]], dtype=torch.uint8),
 3)

In [14]:
# four-dimensional : Batches of RGB images

four_d_tensor = torch.tensor([[
    [
        [0, 255,128],
        [0, 255,128],
        [0, 255,128]
    ]
]], dtype=torch.uint8)
four_d_tensor, four_d_tensor.ndim

(tensor([[[[  0, 255, 128],
           [  0, 255, 128],
           [  0, 255, 128]]]], dtype=torch.uint8),
 4)

Tensor are useful because:

- Mathematical Operations : linear algebraic operations
- Representation of real-world data as a tensor
- Can be run on gpu and can be parallelized

Tensor used in DL:

- Data Storage
- Weight and Biases
- Matrix Operations
- Training Process :
    -  Forward pass in DL : z(xTw + b) , gradients during backward pass


In [19]:
# check type of tensor

one_d_tensor.dtype

torch.int64

In [18]:
# size of tensor

one_d_tensor.size()

torch.Size([4])

In [31]:
# using empty

empty_tensor = torch.empty(size=(2,3))
print(empty_tensor)


tensor([[0., 0., 0.],
        [0., 0., 0.]])


In [36]:
# zeros tensor : all the entries of tensor are 0

zero_tensor = torch.zeros(size=(5,5))
print(zero_tensor)

zero_tensor.dtype, zero_tensor.size()

tensor([[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]])


(torch.float32, torch.Size([5, 5]))

In [37]:
# ones tensor : all the entries of tensor are 1

ones_tensor = torch.ones(size=(5,5))
print(ones_tensor)

tensor([[1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.]])


By default the data  type in tensor is : `float32`

In [50]:
# rand tensor: entries of the tensor contains randon number between 0 and 1

random_tensor = torch.rand(size=(5,5))
print(random_tensor)

random_tensor.dtype, random_tensor.size()
print("rand(): picks the values(0 to 1) from randomly from uniform distribution.")

tensor([[0.1182, 0.3628, 0.3302, 0.2807, 0.8269],
        [0.7934, 0.1577, 0.6951, 0.9363, 0.3530],
        [0.7394, 0.1078, 0.1731, 0.1845, 0.5970],
        [0.0264, 0.5525, 0.9674, 0.8975, 0.6987],
        [0.2962, 0.7868, 0.6388, 0.3797, 0.8793]])
rand(): picks the values(0 to 1) from randomly from uniform distribution.


Linearly Spaced Tensor : one-dimensional tensor of size steps whose values are evenly spaced from start to end, inclusive.

In [62]:
linearly_spaced_tensor = torch.linspace(start=0, end=10, steps=5)
linearly_spaced_tensor

tensor([ 0.0000,  2.5000,  5.0000,  7.5000, 10.0000])

Seeding

In [83]:
# tensor entries will be same
torch.manual_seed(0)

rand_t_seed_0 = torch.rand(size=(5,5))
print(rand_t_seed_0)


tensor([[0.4963, 0.7682, 0.0885, 0.1320, 0.3074],
        [0.6341, 0.4901, 0.8964, 0.4556, 0.6323],
        [0.3489, 0.4017, 0.0223, 0.1689, 0.2939],
        [0.5185, 0.6977, 0.8000, 0.1610, 0.2823],
        [0.6816, 0.9152, 0.3971, 0.8742, 0.4194]])


In [85]:
torch.manual_seed(0)

rand_t_seed_0 = torch.rand(size=(5,5))
print(rand_t_seed_0)

tensor([[0.4963, 0.7682, 0.0885, 0.1320, 0.3074],
        [0.6341, 0.4901, 0.8964, 0.4556, 0.6323],
        [0.3489, 0.4017, 0.0223, 0.1689, 0.2939],
        [0.5185, 0.6977, 0.8000, 0.1610, 0.2823],
        [0.6816, 0.9152, 0.3971, 0.8742, 0.4194]])


Creating Custom Tensors

In [87]:
arr = [[12, 3, 45], [4,66, 3]]

arr_to_tensor = torch.tensor(arr, dtype=torch.float32)
print(arr_to_tensor), type(arr_to_tensor)

tensor([[12.,  3., 45.],
        [ 4., 66.,  3.]])


(None, torch.Tensor)

Identity Matrix

In [116]:
# diagonal tensor

default_diag_tensor = torch.eye(3,3)
print(default_diag_tensor)


tensor([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]])


In [118]:
# get the diagonal values
x = torch.rand(3,3)
print(x)

diagonal_values = torch.diag(x)
print(diagonal_values)

tensor([[0.8579, 0.6870, 0.0051],
        [0.1757, 0.7497, 0.6047],
        [0.1100, 0.2121, 0.9704]])
tensor([0.8579, 0.7497, 0.9704])


In [127]:
# homogenous tensor

homogenous_tensor = torch.full((3,3), 5)
print(homogenous_tensor)

homogenous_tensor_2 = torch.full((3,3), 10.0)
print(homogenous_tensor_2)

tensor([[5, 5, 5],
        [5, 5, 5],
        [5, 5, 5]])
tensor([[10., 10., 10.],
        [10., 10., 10.],
        [10., 10., 10.]])


Creating a tensor with range

In [132]:
range_tensor = torch.arange(0, 10, 2, dtype=torch.float32)
print(range_tensor)

tensor([0., 2., 4., 6., 8.])


Tensor Shapes

In [144]:
x1 = torch.rand(5,5)
print(x1)
x1.shape

tensor([[0.0780, 0.3986, 0.7742, 0.7703, 0.0178],
        [0.8119, 0.1087, 0.3943, 0.2973, 0.4037],
        [0.4018, 0.0513, 0.0683, 0.4218, 0.5065],
        [0.2729, 0.6883, 0.0500, 0.4663, 0.9397],
        [0.2961, 0.9515, 0.6811, 0.0488, 0.8163]])


torch.Size([5, 5])

In [149]:
# want to create a tensor with same shape as x1
same_shape_as_x1 = torch.empty(x1.shape)
print(same_shape_as_x1)

print(torch.full_like(x1, 5))
print(torch.zeros_like(x1))
print(torch.ones_like(x1))

tensor([[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]])
tensor([[5., 5., 5., 5., 5.],
        [5., 5., 5., 5., 5.],
        [5., 5., 5., 5., 5.],
        [5., 5., 5., 5., 5.],
        [5., 5., 5., 5., 5.]])
tensor([[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]])
tensor([[1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.]])


Copying Tensor

In [160]:
# shallow copy

x2 = x1
print(x2)

print(hash(x1) == hash(x2))

x3 = torch.tensor(x1)
print(x3)

print(hash(x1) == hash(x3))

tensor([[0.0780, 0.3986, 0.7742, 0.7703, 0.0178],
        [0.8119, 0.1087, 0.3943, 0.2973, 0.4037],
        [0.4018, 0.0513, 0.0683, 0.4218, 0.5065],
        [0.2729, 0.6883, 0.0500, 0.4663, 0.9397],
        [0.2961, 0.9515, 0.6811, 0.0488, 0.8163]])
True
tensor([[0.0780, 0.3986, 0.7742, 0.7703, 0.0178],
        [0.8119, 0.1087, 0.3943, 0.2973, 0.4037],
        [0.4018, 0.0513, 0.0683, 0.4218, 0.5065],
        [0.2729, 0.6883, 0.0500, 0.4663, 0.9397],
        [0.2961, 0.9515, 0.6811, 0.0488, 0.8163]])
False


  x3 = torch.tensor(x1)


In [161]:
# deep copy : true copy

x2 = x1.clone().detach()
print(x2)

print(hash(x1) == hash(x2))

tensor([[0.0780, 0.3986, 0.7742, 0.7703, 0.0178],
        [0.8119, 0.1087, 0.3943, 0.2973, 0.4037],
        [0.4018, 0.0513, 0.0683, 0.4218, 0.5065],
        [0.2729, 0.6883, 0.0500, 0.4663, 0.9397],
        [0.2961, 0.9515, 0.6811, 0.0488, 0.8163]])
False


Tensor Mathematical Operations

In [9]:
x = torch.randint(0, 10, (1,5))
y = torch.randint(10, 20, (1,5))
print("X ", x)
print("Y ", y)
print( x + y )

X  tensor([[3, 2, 2, 7, 9]])
Y  tensor([[12, 14, 12, 18, 12]])
tensor([[15, 16, 14, 25, 21]])


In [10]:
print( x - y )

tensor([[ -9, -12, -10, -11,  -3]])


In [11]:
print( x * y )

tensor([[ 36,  28,  24, 126, 108]])


In [13]:
print( x / y )

tensor([[0.2500, 0.1429, 0.1667, 0.3889, 0.7500]])


In [18]:
# returns 1 if not divisible by 2 else returns 0
print(x % 2)

tensor([[1, 0, 0, 1, 1]])


Transpose, Matrix Multiplication, Determinant, Inverse of matrix

In [27]:
# transpose
print("Before transpose: ", x)
transpose_x = x.T
print(transpose_x)

print(x.shape, transpose_x.shape)

Before transpose:  tensor([[3, 2, 2, 7, 9]])
tensor([[3],
        [2],
        [2],
        [7],
        [9]])
torch.Size([1, 5]) torch.Size([5, 1])


In [43]:
torch.manual_seed(123)

matrix = torch.randint(0, 5, (3,3))
print(matrix)

v = torch.tensor([-1, 0, 1])

res = torch.matmul(matrix, v)
print("Res: ", res)

tensor([[2, 4, 2],
        [0, 0, 2],
        [1, 2, 4]])
Res:  tensor([0, 2, 3])


In [94]:
matrix = torch.randn(3,3)
print(matrix)

print("Determinant of Square Matrix: ", torch.det(matrix))

tensor([[ 0.6117,  0.2539,  0.8062],
        [-0.2239, -0.9913,  1.1192],
        [-0.1277, -0.9501,  1.4064]])
Determinant of Square Matrix:  tensor(-0.0893)


In [98]:
matrix = torch.randn(3,3)
print(matrix)

print("Inverse Matrix:\n", torch.inverse(matrix))

tensor([[ 0.5589, -0.0543,  0.5147],
        [ 0.3752,  0.4385, -0.9105],
        [-0.8118,  0.7256,  0.0306]])
Inverse Matrix:
 tensor([[ 1.0204,  0.5678, -0.2669],
        [ 1.1015,  0.6584,  1.0627],
        [ 0.9510, -0.5472,  0.4018]])


In [6]:
# absolute

t = torch.randint(-5, 5, (1,10))
print(t)
print(torch.abs(t))


tensor([[-2, -4,  0, -4, -3,  2,  0,  0,  3,  2]])
tensor([[2, 4, 0, 4, 3, 2, 0, 0, 3, 2]])


In [11]:
# round

t = torch.rand((3,3))
print(t)

t_round = torch.round(t)
print(t_round)

tensor([[0.4344, 0.3962, 0.9176],
        [0.9703, 0.5368, 0.7929],
        [0.6427, 0.8530, 0.1912]])
tensor([[0., 0., 1.],
        [1., 1., 1.],
        [1., 1., 0.]])


In [14]:
# ceil

t = torch.tensor([-1.23, 32, 43.11, 6.01, 23.0, 54.4 ])
torch.ceil(t)
print(t)

tensor([-1.2300, 32.0000, 43.1100,  6.0100, 23.0000, 54.4000])


In [21]:
# clamp : fix the value in certain range

t = torch.tensor([124,34,56,57,65,423,3,35])
t_clamp = torch.clamp(t, 50, 100) # val < 50 -> 50, val > 100 -> 100, rest will remain same
print(t_clamp)

tensor([100,  50,  56,  57,  65, 100,  50,  50])


Calculating Sum

In [55]:
torch.manual_seed(1)

example_tensor = torch.randint(0, 10,(3,3))
print(example_tensor)

tensor([[5, 9, 4],
        [8, 3, 3],
        [1, 1, 9]])


In [56]:
# column-wise sum
torch.sum(example_tensor, dim=0)

tensor([14, 13, 16])

In [57]:
# row-wise sum
torch.sum(example_tensor, dim=1)

tensor([18, 14, 11])

In [58]:
# sort the example tensor from lowest to highest column sum

col_wise_sum_sort_desc, col_indices = torch.sort(torch.sum(example_tensor, 0))
print(col_wise_sum_sort_desc)
print(col_indices)

sorted_tensor_by_col_sum = example_tensor[:, col_indices]

print("Original tensor:\n", example_tensor)
print("Sorted tensor:\n", sorted_tensor_by_col_sum)

tensor([13, 14, 16])
tensor([1, 0, 2])
Original tensor:
 tensor([[5, 9, 4],
        [8, 3, 3],
        [1, 1, 9]])
Sorted tensor:
 tensor([[9, 5, 4],
        [3, 8, 3],
        [1, 1, 9]])


In [60]:
# sort the example tensor by lowest to highest row sum

rs = torch.sum(example_tensor, dim=1)
print("Row Sum:\n", rs)
_ , row_indices = torch.sort(rs)
sorted_tensor_by_row_sum = example_tensor[row_indices]

print("Original tensor:\n", example_tensor)
print("Sorted tensor:\n", sorted_tensor_by_row_sum)

Row Sum:
 tensor([18, 14, 11])
Original tensor:
 tensor([[5, 9, 4],
        [8, 3, 3],
        [1, 1, 9]])
Sorted tensor:
 tensor([[1, 1, 9],
        [8, 3, 3],
        [5, 9, 4]])


In [65]:
# argmax(), argmin() : returns index of max and min value of tensor

print(torch.argmax(example_tensor))
print(torch.argmax(example_tensor, dim=0))

print(torch.argmin(example_tensor))
print(torch.argmin(example_tensor, dim=0))

tensor(1)
tensor([1, 0, 2])
tensor(6)
tensor([2, 2, 1])


Mean, Median, Mode, Variance, Standard Deviation

In [77]:
# mean : returns 0 dimension tensor

v = torch.randn((1,5))
print(v)

print("Mean: ", torch.mean(v))

type(torch.mean(v)), torch.mean(v).ndim

tensor([[ 0.0476, -1.1322, -0.0179,  0.1280, -0.5552]])
Mean:  tensor(-0.3059)


(torch.Tensor, 0)

In [92]:
# median : middle value of the sorted tensor
v = torch.randint(0, 10, (1,5))
print(v)

print("Median: ", torch.median(v))

tensor([[1, 5, 9, 1, 2]])
Median:  tensor(2)


In [86]:
# mode : most frequent value in tensor
t = torch.randint(5, 10, (1,10))
print(t)

value, index = torch.mode(t)
value, index

tensor([[5, 6, 8, 7, 6, 5, 7, 8, 6, 6]])


(tensor([6]), tensor([1]))

In [89]:
# variance
t = torch.rand(1,10)
print(t)
print("Variance: ", torch.var(t))
print("Standard Deviation: ", torch.std(t))

tensor([[0.1033, 0.0893, 0.4562, 0.7100, 0.4855, 0.2465, 0.5114, 0.0300, 0.1466,
         0.1672]])
Variance:  tensor(0.0524)
Standard Deviation:  tensor(0.2289)


Comparison
- '>'
- '<'
- '>='
- '<='

In [110]:
torch.manual_seed(2)

x = torch.rand((1, 5))
y = torch.rand((1, 5))
print(x)
print(y)
print("--------------")
print(x > y)
print( x < y)
print( x >= y)
print( x <= y)

# mask
print( x[ x > y ])
print( x[ x < y ])

tensor([[0.6147, 0.3810, 0.6371, 0.4745, 0.7136]])
tensor([[0.6190, 0.4425, 0.0958, 0.6142, 0.0573]])
--------------
tensor([[False, False,  True, False,  True]])
tensor([[ True,  True, False,  True, False]])
tensor([[False, False,  True, False,  True]])
tensor([[ True,  True, False,  True, False]])
tensor([0.6371, 0.7136])
tensor([0.6147, 0.3810, 0.4745])


Special Tensor Functions<br>

log(), exp(), sigmoid(), softmax(), relu()

In [120]:
torch.manual_seed(0)

input_tensor = torch.rand((5,5))
print(input_tensor)

tensor([[0.4963, 0.7682, 0.0885, 0.1320, 0.3074],
        [0.6341, 0.4901, 0.8964, 0.4556, 0.6323],
        [0.3489, 0.4017, 0.0223, 0.1689, 0.2939],
        [0.5185, 0.6977, 0.8000, 0.1610, 0.2823],
        [0.6816, 0.9152, 0.3971, 0.8742, 0.4194]])


In [121]:
print(torch.log(input_tensor))

tensor([[-0.7007, -0.2637, -2.4250, -2.0247, -1.1795],
        [-0.4556, -0.7132, -0.1093, -0.7861, -0.4584],
        [-1.0530, -0.9120, -3.8020, -1.7787, -1.2246],
        [-0.6568, -0.3600, -0.2231, -1.8262, -1.2649],
        [-0.3833, -0.0886, -0.9236, -0.1345, -0.8689]])


In [122]:
print(torch.exp(input_tensor))

tensor([[1.6426, 2.1559, 1.0925, 1.1411, 1.3599],
        [1.8853, 1.6325, 2.4509, 1.5772, 1.8819],
        [1.4175, 1.4944, 1.0226, 1.1840, 1.3416],
        [1.6795, 2.0091, 2.2256, 1.1747, 1.3261],
        [1.9771, 2.4973, 1.4875, 2.3969, 1.5211]])


In [123]:
print(torch.sigmoid(input_tensor))

tensor([[0.6216, 0.6831, 0.5221, 0.5330, 0.5763],
        [0.6534, 0.6201, 0.7102, 0.6120, 0.6530],
        [0.5863, 0.5991, 0.5056, 0.5421, 0.5729],
        [0.6268, 0.6677, 0.6900, 0.5402, 0.5701],
        [0.6641, 0.7141, 0.5980, 0.7056, 0.6033]])


In [125]:
print(torch.softmax(input_tensor, dim=0))

tensor([[0.1910, 0.2202, 0.1320, 0.1527, 0.1830],
        [0.2192, 0.1668, 0.2960, 0.2110, 0.2533],
        [0.1648, 0.1527, 0.1235, 0.1584, 0.1806],
        [0.1953, 0.2052, 0.2688, 0.1572, 0.1785],
        [0.2298, 0.2551, 0.1797, 0.3207, 0.2047]])


In [126]:
print(torch.relu(input_tensor))

tensor([[0.4963, 0.7682, 0.0885, 0.1320, 0.3074],
        [0.6341, 0.4901, 0.8964, 0.4556, 0.6323],
        [0.3489, 0.4017, 0.0223, 0.1689, 0.2939],
        [0.5185, 0.6977, 0.8000, 0.1610, 0.2823],
        [0.6816, 0.9152, 0.3971, 0.8742, 0.4194]])
