# Key PyTorch Components
## torch.nn
Contains all the building blocks for neural networks in PyTorch, such as layers, activation functions, and loss functions. It allows you to construct computational graphs, which are sequences of computations executed in a specific order.

## torch.nn.Parameter
A subclass of torch.Tensor that is automatically registered as a parameter when assigned as an attribute to an nn.Module. Parameters are tensors that are considered model parameters, and if requires_grad=True (the default), gradients are automatically computed for them during backpropagation.

## torch.nn.Module
The base class for all neural network modules in PyTorch. It provides a way to encapsulate parameters, helpers for moving them to GPUs, exporting, loading, and more. All custom models should subclass nn.Module and implement the forward() method.

## torch.optim
Contains optimization algorithms (optimizers) used to update the parameters of a model based on the gradients computed during backpropagation. Optimizers adjust the weights to minimize the loss function.

## def forward()
All subclasses of nn.Module must implement the forward() method, which defines how the model processes input data and produces output. This is where the actual computation happens.

# Tensors

Tensors are multi-dimensional arrays that store data.

hey generalize scalars, vectors, and matrices to higher dimensions and are the foundation of machine learning frameworks like PyTorch and TensorFlow.

Scalar (0D tensor): A single number, e.g., 5.

Vector (1D tensor): A list of numbers, e.g., [1, 2, 3].

Matrix (2D tensor): A table of numbers, [[1, 2],
 [3, 4]]

 Higher-dimensional tensors (3D, 4D, etc.): Extend matrices to more dimensions, e.g., a batch of images (4D).

Combining tensors is necessary for batch processing, organization, and efficient computation.

#View - Is used to flatten the matrix

memory is contigeous memory

In [33]:
import torch

tensor_random=torch.rand(4,4)
print(tensor_random)

print(tensor_random.view(16))
print(tensor_random.view(2,8))

tensor([[0.4456, 0.8219, 0.2376, 0.3831],
        [0.8598, 0.6074, 0.0059, 0.1892],
        [0.8744, 0.6607, 0.1270, 0.2809],
        [0.9162, 0.5762, 0.1097, 0.4148]])
tensor([0.4456, 0.8219, 0.2376, 0.3831, 0.8598, 0.6074, 0.0059, 0.1892, 0.8744,
        0.6607, 0.1270, 0.2809, 0.9162, 0.5762, 0.1097, 0.4148])
tensor([[0.4456, 0.8219, 0.2376, 0.3831, 0.8598, 0.6074, 0.0059, 0.1892],
        [0.8744, 0.6607, 0.1270, 0.2809, 0.9162, 0.5762, 0.1097, 0.4148]])


#Reshape

more flexible

It uses contigeous memory as well + non contiguous memory

contiguous memory then use view.


In [34]:
#Reshape
tensor_random=torch.rand(4,4)
print(tensor_random)

print(tensor_random.reshape(16))
print(tensor_random.reshape(2,8))


tensor([[0.9698, 0.3375, 0.9843, 0.4118],
        [0.5258, 0.3957, 0.1894, 0.1267],
        [0.2521, 0.8703, 0.6007, 0.9123],
        [0.1984, 0.6291, 0.9960, 0.1445]])
tensor([0.9698, 0.3375, 0.9843, 0.4118, 0.5258, 0.3957, 0.1894, 0.1267, 0.2521,
        0.8703, 0.6007, 0.9123, 0.1984, 0.6291, 0.9960, 0.1445])
tensor([[0.9698, 0.3375, 0.9843, 0.4118, 0.5258, 0.3957, 0.1894, 0.1267],
        [0.2521, 0.8703, 0.6007, 0.9123, 0.1984, 0.6291, 0.9960, 0.1445]])


#Stack

Stacking is generally used in situations where you need to combine tensors while introducing a new dimension.

In [35]:
random_stack=torch.stack([tensor_random,tensor_random,tensor_random,tensor_random])
print(random_stack)

tensor([[[0.9698, 0.3375, 0.9843, 0.4118],
         [0.5258, 0.3957, 0.1894, 0.1267],
         [0.2521, 0.8703, 0.6007, 0.9123],
         [0.1984, 0.6291, 0.9960, 0.1445]],

        [[0.9698, 0.3375, 0.9843, 0.4118],
         [0.5258, 0.3957, 0.1894, 0.1267],
         [0.2521, 0.8703, 0.6007, 0.9123],
         [0.1984, 0.6291, 0.9960, 0.1445]],

        [[0.9698, 0.3375, 0.9843, 0.4118],
         [0.5258, 0.3957, 0.1894, 0.1267],
         [0.2521, 0.8703, 0.6007, 0.9123],
         [0.1984, 0.6291, 0.9960, 0.1445]],

        [[0.9698, 0.3375, 0.9843, 0.4118],
         [0.5258, 0.3957, 0.1894, 0.1267],
         [0.2521, 0.8703, 0.6007, 0.9123],
         [0.1984, 0.6291, 0.9960, 0.1445]]])


In [36]:
random_stack = torch.randn(2, 3)
print(random_stack)
print(random_stack.shape)


random_stack = torch.tensor([[ 0.5174, -1.6801, -1.7602],
                             [ 1.2056, -0.1794,  0.9064]])
random_stack

tensor([[ 1.8234,  0.8921, -0.5843],
        [-0.6360, -0.3497, -0.8274]])
torch.Size([2, 3])


tensor([[ 0.5174, -1.6801, -1.7602],
        [ 1.2056, -0.1794,  0.9064]])

In [37]:
# By Default dimension is 0
print("Dimension 0 or default")
random_stack_dim_default = torch.stack((random_stack, random_stack))
print(random_stack_dim_default)
print(random_stack_dim_default.shape)

Dimension 0 or default
tensor([[[ 0.5174, -1.6801, -1.7602],
         [ 1.2056, -0.1794,  0.9064]],

        [[ 0.5174, -1.6801, -1.7602],
         [ 1.2056, -0.1794,  0.9064]]])
torch.Size([2, 2, 3])


# Pytorch Intermediate

In [38]:
import torch
import torch.nn as nn

In [39]:
# Input data
X= torch.tensor([[1.0],[4.0],[7.0]])

# Output data (Target)
Y= torch.tensor([[2.0],[8.0],[11.0]])

#Defining the class name of the nn.module

name of class - Linearregression Model (which inherits the nn.module)

init is the constructor which has layers and parameters of the module

super of init = this initailises the base class nn.module()

self.Linear = contains nn.layer(Linear layer that applies a linear transformation) y=w.x+b which has x and y features
in_features = 1, out_features = total no of output we want.

Forward= Forward propagation, where computation happens

out= self.linear(x) = applies the linear transformation to the input.





In [74]:
class LinearRegressionModel(nn.Module): # inherenting the object
  def __init__(self):
    super(LinearRegressionModel, self).__init__()
    # Define model's parameter
    self.linear= nn.Linear(in_features=1, out_features=1)
    self.relu = nn.ReLU()

  def forward(self, x):
    return self.relu(self.linear(x))


In [75]:
model = LinearRegressionModel()

# Loss Function

The loss function guides the optimization process by providing feedback to the model about its performance, enabling it to adjust parameters (via backpropagation) and improve predictions.

In [76]:
# Define the loss function

criterion= nn.MSELoss()

#Optimizer - tool to find the lowest valley such as gradient descent

optimizer = torch.optim.SGD(model.parameters(), lr=0.01)


In [77]:
num_epoch = 10

for epoch in range(num_epoch):
  y_pred = model(X)

  # Compute loss
  loss = criterion(y_pred, Y)

  optimizer.zero_grad() # Clearing the old gradients

  loss.backward()  # Backward pass : compute gradients

  optimizer.step() # Update the parameters

  print(f"Epoch {epoch+1}/{num_epoch}, loss:{loss.item() : .4f}")


Epoch 1/10, loss: 19.6374
Epoch 2/10, loss: 6.1917
Epoch 3/10, loss: 2.1939
Epoch 4/10, loss: 1.0053
Epoch 5/10, loss: 0.6518
Epoch 6/10, loss: 0.5467
Epoch 7/10, loss: 0.5155
Epoch 8/10, loss: 0.5061
Epoch 9/10, loss: 0.5034
Epoch 10/10, loss: 0.5025


In [78]:
print(f"Model weights {model.linear.weight.data}")
print(f"Model bias {model.linear.bias.data}")

Model weights tensor([[1.5142]])
Model bias tensor([0.9111])


# Test the model

####torch.no_grad()
During inference or evaluation, we don’t need gradients since we’re not training the model or updating weights. This saves memory and speeds up computation.

In [79]:
with torch.no_grad():    # we are not updating the gradient, we are predicting the gradients
  predicted = model(X)
  print("Inupt vale: ", X[:, 0])
  print("Input value: ", X.squeeze()) # numpy or squeeze we can use
  print("Input value: ", X[:,0].numpy())
  print("=="*20)
  print("Predicted value: ", predicted.squeeze().numpy())
  print("Actual value: ", Y.squeeze().numpy())

Inupt vale:  tensor([1., 4., 7.])
Input value:  tensor([1., 4., 7.])
Input value:  [1. 4. 7.]
Predicted value:  [ 2.4252748  6.9678636 11.510452 ]
Actual value:  [ 2.  8. 11.]


* Training Loop: Iterates over the dataset multiple times (epochs) to train the model.
* y_pred = model(X): Performs the forward pass by calling the model on the input data. This invokes the forward() method.
* loss = criterion(y_pred, y): Computes the loss between the predicted values and actual values.
* optimizer.zero_grad(): Clears old gradients before computing new ones.
* loss.backward(): Performs backpropagation to compute gradients of the loss w.r.t. model parameters.
* optimizer.step(): Updates the model parameters using the computed gradients.
* Progress Printing: Every 2 epochs, prints the current loss to monitor training progress.