<a href="https://colab.research.google.com/github/saigowtham627/Fundementals/blob/main/PyTorch_5_Improved_Training_PipeLine.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Code Flow
*1. Load the dataset*

*2. Basic Preprocessing*

*3. Training Process*

    *a. Create the model*
    *b. Forward pass*
    *c. Loss computation*
    *d. Back Propagagation*
    *e. Parameters updation*
*4. Model Evaluation*

#Improvements
*1. Building the neural networks using nn module*

*2. Uisng built in Activation Function*

*3. Using built-in loss function*

*4. Using built-in Optimizer*

#What is nn Module ?

*nn module of PyTorch is a core library, that provides a wide array of classes and functions designed to help developers in building neural networks efficiently and effectively.*

*nn Module abstracts the complexity of creating and training neural networks by offering, Pre built layers, Activation functions, Loss functions, and other utilities, enabling us to focus on designing and experimenting on defferent model architectures.*

**Key componenets of torch.nn:**

**1. Modules(Layers):**

*nn.Modules : The base class for all neural network modules. Our custom models and layers should sub class this class*

*Common layers: Includes layers like nn.Linear(Fully connected layer), nn.Conv2d(Convolutional Layer), nn.LSTM(recurrent layer), and many others*

**2. Activation functions:**

*Functions like nn.ReLU, nn.Sigmoid, nn.Tanh introduce non linearities to the model allowing to learn complex patterns*

**3. Loss Functions:**

*Provides loss functions such as nn.MSELoss, nn.CrossEntropyLoss and nn.NLLLoss to quantify the difference between the model's predictions and the actual targets*

**4. Container Modules:**

*nn.Sequential : A sequential container to stack the layers in order*

**5. Regularization and Dropout:**

*Layers like nn.Dropout and nn.BatchNorm2d help prevent overfitting and improve the models ability to generalize on new data.*

In [1]:
#Create model class
import torch
import torch.nn as nn

In [2]:
import torch
import torch.nn as nn

class Model(nn.Module):  # We have to inherit our model from nn.Module
    # Build a constructor
    def __init__(self, num_features):  # We have to specify the number of input features
        # Invoke the base class constructor
        super().__init__()
        # Specify the layers
        self.linear = nn.Linear(num_features, 1)
        self.sigmoid = nn.Sigmoid()  # You can also use this in the forward method

    def forward(self, input_features):  # Self is used because forward is a method
        z = self.linear(input_features)  # No need to define weights and biases, just pass the input_features to the Linear layer
        out = self.sigmoid(z)  # Use the sigmoid layer defined in __init__, instead of creating a new one
        return out


In [3]:
#Creating a fake dataset
features = torch.rand(10, 5)

#Create the model
model = Model(features.shape[1]) #We have to pass the number of features, this will create the neural network

#Perform the forward pass
model(features) #This will automatically trigger the forward method in the model

tensor([[0.6126],
        [0.6123],
        [0.5152],
        [0.5415],
        [0.6522],
        [0.6158],
        [0.4797],
        [0.5212],
        [0.5896],
        [0.5598]], grad_fn=<SigmoidBackward0>)

In [4]:
#We can see the model weights and biases
model.linear.weight
model.linear.bias

Parameter containing:
tensor([-0.0761], requires_grad=True)

#Visualizing the model

In [5]:
!pip install torchinfo

Collecting torchinfo
  Downloading torchinfo-1.8.0-py3-none-any.whl.metadata (21 kB)
Downloading torchinfo-1.8.0-py3-none-any.whl (23 kB)
Installing collected packages: torchinfo
Successfully installed torchinfo-1.8.0


In [6]:
from torchinfo import summary

summary(model, input_size = (10, 5))

Layer (type:depth-idx)                   Output Shape              Param #
Model                                    [10, 1]                   --
├─Linear: 1-1                            [10, 1]                   6
├─Sigmoid: 1-2                           [10, 1]                   --
Total params: 6
Trainable params: 6
Non-trainable params: 0
Total mult-adds (M): 0.00
Input size (MB): 0.00
Forward/backward pass size (MB): 0.00
Params size (MB): 0.00
Estimated Total Size (MB): 0.00

*Let us create a neural network with 5 input neurons, 3 hidden layer neurons and 1 output layer neurons*

In [7]:
class Model_1(nn.Module):

  def __init__(self, num_features):

    super().__init__()
    self.linear1 = nn.Linear(num_features, 3) #First hidden layer, takes 5 input feastures and outputs 3
    self.ReLU = nn.ReLU() #Applying ReLU to the outputs of first hidden layer

    #Output layer
    self.linear2 = nn.Linear(3, 1) #Takes 3 inputs from the previous layer and outputs a single value
    self.sigmoid = nn.Sigmoid() #Using Sigmoid AF at the output layer

  def forward(self, features):
    out = self.linear1(features)
    out = self.ReLU(out)

    out = self.linear2(out)
    out = self.sigmoid(out)

    return out

In [8]:
#Create a fake dataset
features = torch.rand(10,5)
#Create an instance of the model
model = Model_1(features.shape[1])

#Perform the forward pass
model(features)

tensor([[0.6086],
        [0.6087],
        [0.6074],
        [0.6087],
        [0.6087],
        [0.6020],
        [0.6087],
        [0.6087],
        [0.6087],
        [0.6086]], grad_fn=<SigmoidBackward0>)

In [9]:
#Seeing the weights and biases
model.linear1.weight
model.linear2.weight

model.linear1.bias
model.linear2.bias

Parameter containing:
tensor([0.4417], requires_grad=True)

In [10]:
!pip install torchinfo



In [11]:
from torchinfo import summary

summary(model, input_size = (10,5))

Layer (type:depth-idx)                   Output Shape              Param #
Model_1                                  [10, 1]                   --
├─Linear: 1-1                            [10, 3]                   18
├─ReLU: 1-2                              [10, 3]                   --
├─Linear: 1-3                            [10, 1]                   4
├─Sigmoid: 1-4                           [10, 1]                   --
Total params: 22
Trainable params: 22
Non-trainable params: 0
Total mult-adds (M): 0.00
Input size (MB): 0.00
Forward/backward pass size (MB): 0.00
Params size (MB): 0.00
Estimated Total Size (MB): 0.00

*Defining forward passes for a deep neural network will be cumbersome process. So we can use containers....Sequential*

In [12]:
import torch
import torch.nn as nn

In [13]:
class Model_2(nn.Module):

  def __init__(self, num_features):

    super().__init__()
    self.network = nn.Sequential( #We have to define the structure of the network
        nn.Linear(num_features, 3),
        nn.ReLU(),
        nn.Linear(3,1),
        nn.Sigmoid()
    )

  def forward(self, features):
    out = self.network(features)
    return out

In [14]:
#Define a fake dataset
features = torch.rand(10, 5) #We need 5 features, so number of columns should be 5

#Create the model instance
model = Model_2(features.shape[1])

model(features)

tensor([[0.5794],
        [0.5781],
        [0.5812],
        [0.5899],
        [0.5683],
        [0.5852],
        [0.5748],
        [0.5742],
        [0.5908],
        [0.5806]], grad_fn=<SigmoidBackward0>)

In [15]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import torch
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import LabelEncoder

In [16]:
#Breast cancer dataset
df = pd.read_csv('https://raw.githubusercontent.com/gscdit/Breast-Cancer-Detection/refs/heads/master/data.csv')
df.sample(5)

Unnamed: 0,id,diagnosis,radius_mean,texture_mean,perimeter_mean,area_mean,smoothness_mean,compactness_mean,concavity_mean,concave points_mean,...,texture_worst,perimeter_worst,area_worst,smoothness_worst,compactness_worst,concavity_worst,concave points_worst,symmetry_worst,fractal_dimension_worst,Unnamed: 32
557,925236,B,9.423,27.88,59.26,271.3,0.08123,0.04971,0.0,0.0,...,34.24,66.5,330.6,0.1073,0.07158,0.0,0.0,0.2475,0.06969,
242,883852,B,11.3,18.19,73.93,389.4,0.09592,0.1325,0.1548,0.02854,...,27.96,87.16,472.9,0.1347,0.4848,0.7436,0.1218,0.3308,0.1297,
220,8812816,B,13.65,13.16,87.88,568.9,0.09646,0.08711,0.03888,0.02563,...,16.35,99.71,706.2,0.1311,0.2474,0.1759,0.08056,0.238,0.08718,
313,893988,B,11.54,10.72,73.73,409.1,0.08597,0.05969,0.01367,0.008907,...,12.87,81.23,467.8,0.1092,0.1626,0.08324,0.04715,0.339,0.07434,
131,8670,M,15.46,19.48,101.7,748.9,0.1092,0.1223,0.1466,0.08087,...,26.0,124.9,1156.0,0.1546,0.2394,0.3791,0.1514,0.2837,0.08019,


In [17]:
df.shape #569 rows and 33 columns

(569, 33)

In [18]:
df.isna().sum()

Unnamed: 0,0
id,0
diagnosis,0
radius_mean,0
texture_mean,0
perimeter_mean,0
area_mean,0
smoothness_mean,0
compactness_mean,0
concavity_mean,0
concave points_mean,0


In [19]:
df.columns

Index(['id', 'diagnosis', 'radius_mean', 'texture_mean', 'perimeter_mean',
       'area_mean', 'smoothness_mean', 'compactness_mean', 'concavity_mean',
       'concave points_mean', 'symmetry_mean', 'fractal_dimension_mean',
       'radius_se', 'texture_se', 'perimeter_se', 'area_se', 'smoothness_se',
       'compactness_se', 'concavity_se', 'concave points_se', 'symmetry_se',
       'fractal_dimension_se', 'radius_worst', 'texture_worst',
       'perimeter_worst', 'area_worst', 'smoothness_worst',
       'compactness_worst', 'concavity_worst', 'concave points_worst',
       'symmetry_worst', 'fractal_dimension_worst', 'Unnamed: 32'],
      dtype='object')

In [20]:
#id and unnamed32 are not required
df = df.drop(columns = ['id', 'Unnamed: 32'], axis = 0) #We have specify as columns not, labels

#Train test split

In [21]:
X_train, X_test, y_train, y_test = train_test_split(df.iloc[:, 1:], df.iloc[:, 0], test_size=0.2)

#Scaling

In [22]:
scaler = StandardScaler() #Initializing the Standard scaler
X_train = scaler.fit_transform(X_train)
X_test = scaler.fit_transform(X_test)

In [23]:
X_train

array([[-0.5585858 , -0.34406657, -0.59336318, ..., -0.86433584,
        -1.05833725, -0.56588278],
       [ 0.6708427 ,  0.32722077,  0.68036591, ...,  0.68502353,
         0.34921471, -0.11890953],
       [ 1.47342805,  0.46755959,  1.41108647, ...,  1.0058599 ,
        -0.53773584, -0.99278564],
       ...,
       [ 0.43637957,  0.15413623,  0.48346515, ...,  1.52293601,
         0.48418544,  0.96055259],
       [ 1.65679025,  0.91196585,  1.6429918 , ...,  1.0619284 ,
        -0.49756598, -0.4432906 ],
       [-0.73293018,  1.21135533, -0.74913355, ..., -1.03051662,
         0.57577273, -0.90816449]])

In [24]:
y_train

Unnamed: 0,diagnosis
96,B
328,M
70,M
338,B
313,B
...,...
247,B
14,M
392,M
389,M


*Target features are alphabetical, M and B. So we have to encode them into numeric form*

In [25]:
encoder = LabelEncoder()
y_train = encoder.fit_transform(y_train)
y_test = encoder.fit_transform(y_test)

In [26]:
y_train

array([0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1,
       0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1,
       1, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0,
       0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1,
       1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0,
       0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0,
       0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 0,
       0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0,
       0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0,
       1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0,
       1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1,
       0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 0,
       0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0,
       1, 1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0,

*X_train, Y_train, X_test, Y_test are NumPy arrays, we need to convert them into PyTorch tensors*

In [27]:
X_train_tensor = torch.from_numpy(X_train)
X_test_tensor = torch.from_numpy(X_test)
y_train_tensor = torch.from_numpy(y_train)
y_test_tensor = torch.from_numpy(y_test)

#Defining the Model

In [28]:
import math

In [29]:
# class MySimpleNN():

#     def __init__(self, X):
#         # Define weights and biases
#         # Since we have 30 features, and we have only 1 neuron, We need 30 weights and 1 bias...As each neuron will have a bias
#         self.weights = torch.rand(X.shape[1], 1, dtype=torch.float64, requires_grad=True) # requires_grad is set to True, because we are calculating gradients of Loss functions wrt to Weights
#         self.bias = torch.rand(1, dtype=torch.float64, requires_grad=True)

#         # We need 30 weights, each one for each feature ..So weight vector dimension would be (30, 1) vector. 30 is X.shape[1]
#         # We need only 1 bias, i.e; for the single neuron, Since it is a scalar, We kept it as a single number

#     def forward_pass(self, X):  # Step 1: Z = W.X+b......Step 2: Sigmoid(Z)
#         # Calculate Z
#         Z = torch.matmul(X, self.weights) + self.bias
#         # Calculate y_pred
#         y_pred = torch.sigmoid(Z)

#         return y_pred

#     def loss_function(self, y_pred, y):
#         # Clamp predictions to avoid log(0)
#         epsilon = 1e-7
#         y_pred = torch.clamp(y_pred, epsilon, 1 - epsilon)

#         # Calculate loss
#         loss = -(y * torch.log(y_pred) + (1 - y) * torch.log(1 - y_pred)).mean()
#         return loss

In [30]:
class MySimpleNN(nn.Module):
    def __init__(self, num_features):
        super().__init__()
        # Step 1: Define the structure of the network
        self.linear = nn.Linear(num_features, 1)  # Only 1 output neuron
        self.sigmoid = nn.Sigmoid()

    # Define the forward pass
    def forward(self, features):
        out = self.linear(features)  # Linear transformation
        out = self.sigmoid(out)  # Sigmoid activation
        return out

#Defining parameters

In [31]:
learning_rate = 0.1
epochs = 5000

In [32]:
loss_function = nn.BCELoss()

#Training Pipeline

In [33]:
# # Ensuring they are tensors without requiring gradient computation
# X_train_tensor = X_train_tensor.clone().detach().type(torch.float32)
# y_train_tensor = y_train_tensor.clone().detach().type(torch.float32)

# # Initialize the model
# model = MySimpleNN(X_train_tensor.shape[1])  # Pass the number of features

# # Training loop
# for epoch in range(epochs):
#     # Forward pass
#     y_pred = model(X_train_tensor)

#     # Loss computation
#     loss = loss_function(y_pred, y_train_tensor.view(-1,1))

#     # Backward pass
#     loss.backward()

#     # Parameter update
#     with torch.no_grad():
#         # Update weights and biases
#         model.linear.weight -= learning_rate * model.linear.weight.grad
#         model.linear.bias -= learning_rate * model.linear.bias.grad

#     # Zero gradients to avoid accumulation
#     model.linear.weight.grad.zero_()
#     model.linear.bias.grad.zero_()

#     # Print loss in each epoch
#     if (epoch + 1) % 100 == 0:  # Print every 100 epochs for clarity
#         print(f'Epoch : {epoch + 1}. Loss: {loss.item()}')

*torch.optim is a module in PyTorch that provides a variety of optimization algorithms used to update the parameters of our model during training.*

*It includes common optimizers like SGD, RMSProp and ADAM and many more.*

*It handles weight updates efficiently, including additional features like learning rate scheduling and Weight decay(regularization)*

*The model.parameters() method in PyTorch retreives an iterator over all trainable parameters(Weights and Biases) in a model. These parameters are instances of torch.nn.Parameters and include:*

*Weights : The weight matrices of layers like nn.Linear, nn.Conv2d, etc.*

*Biases : The bias terms of layers(if they exist)*

*The optimizer uses these parameters to compute gradients and update them during training*


In [41]:
# Ensuring they are tensors without requiring gradient computation
X_train_tensor = X_train_tensor.clone().detach().type(torch.float32)
y_train_tensor = y_train_tensor.clone().detach().type(torch.float32)

# Initialize the model
model = MySimpleNN(X_train_tensor.shape[1])  # Pass the number of features

#Set an optimizer
optimizer = torch.optim.SGD(model.parameters(), lr = learning_rate)

# Training loop
for epoch in range(epochs):
    # Forward pass
    y_pred = model(X_train_tensor)

    # Loss computation
    loss = loss_function(y_pred, y_train_tensor.view(-1,1))

    #Before we there was a need to zero the gradients manually, we can do it with:
    optimizer.zero_grad() #It is suggested to clear the gradients before the backward propagation of loss

    # Backward pass
    loss.backward()

    #Instead of doing parameter update manually, We will use torch.optim
    #Parameter update
    optimizer.step()

    #Print loss in each epoch
    print(f"Epoch: {epoch + 1}, Loss: {loss.item()}")


Epoch: 1, Loss: 0.7578815817832947
Epoch: 2, Loss: 0.563694417476654
Epoch: 3, Loss: 0.4633199870586395
Epoch: 4, Loss: 0.4030325710773468
Epoch: 5, Loss: 0.36238381266593933
Epoch: 6, Loss: 0.33280307054519653
Epoch: 7, Loss: 0.31010767817497253
Epoch: 8, Loss: 0.29201021790504456
Epoch: 9, Loss: 0.27715057134628296
Epoch: 10, Loss: 0.2646670937538147
Epoch: 11, Loss: 0.2539858818054199
Epoch: 12, Loss: 0.24470899999141693
Epoch: 13, Loss: 0.23655134439468384
Epoch: 14, Loss: 0.22930264472961426
Epoch: 15, Loss: 0.2228042036294937
Epoch: 16, Loss: 0.21693381667137146
Epoch: 17, Loss: 0.2115955799818039
Epoch: 18, Loss: 0.20671306550502777
Epoch: 19, Loss: 0.2022245079278946
Epoch: 20, Loss: 0.198079451918602
Epoch: 21, Loss: 0.19423611462116241
Epoch: 22, Loss: 0.1906595528125763
Epoch: 23, Loss: 0.18732035160064697
Epoch: 24, Loss: 0.1841934472322464
Epoch: 25, Loss: 0.1812574416399002
Epoch: 26, Loss: 0.17849375307559967
Epoch: 27, Loss: 0.17588642239570618
Epoch: 28, Loss: 0.173421

#What is model.parameters() ?

*model.parameters() in PyTorch retrives an iterator over all the training parameters(Weights and Biases) in a model. The parameters are instances of toch.nn.Parameters include :*

*Weights: The weight matrices of nn.Conv2d, nn.Linear etc*

*Bias : The bias terms of layers(if they exist)*

*The optimizer uses these parameters to compute gradients and update them during training.*

<generator object Module.parameters at 0x7b8c0039f290>


In [35]:
model.linear.weight

Parameter containing:
tensor([[ 0.1314,  0.1314, -0.0352, -0.0512,  0.1187, -0.1324,  0.1794, -0.1105,
          0.0558, -0.1486, -0.1308, -0.0586, -0.0630,  0.1524,  0.1525,  0.0010,
          0.1780, -0.0457, -0.1528, -0.0278, -0.0225,  0.1776, -0.1804, -0.0768,
         -0.0864,  0.0577,  0.0301,  0.0161, -0.0501, -0.0651]],
       requires_grad=True)

In [36]:
model.linear.bias

Parameter containing:
tensor([0.0276], requires_grad=True)

#Model Evaluation

In [42]:
# Device setup: Ensure both model and data are on the same device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Move the model to the device (either GPU or CPU)
model.to(device)

# Convert X_test_tensor and y_test_tensor to float32 and move them to the same device as the model
X_test_tensor = X_test_tensor.to(device).type(torch.float32)
y_test_tensor = y_test_tensor.to(device).type(torch.float32)

# Model evaluation
with torch.no_grad():
    # Forward pass on the test data
    y_pred = model(X_test_tensor)

    # Apply thresholding (if needed)
    y_pred = (y_pred > 0.3856090123673645).float()

    # Accuracy calculation
    accuracy = (y_pred == y_test_tensor).float().mean()

    print(f'Accuracy: {accuracy.item()}')


Accuracy: 0.5369344353675842
