# **PyTorch Training Pipeline Using `nn.Module`**


> **NOTE:**  
This notebook contains the training pipeline for breast cancer detection using PyTorch's `nn.Module`, as implemented in  
`../04 PyTorch Training Pipeline/pytorch_training_pipeline.ipynb`.

## Import Required Libraries
This cell imports PyTorch, NumPy, Pandas, and scikit-learn utilities for data processing.

In [1]:
# Import PyTorch for tensor operations and neural networks
import torch

# Import NumPy for numerical operations
import numpy as np

# Import Pandas for data manipulation
import pandas as pd

# Import train_test_split for splitting data
from sklearn.model_selection import train_test_split
# Import StandardScaler for feature scaling
from sklearn.preprocessing import StandardScaler
# Import LabelEncoder for encoding categorical labels
from sklearn.preprocessing import LabelEncoder

## Load Dataset
This cell loads the breast cancer dataset from a remote CSV file.

In [2]:
df = pd.read_csv(
    'https://raw.githubusercontent.com/gscdit/Breast-Cancer-Detection/refs/heads/master/data.csv')

df.head()

Unnamed: 0,id,diagnosis,radius_mean,texture_mean,perimeter_mean,area_mean,smoothness_mean,compactness_mean,concavity_mean,concave points_mean,...,texture_worst,perimeter_worst,area_worst,smoothness_worst,compactness_worst,concavity_worst,concave points_worst,symmetry_worst,fractal_dimension_worst,Unnamed: 32
0,842302,M,17.99,10.38,122.8,1001.0,0.1184,0.2776,0.3001,0.1471,...,17.33,184.6,2019.0,0.1622,0.6656,0.7119,0.2654,0.4601,0.1189,
1,842517,M,20.57,17.77,132.9,1326.0,0.08474,0.07864,0.0869,0.07017,...,23.41,158.8,1956.0,0.1238,0.1866,0.2416,0.186,0.275,0.08902,
2,84300903,M,19.69,21.25,130.0,1203.0,0.1096,0.1599,0.1974,0.1279,...,25.53,152.5,1709.0,0.1444,0.4245,0.4504,0.243,0.3613,0.08758,
3,84348301,M,11.42,20.38,77.58,386.1,0.1425,0.2839,0.2414,0.1052,...,26.5,98.87,567.7,0.2098,0.8663,0.6869,0.2575,0.6638,0.173,
4,84358402,M,20.29,14.34,135.1,1297.0,0.1003,0.1328,0.198,0.1043,...,16.67,152.2,1575.0,0.1374,0.205,0.4,0.1625,0.2364,0.07678,


## View Dataset Shape
This cell displays the shape of the loaded DataFrame.

In [3]:
# Shape of the DataFrame (rows, columns)
df.shape

(569, 33)

## Drop Unnecessary Columns
This cell removes the 'id' and 'Unnamed: 32' columns from the DataFrame.

In [4]:
# Drop unnecessary columns: 'id' and 'Unnamed: 32'
df.drop(columns=['id', 'Unnamed: 32'], inplace=True)

## View Updated DataFrame
This cell displays the first few rows of the updated DataFrame.

In [5]:
# Display the first few rows after dropping columns
df.head()

Unnamed: 0,diagnosis,radius_mean,texture_mean,perimeter_mean,area_mean,smoothness_mean,compactness_mean,concavity_mean,concave points_mean,symmetry_mean,...,radius_worst,texture_worst,perimeter_worst,area_worst,smoothness_worst,compactness_worst,concavity_worst,concave points_worst,symmetry_worst,fractal_dimension_worst
0,M,17.99,10.38,122.8,1001.0,0.1184,0.2776,0.3001,0.1471,0.2419,...,25.38,17.33,184.6,2019.0,0.1622,0.6656,0.7119,0.2654,0.4601,0.1189
1,M,20.57,17.77,132.9,1326.0,0.08474,0.07864,0.0869,0.07017,0.1812,...,24.99,23.41,158.8,1956.0,0.1238,0.1866,0.2416,0.186,0.275,0.08902
2,M,19.69,21.25,130.0,1203.0,0.1096,0.1599,0.1974,0.1279,0.2069,...,23.57,25.53,152.5,1709.0,0.1444,0.4245,0.4504,0.243,0.3613,0.08758
3,M,11.42,20.38,77.58,386.1,0.1425,0.2839,0.2414,0.1052,0.2597,...,14.91,26.5,98.87,567.7,0.2098,0.8663,0.6869,0.2575,0.6638,0.173
4,M,20.29,14.34,135.1,1297.0,0.1003,0.1328,0.198,0.1043,0.1809,...,22.54,16.67,152.2,1575.0,0.1374,0.205,0.4,0.1625,0.2364,0.07678


## View Updated Shape
This cell displays the shape of the DataFrame after dropping columns.

In [6]:
# Shape of the DataFrame after dropping columns
df.shape

(569, 31)

## **Train Test Split**


## Split Data into Train and Test Sets
This cell splits the data into training and testing sets.

In [7]:
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(
    df.iloc[:, 1:],  # Features (all columns except first)
    df.iloc[:, 0],   # Labels (first column)
    test_size=0.2    # 20% for testing
)

# **Scaling**


## Scale Features
This cell standardizes the features using StandardScaler.

In [8]:
# Initialize the scaler
scaler = StandardScaler()

# Fit scaler on training data and transform
X_train = scaler.fit_transform(X_train)

# Transform test data using the same scaler
X_test = scaler.transform(X_test)

## View Scaled Training Features
This cell displays the scaled training features.

In [9]:
# View the scaled training features
X_train

array([[-1.25798336,  2.07135858, -1.26487489, ..., -1.028949  ,
        -0.93093938, -0.6616548 ],
       [ 0.01703474,  0.66445368,  0.16497857, ...,  0.94115381,
        -0.10447307,  1.53406445],
       [ 0.4510231 , -0.32619341,  0.46049808, ...,  0.97908709,
         1.39359589,  1.0851951 ],
       ...,
       [-0.6949331 , -1.22614746, -0.74792154, ..., -0.5094148 ,
         0.301649  , -0.94112941],
       [-0.43113625, -0.46339571, -0.45939901, ..., -0.91727342,
        -0.35099074, -0.83425576],
       [-1.04949875, -0.1587601 , -1.05167279, ..., -0.77676855,
        -1.10160547, -0.32446842]], shape=(455, 30))

## View Training Labels
This cell displays the training labels.

In [10]:
# View the training labels
y_train

459    B
62     M
11     M
450    B
124    B
      ..
100    M
133    B
281    B
325    B
303    B
Name: diagnosis, Length: 455, dtype: object

# **Label Encoding**


## Encode Labels
This cell encodes the categorical labels into numeric values.

In [11]:
# Initialize label encoder
encoder = LabelEncoder()

# Fit encoder on training labels and transform
y_train = encoder.fit_transform(y_train)

# Transform test labels using the same encoder
y_test = encoder.transform(y_test)

## View Encoded Training Labels
This cell displays the encoded training labels.

In [12]:
# View the encoded training labels
y_train

array([0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0,
       0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0,
       0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1,
       0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1,
       1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1,
       1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0,
       0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1,
       1, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1,
       0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0,
       0, 1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1,
       0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1,
       0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1,
       0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0,

# **Numpy arrays to PyTorch tensors**


## Convert Data to PyTorch Tensors
This cell converts the numpy arrays to PyTorch tensors for training.

In [13]:
# Convert numpy arrays to PyTorch tensors (float32 type)
X_train_tensor = torch.from_numpy(X_train.astype(np.float32))
X_test_tensor = torch.from_numpy(X_test.astype(np.float32))

y_train_tensor = torch.from_numpy(y_train.astype(np.float32))
y_test_tensor = torch.from_numpy(y_test.astype(np.float32))

## View Training Feature Tensor
This cell displays the training feature tensor.

In [14]:
# View the training feature tensor
X_train_tensor

tensor([[-1.2580,  2.0714, -1.2649,  ..., -1.0289, -0.9309, -0.6617],
        [ 0.0170,  0.6645,  0.1650,  ...,  0.9412, -0.1045,  1.5341],
        [ 0.4510, -0.3262,  0.4605,  ...,  0.9791,  1.3936,  1.0852],
        ...,
        [-0.6949, -1.2261, -0.7479,  ..., -0.5094,  0.3016, -0.9411],
        [-0.4311, -0.4634, -0.4594,  ..., -0.9173, -0.3510, -0.8343],
        [-1.0495, -0.1588, -1.0517,  ..., -0.7768, -1.1016, -0.3245]])

## View Shape of Training Feature Tensor
This cell displays the shape of the training feature tensor.

In [15]:
# View the shape of the training feature tensor
X_train_tensor.shape

torch.Size([455, 30])

## View Training Label Tensor
This cell displays the training label tensor.

In [16]:
# View the training label tensor
y_train_tensor

tensor([0., 1., 1., 0., 0., 1., 1., 0., 1., 0., 0., 0., 1., 1., 1., 1., 0., 0.,
        1., 0., 0., 0., 0., 1., 1., 0., 0., 1., 1., 1., 0., 0., 1., 1., 0., 1.,
        0., 1., 0., 1., 0., 0., 1., 0., 0., 0., 0., 1., 1., 1., 0., 0., 0., 0.,
        0., 1., 0., 0., 0., 0., 1., 0., 1., 1., 0., 1., 0., 0., 0., 0., 1., 0.,
        1., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 1., 0., 1., 1., 1., 0., 0., 0., 0., 0., 0.,
        0., 1., 1., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 1.,
        0., 1., 1., 0., 1., 1., 1., 1., 0., 1., 1., 1., 0., 0., 0., 0., 1., 0.,
        1., 0., 0., 0., 1., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0.,
        1., 1., 0., 0., 0., 0., 0., 0., 0., 1., 1., 0., 0., 1., 1., 0., 0., 0.,
        1., 1., 1., 1., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 1., 1., 1.,
        0., 1., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 1., 0., 0., 0., 1., 0.,
        0., 1., 0., 0., 0., 1., 1., 0., 

## View Shape of Training Label Tensor
This cell displays the shape of the training label tensor.

In [17]:
# View the shape of the training label tensor
y_train_tensor.shape

torch.Size([455])

# **Defining The Model**


## Define Neural Network Model
This cell defines a simple neural network model using PyTorch's nn.Module.

In [18]:
# Import PyTorch neural network module
import torch.nn as nn

### **`Optimizing This Section Of Code Block Below:`**

```python

class MySimpleNN():
    def __init__(self, X):
        self.weights = torch.rand(
            X.shape[1], 
            1, 
            dtype=torch.float64, 
            requires_grad=True
        )
        
        self.bias = torch.zeros(
            1, 
            dtype=torch.float64, 
            requires_grad=True
        )
        

    def forward(self, X):
        z = torch.matmul(X, self.weights) + self.bias
        y_pred = torch.sigmoid(z)
        return y_pred
    

    def loss_function(self, y_pred, y):
        epsilon = 1e-7
        y_pred = torch.clamp(y_pred, epsilon, 1 - epsilon)
        loss = -(y_train_tensor * torch.log(y_pred) +
                 (1 - y_train_tensor) * torch.log(1 - y_pred)).mean()
        
        return loss

### Building the neural network using nn.module

In [19]:
# Define a simple neural network class
class MySimpleNN(nn.Module):

    def __init__(self, num_features):
        # Call parent constructor
        super().__init__()
        
        # Linear layer: input features to 1 output
        self.linear = nn.Linear(num_features, 1)
        
        # Sigmoid activation for output
        self.sigmoid = nn.Sigmoid()



    def forward(self, features):
        # Pass input through linear layer
        out = self.linear(features)
        
        # Apply sigmoid activation
        out = self.sigmoid(out)
        
        # Return output
        return out

## Instantiate Model
This cell creates an instance of the neural network model.

In [20]:
# Create model instance with number of features as input size
model = MySimpleNN(X_train_tensor.shape[1])

### Important Parameters


## Set Training Parameters
This cell sets the learning rate and number of epochs for training.

In [21]:
# Set learning rate for optimizer
learning_rate = 0.1

# Set number of epochs for training
epochs = 50

## View Initial Model Weights
This cell displays the initial weights of the model.

In [22]:
# View initial weights of the linear layer
model.linear.weight

Parameter containing:
tensor([[-0.0242,  0.0016, -0.0082,  0.0261, -0.1366, -0.0451, -0.1034,  0.0187,
         -0.1379, -0.0198, -0.1067,  0.1534, -0.0256,  0.0784,  0.0186,  0.0758,
         -0.1711, -0.1268, -0.1229, -0.0741, -0.1673,  0.0717, -0.0817, -0.0126,
          0.0393,  0.1763,  0.1222,  0.1159,  0.1089,  0.1061]],
       requires_grad=True)

## View Initial Model Bias
This cell displays the initial bias of the model.

In [23]:
# View initial bias of the linear layer
model.linear.bias

Parameter containing:
tensor([0.1079], requires_grad=True)

## Define Loss Function

Using built-in loss function for training.

In [24]:
# Define binary cross entropy loss function
loss_function = nn.BCELoss()

# **Training Pipeline**


## Define Optimizer

Using built-in optimizer for training the model.

In [25]:
# Define SGD optimizer with model parameters and learning rate
optimizer = torch.optim.SGD(
    model.parameters(), 
    lr=learning_rate
)

## Training Loop
This cell runs the training loop for the specified number of epochs.

### **`This Changed To Below Code`**

```python
for epoch in range(epochs):

    y_pred = model.forward(X_train_tensor)

    loss = model.loss_function(y_pred, y_train_tensor)

    loss.backward()

    with torch.no_grad():
        model.weights -= learning_rate * model.weights.grad
        model.bias -= learning_rate * model.bias.grad

    model.weights.grad.zero_()
    model.bias.grad.zero_()

    print(f'Epoch: {epoch + 1}, Loss: {loss.item()}')

In [26]:
# Training loop for specified number of epochs
for epoch in range(epochs):

    # Forward pass: compute predictions
    # There is no need to explicityly call the `forward` method
    # PyTorch nn.Module handles in internaly
    y_pred = model(X_train_tensor)
   
    # Compute loss between predictions and true labels
    # Shape of y_pred is [114, 1]
    # Shape of y_train_tensor is [455]
    loss = loss_function(y_pred, y_train_tensor.view(-1, 1))

    # Zero gradients before backward pass
    optimizer.zero_grad()

    # Backward pass: compute gradients
    loss.backward()

    # Update model parameters
    optimizer.step()

    # Print loss for current epoch
    print(f'Epoch: {epoch + 1}, Loss: {loss.item()}')

Epoch: 1, Loss: 0.7764944434165955
Epoch: 2, Loss: 0.5662938356399536
Epoch: 3, Loss: 0.4589614272117615
Epoch: 4, Loss: 0.3962050676345825
Epoch: 5, Loss: 0.3543602228164673
Epoch: 6, Loss: 0.32402941584587097
Epoch: 7, Loss: 0.30079174041748047
Epoch: 8, Loss: 0.28227415680885315
Epoch: 9, Loss: 0.2670782208442688
Epoch: 10, Loss: 0.2543216347694397
Epoch: 11, Loss: 0.24341754615306854
Epoch: 12, Loss: 0.2339591234922409
Epoch: 13, Loss: 0.2256544679403305
Epoch: 14, Loss: 0.21828800439834595
Epoch: 15, Loss: 0.2116968035697937
Epoch: 16, Loss: 0.20575496554374695
Epoch: 17, Loss: 0.20036354660987854
Epoch: 18, Loss: 0.19544357061386108
Epoch: 19, Loss: 0.1909310519695282
Epoch: 20, Loss: 0.18677350878715515
Epoch: 21, Loss: 0.18292754888534546
Epoch: 22, Loss: 0.17935676872730255
Epoch: 23, Loss: 0.1760304719209671
Epoch: 24, Loss: 0.1729225367307663
Epoch: 25, Loss: 0.17001056671142578
Epoch: 26, Loss: 0.16727523505687714
Epoch: 27, Loss: 0.16469977796077728
Epoch: 28, Loss: 0.1622

### Viewing Trained Weights

Display the weights after training.

In [27]:
# View the trained weights of the linear layer after training
model.linear.weight

Parameter containing:
tensor([[ 0.3185,  0.2222,  0.3329,  0.3564,  0.0358,  0.1197,  0.1633,  0.3686,
         -0.0157, -0.1345,  0.1893,  0.1602,  0.2330,  0.3369,  0.0430,  0.0563,
         -0.1640,  0.0202, -0.1199, -0.1661,  0.2081,  0.3117,  0.2794,  0.3364,
          0.2486,  0.3168,  0.3328,  0.4408,  0.2784,  0.1228]],
       requires_grad=True)

### Viewing Trained Bias

Display the bias after training.

In [28]:
# View the trained bias of the linear layer after training
model.linear.bias

Parameter containing:
tensor([-0.1419], requires_grad=True)

# **Evaluation**


## Model Evaluation
This cell evaluates the trained model on the test set and prints the accuracy.

In [29]:
# Evaluate model on test data
with torch.no_grad():
    # Forward pass on test set
    y_pred = model.forward(X_test_tensor)
    
    # Convert probabilities to binary predictions
    y_pred = (y_pred > 0.5).float()
    
    # Calculate accuracy by comparing predictions to true labels
    accuracy = (y_pred == y_test_tensor).float().mean()
    
    # Print accuracy
    print(f'Accuracy: {accuracy.item()}')

Accuracy: 0.5646352767944336
