In [None]:
"""
#Explantion of the Full Code

This project is all about predicting whether a patient has breast cancer or not based on different medical measurements.
You created a simple neural network from scratch using PyTorch to do this prediction.

Neeche full theory, step-by-step:

⭐ 1. Dataset Loading

The project starts by loading a breast cancer dataset.
This dataset contains:

Different measurements taken from a breast cell

A "diagnosis" label showing if the sample is Malignant (cancer) or Benign (no cancer)

The goal is to teach a model to use the measurements and predict whether the patient has cancer.

⭐ 2. Removing Unnecessary Columns

The dataset contains an ID column, which does not help in prediction.
So this column is removed to avoid confusion for the model.

⭐ 3. Separating Inputs and Output

The dataset has:

One target column → diagnosis (cancer or not)

Many feature columns → measurements (like radius, texture, smoothness, etc.)

So the data is split into:

X (features) → all measurements

Y (label) → diagnosis

This tells the model what to learn from.

⭐ 4. Splitting Data into Training and Testing

The data is divided into:

Training data (80%) → model learns from this

Testing data (20%) → model is tested to see how well it learned

This is important to check if the model can work on new unseen data.

⭐ 5. Scaling the Data

Different measurements have different numerical ranges.
Some values are small, some very large.

To prevent imbalance, all values are scaled to a similar range using StandardScaler.
This helps the neural network learn better and faster.

⭐ 6. Converting Labels (M/B) into Numbers

The diagnosis column contains text labels:

M → Malignant (cancer)

B → Benign (no cancer)

A neural network cannot understand text, so these are converted into numbers:

1 → cancer

0 → no cancer

This is called Label Encoding.

⭐ 7. Converting Data into Tensors

To work with PyTorch, all the data must be in the form of tensors, which are like special arrays.
PyTorch uses tensors to perform automatic gradient computation.

⭐ 8. Building a Simple Neural Network

You created your own custom neural network class.

This neural network contains:

Weights (single layer)

Bias

A forward function using the formula:

Prediction = sigmoid(XW + b)

This is basically a logistic regression model acting as a neural network.

⭐ 9. Loss Function (Binary Cross Entropy)

A loss function tells the model
➡️ how wrong its prediction is.

Your project uses Binary Cross Entropy Loss, which is ideal for yes/no predictions.

The model tries to reduce this loss during training.

⭐ 10. Training the Model

The model goes through several rounds called epochs.

Every epoch does 3 things:

Make a prediction (forward pass)

Compare prediction with actual result (loss calculation)

Correct itself by updating weights and bias (backpropagation)

With each epoch, the model improves its ability to predict correctly.

⭐ 11. Checking the Learned Weights

After training, you print the final weights and bias that the model learned.
These numbers represent the importance of each feature.

⭐ 12. Evaluating the Model

In the testing phase:

The model predicts cancer on new test data

Predictions above a certain probability are marked as cancer

The model’s performance is measured using accuracy

Accuracy tells how many predictions were correct.
"""


In [None]:
import pandas as pd
import numpy as np
import torch
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, LabelEncoder

In [2]:
df = pd.read_csv("breast-cancer.csv")
print(df.head())

         id diagnosis  radius_mean  texture_mean  perimeter_mean  area_mean  \
0    842302         M        17.99         10.38          122.80     1001.0   
1    842517         M        20.57         17.77          132.90     1326.0   
2  84300903         M        19.69         21.25          130.00     1203.0   
3  84348301         M        11.42         20.38           77.58      386.1   
4  84358402         M        20.29         14.34          135.10     1297.0   

   smoothness_mean  compactness_mean  concavity_mean  concave points_mean  \
0          0.11840           0.27760          0.3001              0.14710   
1          0.08474           0.07864          0.0869              0.07017   
2          0.10960           0.15990          0.1974              0.12790   
3          0.14250           0.28390          0.2414              0.10520   
4          0.10030           0.13280          0.1980              0.10430   

   ...  radius_worst  texture_worst  perimeter_worst  area_wor

In [3]:
print(f"The total number of Columns is {df.shape[1]} and the total number of Rows is {df.shape[0]}.")

The total number of Columns is 32 and the total number of Rows is 569.


In [4]:
df.drop(columns=['id'],inplace=True)

In [5]:
print(f"The total number of Columns is {df.shape[1]} and the total number of Rows is {df.shape[0]}.")

The total number of Columns is 31 and the total number of Rows is 569.


In [6]:
df['diagnosis'].unique()

array(['M', 'B'], dtype=object)

In [7]:
x = df.iloc[:,1:] #All Colum Excepted 'diagnosis'
y = df.iloc[:,0] # 'diagnosis' Column

x_train,x_test,y_train,y_test = train_test_split(x,y,test_size=0.2)

In [8]:
scaler = StandardScaler()
x_train = scaler.fit_transform(x_train)
x_test = scaler.transform(x_test)

In [9]:
lb = LabelEncoder()
y_train = lb.fit_transform(y_train)
y_test = lb.transform(y_test)

In [10]:
#Conver the all numpy Arrays to the Tensors.
x_train_tensor = torch.from_numpy(x_train)
x_test_tensor = torch.from_numpy(x_test)
y_train_tensor = torch.from_numpy(y_train)
y_test_tensor = torch.from_numpy(y_test)

In [12]:
print(x_test_tensor.shape)

torch.Size([114, 30])


In [13]:
class MyfirstNN():
    def __init__(self,x):
        self.weights = torch.rand(x.shape[1],dtype=torch.float64,requires_grad=True)
        self.bias = torch.zeros(1,dtype=torch.float64,requires_grad=True)   

    def forward(self,x):
        z = torch.matmul(x,self.weights) + self.bias
        y_predict = torch.sigmoid(z)
        return y_predict

    def loss_function(self,y_pred,y):
        #Clamp Predictions to avoid log(0)
        epsilon = 1e-7
        y_pred = torch.clamp(y_pred,epsilon,1-epsilon)

        #calculate loss
        loss = -(y_train_tensor*torch.log(y_pred) + (1 - y_train_tensor) * torch.log(1 - y_pred)).mean()
        return loss

In [32]:
learning_rate = 0.1
epochs = 20

In [33]:
#Create Model
model = MyfirstNN(x_train_tensor)

#Define Loop
for i in range(epochs): 
    
    #Forward Pass 
    y_predict = model.forward(x_train_tensor)
    
    #Loss Calculation
    loss = model.loss_function(y_predict,y_train_tensor)
    
    #Backward Pass
    loss.backward()
    
    #Parameters Updates.
    with torch.no_grad():
        model.weights -= learning_rate * model.weights.grad
        model.bias -= learning_rate * model.bias.grad

    #zero gradients
    model.weights.grad.zero_()
    model.bias.grad.zero_()

    #Print Loss in each epoch
    print(f"Epoch: {i + 1}, Loss : {loss.item()}.")

Epoch: 1, Loss : 0.36244193006545594.
Epoch: 2, Loss : 0.3542328119523354.
Epoch: 3, Loss : 0.3462386273861073.
Epoch: 4, Loss : 0.3384518864942635.
Epoch: 5, Loss : 0.330865713168754.
Epoch: 6, Loss : 0.32347391417644505.
Epoch: 7, Loss : 0.31627103018550917.
Epoch: 8, Loss : 0.30925236777613574.
Epoch: 9, Loss : 0.30241401143793684.
Epoch: 10, Loss : 0.29575281360683.
Epoch: 11, Loss : 0.28926636044298293.
Epoch: 12, Loss : 0.28295291053592103.
Epoch: 13, Loss : 0.2768113040825533.
Epoch: 14, Loss : 0.2708408415971616.
Epoch: 15, Loss : 0.26504113395214923.
Epoch: 16, Loss : 0.25886773082848485.
Epoch: 17, Loss : 0.25166672412954105.
Epoch: 18, Loss : 0.24468511974723803.
Epoch: 19, Loss : 0.23792304645124965.
Epoch: 20, Loss : 0.23137973601589623.


In [34]:
print(f"Model Weights:{model.weights}")

Model Weights:tensor([ 0.8670,  0.3738,  0.6128,  0.8362,  0.7457,  0.1934,  0.6758,  0.9187,
         0.4453,  0.7166,  0.8213,  0.3568,  0.1659,  0.9895,  0.1884,  0.6962,
        -0.0024, -0.0113,  0.1821,  0.4479,  0.2628,  0.9287,  0.9653,  0.7478,
         0.3160,  0.0264,  0.2231,  0.5387,  0.5703,  0.2861],
       dtype=torch.float64, requires_grad=True)


In [35]:
print(f"Model Bias:{model.bias}")

Model Bias:tensor([-0.0081], dtype=torch.float64, requires_grad=True)


In [49]:
#Evaluation of the Model

with torch.no_grad():
    y_pred = model.forward(x_test_tensor)
    y_pred = (y_pred>0.9).float()

    #Accuracy score of the ML NN Pipeline
    accuracy = (y_pred == y_test_tensor).float().mean()
    print(f"The Accuracy is : {accuracy}")

The Accuracy is : 0.8859649300575256
