In [1]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

## **1. Load Dataset**

What this step does:
Loads the glass data into a table (DataFrame).

Why it is needed:
The model needs numeric data to learn patterns.

In [2]:
df = pd.read_csv("glass.csv")

## **2. Look at Data**

What this step does:
Shows number of rows, columns, and sample values.

Why it is needed:
To understand which column is input and which is output, and to avoid using wrong data.

In [6]:
df.shape

(214, 10)

In [7]:
df.columns

Index(['RI', 'Na', 'Mg', 'Al', 'Si', 'K', 'Ca', 'Ba', 'Fe', 'Type'], dtype='object')

In [8]:
df.head()

Unnamed: 0,RI,Na,Mg,Al,Si,K,Ca,Ba,Fe,Type
0,1.52101,13.64,4.49,1.1,71.78,0.06,8.75,0.0,0.0,1
1,1.51761,13.89,3.6,1.36,72.73,0.48,7.83,0.0,0.0,1
2,1.51618,13.53,3.55,1.54,72.99,0.39,7.78,0.0,0.0,1
3,1.51766,13.21,3.69,1.29,72.61,0.57,8.22,0.0,0.0,1
4,1.51742,13.27,3.62,1.24,73.08,0.55,8.07,0.0,0.0,1


1. Output column is Type
2. All columns are numeric

## **3. Make Binary Output**

What this step does:
Converts the multi-class problem into a binary problem (Type 1 vs others).

Why it is needed:
Logistic regression in this lab is used for binary classification.

In [10]:
df["y"] = (df["Type"] == 1).astype(int)
df = df.drop(columns=["Type"])

## **4. Separate Inputs and Output**

What this step does:
Splits data into features (X) and labels (y).

Why it is needed:
The model learns from inputs and compares predictions with correct outputs.

In [11]:
X = df.drop(columns=["y"]).values
y = df["y"].values

## **5. Train-Test Split**

What this step does:
Divides data into training and testing sets.

Why it is needed:
To test the model on unseen data and avoid false accuracy.

In [12]:
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

## **6. Feature Scaling**

What this step does:
Makes all features have similar numeric ranges.

Why it is needed:
Without scaling, large values dominate and learning becomes unstable.

In [13]:
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

## **7. Sigmoid Function**

What this step does:
Converts the score into a probability between 0 and 1.

Why it is needed:
Probability shows confidence instead of forced yes/no decisions.

In [14]:
def sigmoid(z):
    return 1 / (1 + np.exp(-z))

## **8. Forward Probability**

What this step does:
Calculates the model’s confidence for each sample.

Why it is needed:
Confidence is required before applying a decision threshold.

In [15]:
def predict_proba(X, w, b):
    z = X @ w + b
    return sigmoid(z)

## **9. Loss Function (Binary Cross Entropy)**

What this step does:
Measures how wrong the predicted probabilities are.

Why it is needed:
Helps the model learn not just correctness, but confidence quality.

In [16]:
def loss(y, p):
    return -np.mean(y*np.log(p) + (1-y)*np.log(1-p))

## **10. Weight Update**

What this step does:
Adjusts weights and bias to reduce error.

Why it is needed:
Learning happens only by updating weights and bias.

In [17]:
def update_weights(X, y, w, b, lr):
    p = predict_proba(X, w, b)
    error = p - y

    w = w - lr * (X.T @ error) / len(y)
    b = b - lr * np.mean(error)

    return w, b

## **11. Training Loop**

What this step does:
Repeats updates over many epochs.

Why it is needed:
Repeated corrections improve the model gradually.

In [18]:
w = np.zeros(X_train.shape[1])
b = 0.0
lr = 0.1
epochs = 100

for _ in range(epochs):
    w, b = update_weights(X_train, y_train, w, b, lr)

## **12. Probability → Decision**

What this step does:
Converts probability into class using a threshold.

Why it is needed:
Decision rules depend on application safety requirements.

In [19]:
def predict_label(p, threshold=0.5):
    return (p >= threshold).astype(int)

## **13. Test Model**

In [20]:
p_test = predict_proba(X_test, w, b)

y_pred_05 = predict_label(p_test, 0.5)
y_pred_07 = predict_label(p_test, 0.7)

# Difference between Perceptron and Logistic Regression

- Perceptron gives only 0 or 1.
- Logistic Regression gives probability.

# Why sigmoid matters?

- It keeps information near the boundary.
- It avoids unstable decisions.

# What problem still remains?

- Data noise and overlap still exist.
- Perfect separation is not always possible.