🔹 1. Initialization Function (__init__)
def __init__(self, lr=0.01, n_iters=1000):
    self.lr = lr
    self.n_iters = n_iters
    self.weights = None
    self.bias = None

Explanation:

lr (learning rate):
Determines how much to adjust the weights after each iteration.
A small value = slow learning; a large value = unstable learning.

n_iters:
Number of training iterations (epochs).

weights and bias:
Model parameters that will be learned during training.
Initially set to None, and then later initialized as zeros inside fit().

🔹 2. Training Function (fit())
def fit(self, X, y):
    n_samples, n_features = X.shape
    self.weights = np.zeros(n_features)
    self.bias = 0


X → input data (matrix of features).

y → actual labels (0 or 1).

n_samples → number of rows (data points).

n_features → number of columns (features).

At first, we set:

All weights = 0

Bias = 0

🔹 3. The Training Loop (Gradient Descent)
for _ in range(self.n_iters):
    linear_model = np.dot(X, self.weights) + self.bias
    y_predicted = self._sigmoid(linear_model)

Step 1: Linear Model

We calculate the raw prediction:

𝑧
=
𝑋
⋅
𝑤
+
𝑏
z=X⋅w+b
Step 2: Apply Sigmoid Function

We pass z through a sigmoid function to convert it into probabilities between 0 and 1:

𝜎
(
𝑧
)
=
1
1
+
𝑒
−
𝑧
σ(z)=
1+e
−z
1
	​


So y_predicted gives the probability that each sample belongs to class 1.

🔹 4. Compute Gradients
dw = (1 / n_samples) * np.dot(X.T, (y_predicted - y))
db = (1 / n_samples) * np.sum(y_predicted - y)


We calculate how much each parameter (weight and bias) contributes to the total error.
These are the gradients — the direction of change needed to reduce the error.

dw (gradient of weights):
Measures how much to change each weight.

db (gradient of bias):
Measures how much to change the bias.

🔹 5. Update Parameters
self.weights -= self.lr * dw
self.bias -= self.lr * db


This is the Gradient Descent update rule:

Move the weights and bias opposite to the gradient direction (to minimize error).

Multiply by the learning rate to control step size.

The loop repeats for all iterations — the model “learns” by continuously reducing the prediction error.

🔹 6. Predict Function (predict())
def predict(self, X):
    linear_model = np.dot(X, self.weights) + self.bias
    y_predicted = self._sigmoid(linear_model)
    y_predicted_cls = [1 if i > 0.5 else 0 for i in y_predicted]
    return np.array(y_predicted_cls)


Here the model makes predictions on new data.

Compute linear model → np.dot(X, weights) + bias

Convert to probabilities with sigmoid.

Apply a threshold of 0.5:

If probability > 0.5 → predict 1

Else → predict 0

🔹 7. Sigmoid Function
def _sigmoid(self, x):
    return 1 / (1 + np.exp(-x))


This function squashes any real number into the range (0, 1).
It’s what makes Logistic Regression suitable for binary classification.

✅ Summary
Step	Description
1	Initialize weights & bias
2	Compute linear combination (X·w + b)
3	Apply sigmoid to get probabilities
4	Compute gradients
5	Update weights & bias (gradient descent)
6	Repeat until convergence
7	Predict using learned parameters

In [1]:
 import numpy as np

class LogisticRegression:
    def __init__(self, lr=0.01, n_iters=1000):
        # Initialize hyperparameters
        self.lr = lr
        self.n_iters = n_iters
        self.weights = None
        self.bias = None

    def fit(self, X, y):
        # Initialize parameters
        n_samples, n_features = X.shape
        self.weights = np.zeros(n_features)
        self.bias = 0

        # Gradient Descent
        for _ in range(self.n_iters):
            # Compute linear model
            linear_model = np.dot(X, self.weights) + self.bias
            # Apply sigmoid function to get probabilities
            y_predicted = self._sigmoid(linear_model)

            # Compute gradients
            dw = (1 / n_samples) * np.dot(X.T, (y_predicted - y))
            db = (1 / n_samples) * np.sum(y_predicted - y)

            # Update parameters
            self.weights -= self.lr * dw
            self.bias -= self.lr * db

    def predict(self, X):
        # Predict probabilities
        linear_model = np.dot(X, self.weights) + self.bias
        y_predicted = self._sigmoid(linear_model)
        # Convert probabilities to 0 or 1 (threshold = 0.5)
        y_predicted_cls = [1 if i > 0.5 else 0 for i in y_predicted]
        return np.array(y_predicted_cls)

    def _sigmoid(self, x):
        # Sigmoid activation function
        return 1 / (1 + np.exp(-x))


🔍 Explanation

Dataset (X, y):
We created a small binary dataset — the first three points belong to class 0, and the last three to class 1.
Each point has two features.

Model Initialization:
lr=0.1 → the learning rate (how fast the model updates its weights).
n_iters=1000 → the number of training iterations.

Training (fit):
The model learns to find a decision boundary that separates class 0 and class 1.

Prediction:
We test the model on two new samples [[2, 3], [5, 6]].
The model returns class labels (0 or 1) based on what it has learned.


In [3]:
# Generate simple binary dataset
X = np.array([
    [1, 2],
    [2, 3],
    [3, 4],
    [4, 5],
    [5, 6],
    [6, 7]
])
y = np.array([0, 0, 0, 1, 1, 1])  # Binary labels

# Initialize and train the model
model = LogisticRegression(lr=0.1, n_iters=1000)
model.fit(X, y)

# Make predictions
predictions = model.predict(np.array([[2, 3], [5, 6]]))
print("Predictions:", predictions)


Predictions: [0 1]
