# Logistic Regression — Perceptron-Based Intuition (Pure Notes)

---

## Introduction

We have already studied **Linear Regression** in detail.  
Now we move towards **Logistic Regression**, which is one of the **most important and foundational machine learning algorithms**.

Logistic Regression is especially important because:

- It is closely related to the **Perceptron**, which is a **fundamental building block of Deep Learning**
- If you understand Logistic Regression properly, you automatically build a strong foundation for Neural Networks

Most explanations available online are **misleading or incomplete**.  
Logistic Regression is actually an **easy algorithm** if you already know:

- Gradient Descent  
- Regularization  

The confusion mainly comes because Logistic Regression is explained using **two different perspectives**.

---

## Two Perspectives of Logistic Regression

There are **two standard ways** to understand Logistic Regression:

1. **Geometric Perspective**
2. **Probabilistic Perspective**

In this series, the focus is on the **Probabilistic Perspective**, because:

- It gives the **complete picture**
- Geometry becomes obvious after probability is understood
- It answers *why* things work, not just *how*

---

## Requirement to Apply Logistic Regression

Before applying Logistic Regression, you must check one **critical requirement**:

### Data must be **Linearly Separable** (or almost separable)

Example dataset:

- $x$-axis → CGPA  
- $y$-axis → IQ  
- Green points → Placement happened  
- Blue points → Placement did not happen  

Each point represents a student.

![Linearly vs Not Linearly Separable Data](Linearly-vs-Not-linearly-separable-datasets.png)

Goal:  
Given a new student's CGPA and IQ, predict **placement = yes/no**

### Linear Separability

If a **single straight line** can separate the two classes, the data is **linearly separable**.

- Perfect separability → Ideal case  
- Almost separable → Acceptable (few misclassified points)

### Non-Linearly Separable Data

If data looks like concentric circles or complex shapes, **no straight line** can separate the classes.

In such cases:

- Logistic Regression **will not work well**
- You must use **non-linear models**

---

## Decision Boundary Representation

In Logistic Regression, the decision boundary is **not written as**:

$$
y = mx + c
$$

Instead, it is written in **general form**:

$$
ax + by + c = 0
$$

Or more generally:

$$
w_1 x_1 + w_2 x_2 + b = 0
$$

If we add more features:

$$
w_1 x_1 + w_2 x_2 + w_3 x_3 + b = 0
$$

This is called a **hyperplane**.

![Decision Boundary in 2D](decision_boundary_in_2d.png)

---

## Objective of Logistic Regression

Find values of:

$$
w_1, w_2, \dots, w_n, b
$$

Such that:

- Points from different classes are separated correctly
- Misclassifications are minimized

---

## Perceptron Trick (Core Intuition)

The **Perceptron Trick** is a simple iterative method to find the decision boundary.

![Perceptron Decision Boundary Update](visualizing-decision-boundary-perceptron-6-1640590822.webp)

### Idea

1. Start with **random values** of $w$ and $b$
2. This gives a **random line**
3. Loop through data points
4. Ask each point:
   - Are you correctly classified?
   - If yes → do nothing
   - If no → move the line **towards that point**

---

## Positive and Negative Regions

Given a line:

$$
ax + by + c = 0
$$

For any point $(x, y)$:

- If $ax + by + c > 0$ → **Positive region**
- If $ax + by + c < 0$ → **Negative region**
- If $ax + by + c = 0$ → On the line

![Positive and Negative Regions](decision_boundary_in_2d.png)

---

## Line Transformations

Changing coefficients causes different transformations:

### Change in $c$ (Bias)

- Moves line **up or down**
- No rotation

### Change in $a$

- Rotates line around $y$-axis

### Change in $b$

- Rotates line around $x$-axis

### Change in all $(a, b, c)$

- Combination of shift + rotation

![Pixel Space Transformation](pixelspace.jpeg)

---

## Handling Misclassified Points

If a point is misclassified:

- **Negative point in positive region**
  - Move line **away** from the point
- **Positive point in negative region**
  - Move line **towards** the point

This movement is achieved by **updating weights**.

---

## Learning Rate

Large updates cause instability.  
Therefore, we use a **learning rate** $\eta$.

Instead of large jumps, we update weights slowly:

$$
\text{small step} = \eta \times x
$$

Typical values:

$$
\eta \in \{0.01,\; 0.1\}
$$

---

## Vector Formulation

Rewrite equation:

$$
w_0 + w_1 x_1 + w_2 x_2 = 0
$$

Add bias into vector:

$$
\mathbf{x} = [1, x_1, x_2]
$$

$$
\mathbf{w} = [w_0, w_1, w_2]
$$

Decision function:

$$
\mathbf{w} \cdot \mathbf{x}
$$

Prediction rule:

- If $\mathbf{w} \cdot \mathbf{x} \ge 0$ → Class $1$
- Else → Class $0$

---

## Model Prediction

For a student with:

- CGPA = $x_1$
- IQ = $x_2$

Compute:

$$
z = w_0 + w_1 x_1 + w_2 x_2
$$

Prediction:

- If $z \ge 0$ → Placement = Yes
- If $z < 0$ → Placement = No

![Training Output Example](cell-10-output-1.png)

---

## Weight Update Rules

Let:

- $y \in \{+1, -1\}$
- $x$ = feature vector

### Update Formula

$$
\mathbf{w}_{new} = \mathbf{w}_{old} + \eta \cdot y \cdot \mathbf{x}
$$

---

## Four Possible Cases

| Actual | Predicted | Update |
|------|----------|--------|
| +1 | +1 | No change |
| -1 | -1 | No change |
| +1 | -1 | $+ \eta x$ |
| -1 | +1 | $- \eta x$ |

Correctly classified points → **No update**

---

## Final Perceptron Algorithm

1. Initialize $\mathbf{w}$ randomly
2. Choose learning rate $\eta$
3. Loop for $N$ iterations:
   - Randomly select a data point
   - Compute prediction
   - Update weights using:
     $$
     \mathbf{w} = \mathbf{w} + \eta \cdot y \cdot \mathbf{x}
     $$
4. Stop when:
   - No misclassified points
   - Or max iterations reached

---

## Convergence

Convergence means:

$$
\text{Number of misclassified points} = 0
$$

At this point, the algorithm stops.

---

## Key Takeaways

- Logistic Regression requires **linearly separable data**
- Perceptron provides **intuitive weight updates**
- Decision boundary is a **hyperplane**
- Learning rate controls stability
- Vector formulation simplifies implementation
- This forms the **foundation of Logistic Regression and Neural Networks**

---


In [39]:
from sklearn.datasets import make_classification
import numpy as np
X, y = make_classification(n_samples=100, n_features=2, n_informative=1,n_redundant=0,
                           n_classes=2, n_clusters_per_class=1, random_state=41,hypercube=False,class_sep=9)

In [40]:
import matplotlib.pyplot as plt
plt.figure(figsize=(10,6))
plt.scatter(X[:,0],X[:,1],c=y,cmap='winter',s=100)

<IPython.core.display.Javascript object>

<matplotlib.collections.PathCollection at 0x7b0ef28970b0>

In [41]:
def perceptron(X,y):

    X = np.insert(X,0,1,axis=1)
    weights = np.ones(X.shape[1])
    lr = 0.1

    for i in range(1000):
        j = np.random.randint(0,100)
        y_hat = step(np.dot(X[j],weights))
        weights = weights + lr*(y[j]-y_hat)*X[j]

    return weights[0],weights[1:]

In [42]:
def step(z):
    return 1 if z>0 else 0

In [43]:
intercept_,coef_ = perceptron(X,y)

In [44]:
print(coef_)
print(intercept_)

[1.444153   0.10583723]
0.9


In [45]:

m = -(coef_[0]/coef_[1])
b = -(intercept_/coef_[1])

In [46]:
x_input = np.linspace(-3,3,100)
y_input = m*x_input + b

In [55]:

plt.figure(figsize=(10,6))
plt.plot(x_input,y_input,color='red',linewidth=3)
plt.scatter(X[:,0],X[:,1],c=y,cmap='winter',s=100)
plt.ylim(-3,2)
plt.show()

<IPython.core.display.Javascript object>

In [48]:
def perceptron(X,y):

    m = []
    b = []

    X = np.insert(X,0,1,axis=1)
    weights = np.ones(X.shape[1])
    lr = 0.1

    for i in range(200):
        j = np.random.randint(0,100)
        y_hat = step(np.dot(X[j],weights))
        weights = weights + lr*(y[j]-y_hat)*X[j]

        m.append(-(weights[1]/weights[2]))
        b.append(-(weights[0]/weights[2]))

    return m,b

In [49]:

m,b = perceptron(X,y)

In [50]:
%matplotlib notebook
from matplotlib.animation import FuncAnimation
import matplotlib.animation as animation
fig, ax = plt.subplots(figsize=(9,5))

x_i = np.arange(-3, 3, 0.1)
y_i = x_i*m[0] +b[0]
ax.scatter(X[:,0],X[:,1],c=y,cmap='winter',s=100)
line, = ax.plot(x_i, x_i*m[0] +b[0] , 'r-', linewidth=2)
plt.ylim(-3,3)
def update(i):
    label = 'epoch {0}'.format(i + 1)
    line.set_ydata(x_i*m[i] + b[i])
    ax.set_xlabel(label)
    return line, ax

anim = FuncAnimation(fig, update, repeat=True, frames=200, interval=100)

f = r"animation_line_plot.gif"
writergif = animation.PillowWriter(fps=2)
anim.save(f, writer=writergif)

<IPython.core.display.Javascript object>

In [51]:
from sklearn.linear_model import LogisticRegression
lor = LogisticRegression()
lor.fit(X,y)

In [52]:
m = -(lor.coef_[0][0]/lor.coef_[0][1])
b = -(lor.intercept_/lor.coef_[0][1])

In [53]:
x_input1 = np.linspace(-3, 3, 100)
y_input1 = m * x_input + b

plt.figure(figsize=(10,6))
plt.plot(x_input, y_input, color='red', linewidth=3)
plt.plot(x_input1, y_input1, color='black', linewidth=3)
plt.scatter(X[:,0], X[:,1], c=y, cmap='winter', s=100)
plt.ylim(-3, 2)

plt.savefig("linear_plot.png", dpi=300, bbox_inches="tight")
plt.show()


<IPython.core.display.Javascript object>