##### Introduction

2 perspective 
- geometric
- probability

### Probabilistic Approach here

##### Requirement
- Dataset should be linearly separable or almost separable
- If data is completely non-linear, logistic regression cannot be applied

##### Perceptron Trick  

Suppose we have a linear decision boundary:

$$
A x_1 + B x_2 + C = 0
$$

or in vector form:

$$
w^T x + b = 0
$$

Where:  
- $$ w = [A, B] $$  
- $$ b = C $$  
- In higher dimensions:

$$
A x_1 + B x_2 + C x_3 + d = 0
$$

This represents a line (2D) or a hyperplane (3D or higher).

---

##### Goal  

If the dataset is linearly separable, we want to find values of:

$$
A, B, C \quad (\text{or } w, b)
$$

Such that:  
- All positive points lie on one side  
- All negative points lie on the other side  

---

##### Step 1: Random Initialization  

Start with random values:

$$
w, b
$$

This gives a random separating line.

---

##### Step 2: Pick a Training Example  

Select one data point $$ (x_i, y_i) $$

Where:

$$
y_i \in \{-1, +1\}
$$

---

##### Step 3: Check Classification  

Compute:

$$
y_i (w^T x_i + b)
$$

- If $$ > 0 $$ → correctly classified  
- If $$ \leq 0 $$ → misclassified  

---

##### Step 4: Update Rule (Perceptron Trick)  

If misclassified:

$$
w = w + \eta y_i x_i
$$

$$
b = b + \eta y_i
$$

Where:  
- $$ \eta $$ = learning rate  

This shifts the decision boundary.

---

##### Geometric Intuition  

- If a positive point is wrongly classified → move boundary toward it  
- If a negative point is wrongly classified → move boundary away  

Updating with $$ y_i x_i $$:

- If $$ y_i = +1 $$ → $$ w = w + x_i $$  
- If $$ y_i = -1 $$ → $$ w = w - x_i $$  

The line rotates or shifts accordingly.

---

##### Iterative Process  

Repeat for many epochs:

1. Pick a point  
2. Check classification  
3. Update if necessary  
4. Move to next point  

Stop when:
- No misclassifications remain  
OR  
- Maximum iterations reached  

---

##### Important Insight  

The perceptron does not explicitly remember previous points.  

However, every weight update changes the boundary globally,  
which automatically affects all points.

---

##### Convergence Property  

- If data is linearly separable → perceptron guarantees convergence  
- If not separable → weights may oscillate  

---

##### Final Mental Model  

Perceptron Trick:

1. Start with a random line  
2. If a point is misclassified → push boundary to correct side  
3. Repeat  
4. Eventually converge (if separable)  

No probabilities.  
No explicit loss minimization.  
Pure geometric correction.

### How to label region 

[website](https:/desmos.com/calculator)

##### Understanding How A, B, C Affect the Line  

Consider the linear equation:

$$
Ax + By + C = 0
$$

This represents a straight line in 2D.

We can rewrite it in slope–intercept form:

$$
y = -\frac{A}{B}x - \frac{C}{B}
$$

From this form, we can clearly see how A, B, and C affect the line.

---

##### 1. Effect of Changing C  

Equation:

$$
Ax + By + C = 0
$$

If we change only **C**, while keeping A and B fixed:

- The slope $$ -\frac{A}{B} $$ remains the same.
- Only the intercept changes.

This means:

- The line shifts **parallel** to itself.
- The orientation does not change.
- The line moves up or down (or sideways depending on slope).

So, changing **C moves the line parallelly**.

---

##### 2. Effect of Changing B  

From:

$$
y = -\frac{A}{B}x - \frac{C}{B}
$$

If B changes:

- The slope $$ -\frac{A}{B} $$ changes.
- The intercept $$ -\frac{C}{B} $$ also changes.

This causes:

- The line to rotate.
- The steepness (tilt) changes.

Geometrically, the line rotates around a pivot point depending on the values of A and C.

So changing **B changes the slope**, causing rotation.

---

##### 3. Effect of Changing A  

Again from:

$$
y = -\frac{A}{B}x - \frac{C}{B}
$$

If A changes:

- The slope $$ -\frac{A}{B} $$ changes.
- The intercept stays the same (since C and B are constant).

This also causes:

- Rotation of the line.
- Change in steepness.

So changing **A rotates the line**.

---

##### Clean Geometric Summary  

For the line:

$$
Ax + By + C = 0
$$

- Changing **C** → shifts line parallel  
- Changing **A** → changes slope (rotation)  
- Changing **B** → changes slope (rotation)  

---

##### Important Insight for Perceptron  

In perceptron:

$$
w = [A, B], \quad b = C
$$

- Updating **b** moves boundary parallel  
- Updating **A or B** rotates the boundary  

That is why the update:

$$
w = w + \eta y x
$$

changes the orientation and position of the decision boundary.

# [website for checking line transformation](https:/desmos.com/calculator)

##### Perceptron Trick – Intuition Using Coefficients Directly  

Consider the decision boundary:

$$
Ax + By + C = 0
$$

We can represent it in vector form as:

$$
w = [A, B], \quad b = C
$$

For convenience, we often extend the input vector:

$$
x' = [x_1, x_2, 1]
$$

and parameter vector:

$$
\theta = [A, B, C]
$$

So the equation becomes:

$$
\theta^T x' = 0
$$

---

##### Case 1: Negative Point Lies on Positive Side  

Suppose a point belongs to class:

$$
y = -1
$$

But:

$$
\theta^T x' > 0
$$

This means it is wrongly classified (it lies on the positive side).

To correct this:

We **subtract the point coordinates** from the coefficient vector:

$$
\theta = \theta - \eta x'
$$

Why?

Because for a negative point, we want:

$$
\theta^T x' < 0
$$

Subtracting pushes the decision boundary away from that point,
moving it toward the correct side.

---

##### Case 2: Positive Point Lies on Negative Side  

Suppose a point belongs to class:

$$
y = +1
$$

But:

$$
\theta^T x' < 0
$$

This means it is wrongly classified (it lies on the negative side).

To correct this:

We **add the point coordinates** to the coefficient vector:

$$
\theta = \theta + \eta x'
$$

Why?

Because for a positive point, we want:

$$
\theta^T x' > 0
$$

Adding pushes the boundary toward that point.

---

##### General Unified Update Rule  

Both cases combine into one equation:

$$
\theta = \theta + \eta y x'
$$

Where:

- If $$ y = +1 $$ → we add  
- If $$ y = -1 $$ → we subtract  

---

##### Geometric Meaning  

- Positive misclassified point → pull boundary toward it  
- Negative misclassified point → push boundary away  

Each update slightly rotates or shifts the line.

Over many updates, the boundary gradually transforms into the correct separating line (if data is linearly separable).

---

##### Key Insight  

The perceptron does not "solve" for A, B, C directly.  

Instead, it keeps transforming the line step by step:

1. Pick a point  
2. Check if correct  
3. If wrong → adjust coefficients  
4. Repeat until no mistakes remain  

This continuous transformation is what is called the **Perceptron Trick**.

##

##

# Algortithm 

##### Converting Normal Equation into Weight Form  

Normal equation of a line:

$$
Ax + By + C = 0
$$

We rewrite it as:

$$
w_0 + w_1 x_1 + w_2 x_2 = 0
$$

Where:

$$
w_0 = C, \quad w_1 = A, \quad w_2 = B
$$

---

##### Adding Bias as a Feature  

Let us add a new column $$ x_0 $$ in the dataset with all values equal to 1.

Dataset structure:

| x₀ | CGPA | IQ | Placed |
|----|------|----|--------|
| 1  | ...  | ...| ...    |

Now the equation becomes:

$$
w_0 x_0 + w_1 x_1 + w_2 x_2 = 0
$$

---

##### Summation Form  

We can write it compactly as:

$$
\sum_{i=0}^{2} w_i x_i = 0
$$

## 

eg for the first student prediction :
if ,
$$
 w_0 * 1 + w_1 (7.5) + w_2 (134) > 0
$$

- then placement = 1 (placed)

if ,
$$
w_0 * 1 + w_1 (7.5) + w_2 (134) < 0
$$

- then placement = 0 (not placed)

________

____________

$$
\begin{bmatrix}
w_0 & w_1 & w_2
\end{bmatrix}
\begin{bmatrix}
x_0 \\
x_1 \\
x_2
\end{bmatrix}
$$

### simplified Algo

$$
x_i \in N \quad \text{and} \quad \sum w_i x_i > 0
$$

$$
w_n = w_0 - \eta x_i
$$

$$
x_i \in P \quad \text{and} \quad \sum w_i x_i < 0
$$

$$
w_n = w_0 + \eta x_i
$$

$$
\boxed{
w_n = w_0 + \eta (y_i - \hat{y}_i)\, x_i
}
$$

| $y_i$ | $\hat{y}_i$ |
|-------|-------------|
| 1     | 1           |
| 1     | 1           |
| 0     | 0           |
| 1     | 0           |
| 0     | 1           |

$$
\boxed{
\begin{aligned}
&\text{for } i \text{ in range(epochs):} \\
&\quad \text{select a random student } (i) \\
&\quad w_n = w_0 + \eta (y_i - \hat{y}_i) x_i
\end{aligned}
}
$$