# 📝 KNN (K-Nearest Neighbors) – Exam Notes
🔹 Definition: 
Supervised ML algorithm (Classification + Regression).
Predicts label/value of a new point by looking at its K nearest neighbors.
Lazy learner → no training phase, only prediction time work.
- 🔹 Steps of KNN
Choose K (odd number preferred).
Calculate distance between new point & training data.
Euclidean: 
∑
(
𝑥
𝑖
−
𝑦
𝑖
)
2
∑(x
i	​
−y
i	​

)
2
- 
Manhattan: 
∑
∣
𝑥
𝑖
−
𝑦
𝑖
∣
∑∣x
i	​
−y
i
∣
- 
Cosine similarity (for text/angles)

Select K nearest neighbors.

Prediction:

Classification → majority vote

Regression → average value

- 🔹 When to Use?

- Dataset output (Y) categorical → Classification
- Dataset output (Y) continuous/numerical → Regression
## 🔹 Advantages": 
- ✅ Simple & intuitive
- ✅ No training cost (fast to set up)
- ✅ Works with non-linear decision boundaries
- ✅ Handles multi-class problems
## 🔹 Disadvantages
- ❌ Prediction slow (distance calc for every test point)
- ❌ Memory hungry (stores all training data)
- ❌ Sensitive to irrelevant features & scaling
- ❌ High dimensions → distance becomes meaningless (curse of dimensionality)
# 🔹 Important Points
- always normalize/standardize features before KNN.
- Best K chosen using cross-validation.
- Small K → noisy results, large K → smoother but may ignore local patterns.
### 🔹 Shortcut Memory Trick" :
1. 👉 KNN = “Look at your K closest neighbors, follow majority (classification) or average (regression).”
2. ⚡ Bas ye points yaad rakh lo → exam me koi bhi question (definition, working, pros/cons, example) easily tackle kar logi.
3. Chaho main tumhe ek 1-page mindmap/diagram bana ke dikhau (distance + voting + pros/cons ek jagah pe)?

# 🟢 Euclidean Distance Kya Hai?

Euclidean distance asal mein 2 points ke beech ki straight-line (direct) distance hoti hai — jaise tum ruler se naap lo 📏.

📌 Formula (2D case):

Agar tumhare paas do points hain:

Point A = (x₁, y₁)

Point B = (x₂, y₂)

To unka distance hoga:
𝑑
=
(
𝑥
2
−
𝑥
1
)
2
+
(
𝑦
2
−
𝑦
1
)
2
d=
(x
2
−x
1	​
)
2
+(y
2	​
−y
1
)
2
- 📌 Formula (n-dimensional case):

Agar features zyada hain (weight, color, sweetness, size, …), aur tumhare paas n features hain:
𝑑
=
∑
𝑖
=
1
𝑛
(
𝑥
𝑖
−
𝑦
𝑖
)
2
d=
i=1
∑
n
(x
i
−y
i
)
2
Matlab → har feature ka difference square karo, sab differences jod do, aur uska square root le lo.
- 🟢 Example:
Socho tumhare paas fruits dataset hai:

Apple = (weight = 150g, color = 8)

Orange = (weight = 200g, color = 6)

New fruit = (weight = 180g, color = 7)

Distance from Apple:

(
180
−
150
)
2
+
(
7
−
8
)
2
=
900
+
1
≈
30.01
(180−150)
2
+(7−8)
2
=
900+1
≈30.01
Distance from Orange:

(
180
−
200
)
2
+
(
7
−
6
)
2
=
400
+
1
≈
20.02
(180−200)
2
+(7−6)
2
=
400+1
≈20.02
👉 Kyunki 20 < 30, naya fruit Orange ke zyada close hai → KNN bolega ye Orange hai 🍊
- 🟢 How it Works in KNN
Naya data point aaya
Uska distance calculate karo sab training points se
Jo nearest k points honge, unko dekho
Majority vote (classification) ya average (regression) le lo
- ⚡ Short-cut socho:
"Euclidean distance = straight-line distance in feature space"

# 📘 Manhattan Distance (L1 Norm)
🔹 Definition
Manhattan Distance (Taxicab Distance / L1 Norm) is the sum of absolute differences between the coordinates of two points.
1. 
Isko "Manhattan" is liye kehte hain kyunki ek taxi driver New York ki grid wali streets se travel karta hai → seedha left/right, upar/neeche, diagonal nahi 🚖🗽.
🔹 Formula
For two points 
- 𝑃(𝑥1,𝑦1)P(x1,y1) and 𝑄(𝑥2,𝑦2)Q(x2,y2):𝐷=∣𝑥2−𝑥1∣+∣𝑦2−𝑦1∣D=∣x2−x1∣+∣y2−y1∣
General (n-dimensions):
𝐷=∑𝑖=1𝑛∣𝑥𝑖−𝑦𝑖∣D=i=1∑n∣xi−yi	
🔹 Example
- Point A = (2, 3), Point B = (5, 7)
𝐷=∣5−2∣+∣7−3∣=3+4=7
D=∣5−2∣+∣7−3∣=3+4=7
- 
👉 Euclidean distance yahan 5 hota, Manhattan 7 nikla.
- 🔹 Applications
1. Text mining / NLP → word frequency differences
2. High-dimensional / sparse data (e.g. TF-IDF, images)
3. Grid-based movement problems (pathfinding in games, robotics)
4. Sometimes used in KNN, clustering
- 🔹 Advantages : 
1. ✅ Simple to compute
2. ✅ Works well with high-dimensional sparse data
3. ✅ Better when movement is restricted to grid
- 🔹 Disadvantages
- ❌ Less intuitive than Euclidean for continuous geometric data
- ❌ Can give larger distances than Euclidean (less compact)
- ❌ Sensitive to feature scaling
- 🔹 When to Use?
1. Use Euclidean → when similarity is geometric/continuous (height, weight, size)
2. Use Manhattan → when features are discrete, sparse, or grid-like (text data, pixel intensity, pathfinding)
#### ⚡ Shortcut to Remember:
- “Seedhi galiyon wali taxi → Manhattan distance” 

# 📘 Minkowski Distance
🔹 Definition: 
Minkowski distance is a generalized distance metric jo Manhattan (L1) aur Euclidean (L2) dono ko cover karta hai depending on the parameter p.
It defines distance between two points in an n-dimensional space.
🔹 Formula
For two points 
𝑃(𝑥1,𝑥2,...,𝑥𝑛)P(x1,x2,...,xn) and 𝑄(𝑦1,𝑦2,...,𝑦𝑛)Q(y1,y2,...,yn):𝐷=(∑𝑖=1𝑛∣𝑥𝑖−𝑦𝑖𝑝)1𝑝D(i=1∑n	∣xi−yi∣p)p1
🔹 Special Cases
If 
𝑝=1p=1 → Manhattan Distance
𝐷=∑∣𝑥𝑖−𝑦𝑖∣D=∑∣xi−yi∣
If 
𝑝=2p=2 → Euclidean Distance
𝐷=∑(𝑥𝑖−𝑦𝑖)2D=∑(xi−yi)2	
If 
𝑝→∞p→∞ → Chebyshev Distance
𝐷=max(∣𝑥𝑖−𝑦𝑖∣)D=max(∣xi−yi∣)
🔹 Example
Let’s say points: 
𝐴(2,3)
A(2,3), 𝐵(5,7)B(5,7)
For p=1 (Manhattan):

∣5−2∣+∣7−3∣=7
∣5−2∣+∣7−3∣=7
For p=2 (Euclidean):
(5−2)2+(7−3)2=25=5(5−2)2+(7−3)2​=25=5
For p=3 (Minkowski general):

(∣5−2∣3+∣7−3∣3)1/3=(27+64)1/3=911/3≈4.49
(∣5−2∣3+∣7−3∣3)1/3=(27+64)1/3=911/3≈4.49
🔹 Applications

KNN classifier (choosing distance metric)
1. Clustering algorithms (K-means, Hierarchical clustering)
2. Any ML/AI task where "distance" matters but you want flexibility

🔹 Advantages
1. ✅ Flexible (covers multiple distances in one formula)
2. ✅ You can tune p based on your dataset
3. ✅ Makes it easy to experiment with metrics

🔹 Disadvantages

1. ❌ Requires choosing the right p (trial & error)
2. ❌ Sensitive to feature scaling (needs normalization)
3. ❌ For high dimensions, can become computationally expensive

⚡ Shortcut to Remember:
👉 "Minkowski = Distance ka Swiss Knife 🗡️ … Manhattan aur Euclidean dono uske special cases hain."



# 📌 Hamming Distance (HD)
✅ Definition:

Measures the number of positions where two strings of equal length differ.

✅ Formula:
𝐻𝐷(𝑥,𝑦)=∑(𝑥𝑖≠𝑦𝑖)HD(x,y)=∑(xi=yi)

(Count mismatches only)

✅ Example:
x = 1011101  
y = 1001001  
HD = 2

✅ When to Use:
- Binary / categorical / string data
- Error detection in communication
- DNA sequencing (genomics)
- Text comparison (spell check, plagiarism)
- ✅ Key Point:
- Works only if lengths are same.
- Euclidean/Manhattan = numeric data
- Hamming = categorical/string data
⚡ Shortcut:
👉 Hamming = "Count the mismatches."
