In [None]:
## What is Distance?
Distance is simply a way to measure how different or similar two pieces of data are from each other.

## Key Points:
1. Distance is always a **non-negative number** (zero or positive)
2. The smaller the distance, the more similar the items
3. The larger the distance, the more different the items

## Real-World Examples:

### Movie Ratings Example
Imagine two people rating movies from 1-5 stars:
- Person A: [5,4,3,2,1]
- Person B: [5,4,3,2,2]
- Person C: [1,2,3,4,5]

The ratings of Person A and B are very similar, so they have a **small distance**.
The ratings of Person A and C are very different, so they have a **large distance**.


```
If d(x,y) is small → x and y are similar
If d(x,y) < d(x,z) → x is more similar to y than to z
```


## The Basic Idea
Think of similarity as the opposite of distance:
* When similarity is HIGH, distance is LOW
* When similarity is LOW, distance is HIGH

## Similarity Scores
* Usually ranges from 0 to 1 (like a percentage)
* 1 = Perfect match (100% similar)
* 0 = Completely different (0% similar)


* High similarity = Low distance
* Low similarity = High distance
* We can always convert between them using simple formulas


## Converting Between Similarity and Distance
If similarity is on a 0-1 scale, you can convert:
* Distance = 1 - Similarity
* Similarity = 1 - Distance

Example:
```
If similarity = 0.7
Then distance = 1 - 0.7 = 0.3

If distance = 0.4
Then similarity = 1 - 0.4 = 0.6
```


## Netflix Movie Recommendations

Comparing two viewers' taste in movies:
```
Viewer A and B:
- Similarity Score: 0.8 (very similar taste)
- Distance Score: 0.2 (very close)

Viewer A and C:
- Similarity Score: 0.2 (very different taste)
- Distance Score: 0.8 (far apart)
```


### When Values Aren't on a 0-1 Scale
For larger ranges, we can convert using:
1. Simple scaling: 

  $$Similarity = 1 - \frac{Distance}{Maximum_Distance}$$
2. Or using the fraction method:
   $$Similarity = 1 / (1 + Distance)$$


### The 4 Rules of Distance Measurements

* For a distance measure it must follow these four basic rules. 

* Rule 1: Non-Negativity 
  * Distance can never be negative
  * Example: 
    - The distance between New York and Boston can be 215 miles

* Rule 2: Identity Rule
  * Only identical things have zero distance between them
  * Examples:
    - Distance between an apple and itself = 0
    - Distance between an apple and an orange ≠ 0
    - If the distance is 0, the objects MUST be identical

* Rule 3: Symmetry Rule
  * The distance from A to B is the same as B to A

* Rule 4: Triangle Inequality
  *  "The direct path is always shortest"
  * Real-world example:


![image.png](attachment:image.png)


### What is a Metric Space?
A metric space is just a collection of objects where we can measure the distance between any two objects in a way that makes sense.


1. Physical World (3D Space)
  * Objects: Points in space
  * Distance: Regular measuring tape distance
  * Example: Finding nearest coffee shops to your location

2. Word Space
  * Objects: Words or documents
  * Distance: How different two words/documents are
  * Example: Finding similar documents in Google




### Why Are Metric Spaces Useful?
* They help us answer questions like:
  * "What's the closest...?"
  * "Which items are most similar?"
  * "What group does this belong to?"

* Real-World Applications:
  * Netflix finding similar movies
  * Spotify suggesting similar songs
  * Amazon recommending similar products
  * Face ID matching your selfie


### How Triangle Inequality Speeds Up Cluster Search

* The Problem
  * Given: Multiple cluster centers (c₁, c₂, ..., cₖ)
  * Goal: Find the closest center to each data point x
  * Challenge: Doing it efficiently!


1. Naive Approach (Slow)

```
For each point x:
    Calculate distance to every cluster center
    Pick the smallest distance
Cost: n*k calculations (n = points, k = clusters)
```



In [None]:

### How Triangle Inequality Speeds Up Cluster Search - Cont'd

2. Smart Approach (Fast)
  * Uses triangle inequality to skip unnecessary calculations!

#### The Trick:
If we know:
* Distance between point x and c₁: d(x,c₁)
* Distance between centers c₁ and c₂: d(c₁,c₂)
* And if: d(c₁,c₂) > 2d(x,c₁)

Then: c₂ CANNOT be closer to x than c₁

#### Why This Works:
* If c₂ was closer, it would violate triangle inequality
* Think of it like a circle:
  - Center: point x
  - Radius: distance to current best center
  - If another center is too far from current best
  - It can't possibly be closer to x

## Real-World Example:
```
Finding nearest coffee shop:
- You're 1 mile from Starbucks (c₁)
- There's another coffee shop (c₂) 3 miles from Starbucks
- If 3 miles > 2(1 mile)
- Then c₂ can't possibly be closer to you!
```

## Benefits:
* Fewer distance calculations needed
* Much faster for large datasets
* Especially useful in high-dimensional data
* Key optimization in many clustering algorithms