[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/buildLittleWorlds/ml-math-with-densworld/blob/main/modules/02-linear-algebra/notebooks/03-dot-product-similarity.ipynb)

# Lesson 3: The Dot Product — Measuring Alignment

*"Distance tells you how far apart things are. The dot product tells you whether they're pointed the same way. A Marsh Hornet and a Stakdur may be distant in space, but they share the same predatory intent—their behavioral vectors point in a similar direction."*  
— Boffa Trent, *Natural Philosophy of the Quarry*

---

## The Core Insight

In the last two lessons, we measured how "different" creatures are using distance. But sometimes we care about something else: **are two creatures fundamentally similar in character, even if one is more extreme than the other?**

Consider:
- A young Stakdur and a mature Stakdur have different magnitudes (the mature one is more of everything)
- But they **point the same direction** in behavioral space—they share the same predatory profile

The **dot product** captures this notion of alignment. It's the foundation of:
- Cosine similarity (used in document/text analysis)
- Recommender systems ("users who liked this also liked...")
- Word embeddings (word2vec, semantic similarity)
- Attention mechanisms in neural networks

---

## Learning Objectives

By the end of this lesson, you will:
1. Calculate and interpret the dot product between vectors
2. Understand cosine similarity and when to use it
3. Use the dot product to find similar creatures and manuscripts
4. Connect the dot product to projection and feature importance

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.spatial.distance import cosine

# Set random seed for reproducibility
np.random.seed(42)

# Nice plotting defaults
plt.style.use('seaborn-v0_8-whitegrid')
plt.rcParams['figure.figsize'] = (10, 6)

# Colab-ready data loading
BASE_URL = "https://raw.githubusercontent.com/buildLittleWorlds/ml-math-with-densworld/main/data/"

# Load our datasets
creature_vectors = pd.read_csv(BASE_URL + "creature_vectors.csv")
creature_similarity = pd.read_csv(BASE_URL + "creature_similarity.csv")
manuscripts = pd.read_csv(BASE_URL + "manuscript_features.csv")

print(f"Loaded {len(creature_vectors)} creatures with behavioral/habitat vectors")
print(f"Loaded {len(creature_similarity)} pairwise similarity calculations")
print(f"Loaded {len(manuscripts)} manuscript records")

## Part 1: The Dot Product Formula

The **dot product** of two vectors $\mathbf{a}$ and $\mathbf{b}$ is:

$$\mathbf{a} \cdot \mathbf{b} = \sum_{i=1}^{n} a_i b_i = a_1 b_1 + a_2 b_2 + \cdots + a_n b_n$$

Multiply corresponding elements, then sum. Simple!

But what does it *mean*?

In [None]:
# Simple dot product calculation
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

# Manual calculation
dot_manual = a[0]*b[0] + a[1]*b[1] + a[2]*b[2]

# NumPy calculation
dot_numpy = np.dot(a, b)

print("Dot Product Calculation:")
print(f"  a = {a}")
print(f"  b = {b}")
print(f"")
print(f"  a · b = (1×4) + (2×5) + (3×6)")
print(f"        = 4 + 10 + 18")
print(f"        = {dot_manual}")
print(f"")
print(f"  np.dot(a, b) = {dot_numpy}")

## Part 2: Geometric Interpretation — Alignment

The dot product has a beautiful geometric meaning:

$$\mathbf{a} \cdot \mathbf{b} = \|\mathbf{a}\| \|\mathbf{b}\| \cos(\theta)$$

Where $\theta$ is the angle between the vectors.

This means:
- **Positive dot product**: Vectors point in similar directions ($\theta < 90°$)
- **Zero dot product**: Vectors are perpendicular ($\theta = 90°$)
- **Negative dot product**: Vectors point in opposite directions ($\theta > 90°$)

In [None]:
# Visualize dot product as alignment
fig, axes = plt.subplots(1, 3, figsize=(15, 5))

# Case 1: Similar direction (positive dot product)
ax = axes[0]
v1 = np.array([1, 0.5])
v2 = np.array([0.8, 0.6])
dot = np.dot(v1, v2)
angle = np.arccos(dot / (np.linalg.norm(v1) * np.linalg.norm(v2))) * 180 / np.pi

ax.arrow(0, 0, v1[0], v1[1], head_width=0.05, head_length=0.03, fc='blue', ec='blue', linewidth=2)
ax.arrow(0, 0, v2[0], v2[1], head_width=0.05, head_length=0.03, fc='red', ec='red', linewidth=2)
ax.set_xlim(-0.2, 1.3)
ax.set_ylim(-0.2, 1.0)
ax.set_aspect('equal')
ax.set_title(f'Similar Direction\na·b = {dot:.2f}, θ = {angle:.0f}°', fontsize=12)
ax.axhline(0, color='black', linewidth=0.5)
ax.axvline(0, color='black', linewidth=0.5)

# Case 2: Perpendicular (zero dot product)
ax = axes[1]
v1 = np.array([1, 0])
v2 = np.array([0, 1])
dot = np.dot(v1, v2)

ax.arrow(0, 0, v1[0], v1[1], head_width=0.05, head_length=0.03, fc='blue', ec='blue', linewidth=2)
ax.arrow(0, 0, v2[0], v2[1], head_width=0.05, head_length=0.03, fc='red', ec='red', linewidth=2)
ax.set_xlim(-0.2, 1.3)
ax.set_ylim(-0.2, 1.3)
ax.set_aspect('equal')
ax.set_title(f'Perpendicular (Orthogonal)\na·b = {dot:.2f}, θ = 90°', fontsize=12)
ax.axhline(0, color='black', linewidth=0.5)
ax.axvline(0, color='black', linewidth=0.5)

# Case 3: Opposite direction (negative dot product)
ax = axes[2]
v1 = np.array([1, 0.3])
v2 = np.array([-0.8, -0.2])
dot = np.dot(v1, v2)
angle = np.arccos(dot / (np.linalg.norm(v1) * np.linalg.norm(v2))) * 180 / np.pi

ax.arrow(0, 0, v1[0], v1[1], head_width=0.05, head_length=0.03, fc='blue', ec='blue', linewidth=2)
ax.arrow(0, 0, v2[0], v2[1], head_width=0.05, head_length=0.03, fc='red', ec='red', linewidth=2)
ax.set_xlim(-1.0, 1.3)
ax.set_ylim(-0.5, 0.8)
ax.set_aspect('equal')
ax.set_title(f'Opposite Direction\na·b = {dot:.2f}, θ = {angle:.0f}°', fontsize=12)
ax.axhline(0, color='black', linewidth=0.5)
ax.axvline(0, color='black', linewidth=0.5)

plt.tight_layout()
plt.show()

print("Key insight: The dot product measures how much two vectors 'agree'")
print("  - Positive = pointing together")
print("  - Zero = unrelated (orthogonal)")
print("  - Negative = pointing apart")

## Part 3: Cosine Similarity — Ignoring Magnitude

The raw dot product depends on both **alignment** and **magnitude**. Sometimes we only care about alignment.

**Cosine similarity** normalizes by the vector lengths:

$$\text{cosine\_sim}(\mathbf{a}, \mathbf{b}) = \frac{\mathbf{a} \cdot \mathbf{b}}{\|\mathbf{a}\| \|\mathbf{b}\|} = \cos(\theta)$$

This gives a value between -1 and +1:
- **+1**: Identical direction (perfect similarity)
- **0**: Perpendicular (unrelated)
- **-1**: Opposite direction (perfect dissimilarity)

In [None]:
def cosine_similarity(a, b):
    """Calculate cosine similarity between two vectors."""
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

# Compare dot product vs cosine similarity
print("Dot Product vs Cosine Similarity")
print("="*60)

# Same direction, different magnitudes
v1 = np.array([1, 1])
v2 = np.array([2, 2])  # Same direction, 2x magnitude
v3 = np.array([10, 10])  # Same direction, 10x magnitude

print("\nVectors pointing SAME DIRECTION, different magnitudes:")
print(f"  v1 = {v1}, ||v1|| = {np.linalg.norm(v1):.2f}")
print(f"  v2 = {v2}, ||v2|| = {np.linalg.norm(v2):.2f}")
print(f"  v3 = {v3}, ||v3|| = {np.linalg.norm(v3):.2f}")

print(f"\n  v1 · v2 = {np.dot(v1, v2):.2f}    cosine_sim = {cosine_similarity(v1, v2):.4f}")
print(f"  v1 · v3 = {np.dot(v1, v3):.2f}   cosine_sim = {cosine_similarity(v1, v3):.4f}")

print("\n  → Dot product changes with magnitude, cosine similarity stays 1.0!")

# Perpendicular vectors
p1 = np.array([1, 0])
p2 = np.array([0, 1])

print("\n\nPerpendicular vectors:")
print(f"  p1 = {p1}")
print(f"  p2 = {p2}")
print(f"  p1 · p2 = {np.dot(p1, p2):.2f}    cosine_sim = {cosine_similarity(p1, p2):.4f}")

## Part 4: Creature Similarity Using Dot Product and Cosine

Our `creature_similarity.csv` dataset contains pre-calculated dot products and cosine similarities for all creature pairs. Let's explore what they reveal.

*"Two creatures with high cosine similarity share the same 'personality type', regardless of how extreme each is. A timid Cave Bat and an apex Witch Creature have low cosine similarity—their personalities point in utterly different directions."*

In [None]:
# Explore pre-calculated similarities
print("Pre-calculated Creature Similarities:")
print("="*90)
cols = ['creature_a_name', 'creature_b_name', 'dot_product_behavioral', 
        'cosine_sim_behavioral', 'dot_product_full', 'cosine_sim_full']
print(creature_similarity[cols].head(15).to_string(index=False))

In [None]:
# Find most similar pairs by cosine similarity
print("Most Similar Creature Pairs (Cosine Similarity on Full Features):")
print("="*70)
most_similar = creature_similarity.nlargest(10, 'cosine_sim_full')
for _, row in most_similar.iterrows():
    print(f"  {row['creature_a_name']:22} ↔ {row['creature_b_name']:22} cos_sim = {row['cosine_sim_full']:.4f}")

print("\n\nLeast Similar Creature Pairs (Lowest Cosine Similarity):")
print("="*70)
least_similar = creature_similarity.nsmallest(10, 'cosine_sim_full')
for _, row in least_similar.iterrows():
    print(f"  {row['creature_a_name']:22} ↔ {row['creature_b_name']:22} cos_sim = {row['cosine_sim_full']:.4f}")

In [None]:
# Visualize: Dot product vs Euclidean distance
fig, ax = plt.subplots(figsize=(10, 8))

ax.scatter(creature_similarity['euclidean_dist_full'], 
           creature_similarity['cosine_sim_full'],
           alpha=0.6, c='steelblue', s=40)

ax.set_xlabel('Euclidean Distance (L2)', fontsize=12)
ax.set_ylabel('Cosine Similarity', fontsize=12)
ax.set_title('Distance vs Similarity: Different Perspectives on Creature Relationships', fontsize=13)

# Annotate some interesting points
extreme_pairs = creature_similarity[
    (creature_similarity['cosine_sim_full'] > 0.98) | 
    (creature_similarity['cosine_sim_full'] < 0.5)
].head(5)

for _, row in extreme_pairs.iterrows():
    label = f"{row['creature_a_name'][:8]}-{row['creature_b_name'][:8]}"
    ax.annotate(label, (row['euclidean_dist_full'], row['cosine_sim_full']), 
                fontsize=8, alpha=0.8)

ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

print("Notice: High cosine similarity doesn't always mean low distance!")
print("Distance measures 'how far apart', cosine measures 'pointing same way'.")

## Part 5: Behavioral vs Habitat Similarity

An interesting question: Do creatures that behave similarly also live in similar habitats?

We can compare `cosine_sim_behavioral` with `cosine_sim_habitat` to find creatures that are:
- Similar in behavior AND habitat (consistent ecological niche)
- Similar in behavior BUT different habitat (convergent evolution?)

In [None]:
# Calculate behavioral and habitat cosine similarities directly
behavioral_features = ['aggression', 'sociality', 'nocturnality', 'territoriality', 'hunting_strategy']
habitat_features = ['depth_preference', 'moisture_preference', 'light_tolerance', 'cave_affinity', 'surface_affinity']

# Get the feature matrices
X_behavioral = creature_vectors[behavioral_features].values
X_habitat = creature_vectors[habitat_features].values
names = creature_vectors['common_name'].values

# Calculate cosine similarity matrices
def cosine_sim_matrix(X):
    """Calculate pairwise cosine similarity matrix."""
    norms = np.linalg.norm(X, axis=1, keepdims=True)
    X_normalized = X / norms
    return X_normalized @ X_normalized.T

behavioral_sim = cosine_sim_matrix(X_behavioral)
habitat_sim = cosine_sim_matrix(X_habitat)

# Find pairs with very different behavioral vs habitat similarity
print("Creatures Similar in Behavior BUT Different in Habitat:")
print("(High behavioral cosine, low habitat cosine)")
print("="*70)

pairs_analysis = []
for i in range(len(names)):
    for j in range(i+1, len(names)):
        b_sim = behavioral_sim[i, j]
        h_sim = habitat_sim[i, j]
        diff = b_sim - h_sim
        pairs_analysis.append({
            'creature_a': names[i],
            'creature_b': names[j],
            'behavioral_sim': b_sim,
            'habitat_sim': h_sim,
            'difference': diff
        })

pairs_df = pd.DataFrame(pairs_analysis)

# Show pairs with high behavioral but low habitat similarity
divergent = pairs_df[(pairs_df['behavioral_sim'] > 0.8) & (pairs_df['habitat_sim'] < 0.7)]
print(divergent.sort_values('difference', ascending=False).head(10).to_string(index=False))

In [None]:
# Scatter plot: behavioral similarity vs habitat similarity
fig, ax = plt.subplots(figsize=(10, 8))

ax.scatter(pairs_df['behavioral_sim'], pairs_df['habitat_sim'], 
           alpha=0.5, c='steelblue', s=30)

# Add diagonal line (perfect correlation)
ax.plot([0, 1], [0, 1], 'r--', linewidth=2, label='Perfect correlation')

ax.set_xlabel('Behavioral Cosine Similarity', fontsize=12)
ax.set_ylabel('Habitat Cosine Similarity', fontsize=12)
ax.set_title('Do Similar Behaviors Mean Similar Habitats?', fontsize=13)
ax.legend()
ax.set_xlim(0, 1)
ax.set_ylim(0, 1)
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Calculate correlation
correlation = pairs_df['behavioral_sim'].corr(pairs_df['habitat_sim'])
print(f"\nCorrelation between behavioral and habitat similarity: {correlation:.3f}")
print("\nInterpretation: Moderately correlated—creatures with similar behavior")
print("tend to prefer similar habitats, but there are notable exceptions.")

## Part 6: Manuscript Similarity — Detecting Forgeries

The dot product and cosine similarity are heavily used in document analysis. In the Archives, manuscripts are represented as vectors of stylistic features. Forgeries often have suspicious similarity patterns.

*"Mink Pavar's forgeries show high cosine similarity to multiple schools—an impossibility for a genuine manuscript. A true Water School text points firmly in the Water School direction. A forgery hedges its bets."*

In [None]:
# Create school alignment vectors
school_features = ['school_alignment_stone', 'school_alignment_water', 'school_alignment_pebble']

print("Manuscript School Alignment Vectors:")
print("="*80)
print(manuscripts[['manuscript_id', 'attributed_author', 'is_forgery'] + school_features].head(15).to_string(index=False))

In [None]:
# Create prototype vectors for each school
# (average of authentic manuscripts from that school)
authentic = manuscripts[~manuscripts['is_forgery']]

# Prototype: pure Stone School direction
stone_prototype = np.array([1, 0, 0])
water_prototype = np.array([0, 1, 0])
pebble_prototype = np.array([0, 0, 1])

# Calculate each manuscript's cosine similarity to each school prototype
ms_vectors = manuscripts[school_features].values

manuscripts['sim_to_stone'] = [cosine_similarity(v, stone_prototype) for v in ms_vectors]
manuscripts['sim_to_water'] = [cosine_similarity(v, water_prototype) for v in ms_vectors]
manuscripts['sim_to_pebble'] = [cosine_similarity(v, pebble_prototype) for v in ms_vectors]

# Calculate "mixed signal" score: how evenly distributed across schools?
manuscripts['max_school_sim'] = manuscripts[['sim_to_stone', 'sim_to_water', 'sim_to_pebble']].max(axis=1)

print("Manuscript Similarity to School Prototypes:")
print("="*90)
print(manuscripts[['manuscript_id', 'attributed_author', 'is_forgery', 
                   'sim_to_stone', 'sim_to_water', 'sim_to_pebble', 'max_school_sim']].head(15).to_string(index=False))

In [None]:
# Compare forgeries vs authentic: max school similarity
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Histogram of max school similarity
ax = axes[0]
authentic_ms = manuscripts[~manuscripts['is_forgery']]
forged_ms = manuscripts[manuscripts['is_forgery']]

ax.hist(authentic_ms['max_school_sim'], bins=20, alpha=0.7, label='Authentic', color='steelblue', density=True)
ax.hist(forged_ms['max_school_sim'], bins=20, alpha=0.7, label='Forged', color='crimson', density=True)
ax.set_xlabel('Maximum Cosine Similarity to Any School', fontsize=11)
ax.set_ylabel('Density', fontsize=11)
ax.set_title('Authentic Manuscripts Have Clearer School Alignment', fontsize=12)
ax.legend()

# Scatter: sim to Stone vs sim to Water
ax = axes[1]
ax.scatter(authentic_ms['sim_to_stone'], authentic_ms['sim_to_water'], 
           c='steelblue', s=40, alpha=0.6, label='Authentic')
ax.scatter(forged_ms['sim_to_stone'], forged_ms['sim_to_water'], 
           c='crimson', s=80, alpha=0.8, marker='X', label='Forged')
ax.set_xlabel('Similarity to Stone School', fontsize=11)
ax.set_ylabel('Similarity to Water School', fontsize=11)
ax.set_title('Forgeries Show Mixed School Signals', fontsize=12)
ax.legend()

plt.tight_layout()
plt.show()

print(f"\nAverage max school similarity:")
print(f"  Authentic manuscripts: {authentic_ms['max_school_sim'].mean():.3f}")
print(f"  Forged manuscripts:    {forged_ms['max_school_sim'].mean():.3f}")
print("\nForgeries hedge—they don't point firmly toward any single school.")

## Part 7: The Dot Product as Projection

Another powerful interpretation: the dot product tells you **how much of one vector lies along another**.

The **projection** of $\mathbf{a}$ onto $\mathbf{b}$ is:

$$\text{proj}_{\mathbf{b}} \mathbf{a} = \frac{\mathbf{a} \cdot \mathbf{b}}{\|\mathbf{b}\|^2} \mathbf{b}$$

The scalar part $\frac{\mathbf{a} \cdot \mathbf{b}}{\|\mathbf{b}\|}$ tells you the "length" of $\mathbf{a}$ in the direction of $\mathbf{b}$.

This is useful for:
- Measuring how much a creature exhibits a particular trait direction
- Feature importance in machine learning
- Decomposing complex vectors into simpler components

In [None]:
# Visualize projection
fig, ax = plt.subplots(figsize=(10, 8))

# Define vectors
a = np.array([3, 2])
b = np.array([1, 0])  # Unit vector along x-axis

# Calculate projection
proj_scalar = np.dot(a, b) / np.linalg.norm(b)
proj_vector = (np.dot(a, b) / np.dot(b, b)) * b

# Draw vectors
ax.arrow(0, 0, a[0], a[1], head_width=0.1, head_length=0.08, fc='blue', ec='blue', 
         linewidth=2, label=f'a = {a}')
ax.arrow(0, 0, b[0]*3.5, b[1], head_width=0.1, head_length=0.08, fc='green', ec='green', 
         linewidth=2, label=f'b direction')
ax.arrow(0, 0, proj_vector[0], proj_vector[1], head_width=0.1, head_length=0.08, 
         fc='red', ec='red', linewidth=3, label=f'projection = {proj_vector}')

# Draw dashed line showing projection
ax.plot([a[0], proj_vector[0]], [a[1], proj_vector[1]], 'k--', linewidth=1.5, alpha=0.7)

ax.set_xlim(-0.5, 4)
ax.set_ylim(-0.5, 3)
ax.set_aspect('equal')
ax.set_xlabel('x', fontsize=12)
ax.set_ylabel('y', fontsize=12)
ax.set_title('Projection: How Much of a Lies Along b?', fontsize=13)
ax.legend(fontsize=11)
ax.axhline(0, color='black', linewidth=0.5)
ax.axvline(0, color='black', linewidth=0.5)
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print(f"Vector a = {a}")
print(f"Projecting onto b = {b}")
print(f"")
print(f"Projection scalar (length along b): {proj_scalar:.2f}")
print(f"Projection vector: {proj_vector}")
print(f"\nInterpretation: The 'x-component' of a is {proj_scalar:.2f}")

In [None]:
# Application: Project creatures onto "aggression direction"
print("Projecting Creatures onto 'Pure Aggression' Direction:")
print("="*60)

# Define "pure aggression" direction: [1, 0, 0, 0, 0]
aggression_direction = np.array([1, 0, 0, 0, 0])

# Calculate projection of each creature onto aggression direction
for _, row in creature_vectors.iterrows():
    creature_vec = row[behavioral_features].values
    proj = np.dot(creature_vec, aggression_direction)  # b is unit vector, so just dot product
    print(f"  {row['common_name']:25} → aggression projection = {proj:.2f}")

print("\nThis is just the aggression value—projection onto a coordinate axis")
print("extracts that single feature!")

## Summary

| Concept | Key Insight | Densworld Example |
|---------|-------------|-------------------|
| **Dot Product** | Multiply and sum: $\sum a_i b_i$ | Measuring creature trait alignment |
| **Geometric Meaning** | $\mathbf{a} \cdot \mathbf{b} = \|a\|\|b\|\cos\theta$ | Angle between creature vectors |
| **Cosine Similarity** | Normalized to [-1, 1], ignores magnitude | Cave Bat vs Yeller Bat: same "type" |
| **Positive/Zero/Negative** | Same/orthogonal/opposite direction | Predators vs prey personalities |
| **Projection** | How much lies along a direction | Creature's "aggression component" |
| **Document Analysis** | Texts as vectors; similarity = alignment | Detecting mixed-school forgeries |

---

## Exercises

### Exercise 1: Find Your Creature's Match

Choose a creature and find the 3 creatures most similar to it by cosine similarity (using behavioral features). Then find the 3 most similar by Euclidean distance. Do the rankings differ? Why?

In [None]:
# Exercise 1: Your code here
# Hint: Use creature_similarity DataFrame or calculate directly



### Exercise 2: Orthogonal Creatures

Find pairs of creatures with cosine similarity near 0 (perpendicular behavioral vectors). What does it mean for two creatures to be "orthogonal" in this context?

In [None]:
# Exercise 2: Your code here
# Hint: Look for pairs where abs(cosine_sim) < 0.1



### Exercise 3: Forgery Detection Score

Create a "suspiciousness score" for manuscripts based on how evenly their school alignments are distributed. A genuine manuscript should point strongly toward ONE school. Calculate this score and see if it separates forgeries from authentic manuscripts.

In [None]:
# Exercise 3: Your code here
# Hint: High entropy = evenly distributed = suspicious



### Exercise 4: Custom Similarity Direction

Create a "predator profile" vector by averaging the behavioral vectors of known predators (Witch Creature, Stakdur, Maw Beast, Wharver). Then calculate how similar each creature is to this predator profile using cosine similarity. Which creatures score highest?

In [None]:
# Exercise 4: Your code here
# Hint: Average vectors, then calculate cosine similarity to all creatures



---

## Next Lesson

In **Lesson 4: Matrix Transformations**, we'll see how matrices act as *functions* that transform vectors. When you multiply a vector by a matrix, you're moving it to a new location in space—rotating, scaling, or projecting it. This is the foundation of neural network layers.

*"A matrix is a machine. Feed it a creature's vector, and it returns a transformed version—perhaps the creature as it would appear to different eyes, or the same traits weighted by their importance."*  
— Boffa Trent