# 01: Linear Algebra Intro

Welcome to your first deep dive! Today we’ll build intuition and hands-on experience with vectors, matrices, and their operations—the foundation of almost every ML algorithm.

> 💡 **Companion Reading**: This notebook pairs with [01_linear_algebra_intro.md](01_linear_algebra_intro.md) for deeper mathematical insights, analogies, and tutor guidance.

## 🎯 Objectives
- Understand what vectors and matrices are conceptually and computationally
- Learn how to compute dot products and matrix multiplication
- Visualize geometric interpretations of linear transformations
- Explore how these concepts relate to ML models
- Build intuition through hands-on coding and visualization


## 🎯 Understanding Vectors: The Foundation of Machine Learning

Before we dive into operations, let's build a deep understanding of **what vectors are** and **why they're absolutely central** to machine learning.

### What Are Vectors?

A **vector** is much more than just "a list of numbers." It's a mathematical object that represents:
- **Magnitude** (how big/long it is)
- **Direction** (which way it points)
- **Position in space** (where it lives in multi-dimensional space)

Think of vectors as **arrows in space** that encode information about data, relationships, and transformations.

### Why Vectors Are Central to Machine Learning

**Every piece of data in ML becomes a vector:**
- 📸 **Images**: Each pixel's color values → vector of intensities
- 📝 **Text**: Word frequencies or embeddings → vector representations  
- 👤 **User profiles**: Age, income, preferences → feature vectors
- 🎵 **Audio**: Frequency components over time → signal vectors
- 🧬 **DNA**: Nucleotide sequences → biological feature vectors

**All ML algorithms work by:**
1. **Representing** data as vectors in high-dimensional spaces
2. **Measuring** relationships between vectors (similarity, distance)
3. **Transforming** vectors to find patterns and make predictions
4. **Learning** optimal vector transformations from data

Let's explore this with concrete examples!


In [None]:
import numpy as np
import matplotlib.pyplot as plt

# Example 1: Representing different types of data as vectors
print("=== DATA AS VECTORS ===")

# A simple user profile as a vector [age, income_k, hours_online_per_day]
user1 = np.array([25, 45, 3])  # 25 years old, $45k income, 3 hours online
user2 = np.array([35, 75, 1])  # 35 years old, $75k income, 1 hour online
user3 = np.array([22, 35, 8])  # 22 years old, $35k income, 8 hours online

print("User profiles as vectors:")
print(f"User 1: {user1} (age, income_k, hours_online)")
print(f"User 2: {user2}")
print(f"User 3: {user3}")

# Which users are most similar? We can measure this with vector similarity!
similarity_1_2 = np.dot(user1, user2) / (np.linalg.norm(user1) * np.linalg.norm(user2))
similarity_1_3 = np.dot(user1, user3) / (np.linalg.norm(user1) * np.linalg.norm(user3))

print(f"\nSimilarity between User 1 and User 2: {similarity_1_2:.3f}")
print(f"Similarity between User 1 and User 3: {similarity_1_3:.3f}")
print("Higher values mean more similar users!")

print("\n⚠️  IMPORTANT NOTE ABOUT THESE RESULTS:")
print("These similarity scores aren't very meaningful yet because our data")
print("has different scales: age (20-40), income (30k-80k), hours (1-8).")
print("The income values dominate the calculation!")
print("We'll learn about 'normalization' below to fix this problem.")


### Visual Understanding: Vectors as Arrows

Let's visualize vectors to build geometric intuition:


In [None]:
# Create a comprehensive vector visualization
fig = plt.figure(figsize=(12, 10))
ax1 = plt.subplot(2, 2, 1)
ax2 = plt.subplot(2, 2, 2)
ax3 = plt.subplot(2, 2, 3)
ax4 = plt.subplot(2, 2, 4, projection='3d')

# Plot 1: Basic vector representation
v1 = np.array([3, 4])
v2 = np.array([1, 3])
ax1.quiver(0, 0, v1[0], v1[1], angles='xy', scale_units='xy', scale=1,
           color='blue', width=0.01, label=f'v1 = {v1}')
ax1.quiver(0, 0, v2[0], v2[1], angles='xy', scale_units='xy', scale=1,
           color='red', width=0.01, label=f'v2 = {v2}')
ax1.set_xlim(-1, 5)
ax1.set_ylim(-1, 5)
ax1.grid(True, alpha=0.3)
ax1.legend()
ax1.set_title('Vectors as Arrows in Space')
ax1.set_xlabel('X dimension')
ax1.set_ylabel('Y dimension')

# Add magnitude annotations
ax1.annotate(f'|v1| = {np.linalg.norm(v1):.1f}', xy=(1.5, 2), fontsize=10)
ax1.annotate(f'|v2| = {np.linalg.norm(v2):.1f}', xy=(0.5, 1.5), fontsize=10)

# Plot 2: Angles between vectors

# Compute angle between vectors
dot_product = np.dot(v1, v2)
norm_v1 = np.linalg.norm(v1)
norm_v2 = np.linalg.norm(v2)
angle = np.arccos(dot_product / (norm_v1 * norm_v2))

# Determine direction using cross product
cross = v1[0] * v2[1] - v1[1] * v2[0]
if cross < 0:
    angle = -angle  # Clockwise

# Set radius
r = 0.5

# Compute angle of the first vector
theta1 = np.arctan2(v1[1], v1[0])

# Generate the arc
theta = np.linspace(theta1, theta1 + angle, 50)
x = r * np.cos(theta)
y = r * np.sin(theta)

ax2.plot(x, y, color='green', label='Arc Between Vectors')
ax2.text(-0.4, -0.4, f'θ = {np.degrees(angle):.1f}°', fontsize=12, color='green')

ax2.quiver(0, 0, v1[0], v1[1], angles='xy', scale_units='xy', scale=1,
           color='blue', width=0.01, label='v1')
ax2.quiver(0, 0, v2[0], v2[1], angles='xy', scale_units='xy', scale=1,
           color='red', width=0.01, label='v2')
ax2.set_xlim(-1, 5)
ax2.set_ylim(-1, 5)
ax2.grid(True, alpha=0.3)
ax2.legend()
ax2.set_title('Angle Between Vectors')
ax2.set_xlabel('X dimension')
ax2.set_ylabel('Y dimension')

# Plot 3: Different vector relationships
vectors = {
    'Same direction': ([2, 1], [4, 2], 'green'),
    'Perpendicular': ([1, 0], [0, 1], 'orange'),
    'Opposite': ([1, 1], [-1, -1], 'purple')
}

colors = ['green', 'orange', 'purple']
for i, (label, (v_a, v_b, color)) in enumerate(vectors.items()):
    ax3.quiver(0, 0, v_a[0], v_a[1], angles='xy', scale_units='xy', scale=1,
               color=color, width=0.008, alpha=0.7)
    ax3.quiver(0, 0, v_b[0], v_b[1], angles='xy', scale_units='xy', scale=1,
               color=color, width=0.008, alpha=0.7, linestyle='--')

    # Calculate and display dot product
    dot_prod = np.dot(v_a, v_b)
    ax3.text(-2, 2 - i * 0.5, f'{label}: dot = {dot_prod}', color=color, fontsize=10)

ax3.set_xlim(-2, 5)
ax3.set_ylim(-2, 3)
ax3.grid(True, alpha=0.3)
ax3.set_title('Vector Relationships & Dot Products')
ax3.set_xlabel('X dimension')
ax3.set_ylabel('Y dimension')

# Plot 4: ML Application - Document similarity (3D visualization)
# Simulate word frequency vectors for documents
doc1 = np.array([6, 0, 4])  # [freq_of_'machine learning', 'cooking', 'algorithm']
doc2 = np.array([8, 0, 5])  # Similar document about ML
doc3 = np.array([0, 4, 0])  # Document about cooking

# Calculate similarities
sim_1_2 = np.dot(doc1, doc2) / (np.linalg.norm(doc1) * np.linalg.norm(doc2))
sim_1_3 = np.dot(doc1, doc3) / (np.linalg.norm(doc1) * np.linalg.norm(doc3))

# Visualize in full 3D space (all three dimensions)
ax4.scatter(doc1[0], doc1[1], doc1[2], s=100, color='blue', label='ML Doc 1', alpha=0.8)
ax4.scatter(doc2[0], doc2[1], doc2[2], s=100, color='green', label='ML Doc 2', alpha=0.8)
ax4.scatter(doc3[0], doc3[1], doc3[2], s=100, color='red', label='Cooking Doc', alpha=0.8)

# Draw similarity lines in 3D
ax4.plot([doc1[0], doc2[0]], [doc1[1], doc2[1]], [doc1[2], doc2[2]], 
         'purple', alpha=0.6, linewidth=2, label=f'ML Similarity: {sim_1_2:.3f}')
ax4.plot([doc1[0], doc3[0]], [doc1[1], doc3[1]], [doc1[2], doc3[2]], 
         'yellow', alpha=0.6, linewidth=2, label=f'Cross-domain: {sim_1_3:.3f}')

# Add text annotations for document vectors
ax4.text(doc1[0], doc1[1], doc1[2]+0.2, f'[{doc1[0]},{doc1[1]},{doc1[2]}]', fontsize=8)
ax4.text(doc2[0], doc2[1], doc2[2]+0.2, f'[{doc2[0]},{doc2[1]},{doc2[2]}]', fontsize=8)
ax4.text(doc3[0], doc3[1], doc3[2]+0.2, f'[{doc3[0]},{doc3[1]},{doc3[2]}]', fontsize=8)

ax4.set_xlim(0, 8)
ax4.set_ylim(0, 4)
ax4.set_zlim(0, 3)
ax4.grid(True, alpha=0.3)
ax4.legend(loc='upper left', fontsize=9)
ax4.set_title('ML Application: Document Similarity (3D)', fontsize=11)
ax4.set_xlabel('Frequency of "machine learning"')
ax4.set_ylabel('Frequency of "cooking"')
ax4.set_zlabel('Frequency of "algorithm"')

plt.tight_layout()
plt.show()

print("\n=== KEY INSIGHTS ===")
print("• Vectors encode both magnitude (length) and direction")
print("• Angle between vectors measures their similarity/relationship")
print("• Small angles = high similarity, large angles = low similarity")
print("• 3D visualization shows ALL dimensions, not just projections")
print("• Document vectors: ML docs cluster together, cooking doc is distant")
print("• Algorithm frequency (3rd dimension) adds important context!")
print("• ML algorithms use these geometric relationships to find patterns!")



### 🎯 Why Do Vectors Start at the Origin?

You might have noticed that in all our visualizations, vectors start at the origin (0,0). Let's explore **why** this is the case and **whether it's always necessary**.

#### The Mathematical Reasoning

**In our examples, vectors start at the origin (0,0) for practical reasons:**
- **Standard reference point**: Creates consistency for all vector operations
- **Teaching clarity**: Makes coordinates and position relationship obvious
- **Simplified calculations**: Coordinates directly give vector components

#### Is the Origin Always (0,0)?

**No!** Vectors can start from any point. Let's demonstrate this with code:


In [None]:
# Demonstrate that vectors represent displacement, not absolute position
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))

# Same vector [3, 4] starting from different points
vector_displacement = np.array([3, 4])

# Starting points
start_points = [
    (0, 0),  # Origin
    (2, 1),  # Different starting point
    (5, 3),  # Another starting point
]

colors = ['blue', 'red', 'green']
labels = ['From origin (0,0)', 'From point (2,1)', 'From point (5,3)']

# Plot 1: Same vector from different starting points
for i, (start_x, start_y) in enumerate(start_points):
    end_x = start_x + vector_displacement[0]
    end_y = start_y + vector_displacement[1]

    ax1.quiver(start_x, start_y, vector_displacement[0], vector_displacement[1], 
               angles='xy', scale_units='xy', scale=1, 
               color=colors[i], width=0.008, label=labels[i])

    # Mark start and end points
    ax1.plot(start_x, start_y, 'o', color=colors[i], markersize=8, alpha=0.7)
    ax1.plot(end_x, end_y, 's', color=colors[i], markersize=8, alpha=0.7)

    # Add text annotations
    ax1.annotate(f'Start: ({start_x},{start_y})', 
                xy=(start_x, start_y), xytext=(5, 5), 
                textcoords='offset points', fontsize=9, color=colors[i])
    ax1.annotate(f'End: ({end_x},{end_y})', 
                xy=(end_x, end_y), xytext=(5, 5), 
                textcoords='offset points', fontsize=9, color=colors[i])

ax1.set_xlim(-1, 9)
ax1.set_ylim(-1, 8)
ax1.grid(True, alpha=0.3)
ax1.legend()
ax1.set_title('Same Vector [3,4] from Different Starting Points')
ax1.set_xlabel('X dimension')
ax1.set_ylabel('Y dimension')

# Plot 2: Demonstrate that all vectors have same magnitude and direction
ax2.text(0.1, 0.9, 'All vectors represent the same displacement:', 
         transform=ax2.transAxes, fontsize=12, weight='bold')

for i, (start_x, start_y) in enumerate(start_points):
    magnitude = np.linalg.norm(vector_displacement)
    direction = np.degrees(np.arctan2(vector_displacement[1], vector_displacement[0]))

    ax2.text(0.1, 0.8 - i*0.15, f'{labels[i]}:', 
             transform=ax2.transAxes, fontsize=11, color=colors[i], weight='bold')
    ax2.text(0.1, 0.75 - i*0.15, f'  • Magnitude: {magnitude:.1f}', 
             transform=ax2.transAxes, fontsize=10, color=colors[i])
    ax2.text(0.1, 0.7 - i*0.15, f'  • Direction: {direction:.1f}°', 
             transform=ax2.transAxes, fontsize=10, color=colors[i])

ax2.text(0.1, 0.3, 'Key Insight: The vector [3,4] represents the same\n"move 3 right, 4 up" regardless of starting point!', 
         transform=ax2.transAxes, fontsize=11, 
         bbox=dict(boxstyle="round,pad=0.3", facecolor="lightblue", alpha=0.7))

ax2.set_xlim(0, 1)
ax2.set_ylim(0, 1)
ax2.axis('off')

plt.tight_layout()
plt.show()

print("=== UNDERSTANDING VECTOR ORIGINS ===")
print("✓ Vectors represent DISPLACEMENT, not absolute position")
print("✓ The same vector [3,4] has identical magnitude and direction regardless of starting point")
print("✓ We use origin (0,0) in examples for:")
print("  • Teaching clarity")
print("  • Mathematical convenience") 
print("  • Standard convention")
print("✓ In real applications, vectors can start from any point!")


### Real-World Applications: Vectors Beyond the Origin

Examples where vectors DON'T start at (0,0):

🚗 **Physics - Car velocity:**
- A car at position (100, 50) moving with velocity [20, 10]
- The velocity vector starts from the car's current position

🎮 **Computer Graphics - Object movement:**
- A game character at (x, y) moving with direction vector [dx, dy]
- Movement vector starts from character's current location

🧠 **Machine Learning - Feature vectors:**
- User profile [age=25, income=50k] in feature space
- The 'origin' might represent average values, not (0,0)

💡 **Key Takeaway:**
- Vectors are about DIRECTION and MAGNITUDE, not starting position!
- We use (0,0) in tutorials for simplicity, but vectors work from any point.


### 🔄 Vector Comparison: Same vs Different Starting Points

**Are vectors always compared to the same starting point?** No! The comparison approach depends on what you're trying to measure.

#### When Vectors Share the Same Starting Point
- **Purpose**: Compare directions and magnitudes directly
- **Examples**: User preferences, force analysis, data similarity

#### When Vectors Have Different Starting Points  
- **Purpose**: Compare relative movements or changes
- **Examples**: Navigation, stock changes, game movements

Let's explore this with concrete examples:


In [None]:
# Demonstrate vector comparison scenarios
print("=== VECTOR COMPARISON SCENARIOS ===")
print()

# Scenario 1: Same starting point - User profiles (building on our earlier example)
print("🎯 SAME STARTING POINT - User Profiles:")
# Let's use normalized versions of our user data for fair comparison
user_a_norm = np.array([0.5, 0.6, 0.3])  # [age_norm, income_norm, hours_norm]
user_b_norm = np.array([0.7, 0.9, 0.1])  # Normalized to 0-1 scale

similarity = np.dot(user_a_norm, user_b_norm) / (np.linalg.norm(user_a_norm) * np.linalg.norm(user_b_norm))
print(f"   User A (normalized): {user_a_norm} (age, income, hours_online)")
print(f"   User B (normalized): {user_b_norm}")
print(f"   Similarity score: {similarity:.3f}")
print("   → Both start from [0,0,0] baseline to compare profiles directly")
print()

# Scenario 2: Different starting points - Navigation
print("🗺️ DIFFERENT STARTING POINTS - Navigation:")
person_a_location = np.array([10, 20])
person_b_location = np.array([50, 60])
store_location = np.array([15, 23])

vector_a = store_location - person_a_location  # Direction to store from A
vector_b = store_location - person_b_location  # Direction to store from B

print(f"   Person A at {person_a_location} → Store: vector {vector_a}")
print(f"   Person B at {person_b_location} → Store: vector {vector_b}")
print("   → Same destination, different starting points = different vectors")
print()

# Scenario 3: Same starting point - User behavior changes over time
print("📈 SAME STARTING POINT - User Behavior Changes:")
user_a_changes = np.array([0.1, -0.05, 0.15])   # Changes in online hours (normalized)
user_b_changes = np.array([0.05, 0.1, -0.05])   # Changes in online hours (normalized)

correlation = np.corrcoef(user_a_changes, user_b_changes)[0,1]
print(f"   User A changes: {user_a_changes} (Month 1, 2, 3 online hours)")
print(f"   User B changes: {user_b_changes}")
print(f"   Correlation: {correlation:.3f}")
print("   → Both start from 0 baseline to compare behavior patterns")
print()

# Scenario 4: Different starting points - Game physics
print("🎮 DIFFERENT STARTING POINTS - Game Physics:")
player1_pos = np.array([100, 200])
player2_pos = np.array([300, 400])
movement = np.array([10, 0])  # Both move right

print(f"   Player 1 at {player1_pos} moves by {movement}")
print(f"   Player 2 at {player2_pos} moves by {movement}")
print("   → Same movement vector, different starting positions")


In [None]:
# Visualize the comparison scenarios
fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, figsize=(14, 10))

# Plot 1: Same starting point - User profiles (normalized)
categories = ['Age', 'Income', 'Hours Online']
x_pos = np.arange(len(categories))
width = 0.35

ax1.bar(x_pos - width/2, user_a_norm, width, label='User A', alpha=0.8, color='blue')
ax1.bar(x_pos + width/2, user_b_norm, width, label='User B', alpha=0.8, color='red')
ax1.set_xlabel('User Profile Features (Normalized)')
ax1.set_ylabel('Normalized Value (0-1)')
ax1.set_title('Same Starting Point: User Profiles')
ax1.set_xticks(x_pos)
ax1.set_xticklabels(categories)
ax1.legend()
ax1.grid(True, alpha=0.3)

# Plot 2: Different starting points - Navigation
ax2.scatter(*person_a_location, s=100, color='blue', label='Person A', marker='o')
ax2.scatter(*person_b_location, s=100, color='red', label='Person B', marker='o')
ax2.scatter(*store_location, s=150, color='green', label='Store', marker='s')

# Draw vectors to store
ax2.quiver(person_a_location[0], person_a_location[1], vector_a[0], vector_a[1], 
           angles='xy', scale_units='xy', scale=1, color='blue', width=0.005)
ax2.quiver(person_b_location[0], person_b_location[1], vector_b[0], vector_b[1], 
           angles='xy', scale_units='xy', scale=1, color='red', width=0.005)

ax2.set_xlabel('X Position')
ax2.set_ylabel('Y Position')
ax2.set_title('Different Starting Points: Navigation')
ax2.legend()
ax2.grid(True, alpha=0.3)

# Plot 3: Same starting point - User behavior changes
months = ['Month 1', 'Month 2', 'Month 3']
ax3.plot(months, user_a_changes, 'o-', label='User A', linewidth=2, markersize=8)
ax3.plot(months, user_b_changes, 's-', label='User B', linewidth=2, markersize=8)
ax3.axhline(y=0, color='gray', linestyle='--', alpha=0.5)
ax3.set_xlabel('Time Period')
ax3.set_ylabel('Change in Online Hours (Normalized)')
ax3.set_title('Same Starting Point: User Behavior Changes')
ax3.legend()
ax3.grid(True, alpha=0.3)

# Plot 4: Different starting points - Game physics
ax4.scatter(*player1_pos, s=100, color='blue', label='Player 1', marker='o')
ax4.scatter(*player2_pos, s=100, color='red', label='Player 2', marker='o')

# Show movement vectors
ax4.quiver(player1_pos[0], player1_pos[1], movement[0], movement[1], 
           angles='xy', scale_units='xy', scale=1, color='blue', width=0.005)
ax4.quiver(player2_pos[0], player2_pos[1], movement[0], movement[1], 
           angles='xy', scale_units='xy', scale=1, color='red', width=0.005)

ax4.set_xlabel('X Position')
ax4.set_ylabel('Y Position')
ax4.set_title('Different Starting Points: Game Movement')
ax4.legend()
ax4.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("\n=== KEY INSIGHTS ===")
print("✓ Same starting point → Compare directions/magnitudes directly")
print("✓ Different starting points → Compare relative movements/changes")
print("✓ Choice depends on what you want to measure!")
print("✓ ML often uses same starting point for data similarity")
print("✓ Physics/navigation often use different starting points for movements")
print()
print("🎯 BUILDING ON OUR USER PROFILE THEME:")
print("✓ User profiles [age, income, hours_online] are our consistent example")
print("✓ Normalization is crucial for fair comparisons between features")
print("✓ Without normalization, income values dominate the similarity calculation")
print("✓ All subsequent examples build on this user profile foundation")


## 📌 Vectors and Basic Operations
Let's continue with our user profile theme and explore basic vector operations.

In [None]:
import numpy as np

# Define two user profiles (simplified to 2D for visualization)
user_profile_1 = np.array([0.4, 0.6])  # [age_normalized, income_normalized]
user_profile_2 = np.array([0.8, 0.3])  # Different user profile

# Vector addition and scalar multiplication
print("User Profile 1 + User Profile 2 =", user_profile_1 + user_profile_2)
print("2 * User Profile 1 =", 2 * user_profile_1)
print("\nNote: Vector addition might represent combining user characteristics")
print("Scalar multiplication might represent scaling user features")

## 🔢 Dot Product
The dot product tells us how aligned two vectors are. Mathematically: **a** · **b** = ||a|| ||b|| cos(θ)

**Breaking down this notation:**
- **a** · **b**: The dot product of vectors **a** and **b** (read as "a dot b")
- **||a||**: The magnitude (length) of vector **a** (double pipes mean "length of")
- **||b||**: The magnitude (length) of vector **b**
- **cos(θ)**: Cosine of the angle θ (theta) between the vectors
- **θ**: Greek letter representing the angle between vectors **a** and **b**

This means:
- When vectors point in the same direction (θ = 0°): cos(0°) = 1, maximum dot product
- When vectors are perpendicular (θ = 90°): cos(90°) = 0, dot product = 0
- When vectors point in opposite directions (θ = 180°): cos(180°) = -1, negative dot product


In [None]:
# Compute dot product between user profiles
dot = np.dot(user_profile_1, user_profile_2)
print("Dot product of user profiles:", dot)

# Let's explore the geometric meaning
import matplotlib.pyplot as plt

# Visualize the user profile vectors
plt.figure(figsize=(8, 6))
plt.quiver(0, 0, user_profile_1[0], user_profile_1[1], angles='xy', scale_units='xy', scale=1, 
           color='blue', label='User 1 Profile', width=0.005)
plt.quiver(0, 0, user_profile_2[0], user_profile_2[1], angles='xy', scale_units='xy', scale=1, 
           color='red', label='User 2 Profile', width=0.005)

# Calculate angle between user profile vectors
angle = np.arccos(dot / (np.linalg.norm(user_profile_1) * np.linalg.norm(user_profile_2)))
print(f"Angle between user profiles: {np.degrees(angle):.1f} degrees")
print("Smaller angles mean more similar users!")

plt.xlim(-0.1, 1.0)
plt.ylim(-0.1, 1.0)
plt.grid(True)
plt.legend()
plt.xlabel('Age (normalized)')
plt.ylabel('Income (normalized)')
plt.title(f'User Profile Vectors (dot product = {dot:.3f})')
plt.show()

# Special case: dot product with itself gives squared magnitude
profile1_squared = np.dot(user_profile_1, user_profile_1)
profile1_magnitude_squared = np.linalg.norm(user_profile_1)**2
print(f"User Profile 1 · User Profile 1 = {profile1_squared:.3f}")
print(f"||User Profile 1||² = {profile1_magnitude_squared:.3f}")
print("They're equal! This is always true.")


**Quiz:**
What is the geometric meaning of a dot product?
- A. The angle between vectors
- B. The projection of one vector onto another  
- C. A measure of similarity/alignment
- D. All of the above

> **Answer**: D. All of the above! The dot product encodes the angle between vectors, can be used to compute projections, and serves as a similarity measure.


## 📐 Vector Projection: Hands-On Practice

Now let's explore **vector projection** - one of the most important concepts in linear algebra and ML! 

Vector projection answers: "How much of vector **a** lies in the direction of vector **b**?"

Think of it as the "shadow" that vector **a** casts onto vector **b** when light shines perpendicular to **b**.


In [None]:
# Let's compute and visualize vector projections
def project_vector(a, b):
    """Project vector a onto vector b"""
    # Scalar projection (length of the shadow)
    scalar_proj = np.dot(a, b) / np.linalg.norm(b)

    # Vector projection (the actual shadow vector)
    vector_proj = (np.dot(a, b) / np.dot(b, b)) * b

    return scalar_proj, vector_proj

# Example: Project one user profile onto another
user_a = np.array([0.6, 0.8])  # User A: [age_norm, income_norm]
user_b = np.array([1.0, 0.0])  # Reference direction: pure age dimension

scalar_proj, vector_proj = project_vector(user_a, user_b)

print("=== USER PROFILE PROJECTION EXAMPLE ===")
print(f"User A profile: {user_a} (age_norm, income_norm)")
print(f"Reference direction: {user_b} (pure age dimension)")
print(f"Scalar projection (age component): {scalar_proj:.2f}")
print(f"Vector projection: {vector_proj}")

# Interpretation for ML
print(f"\nML Interpretation:")
print(f"User A's profile magnitude = {np.linalg.norm(user_a):.2f}")
print(f"User A's 'age component' when projected onto age axis = {scalar_proj:.2f}")
print("This tells us how much of User A's profile is explained by age alone!")


In [None]:
# Visualize the projection
plt.figure(figsize=(10, 8))

# Plot original vectors
plt.quiver(0, 0, user_a[0], user_a[1], angles='xy', scale_units='xy', scale=1, 
           color='blue', width=0.01, label=f'User A = {user_a}')
plt.quiver(0, 0, user_b[0], user_b[1], angles='xy', scale_units='xy', scale=1, 
           color='red', width=0.01, label=f'Age axis = {user_b}')

# Plot the projection
plt.quiver(0, 0, vector_proj[0], vector_proj[1], angles='xy', scale_units='xy', scale=1, 
           color='green', width=0.01, label=f'Age component of User A')

# Draw the "shadow" line (perpendicular from user_a to its projection)
plt.plot([user_a[0], vector_proj[0]], [user_a[1], vector_proj[1]], 
         'gray', linestyle='--', alpha=0.7, label='Perpendicular drop')

# Add annotations
plt.annotate(f'Profile magnitude = {np.linalg.norm(user_a):.2f}', 
             xy=(user_a[0]/2, user_a[1]/2 + 0.05), fontsize=12, color='blue')
plt.annotate(f'Age component = {scalar_proj:.2f}', 
             xy=(vector_proj[0]/2, -0.05), fontsize=12, color='green')

plt.xlim(-0.1, 1.1)
plt.ylim(-0.1, 0.9)
plt.grid(True, alpha=0.3)
plt.legend()
plt.title('User Profile Projection: Age Component of User A')
plt.xlabel('Age (normalized)')
plt.ylabel('Income (normalized)')
plt.gca().set_aspect('equal')
plt.show()


In [None]:
# Let's explore different user profile projection scenarios
user_scenarios = [
    ("Similar users", [0.6, 0.8], [0.8, 0.6]),  # Similar age/income profiles
    ("Age-focused vs Income-focused", [0.9, 0.1], [0.1, 0.9]),  # Perpendicular preferences
    ("Opposite profiles", [0.8, 0.9], [-0.4, -0.45]),  # Opposite directions (scaled)
    ("Young high-earner onto age axis", [0.2, 0.9], [1.0, 0.0])  # Project onto pure age
]

print("=== USER PROFILE PROJECTION SCENARIOS ===")
for name, user_a, reference_dir in user_scenarios:
    scalar_proj, vector_proj = project_vector(np.array(user_a), np.array(reference_dir))
    dot_product = np.dot(user_a, reference_dir)

    print(f"\n{name}:")
    print(f"  User profile = {user_a} (age_norm, income_norm)")
    print(f"  Reference direction = {reference_dir}")
    print(f"  Dot product = {dot_product:.3f}")
    print(f"  Scalar projection = {scalar_proj:.3f}")
    print(f"  Vector projection = [{vector_proj[0]:.3f}, {vector_proj[1]:.3f}]")


In [None]:
# ML Application: Feature extraction using projection
print("\n=== ML APPLICATION: FEATURE EXTRACTION ===")

# Simulate some 2D data points (could be customer features, image features, etc.)
data_points = np.array([
    [2, 3],   # Customer 1: [age_normalized, income_normalized]
    [1, 4],   # Customer 2
    [3, 2],   # Customer 3
    [4, 1],   # Customer 4
    [1, 1]    # Customer 5
])

# Define a "direction of interest" (could be discovered by PCA, domain knowledge, etc.)
direction = np.array([1, 1])  # Equal weight to age and income
direction = direction / np.linalg.norm(direction)  # Normalize

print(f"Data points (customers): \n{data_points}")
print(f"Direction of interest: {direction}")

# Project all data points onto this direction
projections = []
for point in data_points:
    scalar_proj, vector_proj = project_vector(point, direction)
    projections.append(scalar_proj)

projections = np.array(projections)
print(f"Projected values (1D features): {projections}")

# Visualize
plt.figure(figsize=(10, 6))

# Plot original 2D data
plt.subplot(1, 2, 1)
plt.scatter(data_points[:, 0], data_points[:, 1], c='blue', s=100)
plt.quiver(0, 0, direction[0]*3, direction[1]*3, angles='xy', scale_units='xy', scale=1, 
           color='red', width=0.01, label='Projection direction')
for i, point in enumerate(data_points):
    plt.annotate(f'C{i+1}', xy=point, xytext=(5, 5), textcoords='offset points')
plt.xlim(-0.5, 5)
plt.ylim(-0.5, 5)
plt.grid(True, alpha=0.3)
plt.legend()
plt.title('Original 2D Data')
plt.xlabel('Feature 1 (e.g., Age)')
plt.ylabel('Feature 2 (e.g., Income)')

# Plot projected 1D data
plt.subplot(1, 2, 2)
plt.scatter(projections, np.zeros_like(projections), c='green', s=100)
for i, proj in enumerate(projections):
    plt.annotate(f'C{i+1}', xy=(proj, 0), xytext=(0, 10), textcoords='offset points')
plt.ylim(-0.5, 0.5)
plt.grid(True, alpha=0.3)
plt.title('Projected 1D Data')
plt.xlabel('Projected Feature Value')
plt.ylabel('')

plt.tight_layout()
plt.show()

print("\n=== KEY INSIGHTS ===")
print("• Projection reduces dimensionality while preserving important information")
print("• The projection direction determines what aspects of data we emphasize")
print("• This is the foundation of techniques like PCA (Principal Component Analysis)")
print("• In ML, we often project high-dimensional data to lower dimensions for visualization and efficiency")


## 🔄 Matrix Multiplication

Matrix multiplication combines transformations. Remember: **order matters!** AB ≠ BA in general.

**Notation:** ≠ means "not equal to"

When we multiply an m×n matrix A by an n×p matrix B, we get an m×p matrix C.


In [None]:
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])

# Multiply matrices
C = np.dot(A, B)
print("A @ B =\n", C)

# Let's verify that order matters
C_reverse = np.dot(B, A)
print("\nB @ A =\n", C_reverse)
print(f"\nAre they equal? {np.array_equal(C, C_reverse)}")

# Let's understand what matrix multiplication does geometrically
# A matrix transforms vectors - let's see how
test_vector = np.array([1, 0])  # Unit vector along x-axis
transformed = A @ test_vector
print(f"\nOriginal vector: {test_vector}")
print(f"After transformation by A: {transformed}")

# The identity matrix does nothing (like multiplying by 1)
I = np.eye(2)  # 2x2 identity matrix
print(f"\nIdentity matrix:\n{I}")
print(f"I @ test_vector = {I @ test_vector}")  # Should be unchanged

## 🧠 Visualizing Transformations

In [None]:
import matplotlib.pyplot as plt

def plot_transform(A, title="Transformation"):
    grid = np.array([[x, y] for x in range(-2, 3) for y in range(-2, 3)])
    transformed = grid @ A.T

    plt.figure(figsize=(6,6))
    plt.quiver(grid[:, 0], grid[:, 1], transformed[:, 0] - grid[:, 0], transformed[:, 1] - grid[:, 1], angles='xy', scale_units='xy', scale=1, color='r')
    plt.scatter(grid[:, 0], grid[:, 1], color='blue')
    plt.title(title)
    plt.grid(True)
    plt.axhline(0, color='gray', lw=1)
    plt.axvline(0, color='gray', lw=1)
    plt.gca().set_aspect('equal')
    plt.show()

plot_transform(np.array([[2, 0], [0, 1]]), title="Horizontal Stretch")

## ✅ Summary Quiz & Checklist

### Quiz Questions
1. **What does matrix multiplication represent geometrically?**
   > Matrix multiplication represents the composition of linear transformations. Each matrix transforms space in some way (stretch, rotate, reflect, etc.), and multiplying matrices combines these transformations.

2. **What happens when you dot a vector with itself?**
   > You get the squared magnitude (length) of the vector: **v** · **v** = ||**v**||²

3. **Which operations preserve direction?**
   > Scalar multiplication by positive numbers preserves direction. Matrix transformations may or may not preserve direction depending on the matrix.

4. **Why does AB ≠ BA in general?**
   > Because matrix multiplication represents composition of transformations, and the order of transformations matters. Rotating then stretching gives a different result than stretching then rotating.

### Self-Assessment Checklist
Check off each item as you master it:

**Vector Fundamentals:**
- [ ] I understand what vectors are: mathematical objects with magnitude and direction
- [ ] I can explain why vectors are central to machine learning (data representation, similarity, transformations)
- [ ] I can give concrete examples of how different data types become vectors
- [ ] I can visualize vectors as arrows in space and interpret their geometric meaning

**Vector Relationships:**
- [ ] I understand what angles between vectors represent (similarity/alignment)
- [ ] I can explain why small angles mean high similarity and large angles mean low similarity
- [ ] I can compute and interpret dot products as measures of vector alignment
- [ ] I can connect vector similarity to real ML applications (recommendations, document analysis, etc.)

**Mathematical Operations:**
- [ ] I can compute a dot product and explain its geometric meaning
- [ ] I can multiply two matrices and describe the geometric effect
- [ ] I can visualize matrix transformations on a 2D grid
- [ ] I can explain why AB ≠ BA (order matters in matrix multiplication)
- [ ] I understand the role of the identity matrix

**ML Connections:**
- [ ] I can connect these concepts to machine learning applications
- [ ] I understand how vectors represent data points in feature spaces
- [ ] I can explain how ML algorithms use vector operations to find patterns

### 🔗 Next Steps
- Review the [companion theory file](01_linear_algebra_intro.md) for deeper mathematical insights
- Practice with different transformation matrices
- Think about how these concepts might apply to neural networks (hint: they're everywhere!)

### 💡 Key Takeaways
- **Vectors**: Quantities with direction and magnitude
- **Matrices**: Functions that transform space
- **Dot Product**: Measures alignment between vectors
- **Matrix Multiplication**: Combines transformations (order matters!)
- **Geometric Intuition**: Always try to visualize what's happening in space
