# 🧠 Day 2 — Matrix Multiplication, Dot Product & Embeddings
### Foundation of AI Math — How Vectors, Matrices & Embeddings Power Models
---
Today We'll understand *why* dot product, cosine similarity, and matrix multiplication are the **heart of AI**, and how every neural layer or embedding uses them.

## 1️⃣ Dot Product — How Two Things Align
The dot product measures **how much two vectors go in the same direction**. It’s the simplest mathematical form of “similarity.”

### Formula
$$A \cdot B = \sum_i (A_i \times B_i)$$

- If both vectors point the same way → large positive value
- If opposite → negative
- If unrelated → near zero

In [1]:
import numpy as np

A = np.array([5, 1, 2])
B = np.array([4, 2, 1])

dot = np.dot(A, B)
print('Dot Product:', dot)

Dot Product: 24


👉 The result `24` means the users' preferences are quite aligned.
If this were user ratings, it means both users like similar movies.

## 2️⃣ Cosine Similarity — Scale-Free Comparison
Dot product depends on vector *length*. If one user rates everything high, their scores inflate.
Cosine similarity fixes that by normalizing both vectors.

$$\cos(\theta) = \frac{A \cdot B}{\|A\|\|B\|}$$

In [2]:
cos_sim = dot / (np.linalg.norm(A) * np.linalg.norm(B))
print('Cosine Similarity:', cos_sim)

Cosine Similarity: 0.956182887467515


- `1` → same direction (identical taste)
- `0` → no relation
- `-1` → opposite

## 3️⃣ Matrix Multiplication — Many Dot Products at Once
Each cell in a matrix product is one dot product. This lets us compare *all users or all items* together.

In [4]:
ratings = np.array([
    [5, 3, 0],
    [4, 0, 2],
    [1, 1, 4]
])

sim_matrix = np.dot(ratings, ratings.T)
print('User Raw Similarity Matrix:\n', sim_matrix)

User Raw Similarity Matrix:
 [[34 20  8]
 [20 20 12]
 [ 8 12 18]]


👉 Each off-diagonal element = similarity between two users.
Diagonal = self-similarity (always max).

## 4️⃣ Normalized Similarity — Cosine for All Users

In [5]:
from sklearn.metrics.pairwise import cosine_similarity
cosine_sim_matrix = cosine_similarity(ratings)
print('Cosine Similarity Matrix:\n', cosine_sim_matrix)

Cosine Similarity Matrix:
 [[1.         0.76696499 0.32338083]
 [0.76696499 1.         0.63245553]
 [0.32338083 0.63245553 1.        ]]


👉 Values between 0–1 show normalized similarity. Easier to interpret across users.

## 5️⃣ Neural Networks: Dot Product Everywhere
Every neuron computes:
$$output = input \cdot weights + bias$$
So, deep learning is really millions of dot products and matrix multiplications.

In [6]:
X = np.array([[1, 2], [3, 4]])  # inputs
W = np.array([[0.2, 0.8], [0.6, 0.4]])  # weights
b = np.array([0.5, -0.5])  # biases

Y = np.dot(X, W) + b
print('Neural Layer Output:\n', Y)

Neural Layer Output:
 [[1.9 1.1]
 [3.5 3.5]]


👉 Each row in `Y` represents neuron activations for one input.
This is what happens in every dense layer in deep learning.

## 6️⃣ Embeddings — Vectors With Meaning
Embeddings map words, users, or items into numeric space.
Similar entities are close together (high cosine similarity).

In [7]:
texts = ['dog', 'cat', 'apple']
embeddings = np.array([
    [0.8, 0.2, 0.1],
    [0.7, 0.3, 0.2],
    [0.1, 0.8, 0.9]
])

query = np.array([[0.75, 0.25, 0.15]])

sims = cosine_similarity(query, embeddings)
for t, s in zip(texts, sims[0]):
    print(f'{t}: {s:.2f}')

dog: 0.99
cat: 0.99
apple: 0.42


👉 Similar words (like 'dog' and 'cat') have higher scores.
This is how search and recommendation systems understand semantic similarity.

## ✅ Reflection
- What does dot product actually *mean* in similarity terms?
    It actually finds how much similar both vectors are by checking at their direction.

- Why does normalization (cosine) change the result?
    It removes the bias towards large values, like if a user rates everything high (like 9 or 10) and another uses smaller numbers (like 1–5), it'll make the dot product large, even if the patterns differ.

- Why is matrix multiplication the backbone of neural networks?
    Every neuron in a neural network computes:
        output= input.weights+bias
    This is a dot product again.
    Matrix multiplication lets a layer compute all neurons’ activations for all samples at once

- What does an embedding capture that raw numbers can’t?
    An embedding captures the context and relationships between items by placing similar items close together in a conceptual "space," something a simple, raw number can't do. This allows a computer to understand concepts like "king is to queen as man is to woman" just by doing math on their "locations."
