<div style="background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 30px; border-radius: 15px; box-shadow: 0 10px 20px rgba(0,0,0,0.2);">
    <h1 style="color: white; border-bottom: 2px solid rgba(255,255,255,0.3); padding-bottom: 15px; margin-top: 0;">🧠 The Perfect Math Roadmap for AI Beginners</h1>
    <h2 style="color: #f1c40f; margin-top: 10px;">From Zero to Hero: Master Linear Algebra, Calculus & Probability</h2>
    <p style="font-size: 1.1em; margin-top: 20px;">
        <b>Welcome!</b> This notebook is designed to take you from "I'm bad at math" to "I understand how AI works". 
        We use <b style="color: #f1c40f;">visualizations</b>, <b style="color: #f1c40f;">analogies</b>, and <b style="color: #f1c40f;">Python code</b> 
        to build your intuition. No boring proofs—just what you need to build intelligent systems.
    </p>
    <hr style="border-color: rgba(255,255,255,0.2); margin: 20px 0;">
    <div style="display: flex; gap: 20px; align-items: center;">
        <div>
            <b>👤 Author:</b> Tassawar Abbas<br>
            <b>📧 Email:</b> <a href="mailto:abbas829@gmail.com" style="color: #f1c40f; text-decoration: none;">abbas829@gmail.com</a>
        </div>
        <div style="border-left: 1px solid rgba(255,255,255,0.3); padding-left: 20px;">
            <b>🚀 Goal:</b> Understand the "Why" and "How" of AI Math<br>
            <b>🛠️ Level:</b> Beginner to Intermediate
        </div>
    </div>
</div>


# 🛠️ 0. Setting Up Your Toolkit
Before we start, we need our tools. In AI, these are:
*   **NumPy**: The Swiss Army knife for numbers (vectors & matrices).
*   **Matplotlib/Seaborn**: Seeing is believing (visualizations).
*   **SciPy**: Advanced math functions.


In [None]:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats

# Make plots look professional
plt.style.use('seaborn-v0_8-darkgrid')
plt.rcParams.update({
    'font.size': 12,
    'figure.figsize': (10, 6),
    'axes.titlesize': 16,
    'axes.labelsize': 14,
    'lines.linewidth': 2.5,
    'grid.alpha': 0.7
})
sns.set_palette("viridis")

print("✅ Toolkit Ready! Let's do some math.")


# 📐 Part 1: Linear Algebra (The Language of Data)
## "Think in Arrows and Grids"

Why do we need Linear Algebra?
> **Because computers don't understand images, text, or sound. They only understand lists of numbers.**

*   An **Image** is a grid of numbers (pixels).
*   A **Word** is a list of numbers (vector embedding).
*   **Linear Algebra** provides the rules for manipulating these lists.


### 1.1 Vectors: Data Points in Space
A **vector** is just a list of numbers. In geometry, it's an arrow pointing from the origin (0,0) to a point.

**Analogy**: Think of a vector as a character's stats in a video game: `[Strength, Speed, Intelligence]`.


In [None]:
# Creating a vector (game character stats)
# [Strength, Speed]
hero = np.array([3, 5])
monster = np.array([4, 2])

print(f"Hero Stats: {hero}")
print(f"Monster Stats: {monster}")

# Visualization
plt.figure(figsize=(8, 8))
plt.grid(True, linestyle='--', alpha=0.6)
plt.axhline(0, color='black', linewidth=1)
plt.axvline(0, color='black', linewidth=1)

# Plot vectors as arrows
plt.quiver(0, 0, hero[0], hero[1], angles='xy', scale_units='xy', scale=1, color='#3498db', label='Hero')
plt.quiver(0, 0, monster[0], monster[1], angles='xy', scale_units='xy', scale=1, color='#e74c3c', label='Monster')

plt.xlim(-1, 6)
plt.ylim(-1, 6)
plt.xlabel('Strength')
plt.ylabel('Speed')
plt.title('Vectors as Arrows: Reviewing Character Stats')
plt.legend()
plt.show()


### 1.2 The Dot Product: Assessing Similarity
The **Dot Product** is one of the most important operations in AI.
It tells us **how similar** two vectors are (conceptually, whether they point in the same direction).

Formula: $a \cdot b = \sum (a_i 	imes b_i)$

> **Use Case in AI**: This is how **Recommendation Systems** work! If your "User Vector" is similar to a "Movie Vector", the dot product is high, and Netflix recommends it.


In [None]:
# Let's see if our Hero and Monster are "aligned" (similar stats)

# Calculate dot product
dot_prod = np.dot(hero, monster)
print(f"Dot Product (Similarity Score): {dot_prod}")

# Let's compare with a Villain who is very different
villain = np.array([-2, 4]) # Low strength (negative?), high speed

dot_prod_villain = np.dot(hero, villain)
print(f"Dot Product with Villain: {dot_prod_villain}")

# Conclusion
print("\nAnalysis:")
print("High positive dot product = Similar direction (Alignment)")
print("Zero dot product = Orthogonal (Unrelated)")
print("Negative dot product = Opposite direction")


### 1.3 Matrices: Transformations
If a vector is data, a **Matrix** is a machine that *transforms* that data.
It can rotate, scale, or shear vectors.

In Neural Networks, **layers are just matrices**. When data passes through a layer, the matrix transforms it to extract features.


In [None]:
# Define a rotation matrix (rotates 90 degrees counter-clockwise)
theta = np.radians(90)
rotation_matrix = np.array([
    [np.cos(theta), -np.sin(theta)],
    [np.sin(theta),  np.cos(theta)]
])

print("Rotation Matrix:\n", rotation_matrix.round(2))

# Apply transformation (Matrix-Vector Multiplication)
# New Vector = Matrix @ Old Vector
hero_rotated = rotation_matrix @ hero

print(f"Original Hero: {hero}")
print(f"Rotated Hero: {hero_rotated.round(2)}")

# Visualizing the transformation
plt.figure(figsize=(8, 8))
plt.grid(True, linestyle='--', alpha=0.6)
plt.axhline(0, color='black', linewidth=1)
plt.axvline(0, color='black', linewidth=1)

# Original
plt.quiver(0, 0, hero[0], hero[1], angles='xy', scale_units='xy', scale=1, color='#3498db', label='Original')
# Rotated
plt.quiver(0, 0, hero_rotated[0], hero_rotated[1], angles='xy', scale_units='xy', scale=1, color='#2ecc71', label='Transformed (Rotated)')

plt.xlim(-6, 6)
plt.ylim(-6, 6)
plt.title('Matrices Transform Vectors (Here: 90° Rotation)')
plt.legend()
plt.show()


# 📉 Part 2: Calculus (The Engine of Learning)
## "How Neural Networks Learn from Mistakes"

Calculus in AI is largely about one thing: **Optimization**.
We want to minimize the *error* (loss) of our model. To do that, we need to know generally "which way is down?"

*   **Derivative**: The slope of a function at a point. It tells you the direction of steepest ascent/descent.
*   **Gradient**: The derivative for multi-dimensional functions (like a landscape).


### 2.1 The Derivative: Sensitivity
Think of the derivative as **Sensitivity**.
If I change the input $x$ slightly, how much does the output $y$ change?

*   High derivative = Big change (Sensitive)
*   Zero derivative = No change (Flat / Minimum / Maximum)


In [None]:
# Let's visualize a loss function (Error curve)
def loss_function(w):
    return w**2  # Simple parabola

w_values = np.linspace(-3, 3, 100)
loss_values = loss_function(w_values)

plt.figure(figsize=(10, 6))
plt.plot(w_values, loss_values, label='Loss Function $L(w) = w^2$')

# Pick a point
current_w = 2.0
current_loss = loss_function(current_w)
derivative = 2 * current_w  # d/dw (w^2) = 2w

# Visualize the slope (tangent line)
plt.plot(current_w, current_loss, 'ro', markersize=10, label='Current Weight')
plt.arrow(current_w, current_loss, -1, -derivative, head_width=0.2, color='red', label='Negative Gradient (Downhill)')

plt.title(f'Gradient at w={current_w} is {derivative}. To minimize loss, go LEFT!')
plt.xlabel('Weight (w)')
plt.ylabel('Error / Loss')
plt.legend()
plt.show()


### 2.2 Gradient Descent: Walking Down the Hill
This is how models learn.
1.  Start at a random place (random weights).
2.  Look around and find the steepest way down (compute Gradient).
3.  Take a small step in that direction (Update weights).
4.  Repeat until you hit the bottom (Minimum error).

> **Analogy**: You are lost on a mountain in dense fog. You can only feel the slope under your feet. To get to the village at the bottom, you simply feel which way goes down and take a step.


# 🎲 Part 3: Probability (Processing Uncertainty)
## "Predicting the Future in a Chaotic World"

The real world is messy. Data is noisy. Probability helps us build models that say "I am 90% sure this is a cat" rather than just "Cat".

*   **Expectation**: The "average" outcome.
*   **Variance**: How "spread out" or uncertain the data is.


### 3.1 Normal Distribution (The Bell Curve)
Most things in nature (heights, errors, noise) follow a **Normal Distribution**.
It's defined by:
*   **Mean ($\mu$)**: The center.
*   **Standard Deviation ($\sigma$)**: The width.


In [None]:
# Generate data from a normal distribution
# Mean 0, Std Dev 1
data = np.random.randn(1000)

plt.figure(figsize=(10, 6))
sns.histplot(data, kde=True, color='purple', bins=30)
plt.title('The Normal Distribution (Bell Curve)')
plt.xlabel('Value')
plt.ylabel('Frequency')

# Mark the mean
plt.axvline(np.mean(data), color='red', linestyle='--', label=f'Mean: {np.mean(data):.2f}')
plt.legend()
plt.show()


### 3.2 Bayes' Theorem: Updating Beliefs
This is the holy grail of reasoning. It tells us how to update our probability when we get new evidence.

$$ P(A|B) = \frac{P(B|A) \times P(A)}{P(B)} $$

*   **Prior $P(A)$**: What I believed *before* seeing data.
*   **Likelihood $P(B|A)$**: How likely the data is *if* my belief is true.
*   **Posterior $P(A|B)$**: What I believe *after* seeing data.

> **Example**: 
> *   Prior: "It probably won't rain" (It's summer).
> *   Evidence: "I see dark clouds".
> *   Posterior: "Okay, it might rain now" (Updated belief).


# 🎓 Conclusion: You are Ready!
You now possess the three pillars of AI Math:
1.  **Linear Algebra**: To represent and transform data.
2.  **Calculus**: To optimize generic models and learn from data.
3.  **Probability**: To handle uncertainty and reason about the world.

### What's Next?
*   Try changing the values in the code cells above.
*   Build a simple neural network using `pytorch` or `tensorflow`.
*   Remember: Math is a tool, not a barrier. You got this! 🚀
