# Understanding the Bias-Variance Tradeoff

## ⚖️ What is Bias and What is Variance?

![Dartboard showing different combinations of bias and variance with arrow clusters. Size 800x600](images/bias_variance_dartboard.png)

*Think of it like throwing darts at a target!*

## 🎯 Understanding Bias

- **Bias:** How far your model's predictions are from the true values

- **High Bias:** Model makes strong assumptions, misses important patterns

- **Example:** Using a straight line to fit a curved relationship

- **Real-world:** Predicting house prices using only size (ignoring location, age, etc.)

## 📊 Understanding Variance

- **Variance:** How much your model's predictions change with different training data

- **High Variance:** Model is too sensitive to training data specifics

- **Example:** A very deep decision tree that memorizes training examples

- **Real-world:** Model that changes drastically when you add just one new data point

## 🎨 Visual Representation

![Four quadrants showing Low Bias/Low Variance, Low Bias/High Variance, High Bias/Low Variance, High Bias/High Variance with scatter plots. Size 800x600](images/bias_variance_comparison.png)

*The sweet spot: Low Bias + Low Variance = Good Model!*

## 💻 Code Example: Observing Bias & Variance

In [None]:
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.tree import DecisionTreeRegressor

# Generate synthetic data
np.random.seed(42)
X = np.linspace(0, 10, 100).reshape(-1, 1)
y = 2 * X.ravel() + np.sin(X.ravel()) + np.random.normal(0, 0.5, 100)

# High Bias Model (Linear Regression for non-linear data)
linear_model = LinearRegression()
linear_model.fit(X, y)
print("Linear Model (High Bias):", linear_model.score(X, y))

# High Variance Model (Deep Decision Tree)
deep_tree = DecisionTreeRegressor(max_depth=10)
deep_tree.fit(X, y)
print("Deep Tree (High Variance):", deep_tree.score(X, y))

## 🎯 Key Takeaway

- **The Goal:** Find the right balance between bias and variance

- Too simple → High Bias (underfitting)

- Too complex → High Variance (overfitting)

- Just right → Good generalization

*Question: Can you think of a real-world example where high bias would be problematic?*