# üìò Day 2: Advanced Classification Algorithms

**üéØ Goal:** Master advanced classifiers: SVM, KNN, and Naive Bayes

**‚è±Ô∏è Time:** 60-90 minutes

**üåü Why This Matters for AI:**
- **SVM** powers image classification in computer vision (before deep learning)
- **KNN** is used in recommendation systems and anomaly detection
- **Naive Bayes** is the engine behind spam filters and text classification
- These algorithms complement Transformers in hybrid AI systems (2024-2025)
- Foundation for understanding how multimodal AI classifies different data types
- Critical for Agentic AI systems that need fast, interpretable decisions

---

## üìö Quick Recap: Day 1

Yesterday we learned:
- **Logistic Regression** ‚Üí Linear decision boundaries
- **Decision Trees** ‚Üí Rule-based decisions
- **Random Forests** ‚Üí Ensemble of trees

Today we'll explore:
- **Support Vector Machines (SVM)** ‚Üí Find the best boundary
- **K-Nearest Neighbors (KNN)** ‚Üí Vote by neighbors
- **Naive Bayes** ‚Üí Probability-based classification

Let's dive in! üöÄ

In [None]:
# Import libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

# Advanced classifiers
from sklearn.svm import SVC
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB, MultinomialNB

# For visualization
from sklearn.decomposition import PCA
from matplotlib.colors import ListedColormap

# Make plots beautiful
plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette("husl")

print("‚úÖ Advanced AI libraries loaded!")
print("Let's build some powerful classifiers! üéØ")

## 1Ô∏è‚É£ Support Vector Machines (SVM)

**What it does:** Finds the BEST line/boundary to separate categories

**How it works:**
- Imagine points on a graph: red dots vs blue dots
- Draw a line to separate them
- SVM finds the line with the **maximum margin** (widest gap) between classes

**Visual Analogy:**
```
Red dots: ‚óè  ‚óè  ‚óè          |          ‚óã  ‚óã  ‚óã :Blue dots
              ‚óè  ‚óè         |         ‚óã  ‚óã
                ‚óè          |          ‚óã
                     [Max Margin]
                           ‚Üë
                    Decision Boundary
```

**Best for:**
- High-dimensional data (many features)
- Clear separation between classes
- Works well with small datasets

**üéØ Real AI Use Cases (2024-2025):**
- **Image classification basics** (facial recognition, object detection)
- **Text categorization** in RAG document retrieval
- **Anomaly detection** for AI safety systems
- **Hybrid AI** combining SVM with Transformers for efficiency

In [None]:
# Create a 2D dataset for visualization
np.random.seed(42)

# Class 1: Cluster in bottom-left
X_class1 = np.random.randn(100, 2) + np.array([-2, -2])
y_class1 = np.zeros(100)

# Class 2: Cluster in top-right
X_class2 = np.random.randn(100, 2) + np.array([2, 2])
y_class2 = np.ones(100)

# Combine
X_2d = np.vstack([X_class1, X_class2])
y_2d = np.hstack([y_class1, y_class2])

# Visualize
plt.figure(figsize=(10, 6))
plt.scatter(X_2d[y_2d == 0, 0], X_2d[y_2d == 0, 1], 
            c='red', marker='o', label='Class 0', alpha=0.6, s=50)
plt.scatter(X_2d[y_2d == 1, 0], X_2d[y_2d == 1, 1], 
            c='blue', marker='s', label='Class 1', alpha=0.6, s=50)
plt.xlabel('Feature 1', fontsize=12)
plt.ylabel('Feature 2', fontsize=12)
plt.title('üéØ SVM Dataset: Can we separate these classes?', fontsize=14, fontweight='bold')
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

print("üìä Dataset created with 2 clearly separable classes!")

In [None]:
# Split data
X_train_svm, X_test_svm, y_train_svm, y_test_svm = train_test_split(
    X_2d, y_2d, test_size=0.2, random_state=42
)

# Scale features (important for SVM!)
scaler = StandardScaler()
X_train_svm_scaled = scaler.fit_transform(X_train_svm)
X_test_svm_scaled = scaler.transform(X_test_svm)

# Train SVM with linear kernel
svm_linear = SVC(kernel='linear', random_state=42)
svm_linear.fit(X_train_svm_scaled, y_train_svm)

# Predictions
y_pred_svm = svm_linear.predict(X_test_svm_scaled)
accuracy_svm = accuracy_score(y_test_svm, y_pred_svm)

print("üéØ Support Vector Machine Results:")
print(f"Accuracy: {accuracy_svm:.2%}")
print("\nüìä Classification Report:")
print(classification_report(y_test_svm, y_pred_svm, target_names=['Class 0', 'Class 1']))

In [None]:
# Visualize decision boundary
def plot_decision_boundary(X, y, model, title):
    h = 0.02  # step size in mesh
    
    # Create mesh
    x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
    y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
    xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
                         np.arange(y_min, y_max, h))
    
    # Predict on mesh
    Z = model.predict(np.c_[xx.ravel(), yy.ravel()])
    Z = Z.reshape(xx.shape)
    
    # Plot
    plt.figure(figsize=(10, 6))
    plt.contourf(xx, yy, Z, alpha=0.3, cmap=ListedColormap(['#FFAAAA', '#AAAAFF']))
    plt.scatter(X[y == 0, 0], X[y == 0, 1], c='red', marker='o', 
                label='Class 0', edgecolors='k', s=50)
    plt.scatter(X[y == 1, 0], X[y == 1, 1], c='blue', marker='s', 
                label='Class 1', edgecolors='k', s=50)
    plt.xlabel('Feature 1', fontsize=12)
    plt.ylabel('Feature 2', fontsize=12)
    plt.title(title, fontsize=14, fontweight='bold')
    plt.legend()
    plt.tight_layout()
    plt.show()

plot_decision_boundary(X_train_svm_scaled, y_train_svm, svm_linear, 
                       'üéØ SVM Decision Boundary (Linear Kernel)')

print("üìä The shaded regions show where SVM predicts each class!")
print("The line in the middle is the decision boundary with maximum margin.")

### üî• SVM Kernels: Handling Complex Patterns

What if data isn't linearly separable? Use **kernels**!

**Kernels transform data to higher dimensions:**
- **Linear:** Straight line (what we just did)
- **RBF (Radial Basis Function):** Curved boundaries (most popular)
- **Polynomial:** Curved, more complex

Think of it like:
- Can't separate with a flat sheet of paper? ‚Üí Bend the paper! (RBF kernel)

In [None]:
# Create non-linearly separable data (circles)
from sklearn.datasets import make_circles

X_circles, y_circles = make_circles(n_samples=300, noise=0.1, factor=0.5, random_state=42)

# Visualize
plt.figure(figsize=(10, 6))
plt.scatter(X_circles[y_circles == 0, 0], X_circles[y_circles == 0, 1],
            c='red', marker='o', label='Class 0', alpha=0.6, s=50)
plt.scatter(X_circles[y_circles == 1, 0], X_circles[y_circles == 1, 1],
            c='blue', marker='s', label='Class 1', alpha=0.6, s=50)
plt.title('üîµ Non-Linear Dataset: Circles', fontsize=14, fontweight='bold')
plt.legend()
plt.tight_layout()
plt.show()

print("‚ùå A straight line can't separate these circles!")
print("‚úÖ But SVM with RBF kernel can! Let's see...")

In [None]:
# Split and scale
X_train_circ, X_test_circ, y_train_circ, y_test_circ = train_test_split(
    X_circles, y_circles, test_size=0.2, random_state=42
)

scaler_circ = StandardScaler()
X_train_circ_scaled = scaler_circ.fit_transform(X_train_circ)
X_test_circ_scaled = scaler_circ.transform(X_test_circ)

# Train SVM with RBF kernel
svm_rbf = SVC(kernel='rbf', random_state=42)
svm_rbf.fit(X_train_circ_scaled, y_train_circ)

# Evaluate
y_pred_rbf = svm_rbf.predict(X_test_circ_scaled)
accuracy_rbf = accuracy_score(y_test_circ, y_pred_rbf)

print("üéØ SVM with RBF Kernel:")
print(f"Accuracy: {accuracy_rbf:.2%}")

# Visualize
plot_decision_boundary(X_train_circ_scaled, y_train_circ, svm_rbf,
                       'üéØ SVM with RBF Kernel: Curved Decision Boundary')

print("\nüåü Amazing! SVM perfectly separated the circles with a curved boundary!")

## 2Ô∏è‚É£ K-Nearest Neighbors (KNN)

**What it does:** Classifies based on the majority vote of nearby neighbors

**How it works:**
1. Given a new point, find the K closest training points
2. Count their labels
3. Majority vote wins!

**Analogy:**
"Tell me who your friends are, and I'll tell you who you are!"
- If K=5, look at 5 nearest neighbors
- If 3 are "red" and 2 are "blue" ‚Üí Predict "red"

**Best for:**
- Simple and intuitive
- No training required (lazy learning)
- Works well with small datasets

**üéØ Real AI Use Cases (2024-2025):**
- **Recommendation systems** (Netflix, Spotify: find similar users)
- **Anomaly detection** (fraud detection, cybersecurity)
- **Semantic search** in RAG systems (find similar documents)
- **Image similarity** in multimodal AI

In [None]:
# Create dataset
np.random.seed(42)

# Generate data
X_knn = np.vstack([
    np.random.randn(50, 2) + [2, 2],   # Class 0
    np.random.randn(50, 2) + [-2, -2]  # Class 1
])
y_knn = np.array([0] * 50 + [1] * 50)

# Split
X_train_knn, X_test_knn, y_train_knn, y_test_knn = train_test_split(
    X_knn, y_knn, test_size=0.2, random_state=42
)

# Scale (KNN is distance-based, so scaling is important!)
scaler_knn = StandardScaler()
X_train_knn_scaled = scaler_knn.fit_transform(X_train_knn)
X_test_knn_scaled = scaler_knn.transform(X_test_knn)

# Try different values of K
k_values = [1, 3, 5, 10, 20]
results_knn = {}

print("üîç Testing different K values:\n")
for k in k_values:
    knn = KNeighborsClassifier(n_neighbors=k)
    knn.fit(X_train_knn_scaled, y_train_knn)
    accuracy = accuracy_score(y_test_knn, knn.predict(X_test_knn_scaled))
    results_knn[k] = accuracy
    print(f"K={k:2d} ‚Üí Accuracy: {accuracy:.2%}")

# Visualize impact of K
plt.figure(figsize=(10, 6))
plt.plot(list(results_knn.keys()), list(results_knn.values()), 
         marker='o', linewidth=2, markersize=10)
plt.xlabel('K (Number of Neighbors)', fontsize=12)
plt.ylabel('Accuracy', fontsize=12)
plt.title('üéØ KNN: Impact of K on Accuracy', fontsize=14, fontweight='bold')
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

best_k = max(results_knn, key=results_knn.get)
print(f"\nüèÜ Best K value: {best_k} with {results_knn[best_k]:.2%} accuracy")

In [None]:
# Visualize decision boundary with best K
knn_best = KNeighborsClassifier(n_neighbors=best_k)
knn_best.fit(X_train_knn_scaled, y_train_knn)

plot_decision_boundary(X_train_knn_scaled, y_train_knn, knn_best,
                       f'üéØ KNN Decision Boundary (K={best_k})')

print(f"üìä Each point is classified based on its {best_k} nearest neighbors!")

## 3Ô∏è‚É£ Naive Bayes

**What it does:** Uses probability and Bayes' Theorem to classify

**How it works:**
- Calculates: "What's the probability this email is spam, given these words?"
- Uses Bayes' Theorem: P(Spam|Words) = P(Words|Spam) √ó P(Spam) / P(Words)

**Why "Naive"?**
- Assumes features are independent (naive assumption)
- Example: Assumes "free" and "money" appear independently
- In reality, they often appear together in spam!
- But it works surprisingly well anyway!

**Best for:**
- Text classification (spam, sentiment)
- Fast training and prediction
- Works well with high-dimensional data

**üéØ Real AI Use Cases (2024-2025):**
- **Email spam filtering** (still used by Gmail, Outlook)
- **Sentiment analysis** for social media monitoring
- **Document classification** in RAG retrieval
- **Real-time classification** in streaming AI systems
- **Text preprocessing** for Transformer models

In [None]:
# Create spam detection dataset (similar to Day 1)
np.random.seed(42)

n_emails = 1000

spam_data = {
    'word_freq_free': np.concatenate([
        np.random.exponential(2, 400),   # Spam
        np.random.exponential(0.3, 600)  # Not spam
    ]),
    'word_freq_money': np.concatenate([
        np.random.exponential(1.8, 400),
        np.random.exponential(0.2, 600)
    ]),
    'word_freq_winner': np.concatenate([
        np.random.exponential(1.5, 400),
        np.random.exponential(0.1, 600)
    ]),
    'word_freq_click': np.concatenate([
        np.random.exponential(1.2, 400),
        np.random.exponential(0.15, 600)
    ]),
    'exclamation_marks': np.concatenate([
        np.random.poisson(4, 400),
        np.random.poisson(0.8, 600)
    ]),
    'is_spam': [1] * 400 + [0] * 600
}

spam_df = pd.DataFrame(spam_data)
spam_df = spam_df.sample(frac=1, random_state=42).reset_index(drop=True)

print("üìß Spam Detection Dataset:")
print(spam_df.head())
print(f"\nSpam: {spam_df['is_spam'].sum()}, Not Spam: {(spam_df['is_spam'] == 0).sum()}")

In [None]:
# Prepare data
X_spam = spam_df.drop('is_spam', axis=1)
y_spam = spam_df['is_spam']

X_train_spam, X_test_spam, y_train_spam, y_test_spam = train_test_split(
    X_spam, y_spam, test_size=0.2, random_state=42
)

# Train Gaussian Naive Bayes (for continuous features)
nb = GaussianNB()
nb.fit(X_train_spam, y_train_spam)

# Predictions
y_pred_nb = nb.predict(X_test_spam)
y_pred_proba_nb = nb.predict_proba(X_test_spam)[:, 1]

# Evaluate
accuracy_nb = accuracy_score(y_test_spam, y_pred_nb)

print("üéØ Naive Bayes Results:")
print(f"Accuracy: {accuracy_nb:.2%}")
print("\nüìä Classification Report:")
print(classification_report(y_test_spam, y_pred_nb, target_names=['Not Spam', 'Spam']))

# Show probability predictions
print("\nüîç Sample Predictions with Probabilities:")
for i in range(5):
    actual = "Spam" if y_test_spam.iloc[i] == 1 else "Not Spam"
    predicted = "Spam" if y_pred_nb[i] == 1 else "Not Spam"
    prob_spam = y_pred_proba_nb[i]
    print(f"Email {i+1}: Actual={actual:8s} | Predicted={predicted:8s} | P(Spam)={prob_spam:.1%}")

## üåü Real AI Example: Image Classification Basics

Let's use these algorithms for **basic image classification** - a foundation for computer vision!

**Scenario:** Classify handwritten digits (0-9) - the "Hello World" of image AI

We'll use the famous **MNIST-like dataset** with 8√ó8 pixel images.

**üéØ Why This Matters:**
- Foundation for OCR (Optical Character Recognition)
- Used in document processing for RAG systems
- Building block for multimodal AI (text + images)
- Understanding before jumping to deep learning

In [None]:
# Load digits dataset
from sklearn.datasets import load_digits

digits = load_digits()
X_digits = digits.data  # 8x8 images flattened to 64 features
y_digits = digits.target  # Labels 0-9

print("üñºÔ∏è Digits Dataset Loaded!")
print(f"Number of images: {len(X_digits)}")
print(f"Image size: 8x8 pixels = {X_digits.shape[1]} features")
print(f"Classes: {np.unique(y_digits)}")

# Visualize some digits
fig, axes = plt.subplots(2, 5, figsize=(12, 5))
fig.suptitle('üî¢ Sample Handwritten Digits', fontsize=14, fontweight='bold')

for i, ax in enumerate(axes.flat):
    ax.imshow(digits.images[i], cmap='gray')
    ax.set_title(f'Label: {digits.target[i]}')
    ax.axis('off')

plt.tight_layout()
plt.show()

In [None]:
# Split data
X_train_dig, X_test_dig, y_train_dig, y_test_dig = train_test_split(
    X_digits, y_digits, test_size=0.2, random_state=42
)

# Scale features
scaler_dig = StandardScaler()
X_train_dig_scaled = scaler_dig.fit_transform(X_train_dig)
X_test_dig_scaled = scaler_dig.transform(X_test_dig)

# Train all classifiers
print("üöÄ Training Advanced Classifiers on Digit Images...\n")

classifiers = {
    'SVM (Linear)': SVC(kernel='linear', random_state=42),
    'SVM (RBF)': SVC(kernel='rbf', random_state=42),
    'KNN (K=5)': KNeighborsClassifier(n_neighbors=5),
    'Naive Bayes': GaussianNB()
}

results_digits = {}

for name, clf in classifiers.items():
    # Train
    clf.fit(X_train_dig_scaled, y_train_dig)
    
    # Predict
    y_pred = clf.predict(X_test_dig_scaled)
    
    # Evaluate
    accuracy = accuracy_score(y_test_dig, y_pred)
    results_digits[name] = accuracy
    
    print(f"{name:20s} ‚Üí Accuracy: {accuracy:.2%}")

print("\nüèÜ Best Model:", max(results_digits, key=results_digits.get))

In [None]:
# Compare all models visually
plt.figure(figsize=(12, 6))
models = list(results_digits.keys())
accuracies = list(results_digits.values())

bars = plt.bar(models, accuracies, color=['#3498db', '#e74c3c', '#2ecc71', '#f39c12'])
plt.ylabel('Accuracy', fontsize=12)
plt.title('üéØ Image Classification: Algorithm Comparison', fontsize=14, fontweight='bold')
plt.xticks(rotation=15)
plt.ylim(0.8, 1.0)

# Add accuracy labels
for bar in bars:
    height = bar.get_height()
    plt.text(bar.get_x() + bar.get_width()/2., height,
             f'{height:.1%}',
             ha='center', va='bottom', fontsize=11, fontweight='bold')

plt.tight_layout()
plt.show()

print("\nüìä All algorithms perform well on this image classification task!")
print("SVM with RBF kernel often wins for image data.")

In [None]:
# Demo: Show some predictions
best_model = SVC(kernel='rbf', random_state=42)
best_model.fit(X_train_dig_scaled, y_train_dig)
predictions = best_model.predict(X_test_dig_scaled)

# Visualize predictions
fig, axes = plt.subplots(2, 5, figsize=(12, 5))
fig.suptitle('üéØ SVM Predictions on Test Images', fontsize=14, fontweight='bold')

for i, ax in enumerate(axes.flat):
    # Reshape back to 8x8 for visualization
    img = X_test_dig[i].reshape(8, 8)
    ax.imshow(img, cmap='gray')
    
    actual = y_test_dig.iloc[i] if isinstance(y_test_dig, pd.Series) else y_test_dig[i]
    predicted = predictions[i]
    
    color = 'green' if actual == predicted else 'red'
    ax.set_title(f'Pred: {predicted}, True: {actual}', color=color, fontsize=10)
    ax.axis('off')

plt.tight_layout()
plt.show()

print("‚úÖ Green = Correct prediction")
print("‚ùå Red = Incorrect prediction")

## üéØ YOUR TURN: Exercise 1

**Challenge:** Build a wine quality classifier!

**Scenario:** Classify wine as "Good" or "Bad" based on chemical properties

**Your Task:**
1. Use the wine dataset below
2. Train SVM, KNN, and Naive Bayes
3. Compare their accuracy
4. Which works best for this problem?

Experiment and learn! üç∑

In [None]:
# Wine quality dataset
from sklearn.datasets import load_wine

wine = load_wine()
X_wine = wine.data
y_wine = (wine.target > 0).astype(int)  # Binary: 0 vs (1 or 2)

print("üç∑ Wine Dataset:")
print(f"Samples: {len(X_wine)}")
print(f"Features: {wine.feature_names}")
print(f"Classes: 0 (Type 0), 1 (Type 1 or 2)")
print(f"\nClass distribution: {np.bincount(y_wine)}")

In [None]:
# YOUR CODE HERE!
# Hint: Follow the pattern from digits classification

# Step 1: Split data
# YOUR CODE

# Step 2: Scale features
# YOUR CODE

# Step 3: Train classifiers (SVM, KNN, Naive Bayes)
# YOUR CODE

# Step 4: Compare results
# YOUR CODE

<details>
<summary>üìñ Click for Solution</summary>

```python
# Step 1: Split
X_train_wine, X_test_wine, y_train_wine, y_test_wine = train_test_split(
    X_wine, y_wine, test_size=0.2, random_state=42
)

# Step 2: Scale
scaler_wine = StandardScaler()
X_train_wine_scaled = scaler_wine.fit_transform(X_train_wine)
X_test_wine_scaled = scaler_wine.transform(X_test_wine)

# Step 3: Train
wine_classifiers = {
    'SVM': SVC(kernel='rbf', random_state=42),
    'KNN': KNeighborsClassifier(n_neighbors=5),
    'Naive Bayes': GaussianNB()
}

wine_results = {}
for name, clf in wine_classifiers.items():
    clf.fit(X_train_wine_scaled, y_train_wine)
    acc = accuracy_score(y_test_wine, clf.predict(X_test_wine_scaled))
    wine_results[name] = acc
    print(f"{name}: {acc:.2%}")
```
</details>

## üìä Algorithm Comparison Summary

| Algorithm | Strengths | Weaknesses | Best For |
|-----------|-----------|------------|----------|
| **SVM** | Excellent for high dimensions, handles non-linear patterns with kernels | Slow on large datasets, needs feature scaling | Image classification, text categorization |
| **KNN** | Simple, no training needed, works for any data shape | Slow predictions, sensitive to scale | Recommendation systems, anomaly detection |
| **Naive Bayes** | Very fast, works with high dimensions | Assumes independence (rarely true) | Spam filtering, sentiment analysis |

**üéØ When to Use Each (2024-2025 AI Context):**

- **SVM**: Pre-processing for Transformers, hybrid AI systems, when you need high accuracy
- **KNN**: Similarity search in RAG systems, real-time recommendations
- **Naive Bayes**: Fast text classification, streaming data, resource-constrained systems

## üéì Key Takeaways

**Today you mastered:**

1. **Support Vector Machines (SVM)**
   - ‚úÖ Finds optimal decision boundary
   - ‚úÖ Kernels handle non-linear patterns
   - ‚úÖ Great for images and high-dimensional data
   - **Use in:** Image classification, text categorization, RAG document filtering

2. **K-Nearest Neighbors (KNN)**
   - ‚úÖ Simple and intuitive
   - ‚úÖ No training phase
   - ‚ùå Slow predictions on large data
   - **Use in:** Recommendation systems, similarity search, anomaly detection

3. **Naive Bayes**
   - ‚úÖ Super fast training and prediction
   - ‚úÖ Works great for text
   - ‚úÖ Provides probabilities
   - **Use in:** Spam filters, sentiment analysis, document classification

**üåü Real-World Impact:**
- These algorithms power Gmail's spam filter (Naive Bayes)
- Netflix recommendations use KNN-like approaches
- Early face recognition used SVM (before deep learning)
- Modern RAG systems combine these for document retrieval

## üöÄ Next Steps

**Practice Ideas:**
1. Try different SVM kernels (linear, RBF, polynomial) on the digits dataset
2. Experiment with different K values in KNN
3. Compare Naive Bayes with yesterday's Logistic Regression

**Coming Tomorrow:**
- **Day 3:** Regression Algorithms (predict continuous values!)
  - Linear Regression deep dive
  - Polynomial Regression
  - Ridge and Lasso regularization
  - Real AI examples: price prediction, forecasting

---

**üéâ Amazing Work!** You now know 6 powerful classification algorithms!

**üí¨ Pro Tip:** In real AI projects (2024-2025), you often:
1. Start with simple models (Logistic Regression, Naive Bayes)
2. Try ensemble methods (Random Forest)
3. Use SVM for complex patterns
4. Finally, deep learning for maximum accuracy

---

*These "traditional" algorithms are still essential in modern AI systems - they're faster, more interpretable, and often perform just as well as complex models!* üåü