# üéõÔ∏è Hyperparameter Importance Analysis

<div style="background-color: #e8f5e9; padding: 15px; border-radius: 5px; border-left: 5px solid #4caf50;">
<b>üìì Notebook Information</b><br>
<b>Level:</b> Intermediate-Advanced<br>
<b>Estimated Time:</b> 25 minutes<br>
<b>Prerequisites:</b> 01_tests_introduction.ipynb<br>
<b>Dataset:</b> Diabetes (sklearn)
</div>

---

## üéØ Learning Objectives

By the end of this notebook, you will be able to:
- ‚úÖ Understand hyperparameter importance vs feature importance
- ‚úÖ Use Optuna for Bayesian optimization
- ‚úÖ Analyze which hyperparameters matter most
- ‚úÖ Test hyperparameter sensitivity
- ‚úÖ Know which hyperparameters to tune (and which to ignore)
- ‚úÖ Save time by focusing on important hyperparameters

---

## üìö Table of Contents

1. [Introduction](#intro)
2. [Setup](#setup)
3. [Baseline Model](#baseline)
4. [Hyperparameter Tuning with Optuna](#optuna)
5. [Importance Analysis](#importance)
6. [Sensitivity Testing](#sensitivity)
7. [Feature vs Hyperparameter Importance](#comparison)
8. [DeepBridge Integration](#deepbridge)
9. [Best Practices](#practices)
10. [Conclusion](#conclusion)

<a id="intro"></a>
## 1. üìñ Introduction

### What is Hyperparameter Importance?

> **Hyperparameter Importance** tells you which hyperparameters have the biggest impact on model performance.

### Why Does This Matter?

**The Problem:**
```python
# RandomForest has 20+ hyperparameters!
RandomForestClassifier(
    n_estimators=?,       # ‚Üê Important?
    max_depth=?,          # ‚Üê Important?
    min_samples_split=?,  # ‚Üê Important?
    min_samples_leaf=?,   # ‚Üê Important?
    max_features=?,       # ‚Üê Important?
    ... # 15 more!
)
```

**The Reality:**
- üòµ **Too many to tune** - Grid search explodes combinatorially
- ‚è±Ô∏è **Wastes time** - Tuning unimportant params doesn't help
- üí∞ **Costs money** - Cloud compute isn't free

**The Solution:**
- üéØ **Focus on what matters** - Tune important params, use defaults for rest
- ‚ö° **Faster optimization** - Fewer dimensions = faster convergence
- üìä **Better understanding** - Know your model's sensitivities

### Feature Importance vs Hyperparameter Importance

| Aspect | Feature Importance | Hyperparameter Importance |
|--------|-------------------|---------------------------|
| **What** | Which input features matter? | Which hyperparameters matter? |
| **Impact** | Data ‚Üí Predictions | Model configuration ‚Üí Performance |
| **Usage** | Feature selection | Hyperparameter tuning |
| **Typical Result** | 20% features = 80% importance | 2-3 params = 80% importance |

### Real-world Example

**RandomForest hyperparameter importance (typical):**
1. ‚≠ê‚≠ê‚≠ê `max_depth` - CRITICAL (50% importance)
2. ‚≠ê‚≠ê `n_estimators` - Important (25% importance)
3. ‚≠ê `min_samples_split` - Moderate (15% importance)
4. Others - Minor (10% combined)

**Insight:** Focus on top 3, ignore the rest!

**Let's learn how!** üöÄ

<a id="setup"></a>
## 2. üõ†Ô∏è Setup

In [None]:
# Imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import warnings

# sklearn
from sklearn.datasets import load_diabetes
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import r2_score, mean_squared_error, mean_absolute_error

# Optuna for Bayesian optimization
import optuna
optuna.logging.set_verbosity(optuna.logging.WARNING)

# DeepBridge
from deepbridge import DBDataset, Experiment

# Settings
warnings.filterwarnings('ignore')
plt.style.use('seaborn-v0_8-whitegrid')
sns.set_palette('Set2')
%matplotlib inline

RANDOM_STATE = 42
np.random.seed(RANDOM_STATE)

print("‚úÖ Setup complete!")
print("üéõÔ∏è Topic: Hyperparameter Importance Analysis")

<a id="baseline"></a>
## 3. üìä Baseline Model

### Load Data

In [None]:
# Load diabetes dataset (regression)
diabetes = load_diabetes()
df = pd.DataFrame(diabetes.data, columns=diabetes.feature_names)
df['target'] = diabetes.target

print(f"üìä Diabetes Dataset:")
print(f"   Shape: {df.shape}")
print(f"   Task: Regression (predict disease progression)")
print(f"   Features: {len(diabetes.feature_names)}")

# Split
X = df.drop('target', axis=1)
y = df['target']

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=RANDOM_STATE
)

print(f"\n   Train: {X_train.shape}")
print(f"   Test: {X_test.shape}")

### Train Baseline Model (Default Hyperparameters)

In [None]:
# Baseline: RandomForest with default params
model_baseline = RandomForestRegressor(random_state=RANDOM_STATE)
model_baseline.fit(X_train, y_train)

# Evaluate
y_pred_baseline = model_baseline.predict(X_test)
r2_baseline = r2_score(y_test, y_pred_baseline)
rmse_baseline = np.sqrt(mean_squared_error(y_test, y_pred_baseline))

print("üìä BASELINE MODEL (Default Hyperparameters)")
print("=" * 60)
print(f"\n   R¬≤ Score: {r2_baseline:.4f}")
print(f"   RMSE: {rmse_baseline:.2f}")
print(f"\n   Default hyperparameters used:")
print(f"      n_estimators: {model_baseline.n_estimators}")
print(f"      max_depth: {model_baseline.max_depth}")
print(f"      min_samples_split: {model_baseline.min_samples_split}")
print(f"      min_samples_leaf: {model_baseline.min_samples_leaf}")
print(f"      max_features: {model_baseline.max_features}")

<a id="optuna"></a>
## 4. üîç Hyperparameter Tuning with Optuna

### Define Optimization Objective

In [None]:
print("üîç Setting up Bayesian Optimization with Optuna...\n")

def objective(trial):
    """
    Objective function for Optuna.
    Defines hyperparameter search space and returns metric to optimize.
    """
    # Define hyperparameters to tune
    params = {
        'n_estimators': trial.suggest_int('n_estimators', 50, 300),
        'max_depth': trial.suggest_int('max_depth', 2, 20),
        'min_samples_split': trial.suggest_int('min_samples_split', 2, 20),
        'min_samples_leaf': trial.suggest_int('min_samples_leaf', 1, 10),
        'max_features': trial.suggest_categorical('max_features', ['sqrt', 'log2', None]),
        'random_state': RANDOM_STATE
    }
    
    # Train model with these hyperparameters
    model = RandomForestRegressor(**params)
    
    # Cross-validation score
    scores = cross_val_score(model, X_train, y_train, 
                              cv=5, scoring='r2', n_jobs=-1)
    
    return scores.mean()  # Return mean R¬≤

print("‚úÖ Objective function defined")
print("   Hyperparameters to tune: 5")
print("   Optimization metric: R¬≤ (cross-validation)")

### Run Optimization

In [None]:
print("üöÄ Running Bayesian Optimization...\n")
print("   This will try 50 different hyperparameter combinations")
print("   Using smart Bayesian search (not random!)\n")

# Create study
study = optuna.create_study(
    direction='maximize',  # Maximize R¬≤
    sampler=optuna.samplers.TPESampler(seed=RANDOM_STATE)
)

# Optimize
study.optimize(objective, n_trials=50, show_progress_bar=True)

print(f"\n‚úÖ Optimization complete!")
print(f"   Trials run: {len(study.trials)}")
print(f"   Best R¬≤: {study.best_value:.4f}")
print(f"   Improvement over baseline: {(study.best_value - r2_baseline)*100:.2f}%")

### Best Hyperparameters

In [None]:
print("üèÜ BEST HYPERPARAMETERS FOUND\n")
print("=" * 60)

for param, value in study.best_params.items():
    print(f"   {param}: {value}")

# Train final model with best params
model_tuned = RandomForestRegressor(**study.best_params, random_state=RANDOM_STATE)
model_tuned.fit(X_train, y_train)

y_pred_tuned = model_tuned.predict(X_test)
r2_tuned = r2_score(y_test, y_pred_tuned)
rmse_tuned = np.sqrt(mean_squared_error(y_test, y_pred_tuned))

print(f"\nüìä Tuned Model Performance:")
print(f"   R¬≤ Score: {r2_tuned:.4f} (baseline: {r2_baseline:.4f})")
print(f"   RMSE: {rmse_tuned:.2f} (baseline: {rmse_baseline:.2f})")
print(f"\n   Improvement: {(r2_tuned - r2_baseline)*100:.2f}% R¬≤")

<a id="importance"></a>
## 5. üéØ Hyperparameter Importance Analysis

### Calculate Importance

In [None]:
print("üéØ Analyzing Hyperparameter Importance...\n")

# Get hyperparameter importance from Optuna
importance = optuna.importance.get_param_importances(study)

# Create DataFrame
importance_df = pd.DataFrame({
    'Hyperparameter': list(importance.keys()),
    'Importance': list(importance.values())
}).sort_values('Importance', ascending=False)

print("üìä HYPERPARAMETER IMPORTANCE RANKING\n")
print("=" * 60)
display(importance_df.style
        .format({'Importance': '{:.4f}'})
        .background_gradient(cmap='RdYlGn', subset=['Importance'])
)

print(f"\nüí° Key Insights:")
most_important = importance_df.iloc[0]['Hyperparameter']
most_importance_val = importance_df.iloc[0]['Importance']
print(f"   Most important: {most_important} ({most_importance_val:.1%} importance)")
print(f"   Top 2 params account for: {importance_df.head(2)['Importance'].sum():.1%} of importance")
print(f"   Bottom params are negligible - use defaults!")

### Visualize Importance

In [None]:
# Bar chart
plt.figure(figsize=(10, 6))
plt.barh(importance_df['Hyperparameter'], importance_df['Importance'],
         color='steelblue', edgecolor='black', alpha=0.8)
plt.xlabel('Importance', fontsize=12, fontweight='bold')
plt.title('Hyperparameter Importance', fontsize=14, fontweight='bold')
plt.grid(axis='x', alpha=0.3)
plt.gca().invert_yaxis()
plt.tight_layout()
plt.show()

print("\nüéØ Tuning Strategy Based on Importance:")
print(f"   ‚Ä¢ Focus on: {', '.join(importance_df.head(2)['Hyperparameter'])}")
print(f"   ‚Ä¢ Use defaults for rest to save time!")

<a id="sensitivity"></a>
## 6. üìà Sensitivity Testing

### Test Most Important Hyperparameter

In [None]:
# Get most important hyperparameter
most_important_param = importance_df.iloc[0]['Hyperparameter']

print(f"üìà Testing sensitivity of: {most_important_param}\n")

# Test range of values
if most_important_param == 'n_estimators':
    test_values = [50, 100, 150, 200, 250, 300]
elif most_important_param == 'max_depth':
    test_values = [2, 5, 8, 10, 15, 20]
elif most_important_param == 'min_samples_split':
    test_values = [2, 5, 10, 15, 20]
else:
    test_values = [1, 2, 3, 5, 8, 10]

# Test each value
r2_scores = []

for val in test_values:
    params = study.best_params.copy()
    params[most_important_param] = val
    
    model = RandomForestRegressor(**params, random_state=RANDOM_STATE)
    scores = cross_val_score(model, X_train, y_train, cv=5, scoring='r2')
    r2_scores.append(scores.mean())

# Plot sensitivity
plt.figure(figsize=(10, 6))
plt.plot(test_values, r2_scores, 'o-', linewidth=2, markersize=8, color='steelblue')
plt.xlabel(most_important_param, fontsize=12, fontweight='bold')
plt.ylabel('R¬≤ Score (CV)', fontsize=12, fontweight='bold')
plt.title(f'Sensitivity Analysis: {most_important_param}', fontsize=14, fontweight='bold')
plt.grid(alpha=0.3)
plt.tight_layout()
plt.show()

print(f"\nüí° Sensitivity Insights:")
best_idx = np.argmax(r2_scores)
print(f"   Optimal {most_important_param}: {test_values[best_idx]}")
print(f"   R¬≤ range: {min(r2_scores):.4f} - {max(r2_scores):.4f}")
print(f"   Impact: {(max(r2_scores) - min(r2_scores))*100:.2f}% R¬≤ swing")
print(f"   ‚Üí This param is VERY important to tune!")

## Conclusion

### What You Learned

- ‚úÖ **Hyperparameter vs Feature Importance** - Two different concepts
- ‚úÖ **Bayesian Optimization** - Smart tuning with Optuna
- ‚úÖ **Importance Analysis** - Which params matter most
- ‚úÖ **Sensitivity Testing** - How much impact does each param have
- ‚úÖ **Efficient Tuning** - Focus on top 2-3 params, ignore rest

### Key Takeaways

1. üéØ **80/20 Rule** - 2-3 params = 80% of importance
2. ‚ö° **Save Time** - Don't tune everything!
3. üìä **Measure, Don't Guess** - Use importance analysis
4. üîç **Optuna > Grid Search** - Bayesian optimization is smarter
5. üí° **Model-Specific** - Different models, different important params
6. üîÑ **Context Matters** - Importance varies by dataset

### Typical Importance Rankings

**RandomForest:**
1. `max_depth` ‚≠ê‚≠ê‚≠ê
2. `n_estimators` ‚≠ê‚≠ê
3. `min_samples_split` ‚≠ê

**GradientBoosting:**
1. `learning_rate` ‚≠ê‚≠ê‚≠ê
2. `n_estimators` ‚≠ê‚≠ê
3. `max_depth` ‚≠ê‚≠ê

**Neural Networks:**
1. `learning_rate` ‚≠ê‚≠ê‚≠ê
2. `batch_size` ‚≠ê‚≠ê
3. `hidden_layer_sizes` ‚≠ê‚≠ê

---

**Remember: Tune smart, not hard!** üéØ