# Ablation Studies - Understanding Feature Importance

This notebook demonstrates how to perform ablation studies to understand which features and components contribute most to the performance of carbon-efficient Kubernetes scheduling algorithms.

## What are Ablation Studies?

Ablation studies systematically remove or disable components/features to understand their individual contributions to overall performance. This helps:

- **Identify critical features**: Which features are most important?
- **Understand interactions**: How do features interact with each other?
- **Optimize complexity**: Can we achieve similar performance with fewer features?
- **Debug performance**: Which components might be causing issues?

## Study Design

We'll analyze the impact of removing different scheduling features:
1. **Carbon awareness**: Remove carbon efficiency considerations
2. **Energy optimization**: Remove energy consumption optimization
3. **Load balancing**: Remove load balancing features
4. **Resource prediction**: Remove predictive resource allocation
5. **Node affinity**: Remove node affinity rules

In [None]:
# Import required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from pathlib import Path
import yaml
import json
from datetime import datetime
from scipy import stats
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.preprocessing import LabelEncoder
import warnings
warnings.filterwarnings('ignore')

# Set up plotting
plt.style.use('seaborn-v0_8')
sns.set_palette("husl")
%matplotlib inline

print("✅ Libraries imported successfully")

## 1. Load Ablation Study Data

Let's load the ablation study datasets that simulate different feature combinations.

In [None]:
# Load ablation study datasets
data_path = Path('../data')
ablation_datasets = {}

# Define ablation scenarios
ablation_scenarios = [
    'full_features',
    'no_carbon_awareness',
    'no_energy_optimization',
    'no_load_balancing',
    'no_resource_prediction',
    'no_node_affinity',
    'minimal_features'
]

for scenario in ablation_scenarios:
    try:
        path = data_path / 'synthetic' / f'ablation_{scenario}.csv'
        ablation_datasets[scenario] = pd.read_csv(path)
        print(f"✅ Loaded {scenario}: {len(ablation_datasets[scenario])} samples")
    except FileNotFoundError:
        print(f"❌ Could not load {scenario}")

print(f"\n📊 Loaded {len(ablation_datasets)} ablation scenarios")

In [None]:
# Combine all ablation data for analysis
if ablation_datasets:
    combined_ablation = pd.DataFrame()
    
    for scenario, df in ablation_datasets.items():
        df_copy = df.copy()
        df_copy['ablation_scenario'] = scenario
        combined_ablation = pd.concat([combined_ablation, df_copy], ignore_index=True)
    
    print(f"📈 Combined ablation dataset: {len(combined_ablation)} samples")
    print(f"🔬 Scenarios: {combined_ablation['ablation_scenario'].unique()}")
    
    # Display sample data
    print("\n📋 Sample Data:")
    display(combined_ablation.head())

## 2. Performance Impact Analysis

Let's analyze how removing each feature affects key performance metrics.

In [None]:
# Calculate performance metrics by ablation scenario
if not combined_ablation.empty:
    performance_metrics = combined_ablation.groupby('ablation_scenario')[[
        'carbon_efficiency', 'energy_consumption', 'performance_score', 
        'response_time', 'throughput', 'resource_utilization'
    ]].agg(['mean', 'std', 'count']).round(3)
    
    print("📊 Performance Metrics by Ablation Scenario:")
    display(performance_metrics)

In [None]:
# Visualize performance impact
if not combined_ablation.empty:
    fig, axes = plt.subplots(2, 3, figsize=(18, 12))
    
    metrics = ['carbon_efficiency', 'energy_consumption', 'performance_score', 
               'response_time', 'throughput', 'resource_utilization']
    titles = ['Carbon Efficiency', 'Energy Consumption (W)', 'Performance Score',
              'Response Time (ms)', 'Throughput (req/s)', 'Resource Utilization (%)']
    
    for i, (metric, title) in enumerate(zip(metrics, titles)):
        row, col = i // 3, i % 3
        
        # Box plot by ablation scenario
        combined_ablation.boxplot(column=metric, by='ablation_scenario', ax=axes[row, col])
        axes[row, col].set_title(f'{title} by Ablation Scenario')
        axes[row, col].set_xlabel('Ablation Scenario')
        axes[row, col].set_ylabel(title)
        axes[row, col].tick_params(axis='x', rotation=45)
    
    plt.suptitle('')  # Remove automatic title
    plt.tight_layout()
    plt.show()

## 3. Feature Importance Ranking

Let's calculate the relative importance of each feature by measuring performance degradation.

In [None]:
def calculate_feature_importance(df, baseline_scenario='full_features'):
    """Calculate feature importance based on performance degradation"""
    if baseline_scenario not in df['ablation_scenario'].values:
        print(f"❌ Baseline scenario '{baseline_scenario}' not found")
        return None
    
    # Get baseline performance
    baseline_data = df[df['ablation_scenario'] == baseline_scenario]
    baseline_metrics = baseline_data[[
        'carbon_efficiency', 'energy_consumption', 'performance_score'
    ]].mean()
    
    # Calculate importance for each ablation scenario
    importance_results = []
    
    for scenario in df['ablation_scenario'].unique():
        if scenario == baseline_scenario:
            continue
            
        scenario_data = df[df['ablation_scenario'] == scenario]
        scenario_metrics = scenario_data[[
            'carbon_efficiency', 'energy_consumption', 'performance_score'
        ]].mean()
        
        # Calculate relative changes (negative for degradation)
        carbon_change = (scenario_metrics['carbon_efficiency'] - baseline_metrics['carbon_efficiency']) / baseline_metrics['carbon_efficiency'] * 100
        energy_change = (scenario_metrics['energy_consumption'] - baseline_metrics['energy_consumption']) / baseline_metrics['energy_consumption'] * 100
        perf_change = (scenario_metrics['performance_score'] - baseline_metrics['performance_score']) / baseline_metrics['performance_score'] * 100
        
        # Calculate overall impact (weighted average)
        overall_impact = (carbon_change * 0.4 + (-energy_change) * 0.3 + perf_change * 0.3)
        
        importance_results.append({
            'Feature Removed': scenario.replace('no_', '').replace('_', ' ').title(),
            'Scenario': scenario,
            'Carbon Efficiency Change (%)': carbon_change,
            'Energy Consumption Change (%)': energy_change,
            'Performance Score Change (%)': perf_change,
            'Overall Impact Score': overall_impact,
            'Importance Rank': 0  # Will be filled later
        })
    
    # Convert to DataFrame and rank by overall impact
    importance_df = pd.DataFrame(importance_results)
    importance_df = importance_df.sort_values('Overall Impact Score', ascending=False)
    importance_df['Importance Rank'] = range(1, len(importance_df) + 1)
    
    return importance_df

if not combined_ablation.empty:
    feature_importance = calculate_feature_importance(combined_ablation)
    
    if feature_importance is not None:
        print("🏆 Feature Importance Ranking:")
        display(feature_importance.round(2))

In [None]:
# Visualize feature importance
if feature_importance is not None:
    fig, axes = plt.subplots(2, 2, figsize=(15, 10))
    
    # Overall impact score
    axes[0, 0].barh(feature_importance['Feature Removed'], feature_importance['Overall Impact Score'], color='skyblue')
    axes[0, 0].set_title('Overall Feature Impact Score')
    axes[0, 0].set_xlabel('Impact Score (Higher = More Important)')
    
    # Carbon efficiency impact
    axes[0, 1].barh(feature_importance['Feature Removed'], feature_importance['Carbon Efficiency Change (%)'], color='green')
    axes[0, 1].set_title('Carbon Efficiency Impact')
    axes[0, 1].set_xlabel('Change (%)')
    
    # Energy consumption impact
    axes[1, 0].barh(feature_importance['Feature Removed'], feature_importance['Energy Consumption Change (%)'], color='red')
    axes[1, 0].set_title('Energy Consumption Impact')
    axes[1, 0].set_xlabel('Change (%)')
    
    # Performance score impact
    axes[1, 1].barh(feature_importance['Feature Removed'], feature_importance['Performance Score Change (%)'], color='orange')
    axes[1, 1].set_title('Performance Score Impact')
    axes[1, 1].set_xlabel('Change (%)')
    
    plt.tight_layout()
    plt.show()

## 4. Statistical Significance Testing

Let's test if the performance differences are statistically significant.

In [None]:
def test_ablation_significance(df, baseline_scenario='full_features', metric='carbon_efficiency'):
    """Test statistical significance of ablation studies"""
    if baseline_scenario not in df['ablation_scenario'].values:
        return None
    
    baseline_data = df[df['ablation_scenario'] == baseline_scenario][metric]
    results = []
    
    for scenario in df['ablation_scenario'].unique():
        if scenario == baseline_scenario:
            continue
            
        scenario_data = df[df['ablation_scenario'] == scenario][metric]
        
        # T-test
        t_stat, t_pval = stats.ttest_ind(baseline_data, scenario_data)
        
        # Mann-Whitney U test
        u_stat, u_pval = stats.mannwhitneyu(baseline_data, scenario_data, alternative='two-sided')
        
        # Effect size (Cohen's d)
        pooled_std = np.sqrt(((len(baseline_data) - 1) * baseline_data.var() + 
                             (len(scenario_data) - 1) * scenario_data.var()) / 
                            (len(baseline_data) + len(scenario_data) - 2))
        cohens_d = (baseline_data.mean() - scenario_data.mean()) / pooled_std
        
        results.append({
            'Scenario': scenario,
            'Feature Removed': scenario.replace('no_', '').replace('_', ' ').title(),
            'Baseline Mean': baseline_data.mean(),
            'Scenario Mean': scenario_data.mean(),
            'Mean Difference': baseline_data.mean() - scenario_data.mean(),
            'T-test p-value': t_pval,
            'Mann-Whitney p-value': u_pval,
            'Effect Size (Cohen\'s d)': cohens_d,
            'Significant (p<0.05)': t_pval < 0.05,
            'Effect Size Category': 'Small' if abs(cohens_d) < 0.5 else 'Medium' if abs(cohens_d) < 0.8 else 'Large'
        })
    
    return pd.DataFrame(results)

if not combined_ablation.empty:
    # Test significance for carbon efficiency
    carbon_significance = test_ablation_significance(combined_ablation, metric='carbon_efficiency')
    
    if carbon_significance is not None:
        print("🧪 Statistical Significance Tests - Carbon Efficiency:")
        display(carbon_significance.round(4))

In [None]:
# Test significance for energy consumption
if not combined_ablation.empty:
    energy_significance = test_ablation_significance(combined_ablation, metric='energy_consumption')
    
    if energy_significance is not None:
        print("🧪 Statistical Significance Tests - Energy Consumption:")
        display(energy_significance.round(4))

## 5. Machine Learning Feature Importance

Let's use machine learning to understand feature importance from a different perspective.

In [None]:
# Prepare data for ML analysis
if not combined_ablation.empty:
    # Create feature matrix
    ml_data = combined_ablation.copy()
    
    # Encode categorical variables
    categorical_cols = ['scheduler', 'node_type', 'workload_type', 'ablation_scenario']
    label_encoders = {}
    
    for col in categorical_cols:
        if col in ml_data.columns:
            le = LabelEncoder()
            ml_data[f'{col}_encoded'] = le.fit_transform(ml_data[col])
            label_encoders[col] = le
    
    # Select features for ML model
    feature_cols = [
        'cpu_utilization', 'memory_utilization', 'network_io', 'disk_io',
        'load_factor', 'scheduler_encoded', 'node_type_encoded', 
        'workload_type_encoded', 'ablation_scenario_encoded'
    ]
    
    # Filter available columns
    available_features = [col for col in feature_cols if col in ml_data.columns]
    
    if available_features:
        X = ml_data[available_features]
        y = ml_data['carbon_efficiency']
        
        # Split data
        X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
        
        # Train Random Forest model
        rf_model = RandomForestRegressor(n_estimators=100, random_state=42)
        rf_model.fit(X_train, y_train)
        
        # Make predictions
        y_pred = rf_model.predict(X_test)
        
        # Calculate metrics
        mse = mean_squared_error(y_test, y_pred)
        r2 = r2_score(y_test, y_pred)
        
        print(f"🤖 Random Forest Model Performance:")
        print(f"   MSE: {mse:.4f}")
        print(f"   R²: {r2:.4f}")
        
        # Feature importance
        feature_importance_ml = pd.DataFrame({
            'Feature': available_features,
            'Importance': rf_model.feature_importances_
        }).sort_values('Importance', ascending=False)
        
        print("\n🎯 ML-Based Feature Importance:")
        display(feature_importance_ml.round(4))
        
        # Plot feature importance
        plt.figure(figsize=(10, 6))
        plt.barh(feature_importance_ml['Feature'], feature_importance_ml['Importance'], color='lightcoral')
        plt.title('Random Forest Feature Importance')
        plt.xlabel('Importance Score')
        plt.tight_layout()
        plt.show()
    else:
        print("❌ No suitable features found for ML analysis")

## 6. Interaction Effects Analysis

Let's analyze how different features interact with each other.

In [None]:
# Analyze interaction effects between workload types and ablation scenarios
if not combined_ablation.empty:
    interaction_analysis = combined_ablation.groupby(['workload_type', 'ablation_scenario'])[[
        'carbon_efficiency', 'energy_consumption', 'performance_score'
    ]].mean().round(3)
    
    print("🔄 Interaction Effects - Workload Type vs Ablation Scenario:")
    display(interaction_analysis)
    
    # Visualize interaction effects
    fig, axes = plt.subplots(1, 3, figsize=(18, 6))
    
    metrics = ['carbon_efficiency', 'energy_consumption', 'performance_score']
    titles = ['Carbon Efficiency', 'Energy Consumption', 'Performance Score']
    
    for i, (metric, title) in enumerate(zip(metrics, titles)):
        # Create pivot table for heatmap
        pivot_data = combined_ablation.pivot_table(
            values=metric, 
            index='workload_type', 
            columns='ablation_scenario', 
            aggfunc='mean'
        )
        
        sns.heatmap(pivot_data, annot=True, fmt='.3f', cmap='RdYlBu_r', ax=axes[i])
        axes[i].set_title(f'{title} - Workload vs Ablation')
        axes[i].set_xlabel('Ablation Scenario')
        axes[i].set_ylabel('Workload Type')
        axes[i].tick_params(axis='x', rotation=45)
    
    plt.tight_layout()
    plt.show()

## 7. Recommendations and Insights

Based on our ablation study analysis, let's generate actionable recommendations.

In [None]:
def generate_ablation_recommendations(importance_df, significance_df):
    """Generate recommendations based on ablation study results"""
    recommendations = []
    
    if importance_df is not None and not importance_df.empty:
        # Most important feature
        most_important = importance_df.iloc[0]
        recommendations.append(
            f"🏆 Most Critical Feature: '{most_important['Feature Removed']}' "
            f"(Impact Score: {most_important['Overall Impact Score']:.2f})"
        )
        
        # Least important feature
        least_important = importance_df.iloc[-1]
        recommendations.append(
            f"🔧 Least Critical Feature: '{least_important['Feature Removed']}' "
            f"(Impact Score: {least_important['Overall Impact Score']:.2f}) - Consider for optimization"
        )
        
        # Features with high carbon impact
        high_carbon_impact = importance_df[
            importance_df['Carbon Efficiency Change (%)'] < -10
        ]
        if not high_carbon_impact.empty:
            features = "', '".join(high_carbon_impact['Feature Removed'].tolist())
            recommendations.append(
                f"🌱 High Carbon Impact Features: '{features}' - Essential for carbon efficiency"
            )
    
    if significance_df is not None and not significance_df.empty:
        # Statistically significant features
        significant_features = significance_df[
            significance_df['Significant (p<0.05)'] == True
        ]
        if not significant_features.empty:
            features = "', '".join(significant_features['Feature Removed'].tolist())
            recommendations.append(
                f"📊 Statistically Significant Features: '{features}' - Reliable performance impact"
            )
        
        # Large effect size features
        large_effect = significance_df[
            significance_df['Effect Size Category'] == 'Large'
        ]
        if not large_effect.empty:
            features = "', '".join(large_effect['Feature Removed'].tolist())
            recommendations.append(
                f"💪 Large Effect Size Features: '{features}' - Major performance contributors"
            )
    
    # General recommendations
    recommendations.extend([
        "🎯 Focus development efforts on the most critical features identified",
        "⚖️ Consider feature complexity vs. performance trade-offs for optimization",
        "🔄 Test feature interactions in different workload scenarios",
        "📈 Monitor feature performance in production environments",
        "🧪 Conduct regular ablation studies as the system evolves"
    ])
    
    return recommendations

# Generate recommendations
recommendations = generate_ablation_recommendations(
    feature_importance if 'feature_importance' in locals() else None,
    carbon_significance if 'carbon_significance' in locals() else None
)

print("💡 Ablation Study Recommendations:")
print("=" * 60)
for i, rec in enumerate(recommendations, 1):
    print(f"{i}. {rec}")

## 8. Export Ablation Study Results

Let's save our ablation study results for future reference and reporting.

In [None]:
# Create comprehensive results summary
ablation_results = {
    'study_info': {
        'analysis_date': datetime.now().isoformat(),
        'total_samples': len(combined_ablation) if not combined_ablation.empty else 0,
        'scenarios_tested': combined_ablation['ablation_scenario'].unique().tolist() if not combined_ablation.empty else [],
        'baseline_scenario': 'full_features'
    },
    'feature_importance': feature_importance.to_dict('records') if 'feature_importance' in locals() and feature_importance is not None else [],
    'statistical_significance': {
        'carbon_efficiency': carbon_significance.to_dict('records') if 'carbon_significance' in locals() and carbon_significance is not None else [],
        'energy_consumption': energy_significance.to_dict('records') if 'energy_significance' in locals() and energy_significance is not None else []
    },
    'ml_feature_importance': feature_importance_ml.to_dict('records') if 'feature_importance_ml' in locals() else [],
    'interaction_effects': interaction_analysis.to_dict() if 'interaction_analysis' in locals() else {},
    'recommendations': recommendations,
    'model_performance': {
        'mse': mse if 'mse' in locals() else None,
        'r2_score': r2 if 'r2' in locals() else None
    }
}

# Save results
results_path = Path('../results')
results_path.mkdir(exist_ok=True)

with open(results_path / 'ablation_study_results.json', 'w') as f:
    json.dump(ablation_results, f, indent=2, default=str)

print("💾 Ablation study results saved to: evaluation/results/ablation_study_results.json")

# Also save feature importance as CSV for easy access
if 'feature_importance' in locals() and feature_importance is not None:
    feature_importance.to_csv(results_path / 'feature_importance_ranking.csv', index=False)
    print("📊 Feature importance ranking saved to: evaluation/results/feature_importance_ranking.csv")

print("\n✅ Ablation study analysis complete!")

## Summary

This ablation study analysis has provided insights into:

### Key Findings:
1. **Feature Importance**: Identified which features contribute most to carbon efficiency
2. **Statistical Significance**: Determined which features have reliable, measurable impact
3. **Interaction Effects**: Understood how features perform differently across workload types
4. **Optimization Opportunities**: Found features that could be simplified or removed

### Next Steps:
1. **Implementation**: Focus development on the most critical features
2. **Optimization**: Consider removing or simplifying low-impact features
3. **Testing**: Validate findings in production environments
4. **Monitoring**: Track feature performance over time

### Related Notebooks:
- **01_Getting_Started.ipynb**: Basic framework usage
- **03_Baseline_Comparison.ipynb**: Detailed baseline analysis
- **04_Statistical_Analysis.ipynb**: Advanced statistical methods

The results from this analysis should guide feature development priorities and system optimization efforts. 🚀