# Carbon-Kube Evaluation Framework Overview

Welcome to the Carbon-Kube evaluation framework! This comprehensive toolkit is designed to evaluate and compare carbon-efficient Kubernetes scheduling algorithms.

## 🎯 Framework Objectives

- **Performance Evaluation**: Measure and compare scheduler performance across multiple metrics
- **Statistical Validation**: Provide rigorous statistical analysis of results
- **Reproducibility**: Ensure consistent and reproducible evaluation processes
- **Comprehensive Analysis**: Support various analysis types from basic comparisons to advanced statistical methods

## 📊 Key Features

### 1. Multi-Metric Evaluation
- **Carbon Efficiency**: Primary metric for environmental impact
- **Energy Consumption**: Power usage optimization
- **Performance Metrics**: Response time, throughput, resource utilization
- **System Metrics**: CPU, memory, network utilization

### 2. Statistical Analysis
- **Hypothesis Testing**: t-tests, ANOVA, non-parametric tests
- **Effect Size Analysis**: Cohen's d, confidence intervals
- **Bootstrap Methods**: Robust statistical inference
- **Multiple Comparisons**: Proper handling of multiple scheduler comparisons

### 3. Specialized Studies
- **Baseline Comparisons**: Compare against standard schedulers
- **Ablation Studies**: Understand component contributions
- **Scenario Analysis**: Performance across different workload types

## 📁 Framework Structure

```
evaluation/
├── data/                    # Dataset storage
│   ├── raw/                # Raw data files
│   ├── synthetic/          # Generated synthetic datasets
│   └── benchmarks/         # Benchmark datasets
├── notebooks/              # Analysis notebooks
│   ├── 00_Framework_Overview.ipynb
│   ├── 01_Getting_Started.ipynb
│   ├── 02_Ablation_Studies.ipynb
│   ├── 03_Baseline_Comparison.ipynb
│   └── 04_Statistical_Analysis.ipynb
├── results/                # Analysis results
├── scripts/                # Utility scripts
└── config/                 # Configuration files
```

## 🚀 Quick Start Guide

### Prerequisites

1. **Python Environment**: Python 3.8+ with required packages
2. **Data**: Scheduler performance data in CSV format
3. **Configuration**: Dataset configuration file (YAML)

### Installation

```bash
# Install required packages
pip install pandas numpy matplotlib seaborn scipy scikit-learn statsmodels jupyter

# Optional: For advanced Bayesian analysis
pip install pymc arviz
```

### Basic Usage

1. **Start with Getting Started**: `01_Getting_Started.ipynb`
2. **Run Baseline Comparison**: `03_Baseline_Comparison.ipynb`
3. **Perform Statistical Analysis**: `04_Statistical_Analysis.ipynb`
4. **Conduct Ablation Studies**: `02_Ablation_Studies.ipynb`

## 📚 Notebook Guide

### 1. Getting Started (`01_Getting_Started.ipynb`)
**Purpose**: Introduction to the framework and basic analysis

**What you'll learn**:
- Load and explore datasets
- Basic statistical analysis
- Data visualization techniques
- Performance comparison basics

**Best for**: New users, initial data exploration

### 2. Ablation Studies (`02_Ablation_Studies.ipynb`)
**Purpose**: Understand component contributions to scheduler performance

**What you'll learn**:
- Design ablation experiments
- Analyze feature importance
- Statistical significance testing
- Component interaction effects

**Best for**: Algorithm developers, feature analysis

### 3. Baseline Comparison (`03_Baseline_Comparison.ipynb`)
**Purpose**: Comprehensive comparison against standard schedulers

**What you'll learn**:
- Multi-scheduler comparison
- Trade-off analysis
- Scenario-based evaluation
- Decision framework development

**Best for**: Performance evaluation, scheduler selection

### 4. Statistical Analysis (`04_Statistical_Analysis.ipynb`)
**Purpose**: Advanced statistical methods and rigorous analysis

**What you'll learn**:
- Hypothesis testing
- Bootstrap methods
- Effect size analysis
- Confidence intervals

**Best for**: Research, publication-quality analysis

## 📊 Data Requirements

### Required Columns
Your dataset should include these essential columns:

- **scheduler**: Scheduler identifier (string)
- **carbon_efficiency**: Primary carbon efficiency metric (float)
- **energy_consumption**: Energy usage in watts or kWh (float)
- **performance_score**: Overall performance metric (float)

### Optional Columns
Additional columns for enhanced analysis:

- **response_time**: Request response time (float)
- **throughput**: Requests per second (float)
- **cpu_utilization**: CPU usage percentage (float)
- **memory_utilization**: Memory usage percentage (float)
- **workload_type**: Type of workload (string)
- **node_type**: Node configuration (string)
- **timestamp**: Time of measurement (datetime)

### Data Format Example

```csv
scheduler,carbon_efficiency,energy_consumption,performance_score,workload_type
kubernetes_default,0.75,120.5,0.82,web_service
carbon_aware_v1,0.89,98.2,0.85,web_service
energy_efficient,0.82,105.1,0.88,batch_job
```

## 🔧 Configuration

### Dataset Configuration (`dataset_config.yaml`)

```yaml
datasets:
  main_dataset:
    description: "Primary evaluation dataset"
    metrics:
      - carbon_efficiency
      - energy_consumption
      - performance_score
    
  baseline_comparison:
    schedulers:
      - kubernetes_default
      - carbon_aware_v1
      - energy_efficient
    
analysis_settings:
  significance_level: 0.05
  bootstrap_iterations: 1000
  confidence_level: 0.95
```

### Environment Variables

```bash
export CARBON_KUBE_DATA_PATH="/path/to/evaluation/data"
export CARBON_KUBE_RESULTS_PATH="/path/to/evaluation/results"
```

## 📈 Analysis Workflow

### Recommended Analysis Sequence

1. **Data Preparation**
   - Load and validate datasets
   - Check data quality and completeness
   - Handle missing values and outliers

2. **Exploratory Analysis**
   - Basic descriptive statistics
   - Data distribution analysis
   - Correlation analysis

3. **Baseline Comparison**
   - Compare against standard schedulers
   - Identify best-performing configurations
   - Analyze trade-offs

4. **Statistical Validation**
   - Hypothesis testing
   - Effect size analysis
   - Confidence intervals

5. **Specialized Studies**
   - Ablation studies for feature importance
   - Scenario-based analysis
   - Sensitivity analysis

6. **Results Interpretation**
   - Generate recommendations
   - Create summary reports
   - Export findings

## 🎯 Best Practices

### Statistical Analysis
- **Check Assumptions**: Verify normality, equal variances before parametric tests
- **Use Appropriate Tests**: Choose parametric vs non-parametric based on data
- **Multiple Comparisons**: Apply corrections (Bonferroni, FDR) when testing multiple hypotheses
- **Effect Size**: Always report effect sizes alongside p-values
- **Confidence Intervals**: Provide confidence intervals for estimates

### Data Quality
- **Validate Data**: Check for missing values, outliers, inconsistencies
- **Document Sources**: Keep track of data sources and collection methods
- **Version Control**: Track dataset versions and changes
- **Reproducibility**: Ensure analysis can be reproduced with same data

### Reporting
- **Clear Metrics**: Define all metrics and their units
- **Statistical Details**: Report test statistics, p-values, effect sizes
- **Visualizations**: Use appropriate charts for different data types
- **Limitations**: Acknowledge analysis limitations and assumptions

## 🔍 Troubleshooting

### Common Issues

#### Data Loading Problems
- **File not found**: Check file paths and working directory
- **Encoding issues**: Specify encoding when reading CSV files
- **Column mismatches**: Verify column names match expected format

#### Statistical Analysis Issues
- **Small sample sizes**: Use non-parametric tests or bootstrap methods
- **Non-normal data**: Apply transformations or use robust methods
- **Missing values**: Handle appropriately (imputation, exclusion)

#### Performance Issues
- **Large datasets**: Use sampling or chunked processing
- **Memory errors**: Reduce data size or use more efficient data types
- **Slow computations**: Consider parallel processing or approximations

### Getting Help
- Check notebook documentation and comments
- Review error messages carefully
- Consult statistical references for method details
- Use built-in help functions (`help()`, `?`)

## 📚 References and Resources

### Statistical Methods
- **Hypothesis Testing**: Neyman-Pearson framework, Type I/II errors
- **Effect Sizes**: Cohen's conventions for small/medium/large effects
- **Bootstrap Methods**: Efron & Tibshirani (1993)
- **Multiple Comparisons**: Benjamini-Hochberg procedure

### Carbon-Efficient Computing
- Green computing principles
- Energy-aware scheduling algorithms
- Carbon footprint measurement
- Sustainable computing practices

### Tools and Libraries
- **pandas**: Data manipulation and analysis
- **scipy**: Statistical functions and tests
- **statsmodels**: Advanced statistical modeling
- **scikit-learn**: Machine learning and preprocessing
- **matplotlib/seaborn**: Data visualization

## 🚀 Next Steps

Ready to start your analysis? Here's what to do next:

1. **📖 Read the Getting Started notebook**: `01_Getting_Started.ipynb`
2. **📊 Prepare your data**: Ensure it matches the required format
3. **⚙️ Configure the framework**: Set up your dataset configuration
4. **🔬 Run your first analysis**: Start with basic comparisons
5. **📈 Explore advanced features**: Try ablation studies and statistical analysis

### Quick Navigation
- [Getting Started →](01_Getting_Started.ipynb)
- [Baseline Comparison →](03_Baseline_Comparison.ipynb)
- [Ablation Studies →](02_Ablation_Studies.ipynb)
- [Statistical Analysis →](04_Statistical_Analysis.ipynb)

---

**Happy analyzing! 🎉**

*The Carbon-Kube evaluation framework is designed to make rigorous performance evaluation accessible and reproducible. If you have questions or suggestions, please don't hesitate to reach out.*