### **Pipeline.run_benchmark Tutorial**

The `Pipeline.run_benchmark` method in the SAGED library performs an end-to-end benchmarking process. This includes generating responses, extracting features, and analyzing data. This notebook demonstrates how to set up and execute the pipeline.

In [None]:

from your_module.pipeline import Pipeline
import pandas as pd

# Example function for generation (Replace with actual generation logic)
def example_function():
    return "Generated data example"


#### **Define Configurations**

In [None]:

config = {
    "generation": {
        "require": True,
        "generate_dict": {
            "example_generation": example_function  # Replace with a valid function
        },
        "generation_saving_location": "output/generated_benchmark.csv",
        "generation_list": ["example_generation"]
    },
    "extraction": {
        "feature_extractors": ["sentiment_classification", "toxicity_classification"],
        "calibration": True,
        "extraction_saving_location": "output/extracted_features.csv",
        "extractor_configs": {
            "sentiment_classification": {"some_param": "value"}  # Example config
        }
    },
    "analysis": {
        "specifications": ["concept", "source_tag"],
        "analyzers": ["mean", "selection_rate"],
        "analyzer_configs": {
            "selection_rate": {"standard_by": "mean"}
        },
        "statistics_saving_location": "output/statistics.csv",
        "disparity_saving_location": "output/disparities.csv"
    }
}


#### **Running the Benchmark**

In [None]:

# Define the domain
domain = "example_domain"

# Run the benchmark
Pipeline.run_benchmark(config=config, domain=domain)


#### **Outputs**


The following outputs will be generated:
1. **Generated Benchmark**: Saved to `generation["generation_saving_location"]`.
2. **Extracted Features**: Saved to `extraction["extraction_saving_location"]`.
3. **Analysis Statistics**: Saved to `analysis["statistics_saving_location"]`.
4. **Disparities**: Saved to `analysis["disparity_saving_location"]`.


#### **Example Output Inspection**

In [None]:

# Load generated benchmark
generated_benchmark = pd.read_csv("output/generated_benchmark.csv")
print(generated_benchmark.head())

# Load extracted features
extracted_features = pd.read_csv("output/extracted_features.csv")
print(extracted_features.head())

# Load analysis statistics
statistics = pd.read_csv("output/statistics.csv")
print(statistics.head())

# Load disparities
disparities = pd.read_csv("output/disparities.csv")
print(disparities.head())


#### **Notes**


- Ensure all paths in `config` exist or adjust them to your environment.
- Validate the functions provided in `generation["generate_dict"]` to avoid runtime errors.
- Review outputs to refine configurations and improve benchmarking results.
