# Metrics-Based Impact Approximation

This notebook demonstrates **metrics-based impact approximation** using `evaluate_impact()`.

## Workflow Overview

1. User provides `products.csv`
2. User configures `DATA.ENRICHMENT` section
3. User calls `evaluate_impact(config.yaml)`
4. Engine handles everything internally (adapter, enrichment, transform, model)

## Setup

In [None]:
import json
from pathlib import Path

from impact_engine import evaluate_impact
from online_retail_simulator import simulate

## Step 1: Create Products Catalog

In production, this would be your actual product catalog.

In [None]:
output_path = Path("output/demo_metrics_approximation")
output_path.mkdir(parents=True, exist_ok=True)

job_info = simulate("configs/demo_metrics_approximation_catalog.yaml", job_id="catalog")
products = job_info.load_df("products")

print(f"Created: {job_info.get_store().full_path('products.csv')}")
print(f"Products: {len(products)}")
products

## Step 2: Configure Metrics Approximation

Configure the impact engine with:
- **ENRICHMENT**: Quality boost parameters
- **TRANSFORM**: Prepare data for approximation
- **MODEL**: `metrics_approximation` with response function

In [None]:
config_path = "configs/demo_metrics_approximation.yaml"

## Step 3: Run Impact Evaluation

A single call to `evaluate_impact()` handles everything:
- Engine creates CatalogSimulatorAdapter
- Adapter simulates metrics
- Adapter generates product_details
- Adapter applies enrichment (quality boost)
- Transform extracts quality_before/quality_after
- MetricsApproximationAdapter computes impact

In [None]:
results_path = evaluate_impact(config_path, str(output_path), job_id="results")
print(f"Results saved to: {results_path}")

## Step 4: Review Results

In [None]:
with open(results_path) as f:
    results = json.load(f)

data = results["data"]
model_params = data["model_params"]
estimates = data["impact_estimates"]
summary = data["model_summary"]

print("=" * 60)
print("METRICS-BASED IMPACT APPROXIMATION RESULTS")
print("=" * 60)

print(f"\nModel Type: {results['model_type']}")
print(f"Response Function: {model_params['response_function']}")

print("\n--- Aggregate Impact Estimates ---")
print(f"Total Impact:        ${estimates['impact']:.2f}")
print(f"Number of Products:  {summary['n_products']}")

In [None]:
import pandas as pd

# Per-product data is now in a separate parquet file (prefixed with model type)
results_dir = Path(results_path).parent
per_product_path = results_dir / "metrics_approximation__product_level_impacts.parquet"
per_product_df = pd.read_parquet(per_product_path)

print("\n--- Per-Product Breakdown ---")
print("-" * 60)
print(f"{'Product':<20} {'Delta Quality':<15} {'Baseline':<12} {'Impact':<12}")
print("-" * 60)
for _, p in per_product_df.iterrows():
    print(
        f"{p['product_id']:<20} {p['delta_metric']:<15.4f} "
        f"${p['baseline_outcome']:<11.2f} ${p['impact']:<11.2f}"
    )

print("\n" + "=" * 60)
print("Demo Complete!")
print("=" * 60)