# Impact Engine Loop

End-to-end demonstration of the Impact Engine Orchestrator.

The orchestrator runs a five-stage pipeline:

1. **MEASURE (pilot)** — estimate causal effects for all initiatives
2. **EVALUATE** — score confidence based on methodology
3. **ALLOCATE** — select a portfolio under budget constraints
4. **MEASURE (scale)** — re-measure selected initiatives at larger sample size
5. **REPORT** — compare predicted vs actual returns

This notebook uses the real `Measure` adapter (wrapping `impact_engine`), the
real `Evaluate` component (from `impact_engine_evaluate`), and a mock allocate
component. As real components are integrated, only the constructor calls
change — the orchestrator logic stays the same.

## Setup

In [None]:
import tempfile
from pathlib import Path

import pandas as pd
import yaml

from impact_engine_orchestrator.config import InitiativeConfig, PipelineConfig
from impact_engine_orchestrator.orchestrator import Orchestrator
from impact_engine_orchestrator.components.measure.measure import Measure
from impact_engine_evaluate import Evaluate
from impact_engine_orchestrator.components.allocate.mock import MockAllocate

# Create working directory with products data and measure configs
_workdir = Path(tempfile.mkdtemp())
_products_path = _workdir / "products.csv"
pd.DataFrame(
    {
        "product_id": [f"prod_{i:03d}" for i in range(5)],
        "name": [f"Product {i}" for i in range(5)],
        "category": ["Electronics"] * 5,
        "price": [99.99, 149.99, 79.99, 59.99, 199.99],
    }
).to_csv(_products_path, index=False)

_measure_config = {
    "DATA": {
        "SOURCE": {
            "type": "simulator",
            "CONFIG": {
                "path": str(_products_path),
                "mode": "rule",
                "seed": 42,
                "start_date": "2024-01-01",
                "end_date": "2024-01-31",
            },
        },
        "TRANSFORM": {"FUNCTION": "aggregate_by_date", "PARAMS": {"metric": "revenue"}},
    },
    "MEASUREMENT": {
        "MODEL": "interrupted_time_series",
        "PARAMS": {"dependent_variable": "revenue", "intervention_date": "2024-01-15"},
    },
}

INITIATIVE_IDS = [
    "product-desc-enhancement",
    "checkout-flow-optimization",
    "search-relevance-tuning",
    "pricing-display-test",
    "recommendation-engine-v2",
]
_storage_url = str(_workdir / "storage")


def _measure_config_path(initiative_id):
    path = _workdir / f"{initiative_id}.yaml"
    if not path.exists():
        with open(path, "w") as f:
            yaml.dump(_measure_config, f)
    return str(path)

## Configure the Pipeline

Two parameter levels:
- **Problem-level**: `budget`, `scale_sample_size`
- **Initiative-level**: `initiative_id`, `cost_to_scale`

In [None]:
initiatives = [
    InitiativeConfig(
        "product-desc-enhancement",
        cost_to_scale=15_000,
        measure_config=_measure_config_path("product-desc-enhancement"),
    ),
    InitiativeConfig(
        "checkout-flow-optimization",
        cost_to_scale=25_000,
        measure_config=_measure_config_path("checkout-flow-optimization"),
    ),
    InitiativeConfig(
        "search-relevance-tuning", cost_to_scale=20_000, measure_config=_measure_config_path("search-relevance-tuning")
    ),
    InitiativeConfig(
        "pricing-display-test", cost_to_scale=10_000, measure_config=_measure_config_path("pricing-display-test")
    ),
    InitiativeConfig(
        "recommendation-engine-v2",
        cost_to_scale=30_000,
        measure_config=_measure_config_path("recommendation-engine-v2"),
    ),
]

config = PipelineConfig(
    budget=100_000,
    scale_sample_size=5000,
    max_workers=4,
    initiatives=initiatives,
)

## Create and Run the Orchestrator

In [None]:
orchestrator = Orchestrator(
    measure=Measure(initiatives=initiatives, storage_url=_storage_url),
    evaluate=Evaluate(),
    allocate=MockAllocate(),
    config=config,
)

result = orchestrator.run()

## Stage 1 — Pilot Measurements

Each initiative gets a causal effect estimate with confidence intervals.

In [None]:
for p in result["pilot_results"]:
    print(
        f"{p['initiative_id']:.<40s} "
        f"effect={p['effect_estimate']:.2%}  "
        f"CI=[{p['ci_lower']:.2%}, {p['ci_upper']:.2%}]  "
        f"model={p['model_type'].value}"
    )

## Stage 2 — Evaluation Scores

Confidence scores reflect methodology quality (experiments score highest).

In [None]:
for e in result["evaluate_results"]:
    print(
        f"{e['initiative_id']:.<40s} confidence={e['confidence']:.2f}"
        f"  return_median={e['return_median']:.2%}  cost=${e['cost']:,.0f}"
    )

## Stage 3 — Allocation

Select initiatives by confidence-weighted return until budget is exhausted.

In [None]:
alloc = result["allocate_result"]
print(f"Selected {len(alloc['selected_initiatives'])} of {len(config.initiatives)} initiatives\n")

for iid in alloc["selected_initiatives"]:
    print(
        f"  {iid:.<40s} budget=${alloc['budget_allocated'][iid]:,.0f}  predicted={alloc['predicted_returns'][iid]:.2%}"
    )

total = sum(alloc["budget_allocated"].values())
print(f"\nTotal allocated: ${total:,.0f} / ${config.budget:,.0f}")

## Stage 4 & 5 — Scale Measurement and Outcome Reports

Selected initiatives are re-measured at `scale_sample_size=5000`.
The outcome report compares pilot predictions against scale actuals.

In [None]:
for report in result["outcome_reports"]:
    print(f"{report['initiative_id']}")
    print(f"  Predicted: {report['predicted_return']:.2%}")
    print(f"  Actual:    {report['actual_return']:.2%}")
    print(f"  Error:     {report['prediction_error']:+.2%}")
    print(f"  Confidence: {report['confidence_score']:.2f} ({report['model_type'].value})")
    print(f"  Samples:   {report['sample_size_pilot']} → {report['sample_size_scale']}")
    print()

## Determinism Check

The simulator uses a fixed seed, so repeated runs produce identical results.

In [None]:
result2 = orchestrator.run()
assert result["pilot_results"] == result2["pilot_results"]
assert result["outcome_reports"] == result2["outcome_reports"]
print("Determinism verified — identical results across runs.")