# Impact Engine Loop

End-to-end demonstration of the Impact Engine Orchestrator.

The orchestrator runs a five-stage pipeline:

1. **MEASURE (pilot)** — estimate causal effects for all initiatives
2. **EVALUATE** — score confidence based on methodology
3. **ALLOCATE** — select a portfolio under budget constraints
4. **MEASURE (scale)** — re-measure selected initiatives at larger sample size
5. **REPORT** — compare predicted vs actual returns

This notebook uses mock components. As real components are integrated,
only the constructor calls change — the orchestrator logic stays the same.

## Setup

In [None]:
from impact_engine_orchestrator.config import InitiativeConfig, PipelineConfig
from impact_engine_orchestrator.orchestrator import Orchestrator
from impact_engine_orchestrator.components.measure.mock import MockMeasure
from impact_engine_orchestrator.components.evaluate.mock import MockEvaluate
from impact_engine_orchestrator.components.allocate.mock import MockAllocate

## Configure the Pipeline

Two parameter levels:
- **Problem-level**: `budget`, `scale_sample_size`
- **Initiative-level**: `initiative_id`, `cost_to_scale`

In [None]:
config = PipelineConfig(
    budget=100_000,
    scale_sample_size=5000,
    max_workers=4,
    initiatives=[
        InitiativeConfig("product-desc-enhancement", cost_to_scale=15_000),
        InitiativeConfig("checkout-flow-optimization", cost_to_scale=25_000),
        InitiativeConfig("search-relevance-tuning", cost_to_scale=20_000),
        InitiativeConfig("pricing-display-test", cost_to_scale=10_000),
        InitiativeConfig("recommendation-engine-v2", cost_to_scale=30_000),
    ],
)

## Create and Run the Orchestrator

In [None]:
orchestrator = Orchestrator(
    measure=MockMeasure(),
    evaluate=MockEvaluate(),
    allocate=MockAllocate(),
    config=config,
)

result = orchestrator.run()

## Stage 1 — Pilot Measurements

Each initiative gets a causal effect estimate with confidence intervals.

In [None]:
for p in result["pilot_results"]:
    print(
        f"{p['initiative_id']:.<40s} "
        f"effect={p['effect_estimate']:.2%}  "
        f"CI=[{p['ci_lower']:.2%}, {p['ci_upper']:.2%}]  "
        f"model={p['model_type']}"
    )

## Stage 2 — Evaluation Scores

Confidence scores reflect methodology quality (experiments score highest).

In [None]:
for e in result["evaluate_results"]:
    print(
        f"{e['initiative_id']:.<40s} confidence={e['confidence']:.2f}  R_med={e['R_med']:.2%}  cost=${e['cost']:,.0f}"
    )

## Stage 3 — Allocation

Select initiatives by confidence-weighted return until budget is exhausted.

In [None]:
alloc = result["allocate_result"]
print(f"Selected {len(alloc['selected_initiatives'])} of {len(config.initiatives)} initiatives\n")

for iid in alloc["selected_initiatives"]:
    print(
        f"  {iid:.<40s} budget=${alloc['budget_allocated'][iid]:,.0f}  predicted={alloc['predicted_returns'][iid]:.2%}"
    )

total = sum(alloc["budget_allocated"].values())
print(f"\nTotal allocated: ${total:,.0f} / ${config.budget:,.0f}")

## Stage 4 & 5 — Scale Measurement and Outcome Reports

Selected initiatives are re-measured at `scale_sample_size=5000`.
The outcome report compares pilot predictions against scale actuals.

In [None]:
for report in result["outcome_reports"]:
    print(f"{report['initiative_id']}")
    print(f"  Predicted: {report['predicted_return']:.2%}")
    print(f"  Actual:    {report['actual_return']:.2%}")
    print(f"  Error:     {report['prediction_error']:+.2%}")
    print(f"  Confidence: {report['confidence_score']:.2f} ({report['model_type']})")
    print(f"  Samples:   {report['sample_size_pilot']} → {report['sample_size_scale']}")
    print()

## Determinism Check

Mock components are seeded by `initiative_id`, so repeated runs produce identical results.

In [None]:
result2 = orchestrator.run()
assert result["pilot_results"] == result2["pilot_results"]
assert result["outcome_reports"] == result2["outcome_reports"]
print("Determinism verified — identical results across runs.")