# NB09: Efficiency-Adjusted Performance

**Question:** Is the best RAG config worth the extra latency/cost?

**Status: BLOCKED** -- No timing data available in current experiment results.

This notebook requires `duration` and `throughput_qps` fields in results.json,
which are not yet populated. When timing instrumentation is added, this notebook
will produce:
1. Pareto frontier (F1 vs latency)
2. Cost per correct answer
3. Agent complexity vs improvement
4. Practical recommendations at different latency budgets

**Prerequisite**: Add per-experiment timing to the execution pipeline.

In [None]:
from pathlib import Path
import numpy as np
import pandas as pd

from analysis_utils import (
    load_all_results, setup_plotting,
    PRIMARY_METRIC, BROKEN_MODELS,
)

setup_plotting()
STUDY_PATH = Path("../outputs/smart_retrieval_slm")

df_all = load_all_results(STUDY_PATH)
df = df_all[~df_all['model_short'].isin(BROKEN_MODELS)].copy()

has_duration = df['duration'].notna() & (df['duration'] > 0)
has_throughput = df['throughput'].notna() & (df['throughput'] > 0)

print(f"Total experiments: {len(df)}")
print(f"With duration > 0: {has_duration.sum()}")
print(f"With throughput > 0: {has_throughput.sum()}")

if has_duration.sum() == 0:
    print("\nNo timing data available. This notebook is blocked until ")
    print("per-experiment timing instrumentation is added to the execution pipeline.")
    print("See docs/FUTURE_WORK.md for the implementation plan.")
else:
    print(f"\nTiming data found! Proceeding with {has_duration.sum()} experiments...")