# Telemetry Report Comparison

This notebook demonstrates how to load multiple telemetry tuning reports,
merge them, and visualize objective improvements using the helper
`analyze_tuner_reports.py`.


## Setup

We'll reuse the CLI helpers that generated the latest `demo_tuner_report.csv`
under `docs/examples/analytics/data/tuner_reports/`.


In [1]:
from pathlib import Path

import pandas as pd

try:
    import altair as alt
except ImportError:  # pragma: no cover - optional for CI
    alt = None

CANDIDATE_DIRS = [
    Path("data/tuner_reports"),
    Path("../data/tuner_reports"),
    Path("examples/analytics/data/tuner_reports"),
    Path("docs/examples/analytics/data/tuner_reports"),
    Path.cwd() / "data/tuner_reports",
    Path.cwd() / "examples/analytics/data/tuner_reports",
]

DATA_DIR = None
for candidate in CANDIDATE_DIRS:
    candidate = candidate.resolve() if not candidate.is_absolute() else candidate
    if (candidate / "demo_tuner_report.csv").exists():
        DATA_DIR = candidate
        break

if DATA_DIR is None:
    print("Telemetry reports not found. Generating synthetic sample data instead.")
    BASELINE = None
    EXPERIMENT = None
else:
    BASELINE = DATA_DIR / "demo_tuner_report.csv"
    EXPERIMENT = DATA_DIR / "demo_tuner_report.csv"  # replace with new report as needed

## Load Reports
We can either call the helper script via `subprocess` or load the CSVs directly
for ad-hoc comparisons.


> CI executes with a lightweight telemetry sweep, so the required CSV files are generated automatically.


In [2]:
if BASELINE is None or EXPERIMENT is None or not BASELINE.exists() or not EXPERIMENT.exists():
    print("Demo telemetry reports not found. Generating synthetic sample data instead.")
    sample = pd.DataFrame(
        [
            {
                "algorithm": "random",
                "scenario": "SampleScenario",
                "best_objective": 6.2,
                "mean_objective": 5.9,
                "runs": 2,
                "label": "baseline",
            },
            {
                "algorithm": "random",
                "scenario": "SampleScenario",
                "best_objective": 7.1,
                "mean_objective": 6.6,
                "runs": 2,
                "label": "experiment",
            },
            {
                "algorithm": "grid",
                "scenario": "SampleScenario",
                "best_objective": 6.8,
                "mean_objective": 6.3,
                "runs": 2,
                "label": "baseline",
            },
            {
                "algorithm": "grid",
                "scenario": "SampleScenario",
                "best_objective": 7.4,
                "mean_objective": 6.9,
                "runs": 2,
                "label": "experiment",
            },
            {
                "algorithm": "bayes",
                "scenario": "SampleScenario",
                "best_objective": 7.0,
                "mean_objective": 6.7,
                "runs": 1,
                "label": "baseline",
            },
            {
                "algorithm": "bayes",
                "scenario": "SampleScenario",
                "best_objective": 7.6,
                "mean_objective": 7.2,
                "runs": 1,
                "label": "experiment",
            },
        ]
    )
    baseline = sample[sample["label"] == "baseline"].drop(columns="label").reset_index(drop=True)
    experiment = (
        sample[sample["label"] == "experiment"].drop(columns="label").reset_index(drop=True)
    )
else:
    baseline = pd.read_csv(BASELINE)
    experiment = pd.read_csv(EXPERIMENT)

merged = pd.concat(
    [baseline.assign(label="baseline"), experiment.assign(label="experiment")], ignore_index=True
)
display(merged)

Unnamed: 0,algorithm,scenario,runs,best_objective,mean_objective,best_run_id,best_started_at,best_config,summary_best,summary_configurations,summary_updated_at,label
0,bayes,FHOPS MiniToy,1,3.0,3.0,73a84eb549714c6da52b92ffda8752b6,2025-11-11T21:15:55+00:00,iters=50; operators=(block_insertion:1.9858747...,,,,baseline
1,grid,FHOPS MiniToy,1,9.0,9.0,f590aa00bb3543debcaebd6bdf288e6b,2025-11-11T21:15:54+00:00,"iters=50; operators=(block_insertion:0.0, cros...",,,,baseline
2,random,FHOPS MiniToy,1,-6.0,-6.0,ad611c7346f84950be678527ae2de434,2025-11-11T21:15:53+00:00,batch_size=2; iters=50; operators=(block_inser...,,,,baseline
0,bayes,FHOPS MiniToy,1,3.0,3.0,73a84eb549714c6da52b92ffda8752b6,2025-11-11T21:15:55+00:00,iters=50; operators=(block_insertion:1.9858747...,,,,experiment
1,grid,FHOPS MiniToy,1,9.0,9.0,f590aa00bb3543debcaebd6bdf288e6b,2025-11-11T21:15:54+00:00,"iters=50; operators=(block_insertion:0.0, cros...",,,,experiment
2,random,FHOPS MiniToy,1,-6.0,-6.0,ad611c7346f84950be678527ae2de434,2025-11-11T21:15:53+00:00,batch_size=2; iters=50; operators=(block_inser...,,,,experiment


## Visualize Best Objectives
Plot the best objective per algorithm to see improvements/deltas.


In [3]:
if alt is None or merged.empty:
    display(
        "Altair not installed or no data available; install `altair` or supply reports to render the chart."
    )
else:
    chart = (
        alt.Chart(merged)
        .mark_line(point=True)
        .encode(x="label:N", y="best_objective:Q", color="algorithm:N", column="scenario:N")
    )
    chart

## Delta Table
Join baseline and experiment to compute deltas using pandas.


In [4]:
comparison = baseline.merge(
    experiment, on=["algorithm", "scenario"], suffixes=("_baseline", "_experiment")
)
comparison["best_delta"] = (
    comparison["best_objective_experiment"] - comparison["best_objective_baseline"]
)
comparison

Unnamed: 0,algorithm,scenario,runs_baseline,best_objective_baseline,mean_objective_baseline,best_run_id_baseline,best_started_at_baseline,best_config_baseline,summary_best_baseline,summary_configurations_baseline,...,best_objective_experiment,mean_objective_experiment,best_run_id_experiment,best_started_at_experiment,best_config_experiment,summary_best_experiment,summary_configurations_experiment,summary_updated_at_experiment,label_experiment,best_delta
0,bayes,FHOPS MiniToy,1,3.0,3.0,73a84eb549714c6da52b92ffda8752b6,2025-11-11T21:15:55+00:00,iters=50; operators=(block_insertion:1.9858747...,,,...,3.0,3.0,73a84eb549714c6da52b92ffda8752b6,2025-11-11T21:15:55+00:00,iters=50; operators=(block_insertion:1.9858747...,,,,experiment,0.0
1,grid,FHOPS MiniToy,1,9.0,9.0,f590aa00bb3543debcaebd6bdf288e6b,2025-11-11T21:15:54+00:00,"iters=50; operators=(block_insertion:0.0, cros...",,,...,9.0,9.0,f590aa00bb3543debcaebd6bdf288e6b,2025-11-11T21:15:54+00:00,"iters=50; operators=(block_insertion:0.0, cros...",,,,experiment,0.0
2,random,FHOPS MiniToy,1,-6.0,-6.0,ad611c7346f84950be678527ae2de434,2025-11-11T21:15:53+00:00,batch_size=2; iters=50; operators=(block_inser...,,,...,-6.0,-6.0,ad611c7346f84950be678527ae2de434,2025-11-11T21:15:53+00:00,batch_size=2; iters=50; operators=(block_inser...,,,,experiment,0.0


Use this notebook as a starting point for richer analytics (e.g., trendlines
across multiple nightly reports).

## Compare Against Historical Summary
If you have downloaded history artefacts, you can re-use the helper script to
compare the latest report against prior snapshots. Update `HISTORY_DIR` to point
to your archive.


This notebook focuses on objectives. See `tuner_history_analysis.ipynb` for multi-metric (KPI) trends.

In [None]:
from pathlib import Path

HISTORY_DIR = Path("docs/examples/analytics/data/tuner_reports")
if HISTORY_DIR.exists():
    !python scripts/analyze_tuner_reports.py --report latest={BASELINE} --history-dir {HISTORY_DIR} --out-history-markdown tmp/notebook_history.md
    display(Path("tmp/notebook_history.md").read_text())
else:
    print("History directory not found. Run analyze_tuner_reports.py manually.")