# 00 – Metric Smoke Test

This notebook is a quick **sanity check** for the local metric implementation
in `src/santa2025/evaluation.py`.

It will:
1. Import your metric functions from `santa2025.evaluation`.
2. Run them on `data/raw/sample_submission.csv` (or any other submission).
3. Show the per‑`n` table and total score.
4. Let you compare the **local total score** against the **Kaggle leaderboard**
   for the same CSV.

If the local score matches the Kaggle score (up to tiny floating‑point noise),
then your metric is wired up correctly.


In [None]:
from pathlib import Path
import pandas as pd

pd.set_option('display.float_format', '{:.12f}'.format)

PROJECT_ROOT = Path('.').resolve()
DATA_RAW = PROJECT_ROOT / 'data' / 'raw'
print('Project root:', PROJECT_ROOT)
print('Raw data dir:', DATA_RAW)


In [None]:
# This assumes you have `src/` marked as a Sources Root in PyCharm
# and that `src/santa2025/evaluation.py` defines
#   - metric_from_df(df: pd.DataFrame) -> pd.DataFrame
#   - evaluate_submission_csv(path: str | Path) -> (score_table: pd.DataFrame, total_score: float)

from santa2025.evaluation import metric_from_df, evaluate_submission_csv
metric_from_df, evaluate_submission_csv


In [None]:
sample_path = DATA_RAW / 'sample_submission.csv'
if not sample_path.exists():
    raise FileNotFoundError(
        f'Missing {sample_path}. Put Kaggle\'s sample_submission.csv into data/raw/.'
    )

score_table, local_total = evaluate_submission_csv(sample_path)
display(score_table.head())
print('\nLocal total score:', f'{local_total:.12f}')
print('Distinct n:', score_table['n'].nunique())
print('Min n:', score_table['n'].min(), 'Max n:', score_table['n'].max())
print('Total trees:', score_table['count'].sum())


## Compare with Kaggle leaderboard

To verify that your local metric is correct:

1. Take the **same exact CSV** you just evaluated locally (e.g. `sample_submission.csv`
   or a new submission you created with your packer).
2. Upload it to the Kaggle **Santa 2025 – Christmas Tree Packing** competition.
3. Wait for the submission to score, then copy the **score shown on the leaderboard**.
4. Paste that score into the next cell as `kaggle_total`.
5. Run the cell to see the absolute difference.

Ideally, the difference should be **0** or on the order of numerical noise
(e.g. < 1e-9). If there is a large difference, there is a bug in the local
metric implementation that needs to be fixed.


In [None]:
# Paste the Kaggle leaderboard score for the SAME CSV here.
# Example: kaggle_total = 123.456789012345
kaggle_total = None  # <-- replace None with the numeric score from Kaggle

if kaggle_total is None:
    print('Set kaggle_total to the numeric score from Kaggle and re-run this cell.')
else:
    diff = abs(local_total - kaggle_total)
    print('Local total:', f'{local_total:.12f}')
    print('Kaggle total:', f'{kaggle_total:.12f}')
    print('Absolute difference:', f'{diff:.12e}')
    if diff < 1e-9:
        print('✅ Metrics match within tight numerical tolerance.')
    else:
        print('⚠️ Difference is larger than expected. Double-check evaluation.py.')
