# 02 – System Evaluation

**Purpose:**  
This notebook evaluates the AI Data Analyst Agent using fixed,
predefined natural-language queries.

The evaluation focuses on:
- Correctness of execution
- Determinism of results
- Quality of generated insights

All experiments correspond to configurations defined in `experiments/*.yaml`.


In [10]:
import sys
from pathlib import Path

# Add project root to PYTHONPATH
project_root = Path("..").resolve()
if str(project_root) not in sys.path:
    sys.path.insert(0, str(project_root))


In [12]:
import pandas as pd

from src.agents.planner import PlannerAgent
from src.executor.executor import execute_plan
from src.schemas.plan_validator import validate_plan


In [7]:
df = pd.read_csv("../data/BOOK1.csv")

In [13]:
evaluation_queries = [
    "Show total SALES by COUNTRY",
    "Find top 5 COUNTRIES by SALES",
    "Show average sales by COUNTRY"
]


In [14]:
planner = PlannerAgent()
columns = list(df.columns)

for query in evaluation_queries:
    print("=" * 80)
    print("Query:", query)

    plan = planner.generate_plan(columns, query)
    validate_plan(plan, columns)   # ✅ FIX HERE

    result = execute_plan(df, plan)
    print(result)




Query: Show total SALES by COUNTRY
(        COUNTRY       SALES
0     Australia   630623.10
1       Austria   202062.53
2       Belgium   108412.62
3        Canada   224078.56
4       Denmark   245637.15
5       Finland   329581.91
6        France  1110916.52
7       Germany   220472.09
8       Ireland    57756.43
9         Italy   374674.31
10        Japan   188167.81
11       Norway   307463.70
12  Philippines    94015.73
13    Singapore   288488.41
14        Spain  1215686.92
15       Sweden   210014.21
16  Switzerland   117713.56
17           UK   478880.46
18          USA  3627982.83, Figure({
    'data': [{'hovertemplate': 'COUNTRY=%{x}<br>SALES=%{y}<extra></extra>',
              'legendgroup': '',
              'marker': {'color': '#636efa', 'pattern': {'shape': ''}},
              'name': '',
              'orientation': 'v',
              'showlegend': False,
              'textposition': 'auto',
              'type': 'bar',
              'x': array(['Australia', 'Austria', '