# KD-GAT Visualization Playground

Rapid prototyping surface for exploring experiment data with DuckDB + pyobsplot.

**Workflow:** Shape data & plots here → copy Observable Plot spec to `reports/playground.qmd` → promote to dashboard/paper.

Data source: `reports/data/*.parquet` (same files the Quarto dashboard uses).

In [None]:
# Cell 1: Connect to data — load all Parquet tables from reports/data/
import duckdb
from pathlib import Path

DATA_DIR = Path("../reports/data")

db = duckdb.connect()

# Auto-discover and load all Parquet files as tables
parquet_files = sorted(DATA_DIR.glob("*.parquet"))
for f in parquet_files:
    table_name = f.stem
    db.execute(f"CREATE TABLE {table_name} AS SELECT * FROM '{f}'")
    print(f"  {table_name}: {db.sql(f'SELECT COUNT(*) FROM {table_name}').fetchone()[0]} rows")

print(f"\nLoaded {len(parquet_files)} tables.")

In [None]:
# Cell 2: Schema browser — inspect any table
db.sql("SHOW TABLES").show()

# Change table name to inspect a different table
db.sql("DESCRIBE metrics").show()

In [None]:
# Cell 3: Sample query — F1 scores by dataset and model
df = db.sql("""
    SELECT
        split_part(run_id, '/', 1) AS dataset,
        model,
        f1, accuracy, auc, mcc
    FROM metrics
    ORDER BY f1 DESC
""").df()
df

In [None]:
# Cell 4: Visualize with Observable Plot (renders inline)
from pyobsplot import Plot

Plot.plot({
    "marks": [
        Plot.dot(df, {"x": "dataset", "y": "f1", "fill": "model", "r": 6})
    ],
    "color": {"legend": True},
    "width": 680,
    "marginBottom": 80,
    "x": {"tickRotate": -45}
})

In [None]:
# Cell 5: Training curves
tc = db.sql("""
    SELECT *
    FROM training_curves
    LIMIT 5000
""").df()

tc.head()

In [None]:
# Cell 6: Training curve line chart
Plot.plot({
    "marks": [
        Plot.lineY(tc, {"x": "epoch", "y": "value", "stroke": "metric_name", "strokeWidth": 1.5})
    ],
    "color": {"legend": True},
    "width": 680,
    "height": 400
})

In [None]:
# Cell 7: Runs overview
runs_df = db.sql("""
    SELECT *
    FROM runs
    LIMIT 50
""").df()
runs_df.head(10)

---
## Scratch Area

Use cells below for ad-hoc exploration. When a query + plot works well,
copy the Plot spec to `reports/playground.qmd` and translate to Mosaic/vgplot.

In [None]:
# Scratch 1: SQL query
df = db.sql("""
    SELECT 1 AS placeholder
""").df()
df

In [None]:
# Scratch 2: Observable Plot chart
# Plot.plot({
#     "marks": [
#         Plot.barY(df, {"x": "col_x", "y": "col_y", "fill": "col_color"})
#     ],
#     "width": 680
# })

In [None]:
# Scratch 3: More exploration
