# 01 — Data Understanding
## Fleet Data Pipeline / Self-Driving Metrics

This notebook explores the schema and sample data for **vehicle telemetry**, **perception events**, and **driving events** used to measure Self-Driving performance.

In [None]:
import sys
from pathlib import Path
sys.path.insert(0, str(Path().resolve().parent))

import pandas as pd
from config import load_config
import psycopg2

In [None]:
cfg = load_config()
db = cfg["timescaledb"]
conn = psycopg2.connect(
    host=db["host"], port=db["port"], dbname=db["database"],
    user=db["user"], password=db["password"]
)

### Tables (TimescaleDB hypertables)

In [None]:
tables = pd.read_sql("""
SELECT tablename FROM pg_tables 
WHERE schemaname = 'public' AND tablename IN 
('vehicle_telemetry','perception_events','driving_events','alerts','self_driving_metrics')
ORDER BY tablename
""", conn)
print(tables.to_string())

### Vehicle telemetry — sample

In [None]:
telemetry = pd.read_sql("SELECT * FROM vehicle_telemetry ORDER BY time DESC LIMIT 100", conn)
telemetry.head(10)

### Perception events — sample

In [None]:
perception = pd.read_sql("SELECT * FROM perception_events ORDER BY time DESC LIMIT 100", conn)
perception.head(10)

### Driving events (interventions / disengagements)

In [None]:
events = pd.read_sql("SELECT * FROM driving_events ORDER BY time DESC LIMIT 50", conn)
events.head(10)

### Alerts

In [None]:
alerts = pd.read_sql("SELECT * FROM alerts ORDER BY time DESC LIMIT 20", conn)
alerts

In [None]:
conn.close()
print("Done. Run pipeline (producer + consumer) to populate data.")