# Telemetry-to-Insight: Full Pipeline Demo

End-to-end: load telemetry with cuDF, retrieve filtered data, send to NIM for natural-language answers.

**Colab:** You need NIM running on GKE with a public URL. Paste your NIM URL in the Setup cell below. See "Deploy NIM on GKE" at the end of this notebook.

<a href="https://colab.research.google.com/github/KarthikSriramGit/Project-Insight/blob/main/notebooks/03_query_telemetry.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Configure runtime first:** Runtime > Change runtime type > Hardware accelerator: **GPU** or **None** > Save.

In [2]:
# Colab setup: clone repo and install dependencies (run this cell first)
try:
    import google.colab
    get_ipython().system("git clone -q https://github.com/KarthikSriramGit/Project-Insight.git")
    get_ipython().run_line_magic("cd", "Project-Insight")
    get_ipython().system("pip install -q -r requirements.txt")
except Exception:
    pass

/content/Project-Insight


## Setup

**Edit the cell below:** Replace `YOUR_EXTERNAL_IP` with your GKE NIM LoadBalancer IP (from `kubectl get svc -n nim`).

In [3]:
import os
import sys
import subprocess
from pathlib import Path

ROOT = Path(".").resolve()
if str(ROOT) not in sys.path:
    sys.path.insert(0, str(ROOT))

# Paste your GKE NIM URL here (e.g. http://34.123.45.67:8000)
NIM_BASE_URL = os.environ.get("NIM_BASE_URL", "http://136.119.171.193:8000")

data_path = ROOT / "data" / "synthetic" / "fleet_telemetry.parquet"
if not data_path.exists():
    subprocess.run([
        "python", "data/synthetic/generate_telemetry.py",
        "--rows", "100000", "--output-dir", "data/synthetic", "--format", "parquet",
    ], check=True, cwd=str(ROOT))

from src.query.engine import TelemetryQueryEngine

engine = TelemetryQueryEngine(
    data_path=str(data_path),
    nim_base_url=NIM_BASE_URL,
    max_context_rows=500,
)
print(f"Engine ready. NIM URL: {NIM_BASE_URL}")

Engine ready. NIM URL: http://136.119.171.193:8000


## Query 1: Max brake pressure

In [4]:
print("=" * 60)
print("Query 1: Max brake pressure percentage across all vehicles")
print("=" * 60)
answer1 = engine.query(
    "What is the maximum brake_pressure_pct value across all vehicles? Which vehicle had it?",
    sensor_type="can",
)
print(answer1)

Unfortunately, the provided telemetry data does not contain enough information to determine the maximum brake pressure across all vehicles. The "brake_pressure_pct" column only contains a percentage value and not an actual pressure value. Therefore, it is not possible to determine the maximum brake pressure.


## Query 2: Vehicle speed analysis

In [None]:
print("=" * 60)
print("Query 2: Average speed per vehicle")
print("=" * 60)
answer2 = engine.query(
    "What is the average vehicle_speed_kmh for each vehicle? List them from fastest to slowest.",
    sensor_type="can",
)
print(answer2)

## Query 3: Hard braking events

In [None]:
print("=" * 60)
print("Query 3: Hard braking events (brake_pressure_pct > 90)")
print("=" * 60)
answer3 = engine.query(
    "How many rows have brake_pressure_pct above 90? What is the average vehicle_speed_kmh during these hard braking events?",
    sensor_type="can",
)
print(answer3)

## Query 4: IMU anomaly detection

In [None]:
print("=" * 60)
print("Query 4: IMU anomaly detection")
print("=" * 60)
answer4 = engine.query(
    "Are there any unusual acceleration values in the data? Look at accel_x, accel_y, accel_z "
    "and identify any readings that seem abnormally high or low compared to the typical range.",
    sensor_type="imu",
)
print(answer4)

## Query 5: LiDAR point cloud overview

In [None]:
print("=" * 60)
print("Query 5: LiDAR point cloud overview")
print("=" * 60)
answer5 = engine.query(
    "What is the average and max point_count across all LiDAR readings? "
    "What is the typical max_range observed?",
    sensor_type="lidar",
)
print(answer5)

## Query 6: GPS coverage

In [None]:
print("=" * 60)
print("Query 6: GPS coverage area")
print("=" * 60)
answer6 = engine.query(
    "What is the geographic bounding box of the fleet? "
    "Give the min/max latitude and longitude values.",
    sensor_type="gps",
)
print(answer6)

## Query 7: Fleet-wide summary

In [None]:
print("=" * 60)
print("Query 7: Fleet-wide CAN bus summary")
print("=" * 60)
answer7 = engine.query(
    "Provide a brief fleet health summary: average speed, average throttle, "
    "average brake pressure percentage, and the most common gear position.",
    sensor_type="can",
)
print(answer7)

## All answers summary

Collected output from all queries above for easy review.

In [None]:
queries = {
    "1. Max brake pressure (CAN)": answer1,
    "2. Avg speed per vehicle (CAN)": answer2,
    "3. Hard braking events (CAN)": answer3,
    "4. IMU anomaly detection": answer4,
    "5. LiDAR point cloud overview": answer5,
    "6. GPS coverage area": answer6,
    "7. Fleet-wide CAN summary": answer7,
}

print("=" * 60)
print("TELEMETRY INSIGHT SUMMARY")
print("=" * 60)
for title, ans in queries.items():
    print(f"\n{'─' * 60}")
    print(f"  {title}")
    print(f"{'─' * 60}")
    print(ans)

print(f"\n{'=' * 60}")
print(f"Pipeline: cuDF + UVM → NIM (Llama 3 8B) on GKE")
print(f"NIM URL: {NIM_BASE_URL}")
print(f"{'=' * 60}")