A CLI bot that answers natural-language questions about indoor location data. It generates SQL from your question, runs it against a SQLite database, and returns a grounded, readable answer — always showing the SQL it ran.
Q: Who was in the lobby today?
SQL:
SELECT DISTINCT e.name, e.department, z.name AS zone, ze.entered_at, ze.exited_at
FROM zone_events ze
JOIN entities e ON e.entity_id = ze.entity_id
JOIN zones z ON z.zone_id = ze.zone_id
WHERE e.entity_type = 'person'
AND z.name = 'Lobby'
AND ze.entered_at BETWEEN '2026-02-21 00:00:00' AND '2026-02-21 12:18:03'
ORDER BY ze.entered_at DESC;
Answer:
Found 5 unique person(s) in 8 event record(s):
Alice Chen, David Kim, Frank Nguyen, Iris Johnson, Jack Thompson
ops-assistant/
├── schema.sql Database schema (zones, entities, pings, zone_events)
├── generate_data.py Generates ops_assistant.db with 48h of sample data
├── query_engine.py NL→SQL translator (rule-based, no external LLM required)
├── formatter.py Formats SQL results into readable answers
├── assistant.py CLI entrypoint (interactive REPL + single-question mode)
├── sample_outputs.txt 15 example questions with real output
└── README.md This file
- Python 3.8+
- No external packages needed (uses only the standard library)
python generate_data.pyOutput:
Schema created.
zones : 12
entities : 15
pings : 18463
zone_events : 404
Database ready → ops_assistant.db
python assistant.py "Who is currently in Meeting Room 3?"
python assistant.py "Show floor-jump anomalies"
python assistant.py "How long did Henry Brooks spend in Meeting Room 3 today?"python assistant.pyThen type questions at the You: prompt. Type help to see examples, quit to exit.
python assistant.py --demopython assistant.py --db /path/to/other.db "Who is in the lobby?"| Column | Type | Notes |
|---|---|---|
| zone_id | INTEGER | Primary key |
| name | TEXT | Unique zone name (e.g. "Open Office A") |
| floor | INTEGER | Physical floor number |
| building | TEXT | Building identifier |
| capacity | INTEGER | Max intended occupancy |
| zone_type | TEXT | office / meeting / lobby / warehouse / common |
| Column | Type | Notes |
|---|---|---|
| entity_id | INT | Primary key |
| name | TEXT | Display name |
| entity_type | TEXT | person or asset |
| department | TEXT | Org unit |
| tag_id | TEXT | BLE/UWB hardware tag ID |
| Column | Type | Notes |
|---|---|---|
| ping_id | INTEGER | Primary key |
| entity_id | INTEGER | FK → entities |
| zone_id | INTEGER | FK → zones |
| ts | DATETIME | UTC timestamp of observation |
| rssi | INTEGER | Signal strength in dBm (−40 to −100) |
| floor_detected | INTEGER | Floor reported by reader hardware |
| reader_id | TEXT | Which antenna/reader saw the tag |
| Column | Type | Notes |
|---|---|---|
| event_id | INTEGER | Primary key |
| entity_id | INTEGER | FK → entities |
| zone_id | INTEGER | FK → zones |
| entered_at | DATETIME | When entity entered the zone |
| exited_at | DATETIME | When entity exited (NULL = still present) |
| dwell_secs | INTEGER | Computed dwell duration in seconds |
All questions support natural time expressions:
today→ midnight UTC to nowyesterday→ full previous calendar daylast 30 minutes→ rolling 30-min windowlast hour,last 2 hours,last 24 hours,last 48 hours,last week
Who is in the lobby right now?
Who was in Meeting Room 1 today?
Who is currently in Open Office B?
What is the current occupancy?
Where is Alice Chen?
Where has Bob Martinez been in the last 2 hours?
Show me the movement of Grace Lee today
How long did Henry Brooks spend in Meeting Room 3 today?
Show dwell time for everyone in the cafeteria yesterday
How busy was Meeting Room 2 today?
What is the traffic in the cafeteria today?
Which zones had the most visitors?
Show floor-jump anomalies ← reader detected wrong floor vs zone floor
Show low RSSI pings ← weak signal reads (below −85 dBm by default)
Show low RSSI pings below -90 dBm ← custom threshold
Show data quality issues
When was Karen Adams last seen?
List all zones
List all people
If the assistant can't map your question to a query, it responds gracefully:
I couldn't map that question to a query. Try asking about:
• Who is in a zone (now / today / yesterday / last 30 minutes)
...
generate_data.py simulates 48 hours of activity for 12 people and 3 assets across 12 zones in a 4-floor building.
Data includes:
- Realistic work-hour schedules (8am–6pm concentrations, rare after-hours)
- 18,000+ ping records at 30–90 second intervals
- 400+ zone entry/exit events with dwell times
- ~5% floor-jump anomalies (reader reports wrong floor)
- 30 deliberately injected low-RSSI pings (< −85 dBm)
Zones (HQ building):
| Floor | Zones |
|---|---|
| 1 | Lobby, Reception, Cafeteria, Warehouse A |
| 2 | Open Office A, Open Office B, Meeting Room 1, Meeting Room 2 |
| 3 | Meeting Room 3, Executive Suite, Server Room |
| 4 | Gym |
The current query_engine.py uses rule-based pattern matching — zero external
dependencies, fully offline. If you want to upgrade to LLM-generated SQL:
- Install the Anthropic SDK:
pip install anthropic - Set your API key:
export ANTHROPIC_API_KEY=sk-ant-... - Replace the
translate()function inquery_engine.pywith an API call:
import anthropic, sqlite3, json
def translate_llm(question: str, schema: str):
client = anthropic.Anthropic()
prompt = f"""You are a SQLite expert. Given this schema:
{schema}
Generate a SQL query to answer: {question}
Return only valid SQL, nothing else."""
msg = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[{"role": "user", "content": prompt}]
)
sql = msg.content[0].text.strip()
return sql, [], "LLM-generated query"The rest of the pipeline (execution, formatting) works unchanged.
User question (natural language)
│
▼
query_engine.translate() ← NL → (SQL, params, description)
├── Time window detection "today", "last 30 minutes", etc.
├── Entity/zone extraction name matching
└── Handler dispatch 12 specialised query builders
│
▼
sqlite3.execute(sql, params) ← Run against ops_assistant.db
│
▼
formatter.format_results() ← rows + col_names → readable text
│
▼
Printed answer + SQL shown
To start fresh:
python generate_data.py # deletes and recreates ops_assistant.dbThe random seed is fixed (random.seed(42)) so outputs are reproducible.
Change the seed or hours_back parameter in simulate_person() to vary coverage.