A full-stack platform that lets users query real NYC public datasets using plain English, visualize results on an interactive map, and receive real-time alerts.
Demo query: "Show me noise complaints in Brooklyn last 30 days near parks" → Claude parses it → Elasticsearch fires → results pin to Mapbox → Celery watches for new matches → WebSocket alert.
| Layer | Tech |
|---|---|
| Backend | Python 3.11 + Django 4.2 + Django REST Framework |
| Database | PostgreSQL 15 + PostGIS (geospatial) |
| Search | Elasticsearch 8 + django-elasticsearch-dsl |
| Task Queue | Celery 5 + Redis 7 + django-celery-beat |
| WebSocket | Django Channels 4 + channels-redis |
| AI | Claude claude-sonnet-4-6 (tool calling + agentic mode) |
| Frontend | React 18 + Redux Toolkit + Mapbox GL JS + Vite |
| Infra | Docker Compose (dev) · GitHub Actions CI |
- NYC 311 — 30M+ service requests via Socrata API (
data.cityofnewyork.us) - NYPD Crime Stats — complaint data via NYC Open Data
- MTA Subway Alerts — real-time transit disruptions via MTA RSS feed
- FEMA — disaster declarations (optional, add to
ingestion/sources/)
cp .env.example .env
# Fill in ANTHROPIC_API_KEY and VITE_MAPBOX_TOKEN
docker compose up --build
# In a second terminal, run migrations + seed ES index
docker compose exec backend python manage.py migrate
docker compose exec backend python manage.py search_index --rebuild
# Trigger first ingestion manually
docker compose exec backend python manage.py shell -c "from apps.ingestion.tasks import run_all_ingestion; run_all_ingestion()"Frontend: http://localhost:5173
Backend API: http://localhost:8000
Django Admin: http://localhost:8000/admin
| Method | Path | Description |
|---|---|---|
| GET | /api/incidents/ |
Paginated incident list |
| GET | /api/search/?q=noise&borough=Brooklyn&days=30 |
Elasticsearch full-text + geo search |
| POST | /api/ai/chat/ |
Natural language → tool calls → answer |
| POST | /api/ai/agent/ |
Autonomous multi-step agent |
| GET | /api/alerts/ |
Active alert subscriptions |
| WS | ws://localhost:8000/ws/alerts/ |
Live incident push feed |
User: "noise complaints in Brooklyn last 30 days near parks"
↓
Claude claude-sonnet-4-6 → tool_use: search_incidents({query: "noise", borough: "Brooklyn", days: 30})
↓
Backend executes Elasticsearch query → returns 847 results
↓
Claude → tool_use: aggregate_stats({group_by: "neighborhood", borough: "Brooklyn", days: 30})
↓
Backend queries Postgres → returns top neighborhoods by count
↓
Claude → end_turn: "Brooklyn had 847 noise complaints in the last 30 days. Bushwick (234) and Bed-Stuy (198) lead..."
↓
Frontend renders text + pins 847 locations on Mapbox map
Status: Accepted
Date: 2024-01
Context:
The ingestion pipeline needs a message queue for async data ingestion tasks and the alert system needs a pub/sub mechanism to fan out new incidents to WebSocket clients.
Decision:
Use Celery + Redis in development instead of Kafka.
Reasoning:
Kafka (with KRaft or ZooKeeper) adds 2-3 containers, requires broker configuration, and has meaningful operational overhead with no dev benefit when message throughput is low (< 1000 msg/day in dev). Celery provides the same logical abstraction — producers schedule tasks, consumers process them — without the setup cost.
Production migration path:
The codebase is structured so that only the transport layer needs to swap:
apps/ingestion/tasks.py— Replace@shared_taskdecorators withconfluent-kafkaproducers publishing to araw-incidentstopicapps/alerts/tasks.py— Replace Celery beat with a Kafka consumer group subscribing toprocessed-incidentsconfig/settings/base.py— AddKAFKA_BOOTSTRAP_SERVERSsetting; keepCHANNEL_LAYERS(Redis stays for WebSocket fan-out)
Zero model/API changes required — all Django models, serializers, Elasticsearch documents, and REST endpoints remain identical.
Consequences:
- Dev: single
rediscontainer, simple setup - Prod: Kafka handles terabyte-scale throughput with replay, consumer group parallelism, and durable logs
- The ADR itself demonstrates systems-thinking: knowing when to add complexity, not just how to