SafePath is a map-based safety routing and crime-risk visualization app for Atlanta, GA. It combines recent incident data, H3 spatial indexing, and XGBoost models to score risk by area and along routes. The stack includes a FastAPI backend, a React/Vite frontend, and data/ML pipelines that update Supabase tables and heatmap artifacts.
Features
- Fastest vs safest route comparison with qualitative summaries.
- H3-based heatmaps for map overlays and point sampling.
- Area analytics for recent incidents, top offenses, and trends.
- Point risk scoring by latitude/longitude.
- Data refresh pipeline that ingests Atlanta Open Data crime feeds.
Architecture
frontend/: React + Vite + Google Maps UI.backend/: FastAPI API, Supabase client, and XGBoost inference.backend/jobs/: data pipeline and heatmap generation.training/: data prep and model training scripts.data/processed/: local outputs (feature snapshots, status, GeoJSON).
Requirements
- Python 3.10+ (backend)
- Node.js 18+ (frontend)
- Supabase project with required tables
Environment Variables
Backend (backend/.env)
SUPABASE_URLSUPABASE_SERVICE_ROLE_KEYGEMINI_API_KEYorGOOGLE_API_KEY(optional, for natural-language summaries)GEMINI_MODEL(optional, defaults togemini-2.5-flash)
Frontend (frontend/.env)
VITE_API_URL(defaults tohttp://localhost:8000)VITE_SUPABASE_URLVITE_SUPABASE_ANON_KEYVITE_GOOGLE_MAPS_API_KEY
Quickstart
- Backend setup and run:
python -m venv .venv
source .venv/bin/activate
pip install -r backend/requirements.txt
uvicorn backend.app.main:app --reload --host 0.0.0.0 --port 8000- Frontend setup and run:
cd frontend
npm install
npm run dev- Open
http://localhost:5173in your browser.
Data Pipeline
Build hourly features and (optionally) upsert incidents:
python backend/jobs/pipeline.py --days 30 --out data/processed/features_current_hour.csvGenerate heatmaps (GeoJSON) from Supabase risk scores:
python backend/jobs/heatmaps.py --zoom 14 --out data/processed/heatmaps_latest.geojsonTrigger both from the API:
curl -X POST http://localhost:8000/updateTraining
Scripts in training/ clean data and train XGBoost models. They require additional dependencies (e.g., scikit-learn) beyond backend/requirements.txt. Trained models should be exported to:
backend/models/anycrime_prob.jsonbackend/models/severity_given_crime.json
Risk Scoring
Risk assessment combines two XGBoost models with percentile-based ranking.
Incident risk weighting Each raw incident is assigned a weighted risk value used during feature aggregation:
risk = 1 + 3 × is_person + 4 × is_gun + 0.5 × victim_count
Features Rolling temporal features are computed per H3 cell across multiple windows (1 h, 6 h, 24 h, 7 d, 30 d) for total incidents, person crimes, property crimes, firearm involvement, and aggregated risk. Temporal features (hour, day-of-week, month, is_weekend) are also included.
Models
| Model | File | Type | Predicts |
|---|---|---|---|
| Crime probability | anycrime_prob.json |
XGBoost binary classifier | p_anycrime — probability any crime occurs in the cell during the next hour |
| Severity | severity_given_crime.json |
XGBoost regressor | severity_given_crime — expected severity if a crime does occur |
The combined expected risk for a cell is:
expected_risk = p_anycrime × severity_given_crime
Percentile-based risk ranking
Raw expected_risk values are converted to a 0–100 risk_index using a percentile rank across all cells for the current hour (pandas .rank(pct=True, method="average") × 100). The ranking is relative — a cell's index reflects what share of cells have equal or lower risk at that point in time.
Risk bands used for heatmap colouring:
| Band | Percentile | Colour |
|---|---|---|
| Very High | ≥ 99 | Red |
| High | 95 – 99 | Orange |
| Elevated | 90 – 95 | Yellow |
| Medium | 75 – 90 | Light Green |
| Low | < 75 | Blue |
Only cells with risk_index ≥ 95 are shown on the heatmap overlay.
Route risk score
Routes are scored by sampling risk_index values from cells along the path:
route_risk_score = 0.35 × mean(risk_index) + 0.65 × p90(risk_index)
The 65 % weight on the 90th-percentile segment ensures a single dangerous stretch outweighs a generally calm route. A route is classified as High risk when incident density ≥ 8 per mile or the p90 risk index ≥ 85, Medium when density ≥ 3 or p90 ≥ 60, and Low otherwise.
Supabase Tables The backend expects these tables:
incidents_raw: raw incident rows from Atlanta Open Data.cell_map:cell_id(H3) ↔cell_idxmapping.features_hourly: feature snapshots per cell/hour.risk_scores: model outputs (p_anycrime,severity_given_crime,expected_risk,risk_index) and heatmap fields.
API Summary
Base URL: http://localhost:8000
GET /status: health + last update time.POST /update: run pipeline + heatmap updates.GET /heatmaps: GeoJSON hexes for map overlays.GET /heatmap: weighted point cloud for heatmap visualization.POST /score_point: risk score for a lat/lng.GET /incidents/analytics: summary stats for recent incidents.GET /cell/insights: insights for a single H3 cell.POST /route/insights: risk summary for a route.POST /route/selection_summary: explains safest vs fastest selection.
Notes
- Default map center and data pipeline target Atlanta, GA.
- Route summaries fall back to a rule-based summary if no Gemini API key is configured.