SafePath

SafePath is a map-based safety routing and crime-risk visualization app for Atlanta, GA. It combines recent incident data, H3 spatial indexing, and XGBoost models to score risk by area and along routes. The stack includes a FastAPI backend, a React/Vite frontend, and data/ML pipelines that update Supabase tables and heatmap artifacts.

Features

Fastest vs safest route comparison with qualitative summaries.
H3-based heatmaps for map overlays and point sampling.
Area analytics for recent incidents, top offenses, and trends.
Point risk scoring by latitude/longitude.
Data refresh pipeline that ingests Atlanta Open Data crime feeds.

Architecture

frontend/: React + Vite + Google Maps UI.
backend/: FastAPI API, Supabase client, and XGBoost inference.
backend/jobs/: data pipeline and heatmap generation.
training/: data prep and model training scripts.
data/processed/: local outputs (feature snapshots, status, GeoJSON).

Requirements

Python 3.10+ (backend)
Node.js 18+ (frontend)
Supabase project with required tables

Environment Variables

Backend (backend/.env)

SUPABASE_URL
SUPABASE_SERVICE_ROLE_KEY
GEMINI_API_KEY or GOOGLE_API_KEY (optional, for natural-language summaries)
GEMINI_MODEL (optional, defaults to gemini-2.5-flash)

Frontend (frontend/.env)

VITE_API_URL (defaults to http://localhost:8000)
VITE_SUPABASE_URL
VITE_SUPABASE_ANON_KEY
VITE_GOOGLE_MAPS_API_KEY

Quickstart

Backend setup and run:

python -m venv .venv
source .venv/bin/activate
pip install -r backend/requirements.txt
uvicorn backend.app.main:app --reload --host 0.0.0.0 --port 8000

Frontend setup and run:

cd frontend
npm install
npm run dev

Open http://localhost:5173 in your browser.

Data Pipeline

Build hourly features and (optionally) upsert incidents:

python backend/jobs/pipeline.py --days 30 --out data/processed/features_current_hour.csv

Generate heatmaps (GeoJSON) from Supabase risk scores:

python backend/jobs/heatmaps.py --zoom 14 --out data/processed/heatmaps_latest.geojson

Trigger both from the API:

curl -X POST http://localhost:8000/update

Training Scripts in training/ clean data and train XGBoost models. They require additional dependencies (e.g., scikit-learn) beyond backend/requirements.txt. Trained models should be exported to:

backend/models/anycrime_prob.json
backend/models/severity_given_crime.json

Risk Scoring

Risk assessment combines two XGBoost models with percentile-based ranking.

Incident risk weighting Each raw incident is assigned a weighted risk value used during feature aggregation:

risk = 1 + 3 × is_person + 4 × is_gun + 0.5 × victim_count

Features Rolling temporal features are computed per H3 cell across multiple windows (1 h, 6 h, 24 h, 7 d, 30 d) for total incidents, person crimes, property crimes, firearm involvement, and aggregated risk. Temporal features (hour, day-of-week, month, is_weekend) are also included.

Models

Model	File	Type	Predicts
Crime probability	`anycrime_prob.json`	XGBoost binary classifier	`p_anycrime` — probability any crime occurs in the cell during the next hour
Severity	`severity_given_crime.json`	XGBoost regressor	`severity_given_crime` — expected severity if a crime does occur

The combined expected risk for a cell is:

expected_risk = p_anycrime × severity_given_crime

Percentile-based risk ranking Raw expected_risk values are converted to a 0–100 risk_index using a percentile rank across all cells for the current hour (pandas .rank(pct=True, method="average") × 100). The ranking is relative — a cell's index reflects what share of cells have equal or lower risk at that point in time.

Risk bands used for heatmap colouring:

Band	Percentile	Colour
Very High	≥ 99	Red
High	95 – 99	Orange
Elevated	90 – 95	Yellow
Medium	75 – 90	Light Green
Low	< 75	Blue

Only cells with risk_index ≥ 95 are shown on the heatmap overlay.

Route risk score Routes are scored by sampling risk_index values from cells along the path:

route_risk_score = 0.35 × mean(risk_index) + 0.65 × p90(risk_index)

The 65 % weight on the 90th-percentile segment ensures a single dangerous stretch outweighs a generally calm route. A route is classified as High risk when incident density ≥ 8 per mile or the p90 risk index ≥ 85, Medium when density ≥ 3 or p90 ≥ 60, and Low otherwise.

Supabase Tables The backend expects these tables:

incidents_raw: raw incident rows from Atlanta Open Data.
cell_map: cell_id (H3) ↔ cell_idx mapping.
features_hourly: feature snapshots per cell/hour.
risk_scores: model outputs (p_anycrime, severity_given_crime, expected_risk, risk_index) and heatmap fields.

API Summary Base URL: http://localhost:8000

GET /status: health + last update time.
POST /update: run pipeline + heatmap updates.
GET /heatmaps: GeoJSON hexes for map overlays.
GET /heatmap: weighted point cloud for heatmap visualization.
POST /score_point: risk score for a lat/lng.
GET /incidents/analytics: summary stats for recent incidents.
GET /cell/insights: insights for a single H3 cell.
POST /route/insights: risk summary for a route.
POST /route/selection_summary: explains safest vs fastest selection.

Notes

Default map center and data pipeline target Atlanta, GA.
Route summaries fall back to a rule-based summary if no Gemini API key is configured.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
backend		backend
frontend		frontend
training		training
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SafePath

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SafePath

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages