Skip to content

feat(location): add AI enrichment, geocode cache, and map drilldown#9

Merged
darron merged 3 commits into
mainfrom
closer-map
Mar 3, 2026
Merged

feat(location): add AI enrichment, geocode cache, and map drilldown#9
darron merged 3 commits into
mainfrom
closer-map

Conversation

@darron
Copy link
Copy Markdown
Owner

@darron darron commented Mar 3, 2026

Introduce an end-to-end location enrichment pipeline so incidents can be verified and visualized at city level with less manual cleanup.

  • Add migration 0003_location_enrichment.sql with verified-city metadata, lat/lon fields, timestamps, and a city_geocode_cache table + indexes.
  • Add src/ai-location.js to extract municipality from linked source text, score confidence, apply fallbacks, geocode Canadian cities via Nominatim, and cache geocode outcomes.
  • Add admin APIs for single and bulk enrichment: POST /admin/api/records/:id/enrich-location and POST /admin/api/records/enrich-location-all, with force, geocode, and min_confidence controls.
  • Extend admin UI with per-record and bulk location actions, plus display of verified city, confidence, coordinates, and location source.
  • Update map data flow to use enriched fields and add Leaflet province drilldown markers with mapped-vs-total visibility.
  • Document the new behavior/env vars and bump Wrangler to ^4.69.0 (lockfile refresh).

Behavior changes:

  • /map/canada now prefers city_verified + location_lat/location_lon when available.
  • Bulk enrichment in only_missing && !force mode processes unchecked missing rows first to avoid endlessly reprocessing unresolved records.
  • Enrichment endpoints return 412 with a clear schema message if migration 0003 has not been applied.

Risks:

  • City extraction/geocoding quality is probabilistic; some false positives/negatives are still possible.
  • External geocoding adds network and provider rate-limit failure modes.
  • Wrangler 4.69.0 raises local tooling expectations to Node 20+.

Follow-ups:

  • Apply migration 0003_location_enrichment.sql in staging/production before running backfill.
  • Monitor unresolved/failure counts and geocode-cache hit rate.
  • Consider scheduled retries for unresolved records and optional provider failover.

Introduce an end-to-end location enrichment pipeline so incidents can be
verified and visualized at city level with less manual cleanup.

- Add migration `0003_location_enrichment.sql` with verified-city metadata,
  lat/lon fields, timestamps, and a `city_geocode_cache` table + indexes.
- Add `src/ai-location.js` to extract municipality from linked source text,
  score confidence, apply fallbacks, geocode Canadian cities via Nominatim, and
  cache geocode outcomes.
- Add admin APIs for single and bulk enrichment:
  `POST /admin/api/records/:id/enrich-location` and
  `POST /admin/api/records/enrich-location-all`, with `force`, `geocode`, and
  `min_confidence` controls.
- Extend admin UI with per-record and bulk location actions, plus display of
  verified city, confidence, coordinates, and location source.
- Update map data flow to use enriched fields and add Leaflet province
  drilldown markers with mapped-vs-total visibility.
- Document the new behavior/env vars and bump Wrangler to `^4.69.0`
  (lockfile refresh).

Behavior changes:
- `/map/canada` now prefers `city_verified` + `location_lat/location_lon` when
  available.
- Bulk enrichment in `only_missing && !force` mode processes unchecked missing
  rows first to avoid endlessly reprocessing unresolved records.
- Enrichment endpoints return `412` with a clear schema message if migration
  `0003` has not been applied.

Risks:
- City extraction/geocoding quality is probabilistic; some false
  positives/negatives are still possible.
- External geocoding adds network and provider rate-limit failure modes.
- Wrangler `4.69.0` raises local tooling expectations to Node 20+.

Follow-ups:
- Apply migration `0003_location_enrichment.sql` in staging/production before
  running backfill.
- Monitor unresolved/failure counts and geocode-cache hit rate.
- Consider scheduled retries for unresolved records and optional provider
  failover.
@darron darron self-assigned this Mar 3, 2026
Comment thread src/ai-location.js Outdated
darron added 2 commits March 2, 2026 22:01
Throttle `triggerBulkRecordLocationEnrichment` geocode calls to ~1 request per
1.1 seconds after the first record. This aligns bulk enrichment with
Nominatim’s rate limit and reduces rate-limit-related failures during
`geocode=true` runs.

Add `integrity` and `crossorigin` to Leaflet CSS/JS CDN includes in the Canada
map template to harden third-party asset loading and improve frontend supply
chain safety.

Behavior changes:
- Bulk geocoding now runs slower by design when geocoding is enabled.
- Leaflet assets now require hash match; mismatches block asset execution/load.

Risks:
- Longer enrichment jobs may increase chance of worker/request timeout on large
  batches.
- Future Leaflet CDN/version/hash changes can break map rendering until hashes
  are updated.

Follow-ups:
- Move throttling/retry into shared geocode rate-limit handling with adaptive
  backoff on 429 responses.
- Consider self-hosting pinned Leaflet assets to reduce CDN hash drift.
Normalize `city_verified` with `normalizeCityName` and treat it as the
primary fallback when AI verification does not meet `minConfidence`.
This keeps previously verified data from being replaced by weaker
fallbacks (`record.city`) and preserves existing verification metadata.

Change the early skip condition to require both a verified city and
coordinates before returning `skipped_already_enriched` (unless `force`
is set). Records without coordinates now continue through enrichment even
when geocoding is disabled, so city verification/metadata can still be
refreshed.

Also carry forward prior verification notes/source when AI reasoning is
empty, and compute fallback confidence from the max of AI confidence,
existing stored confidence, and a 0.55 floor.

Behavior changes:
- Partially enriched records (verified city but missing coords) are no longer skipped.
- Existing verified cities are retained on low-confidence AI results.
- Verification notes/source are less likely to be blanked out.

Risks:
- Stale verified cities may persist longer if AI confidence stays below threshold.
- The 0.55 confidence floor can overstate certainty for legacy verified data.

Follow-ups:
- Add regression tests for skip gating (`force`, `geocode`, coords presence).
- Add tests for city selection precedence and note/source preservation paths.
@darron darron merged commit 9084770 into main Mar 3, 2026
1 check passed
@darron darron deleted the closer-map branch May 5, 2026 00:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant