Skip to content

feat: SAR sensor fusion — unmatched_sar_detections_30d (#84)#126

Merged
yohei1126 merged 9 commits intomainfrom
feat/84-sar-sensor-fusion
Apr 8, 2026
Merged

feat: SAR sensor fusion — unmatched_sar_detections_30d (#84)#126
yohei1126 merged 9 commits intomainfrom
feat/84-sar-sensor-fusion

Conversation

@yohei1126
Copy link
Copy Markdown
Contributor

Summary

  • Adds a new sar_detections table to store Sentinel-1 radar vessel detections (lat/lon/time, no MMSI)
  • Implements unmatched_sar_detections_30d feature: counts SAR detections per vessel that occurred during an AIS gap and were spatially close to the vessel's last known position — a direct dark-vessel signal
  • Wires the feature into the full scoring pipeline (feature matrix → Isolation Forest anomaly scoring)

Algorithm:

  1. Match each SAR detection to AIS broadcasts within 5 km / 60 min → labeled matched
  2. Unmatched detections are checked against per-vessel AIS gaps (> 6 h) using haversine distance (50 km attribution radius) to attribute to the most likely vessel
  3. Count attributed unmatched detections per MMSI → unmatched_sar_detections_30d

All spatial joins are done in DuckDB (in-memory) with a bounding-box pre-filter + haversine verification.

Files changed

File Change
src/ingest/sar.py New — CSV/record ingestion into sar_detections
src/ingest/schema.py Add sar_detections table + vessel_features column
src/features/sar_detections.py New — compute_unmatched_sar_detections()
src/features/build_matrix.py Wire SAR feature into feature matrix pipeline
src/score/anomaly.py Add feature to ANOMALY_FEATURE_COLUMNS
tests/test_sar_detections.py New — 6 unit tests
tests/test_schema.py Update expected table set
tests/test_enhancements_cap_vista.py Sync local ANOMALY_FEATURE_COLUMNS

Test plan

  • test_empty_sar_table_returns_empty — no SAR data → empty result
  • test_matched_sar_detection_not_counted — AIS-matched detections excluded
  • test_unmatched_sar_during_gap_counts — gap + nearby SAR → counted
  • test_unmatched_sar_far_from_vessel_not_counted — 500 km away → not attributed
  • test_multiple_unmatched_detections_counted — 3 detections → count=3
  • test_sar_outside_window_not_counted — 45-day old detection → excluded
  • All 241 existing tests still pass

Closes #84

🤖 Generated with Claude Code

yohei1126 and others added 9 commits April 8, 2026 11:51
Add a Sentinel-1 SAR vessel detection subsystem that counts how many
SAR radar detections per vessel occurred during AIS gaps and were
spatially close to the vessel's last known position (dark vessel signal).

New files:
- src/ingest/sar.py          – ingest SAR detections from CSV or records
- src/features/sar_detections.py – compute unmatched_sar_detections_30d
- tests/test_sar_detections.py   – 6 unit tests covering match/no-match/
                                   attribution/window/multi-detection cases

Modified files:
- src/ingest/schema.py       – add sar_detections table + feature column
- src/features/build_matrix.py – wire SAR feature into matrix pipeline
- src/score/anomaly.py       – include feature in Isolation Forest training
- tests/test_schema.py       – update expected table set
- tests/test_enhancements_cap_vista.py – sync local ANOMALY_FEATURE_COLUMNS

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Seeds a fresh DuckDB with a vessel AIS gap + 3 unmatched SAR detections,
runs compute_unmatched_sar_detections, and asserts count=3. Useful for
verifying issue #84 locally without needing real SAR data.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add to FEATURE_VALUE_COLUMNS so raw value is stored in watchlist
- Add valueLabel formatter ("3 detections") in index.html SHAP table
- Extend ops shell option 11 to optionally run full pipeline + print
  dashboard verification steps

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… Forest

Isolation Forest needs ≥4 samples to train. The smoke test previously
seeded only 1 vessel, causing composite scoring to fail. Now seeds 5
normal vessels (frequent pings, no gaps) as background population.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…t it

- Change default DB path from /tmp/ to data/processed/sar_smoke_test.duckdb
- Remove COPY data/ from Dockerfile (data/ is a runtime volume mount)
- Add .dockerignore to exclude data/, .git/, .venv/ from build context

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…o overrides

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ections_30d

- Add ALTER TABLE ... ADD COLUMN IF NOT EXISTS in init_schema so existing
  DBs get the column without needing a full rebuild
- Delete existing DB file before seeding in smoke test (it's explicitly fresh)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@yohei1126 yohei1126 self-assigned this Apr 8, 2026
@yohei1126 yohei1126 merged commit 1ce70a4 into main Apr 8, 2026
5 checks passed
@yohei1126 yohei1126 deleted the feat/84-sar-sensor-fusion branch April 8, 2026 04:18
yohei1126 added a commit that referenced this pull request Apr 9, 2026
Adds `device_scale_factor=2` to the Playwright browser context so all
screenshots — including `05_sar_shap.png` — render at 2× pixel density on
retina displays.

The screenshot now shows `unmatched_sar_detections_30d` as the primary SHAP
signal (SHAP +0.24, rank #1) for SARI NOUR (MMSI 613115678), satisfying the
Annex A PoC Week 7 deliverable for the Cap Vista submission.

Supporting code shipped in prior PRs:
- `capture_sar_shap()` and `seed_demo_sar.py` — #135
- `unmatched_sar_detections_30d` SAR feature — #126
- `annex-a-submission.md` SAR section updated locally (gitignored _outputs/)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEAT] Multi-modal Sensor Fusion (AIS + SAR Imaging) for Shadow Fleet Detection

1 participant