4D Open Source Intelligence (OSINT) Replay Engine — time-resolved, multi-modal intelligence over a unified graph + vector world model.
GodEye ingests time-stamped multi-modal sensor observations (ADS-B, AIS, GPS jamming, network events, NOTAM-style documents) into SurrealDB, fuses them into higher-level events, links them to entities via a knowledge graph, and exposes a 4D replay agent that answers "what happened in this window?" with a structured, evidence-grounded narrative.
Unlike shallow RAG systems that retrieve documents and hope for the best, GodEye grounds its reasoning in a persistent, multi-model world model: a graph of entities, observations, and events, enriched with three distinct retrieval paths — query-driven RAG, entity-augmented Graph-RAG, and a baseline-only path for comparison. The agent explicitly contrasts what it can conclude with the event graph versus without it.
Built for AI engineers, data engineers, and OSINT-curious developers who want a concrete, production-shaped pattern for agent workflows over graph + vector data.
This project would not be feasible in its current form without both LangChain/LangGraph and SurrealDB. LangChain/LangGraph provides the orchestration layer for deterministic multi-step agent execution (tooling, parallel phases, retries/fallbacks, and persistent checkpoint flow), while SurrealDB provides the unified persistent substrate where graph relationships, time-windowed events, vector/BM25 retrieval data, and replay state all coexist and evolve together.
- Features
- Tech Stack
- Architecture Overview
- Evening Update (March 8, 2026)
- Installation
- Usage
- Configuration
- Screenshots / Demo
- API Reference
- Tests
- Roadmap
- Bonus Incentive: Open-Source Contribution
- Contributing
- License
- Contact / Support
- Multi-model world model — SurrealDB stores
entity,observation,event, anddoc_chunktables with graph edges (observed_in,involves,evidence) linking them into a traversable knowledge graph. - Time-windowed event fusion — Groups observations by feed type and 10-minute buckets into typed events with Noisy-OR calibrated confidence (
1 - 0.8ⁿ) and severity derived from evidence thresholds. - Cross-feed correlation detection — When ≥2 distinct feed types fire in the same 10-minute window, a
multi-axiscorrelationevent is created, surfacing compound signals the LLM would otherwise have to infer itself. - Three-phase retrieval pipeline — Query-RAG (BM25+vector on the user question), entity-augmented Graph-RAG (re-queries using names of entities detected in the event graph), and a baseline-only path for structured-vs-baseline comparison.
- Hybrid BM25 + vector retrieval with RRF —
doc_chunkretrieval merges BM25 full-text and HNSW cosine vector search via Reciprocal Rank Fusion (Cormack et al. 2009, k=30), outperforming either alone on keyword-heavy OSINT queries. - Before/after window comparison — Each replay automatically fetches the prior equal-duration window and feeds it to the LLM for escalation/de-escalation analysis.
- Parallel LangGraph pipeline — Fusion + vector search concurrent (Phase 1); current + previous timelines concurrent (Phase 2); narrative + event summary LLM calls concurrent. Cold-start pool pre-warming eliminates connection overhead on first requests.
- Fault-isolated concurrency — All
asyncio.gathercalls usereturn_exceptions=True; a failed BM25 index gracefully degrades to vector-only retrieval, and vice versa. - Structured vs baseline comparison — The narrative explicitly explains what the agent could not have concluded using only RAG, demonstrating the value of the event graph per-request.
- Agent observability — All agent actions logged to
agent_login SurrealDB with timestamps and structured details. Non-fatal: log failures never abort business logic. - Persistent agent memory — Checkpoints are persisted to SurrealDB (
agent_checkpoint) via a customSurrealDBCheckpointSaverso replay sessions can resume across process restarts. - FastAPI backend + glassmorphism frontend — Dark-mode bento-card UI with markdown-rendered narratives, severity-badged expandable event table, confidence sparkbars, and copy-to-clipboard.
-
Structured Memory / Knowledge Usage
- SurrealDB is the system of record for graph, vector, and replay state (entities, observations, events, doc chunks, agent logs, annotations, checkpoints).
- Context evolves during execution through
fuse_eventswrites, graph-linked annotations (flag_suspicious_event), and replay-phase reads (get_timeline,get_event_annotations). - Files:
src/agents/tools.py,schema.surql.
-
Agent Workflow Quality
- LangGraph orchestrates a two-node, multi-phase replay pipeline with parallel tool/LMM calls and explicit state transitions.
- Tool coordination is handled through LangChain tool wrappers (
@tool) and LLM invocations via the graph nodes. - File:
src/agents/graph.py,src/agents/tools.py.
-
Persistent Agent State
- Custom
SurrealDBCheckpointSaverimplements LangGraph checkpoint persistence in SurrealDB with retention controls. - Replay sessions use deterministic
thread_idand expose/api/checkpointsfor auditability. - Files:
src/agents/checkpointer.py,api/replay/api.py.
- Custom
-
Practical Use Case
- Real-time-style OSINT replay use case: time-window event fusion, cross-feed correlation, prior-window comparison, and structured incident narratives.
- Includes baseline-vs-graph comparison to show value-added reasoning.
- Files:
src/agents/graph.py,api/replay/api.py,readme.mdusage examples.
-
Observability
- Structured action logs (
agent_log) and execution traces in graph nodes (phase timing + failures) persisted/visible for debugging. - Optional LangSmith tracing config is included for full trace-level visibility.
- Files:
src/agents/tools.py,src/agents/graph.py,.env.example,src/agents/checkpointer.py.
- Structured action logs (
Current implementation is benchmarked by:
- Full test suite:
pytest -q(tests + checkpointer + endpoint coverage) - CI workflow:
.github/workflows/ci.yml(compile + tests) - Runtime checks: SurrealDB-connected
/healthand checkpoint inspection/api/checkpoints.
| Layer | Technology |
|---|---|
| Language | Python 3.11+ |
| Database | SurrealDB 3.x (graph, vector, time-series in one engine) |
| Agent orchestration | LangGraph |
| LLM | Anthropic Claude (claude-sonnet-4-6) via LangChain |
| Embeddings | HuggingFace sentence-transformers/all-mpnet-base-v2 (768-dim, local) |
| Backend | FastAPI + Uvicorn |
| Frontend | Static HTML / CSS / JavaScript (no build step) |
flowchart LR
User([User]) --> UI[Web Frontend\nHTML/CSS/JS]
User --> CLI[CLI\ndemo.py]
UI -->|POST /api/replay| API[FastAPI\nport 8001]
CLI --> LG
API --> LG[LangGraph\nAgent Graph]
subgraph LG [LangGraph Agent Graph]
R[reconstruct_node] --> N[narrate_node]
end
subgraph R [reconstruct_node — 3 phases]
P1A[fuse_events\nwrite] -.concurrent.- P1B[vector_search\nQuery-RAG]
P1A --> P2[get_timeline ×2\ncurrent + prev concurrent]
P1B --> P2
P2 --> P3[entity Graph-RAG\nre-query with entity names]
end
subgraph N [narrate_node]
NR[narrative LLM] -.concurrent.- SE[summarise_events LLM]
end
LG <-->|graph queries\nvector search\nevent writes| SDB[(SurrealDB\ngod_eye / world)]
SDB --- T1[entity]
SDB --- T2[observation]
SDB --- T3[event]
SDB --- T4[doc_chunk\nHNSW 768d + BM25]
SDB --- T5[agent_log]
reconstruct_node runs in three phases: (1) event fusion and query-RAG concurrently, (2) current and previous-window timeline reads concurrently, (3) entity-augmented Graph-RAG using entity names extracted from the detected events. narrate_node receives all three retrieval contexts and generates the narrative and event summary via two concurrent LLM calls. SurrealDB serves as the single source of truth for all graph, vector, and time-series data.
- Full architecture write-up: Architecture Overview.md
- Diagram source/reference: diagrams.md
Implemented in the latest evening update:
- Observation Map upgrades: feed filters, mode filters (
All,Only Anomalies,High-Severity Linked), timeline scrubber, play/pause demo mode, marker detail panel, interactive observation list. - Replay analytics upgrades: cross-window diff (
new,escalated,resolved), alert rules summary, data-quality checks, latency budget panel, structured-vs-baseline scorecard. - Workflow usability upgrades: role view toggle (
Analyst/Operator) and replay presets (save/load/delete). - Provenance upgrades: event-row expansion now includes provenance summary (
source_tags, confidence, linked entities). - Export upgrades: visual pack now includes JSON + PNGs + markdown report (
godeye-replay-brief.md). - Backend replay response upgrades:
runtime_metrics,llm_model_used,thread_id,trace_url.
- Dataset used:
evaluation/replay_eval_dataset.json(10 windows/questions). - Experiment runner:
python evaluation/run_prompt_experiment.py - Results:
v1avg score: 5.0265, avg runtime: 11686.14 msv2avg score: 5.8311, avg runtime: 4786.51 ms- Winner:
v2
- Config kept in
.env:GODEYE_PROMPT_VERSION=v2GODEYE_LLM_MODEL=claude-3-haiku-20240307
Best 3 traced runs from the 10-case batch (open LangSmith and filter by thread_id):
- LangSmith run reference 1 —
thread_id=0367633a205a0a0711ca836ed01d4e3c - LangSmith run reference 2 —
thread_id=28c5c8dd6ea80c1f4f3ab32ea5b281e8 - LangSmith run reference 3 —
thread_id=ef1be0453f093a8ec8dc85d6dd866c65
- Python 3.11+
- SurrealDB 3.x binary (
surreal) — download here - An Anthropic API key
git clone https://github.com/MasteraSnackin/GodEye.git
cd GodEyepip install -r requirements.txtIn a separate terminal:
surreal start --user root --pass rootIn the SurrealDB shell (surreal sql --user root --pass root):
USE NS god_eye DB world;
SOURCE "schema.surql";cp .env.example .env
# Edit .env and set ANTHROPIC_API_KEY=sk-ant-...python load_synthetic_world.pyVerify in the SurrealDB shell:
SELECT * FROM observation LIMIT 5;
SELECT count() FROM doc_chunk GROUP ALL;Note: If you previously loaded data with an older schema (384-dim embeddings), drop the
doc_chunktable and re-run the loader after applying the updatedschema.surqlto rebuild the 768-dim HNSW index.
python demo.pyExpected output:
=== GOD EYE DEMO ===
Narrative (structured vs baseline):
[LLM-generated explanation referencing fused events, Graph-RAG docs, and entity-linked context...]
Event summary:
[Per-cluster one-liners with axis, severity, Noisy-OR confidence...]
Events (structured graph):
- event:abc123 | anomaly | axis=air | severity=medium | tags=['adsb', 'auto-fuse']
- event:def456 | jamming | axis=cyber | severity=high | tags=['jamming', 'auto-fuse']
- event:ghi789 | correlation| axis=multi | severity=high | tags=['adsb', 'jamming', 'auto-correlate']
Structured RAG docs (query-RAG):
- NOTAM @ 2026-02-28T02:00:00Z: Airspace restrictions in the Strait...
Graph-RAG docs (entity-augmented):
- advisory @ 2026-02-28T03:00:00Z: SIRIUS STAR route advisory — GPS degradation...
Start the API server:
uvicorn api.replay.api:app --reload --port 8001Open frontend/index.html in your browser (or serve frontend/ via any static file server).
Fill in the form fields:
| Field | Example |
|---|---|
| From (UTC) | 2026-02-28T02:00:00Z |
| To (UTC) | 2026-02-28T04:00:00Z |
| Region | Hormuz |
| Scenario | EPIC_FURY_DEMO |
| Question | What anomalies occurred near Hormuz? |
Click Run Replay (or press Ctrl+Enter). The UI renders:
- A markdown-formatted narrative with AI Analysis badge and copy-to-clipboard
- A concise per-cluster event summary
- An expandable events table (ID, type, axis, severity, confidence sparkbar, start time, tags)
- A severity distribution bar across the event set
| Variable | Required | Default | Description |
|---|---|---|---|
ANTHROPIC_API_KEY |
Yes | — | Anthropic API key for Claude LLM calls |
SURREAL_URL |
No | ws://127.0.0.1:8000/rpc |
SurrealDB WebSocket RPC endpoint |
SURREAL_USER |
No | root |
SurrealDB username |
SURREAL_PASSWORD |
No | root |
SurrealDB password |
CORS_ORIGINS |
No | http://localhost:8001,http://127.0.0.1:8001 |
Comma-separated allowed origins for browser clients |
GODEYE_API_KEYS |
No | (unset) | Comma-separated API keys for optional API authentication (X-API-Key). Leave unset to disable auth in local mode |
GODEYE_LLM_MODEL |
No | claude-3-5-sonnet-latest |
Primary LLM model id. If unavailable, fallback list is attempted. |
GODEYE_LLM_MODELS |
No | claude-3-5-sonnet-latest,claude-3-5-sonnet-20240620,claude-3-haiku-20240307 |
Optional ordered fallback chain for narrative + summary generation |
GODEYE_CHECKPOINT_LIMIT |
No | (unset) | Optional checkpoint retention policy per thread_id and namespace. unset = unlimited, 0 = keep none, positive integer keeps the most recent N checkpoints |
- Namespace:
god_eye - Database:
world - Defined in
schema.surql— re-apply after any schema change, then re-runload_synthetic_world.py
Events are tagged with scenario = 'EPIC_FURY_DEMO' by default. The API and frontend accept a scenario field — add scenarios by loading fixture data tagged with a different scenario name.
The HNSW index dimension (768) and model name (sentence-transformers/all-mpnet-base-v2) must stay in sync across schema.surql, src/agents/tools.py, and load_synthetic_world.py. Update all three and re-run the loader if switching models.
Screenshots below are from the current dashboard flow and replay outputs.
Initial dashboard state
Replay result with narrative + analytics
Deployment view (post-run)
Additional interaction captures
Demo endpoint URLs (local)
- API health:
http://127.0.0.1:8001/health - Replay API:
http://127.0.0.1:8001/api/replay - Frontend:
http://127.0.0.1:8081/index.html?api=http://127.0.0.1:8001
Run the full LangGraph replay pipeline for a time window.
Request body:
{
"mode": "replay",
"from_time": "2026-02-28T02:00:00Z",
"to_time": "2026-02-28T04:00:00Z",
"region": "Hormuz",
"scenario": "EPIC_FURY_DEMO",
"query": "What anomalies occurred in this window?"
}Response:
{
"narrative": "In this window, a GPS jamming burst co-occurring with anomalous AIS tracks...",
"events": [
{
"id": "event:abc123",
"type": "correlation",
"axis": "multi",
"severity": "high",
"confidence": 0.67,
"start_time": "2026-02-28T02:10:00Z",
"source_tags": ["adsb", "jamming", "auto-correlate"],
"entities": [{"id": "entity:ship1", "name": "SIRIUS STAR", "type": "ship"}]
}
],
"event_summary": "**Correlation (multi/high, conf=0.67):** ADS-B + jamming co-occurrence at 02:10 UTC...",
"narrative_status": "ok",
"summary_status": "ok",
"runtime_metrics": {
"phase1_ms": 421.4,
"phase2_ms": 41.9,
"phase3_ms": 138.1,
"total_reconstruct_ms": 602.0,
"narrate_ms": 1128.2,
"total_ms": 1730.2
},
"llm_model_used": "claude-3-5-sonnet-20240620",
"thread_id": "f2a9a3...",
"trace_url": "https://smith.langchain.com/o/.../r/..."
}Query fused events directly without running the LLM pipeline.
GET /api/events?scenario=EPIC_FURY_DEMO&from_time=2026-02-28T02:00:00Z&to_time=2026-02-28T04:00:00Z
Query raw sensor observations by time window and/or feed type.
GET /api/observations?from_time=2026-02-28T02:00:00Z&to_time=2026-02-28T04:00:00Z&feed_type=jamming&limit=100
| Parameter | Required | Description |
|---|---|---|
from_time |
No | ISO 8601 start (inclusive) |
to_time |
No | ISO 8601 end (exclusive) |
feed_type |
No | Filter: adsb | ais | jamming | net | sat_pass |
limit |
No | Max results (default 100, max 1000) |
Query all entities in the knowledge graph.
GET /api/entities?entity_type=ship
List all distinct scenario names present in the database.
GET /api/scenarios
High-severity jamming events where at least one involved entity is a ship. Walks the graph via involves→entity edges.
GET /api/jamming/tankers?from_time=2026-02-28T02:00:00Z&to_time=2026-02-28T04:00:00Z&scenario=EPIC_FURY_DEMO
Inspect persisted LangGraph checkpoint history for a replay thread.
GET /api/checkpoints?thread_id=<thread_id>&checkpoint_ns=replay&limit=20
Response
{
"thread_id": "f2a9a3...",
"checkpoint_ns": "replay",
"checkpoints": [
{
"checkpoint_id": "cp-1a2b",
"parent_checkpoint_id": null,
"metadata": {"source": "replay"}
}
]
}Returns:
{"status":"ok","db":"connected","llm":"configured"}if SurrealDB andANTHROPIC_API_KEYare available.{"status":"degraded","db":"connected","llm":"missing_api_key"}if SurrealDB is reachable but LLM key is not set.- HTTP 503 if SurrealDB is unreachable.
Automated tests are included:
pytest(tests/test_tools_scoring.py)pytest(tests/test_graph_windowing.py)pytest(tests/test_api_validation.py)pytest(tests/test_tools_annotations.py)pytest(tests/test_api_endpoints.py)pytest(tests/test_checkpointer.py)
To run the suite:
pytest -qManual verification still useful:
-- In the SurrealDB shell
SELECT * FROM event LIMIT 10;
SELECT * FROM agent_log ORDER BY time DESC LIMIT 10;
SELECT * FROM agent_checkpoint ORDER BY created_at DESC LIMIT 10;
SELECT count() FROM doc_chunk GROUP ALL;End-to-end pipeline check:
python demo.pyThe repository includes a GitHub Actions workflow at .github/workflows/ci.yml that runs on every push and pull request:
py_compilechecks for core modules and testspytest -qexecutes the full test suite
Planned: pytest unit tests for the fusion logic and confidence scoring, plus a LangGraph integration test against an in-memory SurrealDB instance. Contributions welcome.
- Langsmith - issue with configuration / deployment onto with langsmith but was informed this sunday morning it now fix will try again.
- DBSCAN spatiotemporal clustering — replace hard 10-minute fixed windows with density-based grouping over
(lat, lon, time), eliminating split-at-boundary artefacts. - Real data sources — live ADS-B via dump1090, AIS via AISHub, configurable OSINT feed ingestion pipeline.
- Streaming fusion — SurrealDB LIVE SELECT for real-time event creation as observations land, rather than batch replay.
- Isolation Forest anomaly scoring — per-observation anomaly scores against a rolling baseline rate, replacing binary time-bucket presence.
- 3D globe / map visualisation — globe.gl or MapLibre rendered over the events JSON from
/api/events. - Agent metrics dashboard — per-request latency breakdown, event counts, confidence distributions, Graph-RAG uplift measurement.
- Local LLM option — offline or cost-sensitive deployments via Ollama or llama.cpp.
Reference package:
-
github.com/MasteraSnackin/langchain-surrealdb — standalone package
-
Public repository with commits/issues/PR activity tied to SurrealDB + LangChain integration.
-
Explicit usage or design influence in GodEye (integration notes, architecture link, or dependency path).
- how it improved reliability/usability for agent workflows.
src/agents/checkpointer.py implements SurrealDBCheckpointSaver, a custom
LangGraph BaseCheckpointSaver that writes state checkpoints into SurrealDB.
This enables:
- Stateful multi-step replay continuation by
thread_idacross process restarts. - Structured, queryable checkpoint history for auditing and replay failure analysis.
- Checkpoint pruning and per-thread deletion helpers for operational cleanup.
Contributions are welcome.
- Open a GitHub Issue for bugs, feature requests, or questions.
- For pull requests: fork the repo, create a focused feature branch, keep changes scoped, and include a brief description of what changed and why.
- Follow existing code style: async Python, typed where practical, errors logged not swallowed,
return_exceptions=Trueon allasyncio.gathercalls.
This project is licensed under the MIT License. See the LICENSE file for details.
- GitHub: MasteraSnackin
- Repository: github.com/MasteraSnackin/GodEye
- Maintainer:
<ADD_MAINTAINER_NAME_HERE> - Email:
<ADD_CONTACT_EMAIL_HERE> - Website:
<ADD_WEBSITE_OR_BLOG_URL_HERE>
For bugs and feature requests, please open a GitHub Issue.




