Mosaic

OSINT Graph Platform

A modular, investigation-focused OSINT platform for collecting, structuring, and analyzing open-source intelligence using graph-based link analysis.

Core Concept

This system is a data pipeline + investigation workspace.

Research → collect data
Processing → structure data
Graph → store relationships
Analysis → query graph
LLM → assist, not decide

Tech Stack

Backend:

FastAPI
RQ
Redis

Frontend:

React + TypeScript

Storage:

SQLite / PostgreSQL
Neo4j

LLM:

Ollama

Architecture

Frontend → API → Queue → Worker → Processing → Storage → Analysis

Project Structure

osint-platform/ frontend/ backend/ app/ api/ connectors/ extraction/ graph/ services/ models/ worker/

Data Model

App DB:

Investigation
Job
Evidence

Graph DB: Nodes:

Person, Organization, Email, Domain, etc.

Relationships:

ASSOCIATED_WITH
USES_EMAIL
SUPPORTS

Principles

Evidence-backed data only
Research separate from graph analysis
Jobs instead of blocking requests
LLM is assistant only

Local Setup

Requirements:

Python 3.12
Node.js
Redis
Ollama
Neo4j

Run:

redis-server
cd backend && uvicorn app.main:app --reload
rq worker
npm run dev

Workflow

Run search
Extract entities
Store evidence
Ingest graph
Query graph
Summarize

Notes

Keep connectors isolated
Keep graph logic separate
Do not mix layers

Mosaic

OSINT Graph Platform

A modular, investigation-focused OSINT platform for collecting, structuring, and analyzing open-source intelligence using graph-based link analysis.

This repository is designed for:

local-first development on limited hardware
clean separation between research, processing, and graph analysis
scalable architecture without requiring a rewrite

Core Concept

This system is not a search tool. It is a data pipeline + investigation workspace.

Separation of concerns:

Research → collect and extract data
Processing → normalize and structure data
Graph → store entities and relationships
Analysis → query graph + run algorithms
LLM → assist, summarize, explain (not source of truth)

Architecture Overview

Frontend (React)
    ↓
API (FastAPI)
    ↓
Queue (Redis)
    ↓
Worker (RQ)
    ↓
Processing + Extraction
    ↓
Storage:
  - App DB (SQLite/Postgres)
  - Graph DB (Neo4j)
    ↓
Graph Query + Analysis
    ↓
LLM (Ollama / API)

Tech Stack

Backend

FastAPI
RQ (background jobs)
Redis (queue + cache)
SQLAlchemy
Neo4j Python Driver

Frontend

React + TypeScript
Vite
Zustand (state)
React Query
React Flow (graph UI)

Storage

SQLite (local) → PostgreSQL (later)
Neo4j (AuraDB Free or local)
Filesystem (optional raw artifacts)

LLM

Ollama (local)
Optional paid API (for higher-quality reasoning)

Project Structure

osint-platform/
  docs/
  frontend/
  backend/
    app/
      api/
      connectors/
      extraction/
      graph/
      services/
      models/
      schemas/
    worker/

Key System Layers

1. Connectors (`connectors/`)

Handles external data sources.

Each connector:

builds queries
fetches data
normalizes output

2. Extraction (`extraction/`)

Converts raw data into structured intelligence:

entities
relationships
evidence

3. Graph Layer (`graph/`)

Handles Neo4j:

Cypher templates
ingestion logic
graph queries
algorithms

4. Worker Layer (`worker/`)

Runs background jobs:

research
extraction
graph ingestion
enrichment

5. API Layer (`api/`)

Thin layer only:

creates jobs
returns results
never runs heavy logic

Data Model

App Database

Investigation
Job
SourceDocument
Evidence
Annotation

Graph Database (Neo4j)

Nodes

Person
Organization
Email
Phone
Address
Domain
Account
Event
Evidence

Relationships

ASSOCIATED_WITH
USES_EMAIL
USES_PHONE
LOCATED_AT
OWNS_DOMAIN
HAS_ACCOUNT
MENTIONS
SUPPORTS

Rule:
All relationships must be backed by evidence.

Research vs Graph Analysis

Research (Data Collection)

search sources
fetch documents
extract entities + relationships
store evidence

Graph Analysis (Link Analysis)

query Neo4j
run graph algorithms
explore connections
explain relationships

These are intentionally separate systems.

Job-Based Workflow

User request
→ API creates job
→ Worker processes job
→ Results stored
→ Frontend retrieves results

Benefits:

no blocking requests
retry support
scalable execution
consistent processing

Example Workflow

1. User runs search
2. Worker collects data
3. Extract entities + evidence
4. Store candidates
5. Ingest into Neo4j
6. Run graph queries
7. Summarize results

Graph Query Strategy

Start with deterministic queries:

find node by name
1-hop / 2-hop neighbors
shortest path
evidence lookup

Then add graph algorithms:

centrality
similarity
clustering
link prediction

LLM Usage

LLMs are used for:

extraction (structured output)
summarization
explanation

LLMs are NOT used for:

defining truth
modifying graph directly
replacing structured data

Local Development Setup

Requirements

Python 3.12
Node.js
Redis
Ollama
Neo4j (AuraDB Free recommended)

Backend Setup

The backend is the first working slice of the app right now. Start there before trying to run Redis workers, Neo4j ingestion, or the frontend.

Create or use the project virtual environment

cd backend
python3 -m venv ../venv
source ../venv/bin/activate

If you already created /Users/hadeelmusallam/Mosaic/venv, reuse it:

source /Users/hadeelmusallam/Mosaic/venv/bin/activate
cd /Users/hadeelmusallam/Mosaic/backend

Install backend dependencies

python3 -m pip install --upgrade pip
python3 -m pip install -e .

Run the FastAPI server

uvicorn app.main:app --reload

Verify the backend is running

Open these in the browser or call them with curl:

http://127.0.0.1:8000/health
http://127.0.0.1:8000/docs

The backend currently creates the local SQLite database automatically on startup. The database file lives at backend/mosaic.db.

Current Backend Endpoints

GET /health
GET /investigations
POST /investigations
PATCH /investigations/{investigation_id}/archive

Example create request:

curl -X POST http://127.0.0.1:8000/investigations \
  -H "Content-Type: application/json" \
  -d '{
    "title": "Test Investigation",
    "description": "First DB-backed investigation"
  }'

Full Stack Run Later

Once the rest of the local stack is wired up, the intended run commands are:

redis-server
cd backend && uvicorn app.main:app --reload
rq worker
npm run dev

Minimal Viable Features

create investigation
run research job
extract entities + evidence
store candidates
ingest into Neo4j
query graph
visualize results

Future Enhancements

Graph Data Science algorithms
GraphRAG integration
multi-user investigations
screenshot service
export/reporting
confidence scoring

What Not To Do

do not store every search permanently
do not mix scraping inside API routes
do not allow unrestricted Cypher from LLMs
do not over-engineer infrastructure early
do not rely on LLMs for truth

Design Philosophy

modular
explainable
traceable
scalable
lightweight for local development

Notes for Codex / Cursor

connectors, extraction, and graph are separate layers
all data must flow through normalization
graph queries live only in graph/
avoid tight coupling between frontend and sources
use typed schemas everywhere
treat evidence as first-class data
keep research and analysis separate

Summary

This platform evolves from:

search tool

into:

structured intelligence system

with:

reusable data
explainable relationships
scalable architecture

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.codex		.codex
.cursor		.cursor
backend		backend
docs		docs
frontend		frontend
.env.example		.env.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
README.md		README.md
docker-compose.yml		docker-compose.yml

Folders and files

Latest commit

History

Repository files navigation

Mosaic

OSINT Graph Platform

Core Concept

Tech Stack

Architecture

Project Structure

Data Model

Principles

Local Setup

Workflow

Notes

Mosaic

OSINT Graph Platform

Core Concept

Architecture Overview

Tech Stack

Backend

Frontend

Storage

LLM

Project Structure

Key System Layers

1. Connectors (connectors/)

2. Extraction (extraction/)

3. Graph Layer (graph/)

4. Worker Layer (worker/)

5. API Layer (api/)

Data Model

App Database

Graph Database (Neo4j)

Nodes

Relationships

Research vs Graph Analysis

Research (Data Collection)

Graph Analysis (Link Analysis)

Job-Based Workflow

Example Workflow

Graph Query Strategy

LLM Usage

Local Development Setup

Requirements

Backend Setup

Current Backend Endpoints

Full Stack Run Later

Minimal Viable Features

Future Enhancements

What Not To Do

Design Philosophy

Notes for Codex / Cursor

Summary

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. Connectors (`connectors/`)

2. Extraction (`extraction/`)

3. Graph Layer (`graph/`)

4. Worker Layer (`worker/`)

5. API Layer (`api/`)

Packages