Skip to content

Storage Schema

cyb3rjerry edited this page May 23, 2026 · 1 revision

Storage Schema

Every table, every column, every migration. Source: internal/orchestrator/storage/migrations/.

Two backends — SQLite (default, pure Go via modernc.org/sqlite) and Postgres 14+. Both implement the same storage.Backend interface; a shared contract suite in internal/orchestrator/storage/storagetest/ runs against both in CI.

Migrations

Embedded at migrations/{sqlite,postgres}/. Migration loader applies *.up.sql files in lex order, records applied versions in schema_migrations.

Version Up file What it adds
0001 0001_init.up.sql initial schema: runs, events, baseline_fingerprints, deviations, packages, releases
0002 0002_notifiers.up.sql notifiers + notifications tables (per-run audit log)
0003 0003_run_result.up.sql events_emitted, events_dropped, duration_ns columns on runs
0004 0004_allowlists.up.sql allowlists table (CIDR/path/SNI suppression)

schema_migrations (auto-created):

CREATE TABLE schema_migrations (
  version    TEXT PRIMARY KEY,
  applied_at TEXT NOT NULL DEFAULT CURRENT_TIMESTAMP
);

Down migrations exist (*.down.sql) but aren't run automatically. The migration loader is up-only; rolling back requires manual SQL.

runs

One row per sandbox scan. The central spine — every other table references this via run_id.

CREATE TABLE runs (
  id              TEXT PRIMARY KEY,           -- hex-encoded 16-byte ULID
  package_name    TEXT NOT NULL,
  version         TEXT NOT NULL,
  tarball_sha256  TEXT NOT NULL DEFAULT '',
  lockfile_sha256 TEXT NOT NULL DEFAULT '',
  node_version    TEXT NOT NULL DEFAULT '',
  npm_version     TEXT NOT NULL DEFAULT '',
  state           TEXT NOT NULL,              -- pending|building|sandboxed|analyzed|done|failed
  attempt         INTEGER NOT NULL DEFAULT 1,
  is_baseline     INTEGER NOT NULL DEFAULT 0,
  started_at      TEXT,
  finished_at     TEXT,
  failure_reason  TEXT NOT NULL DEFAULT '',
  -- added in 0003:
  events_emitted  INTEGER NOT NULL DEFAULT 0,
  events_dropped  INTEGER NOT NULL DEFAULT 0,
  duration_ns     INTEGER NOT NULL DEFAULT 0
);

state lifecycle:

pending → done | failed | timeout

The intermediate states (building, sandboxed, analyzed) exist in the schema for future use but the current runner posts a terminal state directly via /v1/runs/<id>/result.

is_baseline=1 means the Differ accepted the run's fingerprints as the package's baseline. Set on first-run-of-package (auto-seed) or on zero-deviation subsequent runs (auto-promote) or via fangs baseline promote (manual).

events

Raw sensor events. JSON payload preserves the typed event shape (with parsed-string fields appended by the runner).

CREATE TABLE events (
  id     INTEGER PRIMARY KEY AUTOINCREMENT,   -- sqlite; BIGSERIAL on postgres
  run_id TEXT NOT NULL REFERENCES runs(id) ON DELETE CASCADE,
  ts_ns  INTEGER NOT NULL,
  type   TEXT NOT NULL,                       -- file_access|exec|net_connect|dns_query|tls_sni
  data   TEXT NOT NULL                        -- JSON-encoded payload
);

CREATE INDEX events_by_run_ts ON events (run_id, ts_ns);
CREATE INDEX events_by_ts_ns ON events (ts_ns);  -- retention pruner

data is the runner's typed event with Header + per-type fields. Read example:

SELECT id, type,
       json_extract(data, '$.PathName')    AS path,
       json_extract(data, '$.Header.PID')  AS pid
  FROM events WHERE run_id = ? AND type = 'file_access'
  ORDER BY ts_ns;

ts_ns is the kernel monotonic timestamp (bpf_ktime_get_ns), not wall-clock. Ordering within a run is reliable; cross-run comparison isn't.

Retention: the pruner deletes rows whose ts_ns < now - retention_days EXCEPT those referenced as evidence by an existing deviation. Pinned evidence keeps deviations' "click to see the proof" link working past the retention horizon.

baseline_fingerprints

The package's rolling baseline. One row per distinct (package, category, value) tuple.

CREATE TABLE baseline_fingerprints (
  package_name      TEXT NOT NULL,
  category          TEXT NOT NULL,           -- net_new_destination|net_new_dns|net_new_https_host|fs_new_path_read|fs_new_path_write|proc_new_exec
  value             TEXT NOT NULL,
  first_seen_run_id TEXT NOT NULL REFERENCES runs(id),
  last_seen_run_id  TEXT NOT NULL REFERENCES runs(id),
  occurrence_count  INTEGER NOT NULL DEFAULT 1,
  PRIMARY KEY (package_name, category, value)
);

CREATE INDEX baseline_by_package ON baseline_fingerprints (package_name);

occurrence_count ticks up every time a run re-observes the fingerprint. last_seen_run_id updates so operators can identify fossil entries (last seen 200 days ago) vs. actively-confirmed ones.

deviations

Findings — fingerprints that were in a run's event stream but NOT in the package's baseline.

CREATE TABLE deviations (
  id                TEXT PRIMARY KEY,         -- random 32-hex
  run_id            TEXT NOT NULL REFERENCES runs(id) ON DELETE CASCADE,
  category          TEXT NOT NULL,
  value             TEXT NOT NULL,
  evidence_event_id INTEGER NOT NULL REFERENCES events(id) ON DELETE CASCADE,
  severity          TEXT NOT NULL,            -- low|medium|high|critical
  detected_at       TEXT NOT NULL,
  notified_at       TEXT,
  suppressed        INTEGER NOT NULL DEFAULT 0
);

CREATE INDEX deviations_by_run ON deviations (run_id);

evidence_event_id is the first event in the run that produced the matching fingerprint. The retention pruner explicitly EXCLUDES these events so the link survives.

suppressed is reserved for future use — currently always 0. A future "acknowledge without promote" workflow could flip it.

notified_at is also reserved — currently the Notifier writes a separate notifications row per attempt instead of stamping the deviation. Could be repurposed if we ever want at-most-once delivery.

packages

Watcher's watch list.

CREATE TABLE packages (
  name              TEXT PRIMARY KEY,
  added_at          TEXT NOT NULL,
  last_checked_at   TEXT,
  last_seen_version TEXT NOT NULL DEFAULT ''
);

fangs package add <name> inserts here. The watcher's poll loop reads from this table, compares last_seen_version to the registry's current dist-tags.latest, dispatches a scan + updates the row on mismatch.

releases

Registry versions the watcher has observed. One row per (package_name, version).

CREATE TABLE releases (
  package_name   TEXT NOT NULL,
  version        TEXT NOT NULL,
  tarball_sha256 TEXT NOT NULL DEFAULT '',
  npm_integrity  TEXT NOT NULL DEFAULT '',
  published_at   TEXT,
  discovered_at  TEXT NOT NULL,
  PRIMARY KEY (package_name, version)
);

CREATE INDEX releases_by_discovered ON releases (discovered_at DESC);

fangs release list -package P reads this; RecordRelease is idempotent on the primary key so a watcher restart doesn't duplicate.

notifiers

Configured webhook targets. Added in migration 0002.

CREATE TABLE notifiers (
  name         TEXT PRIMARY KEY,
  url          TEXT NOT NULL,
  template     TEXT NOT NULL,        -- slack|discord|generic
  secret_env   TEXT,                 -- env var name for HMAC secret
  headers      TEXT,                 -- JSON-encoded extra headers
  min_severity TEXT,                 -- low|medium|high|critical|''
  enabled      INTEGER NOT NULL DEFAULT 1,
  created_at   TEXT NOT NULL,
  updated_at   TEXT NOT NULL
);

UpsertNotifier uses INSERT … ON CONFLICT(name) DO UPDATE so both fangs notifier add (overwrites on duplicate name) and the -notifiers-file boot-time loader behave identically.

notifications

Per-attempt delivery audit log. Added in migration 0002 (replaced the 0001 placeholder).

CREATE TABLE notifications (
  id                TEXT PRIMARY KEY,
  run_id            TEXT NOT NULL REFERENCES runs(id) ON DELETE CASCADE,
  notifier_name     TEXT NOT NULL REFERENCES notifiers(name) ON DELETE CASCADE,
  attempt           INTEGER NOT NULL,         -- 1..N
  status            TEXT NOT NULL,            -- queued|sent|failed|permanent
  last_attempted_at TEXT,
  next_attempt_at   TEXT,
  response_code     INTEGER,
  response_body     TEXT,                     -- truncated to 2048 bytes
  error_msg         TEXT,                     -- network/timeout messages
  deviation_count   INTEGER NOT NULL DEFAULT 0,
  created_at        TEXT NOT NULL
);

CREATE INDEX notifications_by_status ON notifications (status, next_attempt_at);
CREATE INDEX notifications_by_run    ON notifications (run_id);

Append-only. Each retry writes a new row; nothing is updated. The status, next_attempt_at index supports a hypothetical "resume-pending-retries-on-restart" implementation (not wired today).

allowlists

Operator-managed suppressions. Added in migration 0004.

CREATE TABLE allowlists (
  id           TEXT PRIMARY KEY,
  scope        TEXT NOT NULL,           -- global|package
  package_name TEXT,                    -- non-NULL iff scope=package
  kind         TEXT NOT NULL,           -- cidr|path|sni
  value        TEXT NOT NULL,
  note         TEXT NOT NULL DEFAULT '',
  created_at   TEXT NOT NULL,
  CHECK (scope IN ('global','package')),
  CHECK (kind  IN ('cidr','path','sni')),
  CHECK ((scope='global' AND package_name IS NULL)
      OR (scope='package' AND package_name IS NOT NULL))
);

CREATE INDEX allowlists_by_package ON allowlists (package_name);
CREATE INDEX allowlists_by_scope   ON allowlists (scope);

ID semantics:

  • CLI-added: random 16-hex via crypto/rand
  • Config-managed: cfg + first 12 hex chars of sha256(kind+"|"+value). Deterministic so re-applies are idempotent.

The CHECK constraints encode the invariant that scope=global ↔ package_name IS NULL.

Backend-specific differences

Topic SQLite Postgres
Timestamps TEXT in RFC3339Nano TIMESTAMPTZ
Auto-increment INTEGER PRIMARY KEY AUTOINCREMENT BIGSERIAL
Bool INTEGER 0/1 BOOLEAN
JSON columns TEXT + json_extract TEXT + data::jsonb if needed
Placeholder ? $1, $2, ...
ON CONFLICT works works

The dual-backend contract suite verifies every storage method behaves identically. CI runs the suite against a real Postgres service container.

Sizing + retention

Table Growth rate Retention
events by far the largest — thousands of rows per run × hundreds of runs/day 90 days default, pinned-evidence excepted
deviations small — only novel fingerprints indefinite
baseline_fingerprints bounded per-package indefinite
runs one row per scan indefinite
releases one per registry version observed indefinite
notifications small — one per delivery attempt indefinite
allowlists tiny — operator-curated indefinite
packages tiny — operator-curated indefinite

For a typical deployment watching 50 packages with weekly releases:

  • ~250 runs/week
  • ~250 × 500 = 125k events/week
  • After 90 days: ~1.5M events
  • ~5-10 KB per event row → ~10 GB on-disk for the events table

Postgres recommended at this scale; SQLite handles it but pragma tuning (WAL mode, large cache_size) matters.

Direct queries

The CLI talks directly to the DB via storage.Backend. Operators running ad-hoc queries can use the same path:

sqlite3 var/lib/fangs/fangs.db
sqlite> SELECT package_name, COUNT(*) AS runs
   ...> FROM runs GROUP BY package_name ORDER BY runs DESC LIMIT 10;

Or via psql for Postgres. The schema is stable — migrations are additive; columns aren't removed without a new migration version.

Clone this wiki locally