-
Notifications
You must be signed in to change notification settings - Fork 1
Threat Model
What FANGS catches, what it doesn't, what assumptions it makes about
its host. Lifted from docs/THREAT_MODEL.md and extended.
For every watched npm release:
-
Pull the npm tarball + metadata from
registry.npmjs.org— the runner pulls it inside the sandbox vianpm install. The orchestrator never touches the tarball directly. -
Run
npm install <pkg>@<ver>inside an unprivileged Docker container on the runner host. This executes the package'spreinstall/install/postinstallscripts. - Observe every syscall the container makes via eBPF tracepoints attached on the host kernel, scoped to the container's cgroup.
- Stream events back to the orchestrator over HTTP(S).
The malicious code in step 2 is the live threat. Everything else is defense.
| Component | Trust posture |
|---|---|
| Orchestrator | Trusted. Operators run it; holds the DB; signs HMACs. |
| Runner | Trusted but disposable. Mints cgroups, drives Docker, holds the eBPF sensor. Operators run it; treat it as the dirty box. |
| Sandbox container | Hostile. Every byte inside is attacker-controlled. |
| Operator CLI | Trusted. Talks directly to the same DB the orchestrator uses. |
| Kernel | Trusted. The sensor's eBPF probes live in the kernel; sandbox can't subvert what they report about its own syscalls. |
The package has root inside the container. Specifically:
- Anything
npm installscripts can do asroot(the defaultUser: 0:0). - Outbound network:
NetworkMode: bridge→ DNS, TCP, UDP to anywhere reachable from the Docker bridge. - Filesystem writes inside the container — rootfs is writable.
/tmpis a 256 MB noexec tmpfs.
What the attacker doesn't control:
- Kernel-side view of their syscalls.
- Capabilities —
CapDrop: ALL+no-new-privileges:true. - PIDs / forks —
PidsLimit: 256blocks fork bombs. - Resource caps —
Memory: 512 MB,NanoCpus: 1,Ulimits: nofile/nproc/fsize. - Host filesystem persistence — no volume mounts, no
--device. Container artifacts get cleaned up ondocker rm.
Threat. Kernel CVE that lets the sandboxed process break out of
its namespaces / cgroup. Once out, attacker has runner's privileges
— including root (because runner needs CAP_BPF + CAP_PERFMON +
Docker socket).
Mitigation:
- Keep runner host's kernel patched. eBPF + container CVEs land ~quarterly; FANGS doesn't reduce patching urgency.
- Don't share the runner host with anything that matters. The runner is a disposable scan-only box; treat as compromised.
-
seccompis NOT applied today. Future: Docker default seccomp profile + a FANGS-tuned overlay.
Threat. Package exfiltrates host secrets. Even without escape, anything the sandbox can read (its env, its writable rootfs) it can send out.
Mitigation:
- Sandbox env is the install context (npm token if you populated one). Don't populate the sandbox with secrets you can't afford to leak. The default scan config doesn't.
- Network observation is full-fidelity: every TCP connect (including io_uring path), every TLS SNI, every DNS qname. Exfil to novel destinations becomes a deviation.
- DNS-over-HTTPS to a non-baseline host shows up as
net_new_https_host. DoH to an existing baseline host (Cloudflare 1.1.1.1 viacloudflare-dns.com) is invisible at the FANGS layer — the encrypted query inside the TLS session can't be parsed without JA4 fingerprinting (not in v1). - For deployments where any exfil at all is catastrophic, set
NetworkMode: noneon the scan submission's sandbox spec. Trade- off:npm installcan't reach the registry, so this only works for pre-cached tarballs.
Threat. Sophisticated package detects it's in a sandbox and behaves benignly. Released to production, it lights up.
Mitigation:
- FANGS's first run for a new version IS the baseline. If the package behaves benignly in the sandbox, that's baseline. Real- prod behavior that differs would create a deviation — except FANGS doesn't observe prod.
- This is a known unsolvable problem in pure-sandbox malware
analysis. Mitigations are: (a) make the sandbox look enough like
real prod (Node version, hostname patterns, env vars) that simple
checks fail; (b) run long enough that timer-gated payloads still
fire (default
duration: 60s; tune higher for sneaky packages). - FANGS doesn't try to be undetectable. The cgroup parent is named
/fangs/<run_id>, the runner hostname is the real host. A determined attacker has signals available. v2 work: per-job sandbox identity randomization.
Threat. Package uses kernel mechanisms FANGS doesn't observe.
Covered (probes attached):
-
sys_enter_openat(file access) -
sys_enter_execve(process exec + 5-level ancestry) -
sys_enter_connect+kprobe:tcp_v{4,6}_connect(TCP/UDP, IPv4+IPv6, including io_uring TCP) -
sys_enter_sendto+sys_enter_sendmsg+sys_enter_sendmmsg(DNS, TLS ClientHello) -
sys_enter_write(TLS ClientHello on TCP — Node's BoringSSL path) -
uprobe:SSL_ctrl(libssl SNI)
Known gaps:
-
openat2(Linux 5.6+, semantically equivalent). Modern glibc may use it; we only hookopenat. Most npm-install workloads still hit legacy openat. -
io_uring_enterwith file-op opcodes (read/write/open via io_uring). Rare in npm packages but possible in malware designed to evade. -
mmap+ dirty pages writes. Not observed. - Raw netlink-driven sockets. Edge case.
Each gap is a future probe addition. The delta-vs-baseline approach means a new probe retroactively catches new behaviors as soon as fingerprints land.
Threat. Attacker controls the first release FANGS sees. They ship a benign payload to seed the baseline, then turn on malicious behavior in a later release — but the malicious fingerprints look "normal" because the baseline already includes them.
Mitigation:
- FANGS doesn't enforce "watch from version 1." Operators add packages they already trust at some version. Baseline starts there.
- For cold-start packages (operator has no prior trust), today's D38 auto-promote on first run is too trusting. Operators in this case should manually inspect the first run's events before promoting. UI shows first-run events normally; deviations only show post-baseline.
-
Open mitigation (v2): require K consecutive zero-deviation
runs before auto-promotion. Source change in
differ/analyze.go.
Threat. Attacker who can intercept the orchestrator's outbound HTTP can forge "all-clear" notifications or replay real alerts to overwhelm the receiver.
Mitigation:
- HTTPS-only for webhook targets (loopback http exempted for local testing).
- HMAC opt-in per target via
secret_env. Receivers verifyX-FANGS-Signature: sha256=<hex>against their own copy of the secret. Slack and Discord skip HMAC (URL is the secret). - Receiver should validate timestamps for replay protection. FANGS
doesn't embed a timestamp in the signed body today — add a
time_unix_nanofield to the generic envelope if you need replay protection at the receiver.
Threat. Attacker with DB write access mutates baselines (silently suppressing real findings), forges deviations (noise), or adds allowlist entries (suppressing whole categories).
Mitigation:
- DB is the single source of truth for what FANGS thinks is benign. Treat accordingly: filesystem permissions, Postgres role separation, backups.
- Migrations are embedded + run idempotently from
schema_migrations. A rogue DB writer can't sneak schema changes through the orchestrator's startup path. - In steady state, only the orchestrator process should be writing
the DB. The
fangsCLI writes via the storage layer but should run as a trusted operator. - v2 idea: append-only audit log of all DB mutations.
Threat. A runner connects to the orchestrator and either poisons baselines by reporting fake clean runs, or sniffs the scan jobs (which contain the orchestrator's view of what packages are worth watching).
Mitigation:
- Without mTLS, this is unmitigated. Anyone who can reach the orchestrator's listen address can register.
- With mTLS, only runners holding a cert signed by the orchestrator's CA can register. CN of the cert is the runner identity, recorded in the registration metadata.
- See TLS-mTLS.
Threat. Attacker gets root on the runner via some unrelated vulnerability. Now they can inject events, drop legit events, or attack the orchestrator via the trusted runner→orchestrator channel.
Mitigation:
- Runner host should be dedicated and minimal — nothing else running there.
- Runner certs are per-host so a compromised host's cert can be revoked + the runner replaced.
- Event-streaming endpoint trusts the runner's reported PIDs and comms — there's no cross-check against what Docker says is in the container. A malicious runner could fabricate events. Defense-in-depth: signing per-event with a runner-side key (v2 item).
- Compile-time supply-chain attacks. Malicious code in a package's pre-publish tooling that injects payloads into the tarball before upload. FANGS observes what runs, not what was modified pre-publish. Use SLSA + provenance attestations for that layer.
- Account takeover detection. FANGS sees behavior change; it doesn't know who published the version. If an attacker compromises a maintainer account and ships a version with EXACTLY the same behavior as priors, FANGS reports no deviation. Combine with npm publish-event monitoring for ATO detection.
-
Typosquats at install time. Watching
lodahs(typo) won't help if devs typolodaShinstead. FANGS only watches packages operators explicitly added. -
Dependency confusion. If your private registry serves
internal-tooling@1.0.0and someone publishes the same name to npmjs.org with a higher version, FANGS only watches what's on its list. Ifinternal-toolingis watched and the public version gets installed instead, the behavior comparison happens against the PUBLIC version's baseline — possibly not what operators expect. - In-memory-only payloads with no syscall side effects. Theoretical; no real-world npm attack uses this because at some point exfil needs a network syscall.
FANGS handles a lot. It cannot handle:
- Keeping the host kernel patched.
- Keeping the Docker daemon updated.
- Choosing which packages to watch.
- Reading the deviations. Notifier is a push channel, but a human has to decide what each finding means.
- Not running the runner on the same box as production services. If you must, sandbox the runner itself in its own VM.
Found a way to subvert FANGS — make the sensor miss something, escape the sandbox, fool the differ? Open an issue against the repo or contact the maintainers directly. Coordinated disclosure preferred for sandbox-escape-class findings.