-
Notifications
You must be signed in to change notification settings - Fork 1
Operating
Day-2 reference. Workflows, triage, troubleshooting.
FANGS is a delta detector. It compares each run against the package's rolling baseline and surfaces what's new. Three states a run can be in:
| State | Meaning |
|---|---|
| First run for a package | Baseline gets seeded; zero deviations regardless |
| Subsequent zero-deviation run | Auto-promoted to baseline (occurrence_count bump only) |
| Subsequent any-deviation run | Lands in fangs pending for operator decision |
fangs package add lodashWhat happens:
- Validates
lodashexists onregistry.npmjs.org. Bogus names get rejected before the DB row lands. - Inserts into the
packagestable. - Records the current
dist-tags.latestaslast_seen_versionso the next watcher poll doesn't re-flag it as "new." - Queues an immediate kickoff scan of that version. First run → seeds baseline.
Subsequent releases auto-trigger via the watcher (default 5min poll).
Useful subcommands:
fangs package watched # current watch list
fangs package list # packages with runs (post-scan summary view)
fangs package remove lodash # stop watchingSkip the kickoff scan if you want — useful for offline / batch flows:
fangs package add lodash -skip-initial-scanfangs scan submit -package axios -version 1.7.7Validates axios@1.7.7 exists on the registry, then POSTs /v1/scans
to the orchestrator (default http://127.0.0.1:8443). Returns the
assigned run_id + the watch URLs.
Useful flags:
| Flag | Purpose |
|---|---|
-orchestrator URL |
when the orchestrator isn't on localhost |
-runner ID |
target a specific runner (default: the operator's hostname) |
-duration 90s |
longer sandbox time for heavy installs |
-skip-registry-validate |
useful for offline tests against private fake-registries |
The triage queue lives in two places — one for browsing, one for scripting:
# Browser
http://127.0.0.1:8443/ui/pending
# CLI
fangs pending
fangs pending -package axios
fangs pending -min-severity high
fangs pending --json | jq . # for piping into other tools
Each row shows the run, package, version, deviation count, max
severity, last detected, and the literal fangs baseline promote …
command you can copy-paste.
Five steps:
- Look at the package + version. Is it a known dependency? Was there a recent advisory? Sometimes the answer comes from context alone.
-
Look at the deviation list for the run.
fangs deviation list -run <run_id>or click into the run on the UI. -
Look at the evidence event.
fangs deviation show <id>prints the full kernel-event JSON that triggered the finding. Or click "→ lineage" in the UI to see the process tree. -
Decide. Three real options:
- Promote —
fangs baseline promote <run_id>. The whole run's fingerprints merge into baseline, deviation rows clear. - Allowlist the noise —
fangs allow addwith the appropriate kind. See below. - Investigate — pull the package tarball, do offline analysis. Nothing in FANGS changes until you act.
- Promote —
Suppress known-benign noise before it becomes a deviation:
# Global — applies to every package
fangs allow add -kind cidr -value 10.0.0.0/8 -note "internal"
# Per-package — only applies to runs of `axios`
fangs allow add -kind sni -value telemetry.example -package axios
# Path exclusion — silence noisy file_access events
fangs allow add -kind path -value /opt/vendor/ -note "trusted vendored dir"Three kinds map to three Differ categories:
| kind | suppresses | example |
|---|---|---|
cidr |
net_new_destination |
10.0.0.0/8 |
path |
fs_new_path_* |
/opt/vendor/ |
sni |
net_new_https_host |
telemetry.example |
The hardcoded CDN allowlist (Cloudflare/GitHub/Google/Fastly/CloudFront) applies underneath — entries here are additive.
fangs allow list — show all entries. Config-managed ones (from
config/orchestrator.yaml) have cfg…-prefixed IDs.
fangs allow remove <id-prefix> — git-style short ID match.
UI: /ui/allowlist.
When a clean release should join baseline but the run had deviations you've accepted as legitimate:
fangs baseline list -package lodash # see what's in baseline
fangs baseline promote <run-id-prefix>Promote re-extracts the run's fingerprints (with allowlists applied),
merges them into baseline_fingerprints, marks the run
is_baseline=true, and clears any deviation rows for it.
Without a notifier, deviations sit in the DB waiting for someone to refresh the UI. With one, every run with ≥1 deviation fires one webhook per configured + enabled target.
# Slack
fangs notifier add -name soc-slack \
-url 'https://hooks.slack.com/services/T.../B.../...' \
-template slack
# Discord
fangs notifier add -name soc-discord \
-url 'https://discord.com/api/webhooks/.../...' \
-template discord
# Generic — for SIEM / event bus / Lambda
fangs notifier add -name siem \
-url 'https://intake.internal/fangs' \
-template generic \
-secret-env FANGS_HMAC \
-min-severity highKnobs:
| Flag | Purpose |
|---|---|
-template |
slack | discord | generic
|
-secret-env ENV_VAR |
env-var name holding HMAC secret (generic targets only) |
-min-severity |
only fire when ≥1 deviation has severity ≥ threshold |
-headers JSON |
extra HTTP headers as JSON object |
-enabled=false |
disable without removing |
Verify wiring:
fangs notifier test soc-slack # fires a synthetic message
fangs notifier list
fangs notifier history -run <run_id> # delivery attempts for one runUI: /ui/notifiers.
See Notifier for retry policy, template internals, HMAC details.
UI overview (/ui/) shows:
- Packages watched / packages ever tracked
- Runs total / runs on baseline
- Open deviations / packages affected
- Lifetime events dropped (sensor ringbuf overflow indicator)
- Runner pool with heartbeat freshness + active-run links
- Recent runs + recent deviations
Prometheus at /metrics — see Metrics for every series.
CLI:
fangs run list -package lodash -limit 20
fangs run show <run-id-prefix>
fangs release list -package lodashThe orchestrator received a scan request but has no runner. Cause: no
fangs-runner is running, or its heartbeat went stale (>90s) and it
got pruned. Restart the runner.
Runner crashed without deregistering. Wait ~90 seconds and the heartbeat pruner evicts. Pruner ticks every 30s.
The eBPF ringbuf overflowed during one or more runs. Causes:
- Very-high-throughput sandboxes (e.g. ESM-only packages that open thousands of files at install).
- Slow consumer — the runner's event-stream HTTP POST is backed up.
Check
runner.logfor highbatch_send_errorsinevent streamer closedlines. - Orchestrator backpressure under heavy concurrent scans.
The ringbuf is 64 MB per probe (compile-time constant). Most cases resolve by tuning sandbox concurrency.
OOM kill. The sandbox hit Memory: 512 MB (default). For packages
with heavy install steps (TypeScript, native bindings), bump memory
in your scan submission's sandbox spec. Per-package memory override
is a v2 item.
Docker daemon couldn't start the container. Usually a permission
problem on /var/run/docker.sock or a Docker daemon misconfiguration.
Check journalctl -u docker.
Two scans tried to share the same cgroup_id — shouldn't happen in practice. Restart the runner.
Check the runner log for attach tracepoint lines. Common causes:
-
tracefs not mounted— runner auto-mounts at startup; manual fallback:sudo mount -t tracefs nodev /sys/kernel/tracing -
libssl not loadable— Debian/Ubuntu/Kali libssl is mode 644; fix:sudo chmod +x /usr/lib/x86_64-linux-gnu/libssl.so.3 -
kprobe tcp_v4_connect attach failed— symbol absent on very old kernels; sensor logs warn-and-continue, io_uring TCP connects on this path go unobserved
In order of preference:
- Allowlist the recurring noise.
- Promote a known-good run as baseline.
-
Tighten path normalization in
internal/orchestrator/differ/normalize.go— source edit, rebuild.
SQLite — copy var/lib/fangs/fangs.db while orchestrator isn't
running, or use sqlite3 .backup for hot backups.
Postgres — standard pg_dump / pg_restore.
The DB is the only persistent state. Nothing else needs preserving between hosts.
# Orchestrator: SIGINT/SIGTERM cleanly stops the HTTP listener.
sudo systemctl stop fangs-orchestrator
# Runner: SIGTERM cleans up the per-run cgroup parent before exit.
sudo systemctl stop fangs-runnerHard kill (SIGKILL) leaves orphan cgroups at
/sys/fs/cgroup/.../fangs/<run_id>/. Remove with sudo rmdir once
the container exited.
Raw events are auto-pruned after -retention-days (default 90).
Pruner runs daily; first prune fires 30s after orchestrator startup
so operators can verify it's wired.
Deviation-evidence events are PINNED forever — the "click for evidence" link on historical findings keeps working past the retention horizon.
Baselines never expire. They're the load-bearing state.
Disable pruning entirely: -retention-days=0.