AEGIS v0.1.0
First public release. A small local binary that guards an AI coding agent from the untrusted content it reads — prompt injection, jailbreaks, and infrastructure-impersonation (content that doesn't look like an attack but pushes the agent to act without the user).
How it works
- L1 — Aho-Corasick patterns (pure Rust, microseconds, runs anywhere incl. a Pi): known injection/IoC strings + base64/hex/rot13/homoglyph/zero-width decode passes.
- L3 — local Qwen3-1.7B judge (two passes via llama.cpp, fully offline): "is this attacking the agent?" OR "is it pushing the agent to act without the user?". A safe verdict vetoes L1's keyword hits, so a security doc that quotes an injection isn't flagged.
- Degrades gracefully to L1-only where the model can't run.
Held-out evaluation (190 files never used for tuning)
| Recall | Precision | FP-rate | F1 |
|---|---|---|---|
| 82.1% | 95.1% | 4.2% | 88.1% |
0 false positives on 80 real benign dev + agent-surface files (code with subprocess/eval, command-heavy skills, MCP configs, security docs). Reproduce with tests/held_out_eval/.
Install
macOS arm64: download aegis-macos-arm64 below, chmod +x, move to /usr/local/bin/aegis. Then brew install llama.cpp && aegis install-models. Full steps + build-from-source in the README.
Footprint
Binary 848 KB · judge model ~1.8 GB · ~2.2 GB RAM when the judge runs · nothing leaves the machine.