One command. Zero config. Finds what scanners can't.
AI-Sec uses Claude to reason about your entire codebase — auth flows, business logic, infrastructure — and audit it against OWASP ASVS, MASVS, CIS benchmarks, and CI/CD Top 10. It catches the things regex will never find.
/install sergey-ko/ai-secThen from your project:
/auditgit clone https://github.com/sergey-ko/ai-sec.gitThen in Claude Code, from your project directory:
/auditThat's it. AI-Sec auto-detects your stack and runs the relevant checks.
/audit is one-shot. For ongoing security work, use /program — a stateful operator workflow that persists architecture, attack surface, and findings in a .ai-sec/ folder so each run builds on the last.
/program init # Bootstrap: map architecture, enumerate attack surfaces, build scan plan
/program scan # Execute today's step from the plan, append findings
/program triage # Rank open findings by exploit × blast radius × reachability
/program report # Exec-ready snapshot you can forward upwardOn-demand commands:
/program threat-model "adding OAuth login" # STRIDE analysis before building a feature
/program incident "spike in /api/users/:id" # Reverse mode: narrow hypotheses for a suspected issue
/program refresh # Re-scan architecture, catch drift
/program baseline # Freeze pre-existing debt, see only new findings going forwardMon /program init → .ai-sec/ created: architecture.md, attack-surface.md, scan-plan.md
Mon /program baseline → Defer 38 pre-existing findings, start from a clean slate
Tue /program scan → Daily diff-check on changed files. 2 new HIGH findings.
Tue /program triage → Ranked fix order, top issue: IDOR on /api/orders
(fix the IDOR)
Wed /program scan → IDOR marked [RESOLVED]. 0 new findings.
Thu /program threat-model "adding webhook endpoint for Stripe"
→ STRIDE table flags HMAC-on-parsed-body risk before it ships
Fri /program report → Monthly exec report: posture, trend, top 5 risks, 3 next steps
Scanner output is noisy and ephemeral. .ai-sec/ keeps the living state of your security program:
.ai-sec/
├── architecture.md # Current system map — overwritten by refresh
├── attack-surface.md # Current surfaces, ranked by exposure
├── scan-plan.md # Daily/weekly/monthly cadence
├── findings.md # OPEN findings only (triaged, deduplicated)
├── baseline.md # Accepted/deferred findings — excluded from future scans
├── history/ # Per-run deltas (what ran, what changed) — audit trail
└── reports/ # Monthly exec reports
Delta-first: no duplicate findings across runs, no 400-line reports nobody reads.
| Category | Example | Why Scanners Miss It |
|---|---|---|
| Broken Auth | JWT tokens with 30-day expiry, no rotation | Requires understanding the auth flow |
| IDOR | /api/users/:id returns any user's data |
Can't reason about authorization logic |
| Business Logic | Negative prices bypass validation | Requires understanding intent |
| Webhook Forgery | HMAC uses parsed body, not raw bytes | Requires tracing data through middleware |
| CI/CD Supply Chain | Actions pinned to mutable tags, not SHA | Requires understanding CI/CD threat model |
| Infra Drift | EKS API server publicly accessible | Requires reading Terraform + cloud context |
| Capability | AI-Sec | Snyk | Semgrep | CodeQL |
|---|---|---|---|---|
| Understands auth flows | ✅ | ❌ | ❌ | |
| Detects business logic flaws | ✅ | ❌ | ❌ | ❌ |
| IDOR detection | ✅ | ❌ | ||
| Webhook forgery | ✅ | ❌ | ❌ | ❌ |
| CI/CD supply chain | ✅ | ✅ | ❌ | |
| Infrastructure drift | ✅ | ❌ | ❌ | ❌ |
| Framework compliance matrix | ✅ | ❌ | ❌ | ❌ |
| Zero config | ✅ | ❌ | ❌ | ❌ |
| Free & open source | ✅ | Freemium | ✅ | ✅ |
Why the difference? Traditional scanners pattern-match against known CVEs and syntax rules. AI-Sec reads your code the way a security engineer does — understanding what it's supposed to do, then finding where it doesn't.
/audit
│
├─ 1. DETECT What's in this repo?
│ → Next.js + GitHub Actions + Terraform + Dockerfile
│
├─ 2. DISPATCH Run specialized agents in parallel
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ │ web-app-auditor │ │ cicd-auditor │ │ infra-auditor │
│ │ OWASP ASVS L2 │ │ CI/CD Top 10 │ │ CIS Benchmarks │
│ └────────┬────────┘ └────────┬────────┘ └────────┬────────┘
│ │ │ │
├─ 3. AUDIT Each agent: Enumerate → Check → Test → Report
│ │ │ │
│ ▼ ▼ ▼
└─ 4. REPORT ai-sec-report.md
├── 12 findings (2 Critical, 4 High, 3 Medium, 2 Low, 1 Info)
└── ASVS compliance matrix (47 checks: 31 PASS, 12 FAIL, 4 N/A)
Five specialized auditors. Each follows a 4-phase process: Enumerate → Check → Test → Report.
| Agent | Framework | Scope |
|---|---|---|
web-app-auditor |
OWASP ASVS v4.0.3 L2 | Auth, sessions, access control, input validation, config |
api-auditor |
ASVS V13 + V2.10 | REST/GraphQL, webhooks, rate limiting, schema validation |
cicd-auditor |
OWASP CI/CD Top 10 | GitHub Actions, Docker, deps, secrets, branch protection |
infra-auditor |
CIS Benchmarks | K8s, AWS/GCP/Azure, Terraform, IAM, networking |
mobile-auditor |
OWASP MASVS v2.1 L1 | Storage, crypto, auth, network, platform security |
Run individually:
Claude, run the web-app-auditor agent on this codebase
Every finding includes severity, CVSS, framework reference, evidence, and actionable fix:
### [HIGH] AUTH-03: JWT tokens never expire
**CVSS:** 7.5 (AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N)
**Framework:** ASVS V3.5.3
**Component:** src/auth/jwt.service.ts:42
JWT tokens issued with no expiration claim. A stolen token grants permanent access.
**Evidence:**
const token = jwt.sign({ userId }, SECRET);
// No expiresIn — token never expires
**Recommendation:**
const token = jwt.sign({ userId }, SECRET, { expiresIn: '15m' });
// Add refresh token rotation for long-lived sessionsAI-Sec supports a findings-registry.yaml to track findings across audit cycles:
/track # Initialize or update the findings registryThe registry tracks each finding's lifecycle — when it was first found, current status, and automated verification patterns that confirm whether a fix has been applied. No more re-triaging the same findings every sprint.
See methodology/findings-registry-template.yaml for the schema.
Add AI-Sec to your GitHub Actions to get security findings as PR review comments — automatically, on every pull request.
1. Generate a token (one time):
claude setup-token
# → copies sk-ant-oat01-... to clipboard2. Add to GitHub repo secrets as CLAUDE_CODE_OAUTH_TOKEN
3. Create .github/workflows/ai-sec.yml:
name: AI-Sec Security Review
on:
pull_request:
types: [opened, synchronize]
jobs:
security-audit:
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: write
steps:
- uses: actions/checkout@v4
- uses: anthropics/claude-code-action@v1
with:
claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
prompt: |
Run a security audit on the changes in this PR.
Focus on the diff — check for auth issues, injection,
business logic flaws, and CI/CD security.
Post findings as review comments on the relevant lines.Uses your existing Claude Max/Pro subscription — no separate API billing.
Tip: For repos with an API key instead, replace
claude_code_oauth_tokenwithanthropic_api_keyand useANTHROPIC_API_KEYsecret.
87+ findings across 5 security zones on a VARA-regulated cryptocurrency exchange in production — a platform handling real money under regulatory oversight.
This is the same methodology, packaged as an open-source tool.
The frameworks aren't theoretical. Every checklist in methodology/ comes from real audits on real systems — auth flows protecting real money, CI/CD pipelines deploying to regulated infrastructure, Kubernetes clusters running production workloads.
- Not a replacement for pentesting. White-box code review only — it doesn't test running applications.
- Not a SAST scanner. No regex patterns or CVE matching. It reasons about architecture.
- Not perfect. Catches ~60-70% of what a senior security consultant finds. The gap is why consulting exists.
For expert-reviewed audits, compliance-ready reports, and continuous monitoring — visit ai-sec.pro.
Contributions welcome. The best ways to help:
- Run it on your project and report false positives/negatives
- Add methodology checklists for frameworks we don't cover yet
- Improve agent prompts — the agents live in
.claude/agents/ - Share your audit reports (redacted) so we can benchmark accuracy
See the methodology/ folder for the audit frameworks.
MIT — use it, fork it, audit everything.