Team: GHOST//SHELL
Systematic. Standards-Mapped. Audit-Grade.
AuditLLM is a modular red-teaming framework that systematically evaluates LLM safety across the OWASP LLM Top 10 (2025) risk categories, with findings mapped to NIST AI RMF and MITRE ATLAS. It produces audit-grade evidence reports suitable for enterprise security teams, regulators, and AI governance boards.
LLMs are being deployed across Indian BFSI, governance, and healthcare without structured safety evaluation. Current red-teaming is ad hoc, inconsistent, and produces no standardised evidence trail.
AuditLLM provides three layers:
- Attack Library — 30+ adversarial test cases across 10 OWASP categories
- Evaluation Engine — Automated execution + LLM-as-judge scoring
- Audit Dashboard — Risk heatmaps, model comparisons, NIST mapping, exportable reports
# 1. Clone and install
cd auditllm
pip install -r requirements.txt
# 2. Set API keys
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export GROQ_API_KEY="gsk_..."
# 3. Dry run — list all attacks
python core/harness.py --dry-run
# 4. Run full scan
python core/harness.py
# 5. Launch dashboard
streamlit run dashboard.py# Scan specific model
python core/harness.py --model gpt-4o-mini
# Scan specific OWASP category
python core/harness.py --category LLM01
# Filter by severity
python core/harness.py --severity critical
# Custom output path
python core/harness.py --output reports/my_scan.jsonauditllm/
├── attacks/
│ └── taxonomy.yaml # Full attack taxonomy (OWASP-mapped)
├── core/
│ └── harness.py # Red team execution engine
├── data/
│ └── scan_results.json # Scan output (generated)
├── reports/ # Exported reports
├── config.yaml # Target models & settings
├── dashboard.py # Streamlit visualisation
├── requirements.txt
└── README.md
| Framework | Coverage |
|---|---|
| OWASP LLM Top 10 | LLM01–LLM10 (all categories) |
| NIST AI RMF | GOVERN, MAP, MEASURE, MANAGE functions |
| MITRE ATLAS | Adversarial ML threat techniques |
- Multi-model: Test OpenAI, Anthropic, Groq, or any OpenAI-compatible endpoint
- LLM-as-Judge: Automated pass/fail scoring with reasoning
- Multi-turn attacks: Escalation and context manipulation tests
- Indirect injection: Document-embedded and tool-result injection
- Audit-grade output: Timestamped evidence, OWASP/NIST mapping, exportable JSON
- Interactive dashboard: Risk heatmaps, model comparison, severity breakdown
Ananya Aithal Graduate Programme Associate — Information & Cybersecurity Audit Standard Chartered GBS | Quantum ML Researcher, ICTS-TIFR
MIT — Built for the ISB Cybersecurity & AI Safety Hackathon 2026