🛡️ AuditLLM — Automated LLM Safety Evaluation & Red-Team Framework

Team: GHOST//SHELL

Systematic. Standards-Mapped. Audit-Grade.

What is AuditLLM?

AuditLLM is a modular red-teaming framework that systematically evaluates LLM safety across the OWASP LLM Top 10 (2025) risk categories, with findings mapped to NIST AI RMF and MITRE ATLAS. It produces audit-grade evidence reports suitable for enterprise security teams, regulators, and AI governance boards.

The Problem

LLMs are being deployed across Indian BFSI, governance, and healthcare without structured safety evaluation. Current red-teaming is ad hoc, inconsistent, and produces no standardised evidence trail.

The Solution

AuditLLM provides three layers:

Attack Library — 30+ adversarial test cases across 10 OWASP categories
Evaluation Engine — Automated execution + LLM-as-judge scoring
Audit Dashboard — Risk heatmaps, model comparisons, NIST mapping, exportable reports

Quick Start

# 1. Clone and install
cd auditllm
pip install -r requirements.txt

# 2. Set API keys
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export GROQ_API_KEY="gsk_..."

# 3. Dry run — list all attacks
python core/harness.py --dry-run

# 4. Run full scan
python core/harness.py

# 5. Launch dashboard
streamlit run dashboard.py

Run Options

# Scan specific model
python core/harness.py --model gpt-4o-mini

# Scan specific OWASP category
python core/harness.py --category LLM01

# Filter by severity
python core/harness.py --severity critical

# Custom output path
python core/harness.py --output reports/my_scan.json

Project Structure

auditllm/
├── attacks/
│   └── taxonomy.yaml        # Full attack taxonomy (OWASP-mapped)
├── core/
│   └── harness.py           # Red team execution engine
├── data/
│   └── scan_results.json    # Scan output (generated)
├── reports/                  # Exported reports
├── config.yaml              # Target models & settings
├── dashboard.py             # Streamlit visualisation
├── requirements.txt
└── README.md

Standards Mapping

Framework	Coverage
OWASP LLM Top 10	LLM01–LLM10 (all categories)
NIST AI RMF	GOVERN, MAP, MEASURE, MANAGE functions
MITRE ATLAS	Adversarial ML threat techniques

Key Features

Multi-model: Test OpenAI, Anthropic, Groq, or any OpenAI-compatible endpoint
LLM-as-Judge: Automated pass/fail scoring with reasoning
Multi-turn attacks: Escalation and context manipulation tests
Indirect injection: Document-embedded and tool-result injection
Audit-grade output: Timestamped evidence, OWASP/NIST mapping, exportable JSON
Interactive dashboard: Risk heatmaps, model comparison, severity breakdown

Author

Ananya Aithal Graduate Programme Associate — Information & Cybersecurity Audit Standard Chartered GBS | Quantum ML Researcher, ICTS-TIFR

License

MIT — Built for the ISB Cybersecurity & AI Safety Hackathon 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🛡️ AuditLLM — Automated LLM Safety Evaluation & Red-Team Framework

What is AuditLLM?

The Problem

The Solution

Quick Start

Run Options

Project Structure

Standards Mapping

Key Features

Author

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
attacks		attacks
core		core
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.yaml		config.yaml
dashboard.py		dashboard.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🛡️ AuditLLM — Automated LLM Safety Evaluation & Red-Team Framework

What is AuditLLM?

The Problem

The Solution

Quick Start

Run Options

Project Structure

Standards Mapping

Key Features

Author

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages