GenAI Red Team Handbook

The GenAI Red Team Initiative Repository is part of the OWASP GenAI Security Project. It is a companion for the GenAI Red Team Initiative documents, such as the GenAI Red Teaming Handbook.

This repository provides a collection of resources, sandboxes, and examples designed to facilitate Red Teaming exercises for Generative AI systems. It aims to help security researchers and developers test, probe, and evaluate the safety and security of LLM applications.

Directory Structure

.
├── exploitation
│   ├── agent0
│   ├── example
│   ├── garak
│   ├── LangGrinch
│   └── promptfoo
└── sandboxes
    ├── RAG_local
    ├── llm_local
    └── llm_local_langchain_core_v1.2.4

Architecture

graph LR
    subgraph "Exploitation Environment<br/>(uv Env or Podman Container)"
        Tool["Exploitation Tool<br/>(Scripts, Scanners, Agents)"]
        Config["Configuration<br/>(Prompts, Settings)"]
    end

    subgraph "Sandbox Container"
        UI["Interface<br/>(Gradio :7860)"]
        API["API Gateway<br/>(FastAPI :8000)"]
        Logic["Application Logic"]
    end

    Config --> Tool
    Tool -->|Attack Request| UI
    UI -->|Internal API Call| API
    API --> Logic
    Logic --> API
    API --> UI
    UI -->|Response| Tool

System Requirements

This project supports Linux and macOS. Windows users are encouraged to use WSL2 (Windows Subsystem for Linux).

Required Tools

Podman
Ollama
Python 3.10+
uv
Make

Required for Promptfoo:

Node.js (v18+)
npx

Installation Instructions

macOS

Install Dependencies:
```
brew install podman ollama node make
```

Initialize Podman Machine:

podman machine init
podman machine start

Linux (Ubuntu/Debian)

Install Dependencies:

sudo apt-get update
sudo apt-get install -y podman nodejs npm make

Install Ollama:

curl -fsSL https://ollama.com/install.sh | sh

Install uv:
```
pip install uv
```

Verification

Verify the installation by checking the versions of the installed tools:

podman version
ollama --version
node --version
make --version
uv --version

Index of Sub-Projects

`sandboxes/`

Sandboxes Overview
- Summary: The central hub for all available sandboxes. It explains the purpose of these isolated environments and lists the available options.
RAG Local Sandbox
- Summary: A comprehensive Retrieval-Augmented Generation (RAG) sandbox. It includes a mock Vector Database (Pinecone compatible), mock Object Storage (S3 compatible), and a mock LLM API. Designed for testing vulnerabilities like embedding inversion and data poisoning.
- Sub-guides:
  - Adding New Mock Services: Guide for extending the sandbox with new API mocks.
LLM Local Sandbox
- Summary: A lightweight local sandbox that mocks an OpenAI-compatible LLM API using Ollama. Ideal for testing client-side interactions and prompt injection vulnerabilities without external costs.
- Sub-guides:
  - Adding New Mock Services: Guide for extending the sandbox with new API mocks.
LangChain Local Sandbox (Vulnerable)
- Summary: A specialized version of the local sandbox configured with langchain-core v1.2.4 to demonstrate CVE-2025-68664 (LangGrinch). It contains an intentional insecure deserialization vulnerability for educational and testing purposes.

`exploitation/`

Red Team Example
- Summary: Demonstrates a red team operation against a local LLM sandbox. It includes an adversarial attack script (attack.py) targeting the Gradio interface (port 7860). By targeting the application layer, this approach tests the entire system—including the configurable system prompt—providing a more realistic assessment of the sandbox's security posture compared to testing the raw LLM API in isolation.
Agent0 Red Team Example
- Summary: A complete, end‑to‑end, agentic example. Agent0 orchestrates multiple autonomous agents to attack the sandbox, demonstrating complex, multi-step adversarial workflows.
  
  There are options for running it: through the UI (manual prompt interaction) and through the Makefile (programmatic run based on pre-defined prompts).
  
  The set of pre-defined prompts include prompts for testing vulnerabilities from based on OWASP Top 10, OWASP Top 10 for LLM Applications, and Mitre Atlas Matrix.
Garak Scanner Example
- Summary: A comprehensive vulnerability scan using Garak. It probes the sandbox for a wide range of weaknesses, including prompt injection, hallucination, and insecure output handling, mapping results to the OWASP Top 10.
Promptfoo Scanner Example
- Summary: A powerful red teaming setup using Promptfoo. It runs automated probes to identify vulnerabilities such as PII leakage and prompt injection, providing detailed reports and regression testing capabilities.
LangGrinch Exploitation
- Summary: A dedicated exploitation module for CVE-2025-68664 in the LangChain sandbox. It demonstrates how to use prompt injection to force the LLM into generating a malicious JSON payload, which is then insecurely deserialized by the application to leak environment secrets.

Contribution Guide

Please refer to CONTRIBUTING.md for instructions on how to add new sandboxes and exploitation examples.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github		.github
exploitation		exploitation
sandboxes		sandboxes
tutorials		tutorials
.gitignore		.gitignore
.mailmap		.mailmap
CONTRIBUTING.md		CONTRIBUTING.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GenAI Red Team Handbook

Directory Structure

Architecture

System Requirements

Required Tools

Installation Instructions

macOS

Linux (Ubuntu/Debian)

Verification

Index of Sub-Projects

`sandboxes/`

`exploitation/`

Contribution Guide

About

Uh oh!

Uh oh!

Contributors 3

Uh oh!

Languages

GenAI-Security-Project/GenAI-Red-Team-Lab

Folders and files

Latest commit

History

Repository files navigation

GenAI Red Team Handbook

Directory Structure

Architecture

System Requirements

Required Tools

Installation Instructions

macOS

Linux (Ubuntu/Debian)

Verification

Index of Sub-Projects

sandboxes/

exploitation/

Contribution Guide

About

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors 3

Uh oh!

Languages

`sandboxes/`

`exploitation/`