Release Release v0.3 - New Labs, Enhancements, Governance, Licensing, and Documentation Update · AISecurityConsortium/AIGoat

This release introduces project governance, a structured licensing model, and documentation improvements across the platform. It also includes Swagger documentation cleanup, improved attack lab descriptions, and a new LLM07 lab goal.

What's New

Governance and Licensing

Structured licensing model — Introduced a dual licensing approach: Apache License 2.0 for platform code and Creative Commons BY-NC-SA 4.0 for training content. Each content directory now has its own LICENSE file explaining which license applies.
Challenge framework under CC BY-NC-SA 4.0 — The dynamic flag generation engine and exploit evaluators (app/challenges/) are now licensed under CC BY-NC-SA 4.0 rather than Apache 2.0. The evaluators encode specific attack detection patterns, success thresholds, and multi-turn analysis logic that represent the training methodology of the platform, making them training content rather than generic application code.
Governance documentation — Added GOVERNANCE.md describing project ownership, maintainer responsibilities, decision-making process, and how contributors can participate.
Improved contribution guidelines — Updated CONTRIBUTING.md with clearer guidance on how to contribute, contribution scope, pull request process, and how licensing applies to contributions.
Training usage rules — Added TRAINING_LICENSE.md explaining commercial usage rules for labs, prompts, and workshop materials in clear, non-legalistic language.
NOTICE file — Added a NOTICE file summarizing the dual licensing model and trademark information.

API and Swagger Improvements

Swagger documentation cleanup — Removed obsolete endpoints from the API documentation (/api/rag-chat/, /api/rag-chat-history/, /api/rag-stats/, POST /api/ollama/status/). These endpoints still function for backward compatibility but no longer appear in /docs.
Endpoint descriptions — Added docstrings to all active API endpoints so Swagger shows accurate descriptions for each route, including purpose, parameters, and expected behavior.

New Attack Labs

LLM03 — Supply Chain: Modelfile Backdoor — A lab simulating a supply chain attack where a community-contributed Ollama Modelfile contains hidden backdoor triggers. Students discover trigger phrases that cause the chatbot to leak credentials, reveal secret coupon codes, or dump its system prompt. Includes a realistic model card with publisher info, download count, and version history.
LLM06 — Excessive Agency: Overpowered Assistant — A lab where the chatbot believes it has operational tools (lookup_orders, apply_coupon, process_refund, export_customer_data) and confirms unauthorized actions without verification. Demonstrates how excessive agency in AI systems leads to unauthorized operations.
LLM10 — Unbounded Consumption: Token Flood — A lab demonstrating resource abuse through excessive output generation. The chatbot is configured to never summarize and to comply with repetition requests. Includes defense-level-aware evaluation (>2000 chars at L0, >1000 chars at L1) and output truncation at L1.

Attack Lab Improvements

Improved LLM01 goal description — Rewritten to clearly explain what prompt injection is, what the attacker is trying to achieve, and what success looks like.
Improved LLM07 goal description — Rewritten to explain system prompt leakage, why it matters, and how attackers typically attempt it.
New LLM07 lab: Indirect Prompt Leakage via Reasoning — Added a second goal under LLM07 focusing on indirect extraction techniques: role confusion, chain-of-thought manipulation, context probing, and warm/cold guessing games. Includes 5 example prompts with expected results across all three defense levels.

Defense Pipeline Enhancements

Resource abuse detection — Added RESOURCE_ABUSE intent patterns to the intent classifier for detecting repetition requests, enumeration attacks, and anti-summarization instructions. Blocked at Level 2.
Output truncation at Level 1 — Responses exceeding 1,000 characters are truncated with a safety notice at Defense Level 1, mitigating unbounded consumption attacks.

Platform Improvements

Reduced chatbot verbosity — The Level 0 system prompt no longer causes the chatbot to proactively dump all context data on simple greetings. Data is still fully accessible when requested.

Documentation Improvements

Lab configuration documentation — Added comprehensive comments to config/labs.yml explaining each field, how to add new labs, and how to adjust difficulty.
Workshop guide improvements — Added a "Quick Setup for Workshops" subsection with recommended deployment approach, suggested participant workflow, typical workshop duration (2-4 hours), and practical tips for instructors.
README improvements — Added new sections covering project purpose ("Why AI Goat Exists"), expanded target audience, typical training workflow, simplified platform architecture diagram, project evolution notes, and community participation guidance.

Files Added

File	Purpose
`GOVERNANCE.md`	Project governance and maintainer information
`NOTICE`	Licensing overview and trademark notice
`TRAINING_LICENSE.md`	Commercial usage rules for training content
`prompts/LICENSE`	CC BY-NC-SA 4.0 for prompt files
`docs/LICENSE`	CC BY-NC-SA 4.0 for documentation
`media/LICENSE`	CC BY-NC-SA 4.0 for media assets
`config/LICENSE`	Split licensing for config directory
`app/challenges/LICENSE`	CC BY-NC-SA 4.0 for flag engine and exploit evaluators
`prompts/labs/supply_chain.md`	System prompt for LLM03 Supply Chain lab (Modelfile Backdoor)
`prompts/labs/excessive_agency.md`	System prompt for LLM06 Excessive Agency lab
`prompts/labs/unbounded_consumption.md`	System prompt for LLM10 Unbounded Consumption lab
`app/challenges/evaluators/supply_chain.py`	Evaluator for Supply Chain lab
`app/challenges/evaluators/excessive_agency.py`	Evaluator for Excessive Agency lab
`app/challenges/evaluators/unbounded_consumption.py`	Evaluator for Unbounded Consumption lab

Files Modified

File	Change
`LICENSE`	Replaced MIT with Apache License 2.0
`README.md`	Added governance, workflow, architecture, and licensing sections
`CONTRIBUTING.md`	Added contribution scope, licensing of contributions, and community sections
`config/labs.yml`	Added explanatory comments for all fields and instructions for customization
`docs/workshop-guide.md`	Added Quick Setup for Workshops subsection
`app/api/system.py`	Added docstrings; hid obsolete endpoint from Swagger
`app/api/rag.py`	Added docstrings to KB endpoints; hid 3 obsolete endpoints from Swagger
`app/api/chat.py`	Added docstrings to all chat and defense level endpoints
`app/api/challenges.py`	Added docstrings to all challenge endpoints
`app/api/challenge_chat.py`	Added docstring to challenge chat endpoint
`app/api/auth.py`	Added docstrings to all auth endpoints
`app/api/labs.py`	Added docstrings to lab endpoints
`frontend/src/components/AttacksPage.jsx`	Improved LLM01 and LLM07 goal descriptions; added LLM03, LLM06, LLM07, and LLM10 labs
`app/challenges/registry.py`	Registered 3 new evaluators (supply chain, excessive agency, unbounded consumption)
`app/defense/intent_classifier.py`	Added RESOURCE_ABUSE intent patterns
`app/defense/policy_engine.py`	Added RESOURCE_ABUSE to blocking intents
`app/defense/output_moderator.py`	Added L1 output truncation at 1,000 characters
`app/defense/rejection.py`	Added rejection template for resource abuse
`prompts/level0/cracky.md`	Reduced proactive verbosity on simple greetings
`tests/test_lab_manifest.py`	Updated test for new active labs

Upgrade Notes

No breaking changes. Existing deployments can upgrade by pulling the latest code. The hidden Swagger endpoints still function normally — they are only removed from the /docs UI.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release v0.3 - New Labs, Enhancements, Governance, Licensing, and Documentation Update

Choose a tag to compare

Sorry, something went wrong.