Skip to content

feat: Add Crucible agentic challenge attack notebook#333

Merged
rdheekonda merged 2 commits intomainfrom
feat/crucible-agentic-challenge-attacks
Mar 6, 2026
Merged

feat: Add Crucible agentic challenge attack notebook#333
rdheekonda merged 2 commits intomainfrom
feat/crucible-agentic-challenge-attacks

Conversation

@rdheekonda
Copy link
Contributor

@rdheekonda rdheekonda commented Mar 6, 2026

Summary

  • Adds examples/airt/agentic_red_teaming_attacks.ipynb demonstrating automated adversarial attacks against three Crucible agentic challenges (toolshed, webwhisper, vaultguard)
  • Uses TAP, GOAT, and Crescendo attack strategies with diverse transforms (perturbation, role-play, authority exploitation)
  • Includes real-time console() output, contains_crucible_flag scoring, and recommended mitigations

Test plan

  • Run notebook end-to-end against platform.dreadnode.io
  • Verify all 6 attack cells execute and produce results
  • Confirm contains_crucible_flag scorer validates captured flags

🤖 Generated with Claude Code

Generated Summary:

  • Added a new Jupyter Notebook for Agentic AI Red Teaming utilizing the AIRT framework.
  • Implemented automated adversarial attacks against multiple challenges in the Dreadnode Crucible platform.
  • Included various attack methodologies: TAP (beam search), GOAT (graph exploration), Crescendo (progressive escalation).
  • Defined targets and scoring mechanisms for each challenge:
    • toolshed: Focus on DevOps Tool Misuse.
    • webwhisper: Address Indirect Prompt Injection.
    • vaultguard: Engage in Multi-Agent Defense Bypass.
  • Provided environmental setup instructions and required API key configurations.
  • Added multiple code cells demonstrating setup for targets, transformations, and scoring.
  • Highlighted complexities such as input filters, system prompts, and security measures in the attacks.
  • Enhanced documentation for each challenge outlining objectives and defenses in a structured format.

This summary was generated with ❤️ by rigging

Demonstrates automated adversarial attacks (TAP, GOAT, Crescendo) against
Crucible agentic challenges (toolshed, webwhisper, vaultguard) using the
AIRT framework with diverse transforms and real-time console output.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@dreadnode-renovate-bot dreadnode-renovate-bot bot added the area/examples Changes to example code and demonstrations label Mar 6, 2026
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@rdheekonda rdheekonda added this pull request to the merge queue Mar 6, 2026
Merged via the queue into main with commit 48198fd Mar 6, 2026
8 checks passed
@rdheekonda rdheekonda deleted the feat/crucible-agentic-challenge-attacks branch March 6, 2026 22:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/examples Changes to example code and demonstrations

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant