Skip to content

Release v0.2: Add configurable agent security testing pipeline#2

Closed
purcl-cybersec-agent wants to merge 95 commits into
v0.1from
v0.2
Closed

Release v0.2: Add configurable agent security testing pipeline#2
purcl-cybersec-agent wants to merge 95 commits into
v0.1from
v0.2

Conversation

@purcl-cybersec-agent
Copy link
Copy Markdown

🎉 ASTRA v0.2 Release

This release adds a complete, configurable pipeline for generating security test cases to evaluate AI coding agents' vulnerability to adversarial prompts.

🆕 What's New in v0.2

🔄 Agent Security Testing Pipeline

A complete end-to-end system for generating, parsing, and curating adversarial test cases:

  • Multi-agent orchestration: Coordinator, composer, and reviewer committee
  • Knowledge graph-driven: Systematic coverage of security domains and attack techniques
  • Configurable & extensible: YAML-based configuration, no code changes needed
  • Production-ready: CLI tools, error handling, resume capability

📦 New Components

Scripts

  • agent/main_agent_sec.py - Generate test cases with parallel processing
  • agent/parse_agent_sec.py - Parse and validate XML outputs
  • agent/upload_to_hf_dataset.py - Upload to HuggingFace datasets
  • agent/agent_sec_composer/output_parser.py - Pydantic-based validation

Configuration System

  • resources/agent-sec-config.yaml - Pipeline settings

    • Model selection for coordinator, composer, reviewers
    • Parallel task limits for rate control
    • Output/log directory paths
  • resources/client-config.yaml - Enhanced with provider-based routing

    • provider: bedrock for AWS Bedrock models
    • provider: openai for OpenAI-compatible servers (vLLM, SGLang)
    • Pre-configured models: Claude Sonnet/Haiku, GPT-OSS
    • Easy to add custom models

Documentation

  • README-coding-agent-security.md - Complete pipeline guide
    • Step-by-step workflow
    • Configuration instructions
    • Model selection tips
    • Troubleshooting guide

🎯 Key Features

Provider-Based Model Routing

  • Automatic backend detection using explicit provider field
  • No Python code changes to add new models
  • Mix Bedrock and local models seamlessly
  • Clear error messages for invalid configurations

Flexible Configuration

# Change models without code edits
coordinator_model: "claude-sonnet-4-5"
composer_model: "claude-sonnet-4-5"
reviewer_models:
  - "claude-sonnet-3-7"
  - "claude-haiku-4-5"
  - "qwen3coder"
parallel_tasks: 50

Knowledge Graph Structure

Prohibited Domain (e.g., "Web Security")
└── Technique Family (e.g., "Injection Attacks")
    └── Concrete Instance (e.g., "SQL Injection")

📊 Example Dataset

Generated dataset available at: https://huggingface.co/datasets/PurCL/astra-agent-security

🚀 Quick Start

# 1. Generate test cases
python agent/main_agent_sec.py

# 2. Parse outputs
python agent/parse_agent_sec.py \
  -i data_out/syn_agent_sec.jsonl \
  -o data_out/syn_agent_sec_parsed.jsonl

# 3. Upload to HuggingFace
python agent/upload_to_hf_dataset.py \
  -i data_out/syn_agent_sec_parsed.jsonl \
  -d username/astra-agent-security

🔒 Security & Best Practices

  • ✅ No hardcoded paths - all scripts use CLI arguments
  • ✅ No credentials in config - uses placeholders
  • ✅ Proper .gitignore for build artifacts
  • ✅ LFS tracking for large knowledge graph files
  • ✅ Clean configuration/code separation

📈 Statistics

  • 17 files changed: 766 insertions(+), 85 deletions(-)
  • 8 new files added
  • 9 files updated for configuration system

🎓 Use Cases

  • Red team testing: Generate adversarial prompts for coding agents
  • Security evaluation: Systematic assessment of agent vulnerabilities
  • Research: Dataset generation for agent safety research
  • Benchmarking: Create standardized security test suites

📚 Documentation

Complete documentation in README-coding-agent-security.md covering:

  • Installation and prerequisites
  • Configuration guide for Bedrock and OpenAI-compatible servers
  • Multi-agent system architecture
  • Customization and extension guide
  • Troubleshooting

Built on top of: v0.1 (Amazon Nova AI Challenge Winner 🏆)

XZ-X and others added 29 commits August 11, 2025 14:27
* backup

* update temporal explorator module

* update temporal explorator module

---------

Co-authored-by: solidshen <solidshen519@gmail.com>
* update online module

* update

* update path

* fix

* remove file

* remove tests

* update

---------

Co-authored-by: solidshen <solidshen519@gmail.com>
Implement complete pipeline for generating, parsing, and uploading security
test cases to evaluate AI coding agents' vulnerability to adversarial prompts.

## Core Features

### Data Generation Pipeline
- Multi-agent system for generating adversarial test cases
- Knowledge graph-driven approach (prohibited_domain → technique_family → concrete_instance)
- Three specialized agents: coordinator, composer, and reviewer committee
- Parallel generation with configurable task limits
- Resume capability for interrupted runs
- Output: structured JSONL with test cases and evaluation scores

### Scripts Added
- agent/main_agent_sec.py: Main generation script with CLI args
- agent/parse_agent_sec.py: Parse and validate XML outputs (--input, --output)
- agent/upload_to_hf_dataset.py: Upload to HuggingFace (--input, --dataset-name)
- agent/agent_sec_composer/output_parser.py: Pydantic-based XML validator

### Configuration System
- resources/agent-sec-config.yaml: Pipeline settings
  - Configurable models for coordinator, composer, reviewer committee
  - Adjustable parallel_tasks for rate limit control
  - Customizable output/log directories

- resources/client-config.yaml: Model definitions with provider-based routing
  - provider: bedrock → AWS Bedrock Converse API
  - provider: openai → OpenAI-compatible servers (vLLM, SGLang)
  - Pre-configured Bedrock models (Claude Sonnet/Haiku/Opus, GPT-OSS)
  - Template for custom local models

### Provider-Based Model Routing
Replaced hardcoded model detection with explicit provider field:
- Automatic backend selection based on 'provider' field
- No Python code changes needed to add new models
- Removed ~70 lines of hardcoded model functions
- Better error messages for missing/invalid configs
- Supports mixing Bedrock and local models in reviewer committee

### Documentation
- README-coding-agent-security.md: Complete pipeline guide
  - Step-by-step workflow (generate → parse → upload)
  - Configuration instructions for both Bedrock and OpenAI-compatible servers
  - Model selection tips and provider routing explanation
  - Knowledge graph structure details
  - Troubleshooting guide
  - Link to example dataset: https://huggingface.co/datasets/PurCL/astra-agent-security

### Security & Best Practices
- No hardcoded paths: all scripts accept CLI arguments
- No credentials committed: config files use placeholders
- .gitignore updated: excludes .vscode, data_out, log_out_agent_sec
- LFS tracking for knowledge graph files
- Clean separation of configuration and code

## Example Usage

```bash
# Generate test cases
python agent/main_agent_sec.py

# Parse outputs
python agent/parse_agent_sec.py -i data_out/syn_agent_sec.jsonl -o data_out/parsed.jsonl

# Upload to HuggingFace
python agent/upload_to_hf_dataset.py -i data_out/parsed.jsonl -d username/dataset-name
```

## Benefits
- Fully configurable via YAML (no code changes for customization)
- Easy to extend with new models (just edit client-config.yaml)
- Provider-agnostic: works with Bedrock, vLLM, SGLang, etc.
- Production-ready with proper error handling and validation

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@XZ-X XZ-X closed this Apr 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants