Releases · nuclide-research/ai-llm-redteam-operator

First functional release as a two-stage agentic workflow: plan writes a Scenario Packet (the policy), run executes it against one authorized target as a sense-plan-act loop and returns an evidence-ledger Run Report.

Highlights

New run subcommand and agent.py - RedTeamAgent runs the packet: recon pre-pass, signal evaluation, attack-chain walking that advances a step only on confirmation, and a Run Report (findings + chain outcomes + evidence ledger). plan is unchanged and remains the default, so the bare category X form still works.
Four default-safe gates, each lifted only by an explicit flag:
- Authorization: no --authorize reference, no send.
- Dry-run by default: plans every request, sends nothing. A dry-run can never produce a confirmed finding.
- Single-host scope: every request is the target's scheme and host with a packet path appended. Redirects are captured, not followed, so a probe cannot walk the agent off-target.
- Two independent probe gates: a noise cap (--max-aggressiveness) for reads and a separate mutation gate (--allow-writes) that blocks every write method until set.
Evidence-backed findings only - a hypothesis is confirmed solely when a sent observation carries a matching status, header, or body token. Restraint is enforced in the loop: one proof artifact per step, a byte cap on every response sample, a global request budget.
Optional LLM strategist (OpenAI-compatible endpoint via urllib, no SDK) ranks which chain to pursue first. Off unless an endpoint is supplied; the report records the data egress and warns on remote or plaintext endpoints.
Standard library only, including the LLM path.

Security hardening

This release shipped after an adversarial multi-lens review (19 verified findings, 0 uncertain). Notable fixes:

Critical: urllib's default opener auto-followed 3xx redirects off the target host, defeating the single-host scope lock and corrupting the evidence ledger. Fixed with a no-follow opener plus a final-URL assertion; scheme-relative and absolute packet paths are now rejected.
Signal evaluator: single-direction path scoping, prefer-2xx evidence selection, confirmation gated on 2xx-with-body or a token on a 2xx response, punctuation-safe path extraction.
Findings are evaluated for every test case independent of chain walking, so a confirmable exposure is never dropped when a chain stalls early.
Mutation gate made independent of the noise cap (write probes rated medium would otherwise have fired on the cap alone); _agg_rank fails safe on unknown labels.

Install

pip install -e .
ai-llm-redteam-operator plan platform LiteLLM
ai-llm-redteam-operator run platform LiteLLM --target https://10.0.0.5:4000 --authorize ENG-2026-014   # dry-run

Authorized assessment tooling. The agent performs network activity by design and is gated accordingly. Every scenario assumes explicit, written authorization for the target in scope.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Highlights

Security hardening

Install

Uh oh!

Releases: nuclide-research/ai-llm-redteam-operator

v0.2.0 - Agentic execution layer

Highlights

Security hardening

Install

Uh oh!