FEAT add StrategySequenceAttack compound attack primitive#1819
Merged
hannahwestra25 merged 8 commits intoMay 28, 2026
Merged
Conversation
rlundeen2
reviewed
May 27, 2026
rlundeen2
reviewed
May 27, 2026
rlundeen2
reviewed
May 27, 2026
rlundeen2
reviewed
May 27, 2026
d281a26 to
1557e09
Compare
444d727 to
236f80d
Compare
Adds a thin AttackStrategy that runs a sequence of inner attacks against
one objective, controlled by a single SequenceMode.
- SequentialAttack chains AttackStrategy items via AttackExecutor and
returns one envelope SequentialAttackResult, preserving the
one-objective to one-AttackResult invariant.
- SequentialAttackItem bundles per-item strategy + seed_group +
adversarial_chat + objective_scorer + memory_labels.
- SequentialAttackResult(AttackResult) exposes metadata-backed
attempt_result_ids listing each inner attempt id in dispatch order
(mirrors TAPAttackResult / CrescendoAttackResult pattern: no new
dataclass fields, safe with to_dict()).
- SequenceMode collapses iteration + outcome aggregation into a single
intent-named knob:
FIRST_SUCCESS - stop on SUCCESS; resilient past ERROR/FAILURE (default)
FIRST_DECISIVE - stop on SUCCESS or ERROR; fail-fast adaptive
STRICT_ALL - stop on first non-SUCCESS; required pipeline
EXHAUSTIVE - run all; any-success aggregation
LAST_RESULT - run all; inherit final item's outcome
Splits the compound primitive out of microsoft#1760 so the adaptive scenario
rewrite can sit on top of it. Uses AttackContext[AttackParameters]
directly per review feedback (no thin context/params subclasses).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
236f80d to
cd690e6
Compare
The Item suffix was generic. Step reads naturally for an ordered sequence, pairs cleanly with the existing StopPolicy/SequenceMode vocabulary, and has no class-name collisions in the codebase. Cascade renames for internal consistency: - constructor kwarg `items=` -> `steps=` - internal `self._items` -> `self._steps` and loop vars - private method `_run_item_async` -> `_run_step_async` (and its keyword-only `item` parameter -> `step`) - docstring/example/comment vocabulary updated throughout - test helpers (`_patch_run_item` -> `_patch_run_step`) and the affected test method names Intentionally left alone: the `attempt_result_ids` property, `ATTEMPT_RESULT_IDS_KEY` constant, and `metadata[attempt_result_ids]` - each step still produces one attempt, and renaming would also break any persisted-metadata readers. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Each enum value encodes a *rule* (stop condition + outcome aggregation) governing the sequence, which Policy describes more precisely than the generic Mode. Also harmonizes with the PR description's original `StopPolicy` terminology. Cascade renames for internal consistency: - constructor kwarg `mode=` -> `policy=` - attribute `self._mode` -> `self._policy` - all `self._mode is SequenceMode.X` comparisons updated - docstrings / code-block / comment vocabulary updated where `mode` referred to the SequencePolicy concept - test parametrize tuple name, function parameter, `mode=mode` kwarg in the dispatch call, and `test_default_mode_is_first_success` -> `test_default_policy_is_first_success` Intentionally left alone: enum member names (FIRST_SUCCESS, FIRST_DECISIVE, STRICT_ALL, EXHAUSTIVE, LAST_RESULT) and their string-value backings (`first_success` etc.) -- changing the string values would break any persisted metadata using them. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- new 4_sequential_attack.{py,ipynb} demonstrating a Crescendo -> PromptSending fallback chain, per-step memory_labels, inspection of inner attempts via attempt_result_ids, and a SequencePolicy reference table
- 0_attack.md gains a Compound Attacks bullet and updates the AttackStrategy mermaid diagram with SequentialAttack
- myst.yml registers the new notebook in the docs nav
- framework.md mentions compound strategies alongside single/multi-turn attacks
- 3_crescendo_attack.{py,ipynb} adds a tip pointing at the new Sequential notebook
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
rlundeen2
reviewed
May 28, 2026
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
rlundeen2
reviewed
May 28, 2026
rlundeen2
reviewed
May 28, 2026
rlundeen2
reviewed
May 28, 2026
rlundeen2
reviewed
May 28, 2026
rlundeen2
reviewed
May 28, 2026
rlundeen2
reviewed
May 28, 2026
rlundeen2
reviewed
May 28, 2026
37a1703 to
4bafd79
Compare
Addresses review threads on PR microsoft#1819: - Rename SequentialAttackStep -> SequentialChildAttack; kwarg steps= -> child_attacks=; helper _run_step_async -> _run_child_attack_async. - Rename SequencePolicy -> SequenceCompletionPolicy; kwarg policy= -> completion_policy=; saved on the result as both a typed completion_policy field and metadata['completion_policy'] (string) for DB round-trip. - Add SequentialAttackResult.child_attack_results: list[AttackResult], populated at execute time. child_attack_result_ids (renamed from attempt_result_ids) now derives from it when populated, falling back to metadata['child_attack_result_ids'] for envelopes loaded from the DB. Constant renamed to CHILD_ATTACK_RESULT_IDS_KEY. - Forward context._attribution through _run_child_attack_async to AttackExecutor.execute_attack_from_seed_groups_async so inner rows carry parent linkage when the compound is nested under a Scenario. - Envelope now has conversation_id='', last_response=None, last_score=None -- the wrapper owns no conversation; callers use child_attack_results for per-child detail. executed_turns is the sum across child attacks that ran. - Notebook and 0_attack.md updated to reflect the renamed surface and the new child_attack_results/child_attack_result_ids views. Six new tests cover: envelope no-conversation invariant, dispatch-order child results, completion-policy round-trip, metadata-fallback on child_attack_result_ids, summed executed_turns, and attribution forwarding. 41/41 pass; pre-commit clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
rlundeen2
approved these changes
May 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Adds SequentialAttack — a compound
AttackStrategythat chains multiple inner attacks against a single objective with a configurable completion policy. Think "try Crescendo first, fall back to PromptSending" without leaving theAttackStrategylayer.Why?
Right now, if you want to try strategy A then B on the same objective, you either push that logic up to the Scenario layer (which breaks the one-objective → one-result invariant) or write custom glue code.
SequentialAttackkeeps that branching where it belongs — at the attack level — and stays composable in notebooks.What's in the box
SequentialAttackSequentialAttackResult.SequentialChildAttackSequentialAttackResultAttackResult; exposeschild_attack_results(live) andchild_attack_result_ids(ID-only, for DB round-trip).SequenceCompletionPolicyFIRST_SUCCESS(default),FIRST_DECISIVE,STRICT_ALL,EXHAUSTIVE, orLAST_RESULT.Outcome aggregation
SUCCESSif any child succeeded,ERRORif all errored,UNDETERMINEDif all undetermined, otherwiseFAILURE.Files changed
pyrit/executor/attack/compound/sequential_attack.py+__init__.pyexports4_sequential_attack.ipynb/.py), updated attack overview (0_attack.md,framework.md), cross-link from Crescendo notebooktest_sequential_attack.pycovering completion policies, per-child overrides, outcome aggregation, label stamping, result shape, and edge casesWhat this enables later
max_attempts_on_failurewithSequentialAttack([same] * N, completion_policy=FIRST_SUCCESS)