Skip to content

FEAT Breaking: Adding Scenario registry#2115

Merged
rlundeen2 merged 20 commits into
microsoft:mainfrom
rlundeen2:rlundeen2-phase-5-scenario-registry
Jul 2, 2026
Merged

FEAT Breaking: Adding Scenario registry#2115
rlundeen2 merged 20 commits into
microsoft:mainfrom
rlundeen2:rlundeen2-phase-5-scenario-registry

Conversation

@rlundeen2

@rlundeen2 rlundeen2 commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Description

Migrates scenarios onto PyRIT's unified Registry base so the registry — not the Scenario class or the backend — owns building, parameter resolution, and async initialization (via ScenarioRegistry.create_and_initialize_async). All parameter coerce/validate/default/persist logic is consolidated into one shared kernel in pyrit.registry.resolution, and a single ScenarioIdentifier projection replaces the former persistence-only identifier. This is a breaking change to the scenario identifier's wire/serialization shape.

Breaking the old ScenarioIdentifier, which was legacy and not actually an identifier. I thought just deleting it was less confusing.

Full design and rationale: https://gist.github.com/rlundeen2/f7960f7e8973fbb705b1b4bb48d8cdb2

Tests and Documentation

Existing unit tests across models, memory, scenario core, registry, backend, and CLI were updated to the new shapes and pass; pre-commit (ruff format/check, ty, async-suffix) is clean.

rlundeen2 and others added 9 commits June 30, 2026 21:47
Unify ScenarioIdentifier onto ComponentIdentifier and move ScenarioRegistry
from class_registries/ to components/, subclassing the unified Registry.
Update all consumers (backend services, initializers) and tests to the new
Registry API (get_all_registered_class_metadata / get_class_names /
register_class). No back-compat export, following the AttackTechniqueRegistry
precedent.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Earlier commits on this branch were formatted with black (line length 88);
the repo standard is ruff-format (line length 120). Reformat the affected
files so they pass the ruff-format pre-commit hook.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The registry now owns the full scenario build lifecycle. New
ScenarioRegistry.create_and_initialize_async(name, *, scenario_params,
scenario_result_id, **initialize_kwargs) builds the scenario via
create_instance, sets its declared parameters via set_params_from_args,
and initializes it via initialize_async -- a single canonical entry point.

The backend scenario_run_service._initialize_scenario_async now delegates
to this method instead of manually chaining create_instance +
set_params_from_args + initialize_async. Run-specific common-parameter
resolution (target, strategies, dataset config) stays in _build_init_kwargs
and is forwarded through **initialize_kwargs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Extract the coerce/validate/inject-defaults mapping out of
Scenario.set_params_from_args (plus its _validate_declarations /
_validate_params helpers) into a single registry-owned function,
apply_declared_parameters, in pyrit/registry/resolution.py. Scenario now
only declares its parameters via supported_parameters() and delegates the
mapping, so the programmatic, CLI, and registry paths can't diverge.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ctor_args

Rename apply_declared_parameters -> resolve_declared_params and its 'args'
keyword -> 'raw_args' so the two Parameter-contract entry points read as a
parallel pair (resolve_constructor_args / resolve_declared_params). Reframe the
resolution.py module docstring to present both resolve variants and call out
that the Parameter model is their one shared coercion/validation kernel; the
loops differ only in contract source (constructor vs declared list), reference
resolution, and default materialization.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…lution layer

Scenario no longer owns bespoke parameter-handling logic: the JSON snapshot/
serializability guard and the resume param-mismatch diff move into
pyrit.registry.resolution as snapshot_params_for_persistence and
describe_param_mismatch, alongside resolve_declared_params. Removes the
_assert_json_serializable/_format_param_key_diff free functions and the
_declarations_validated bookkeeping flag; initialize_async now converges on the
single registry-owned resolve path (idempotent re-resolution) so programmatic,
CLI, and registry flows can't diverge.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Resolve conflicts by porting microsoft#2112's Parameter-model pattern onto the
unified-Registry scenario registry: carry raw Parameter objects end-to-end,
delete ScenarioParameterMetadata / ScenarioParameterSummary projections,
and drop the local display_choices/_param_type_display in favor of the
shared pyrit.models.parameter serialization.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
It's a per-run dataset setting, not catalog metadata, so drop it from
ScenarioMetadata / RegisteredScenario / scenario_service and the CLI
scenario-list output. The run-side RunScenarioRequest.max_dataset_size and
DatasetConfiguration.max_dataset_size are unchanged.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
… docstring

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
@rlundeen2 rlundeen2 changed the title FEAT: Adding Scenario registry FEAT Breaking: Adding Scenario registry Jul 1, 2026
rlundeen2 and others added 2 commits July 1, 2026 12:50
The method had no callers anywhere in the codebase (already orphaned on
main). Its intended second-phase discovery of user-defined scenarios from
initializer scripts was never wired in, and it cannot live inside the
one-shot cached _discover(). Drop it as dead code along with its now-unused
discover_subclasses_in_loaded_modules import.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Move per-run persistence concerns (description, init_data) off
ScenarioIdentifier and onto ScenarioResult, keeping the identifier as a
pure identity projection like TargetIdentifier/ScorerIdentifier. version
stays as identity-bearing state in attributes; description and init_data
now live on the ScenarioResult persistence aggregate. Removed
with_init_data (which mutated the content hash) and slimmed for_scenario
to identity-only fields.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
@behnam-o behnam-o self-assigned this Jul 1, 2026
rlundeen2 and others added 2 commits July 1, 2026 14:50
Drop the per-run ScenarioIdentifier from Scenario and ScenarioResult.
ScenarioResult now carries denormalized scalar identity facts
(scenario_name / scenario_version / pyrit_version) matching the DB columns,
and ScenarioIdentifier becomes a build-time registry projection that promotes
version / techniques / datasets into the content and eval hash.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Replace the divergent per-registry _discover overrides with a single base implementation that enumerates concrete subclasses of the domain base type in memory (recursive __subclasses__), filtered to classes under the discovery package. This finds the exact curated catalog by type rather than walking the filesystem, so it is unaffected by packages that omit __init__.py. Lazily-exported (PEP 562) classes are force-loaded first so they exist before enumeration. ScenarioRegistry drops its bespoke _discover and keeps only dotted-name keying via _get_registry_name.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Comment thread pyrit/registry/resolution.py
Comment thread pyrit/memory/alembic/versions/ab8f2c1a9d07_pre_alembic_release_schema.py Outdated
rlundeen2 and others added 5 commits July 2, 2026 10:38
Replace ScenarioResult's flat identity fields (scenario_name,
scenario_version, init_data, objective_target_identifier,
objective_scorer_identifier) with a single canonical scenario_identifier,
exposed as read-only computed-field projections. Add a before-validator so
model_dump/model_validate round-trips under extra='forbid'.

Resume drift detection now compares ScenarioEvaluationIdentifier eval hashes
instead of bespoke param snapshots; drop snapshot_params_for_persistence and
describe_param_mismatch. Memory persists/reconstructs the typed identifier.

Migrate all test call sites to the make_scenario_result / make_scenario_identifier
helpers in tests/unit/mocks.py.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
…_params

ScenarioResult.pyrit_version now delegates to scenario_identifier.pyrit_version
(a storage tag every ComponentIdentifier already carries) via @computed_field
instead of a redundant standalone field. Memory read no longer passes it
separately; the identifier carries the stored version.

Rename _reject_unknown_parameters -> _reject_undeclared_params to make clear it
is specific to the declared-params path (resolve_declared_params batches all
undeclared keys); the constructor path rejects unknown args inline.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
The eval hash already encodes class_name/class_module, so an explicit
scenario-name mismatch check was redundant. Drop it and rely solely on
stored_eval_hash != current_eval_hash, with a clearer mismatch message.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
The scenario definition version is owned by the ScenarioIdentifier and is
promoted into params during identifier construction, so a scenario declaring
a param named 'version' would be silently overwritten. Reject it explicitly
at set_params_from_args time. Other scenario-declared params flow into the
identity unchanged.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Add new head migration d4e6f8a0b2c4 that renames ScenarioResultEntries
.scenario_init_data to scenario_identifier (non-nullable) via
batch_alter_table, instead of mutating the immutable baseline migration.
Apply ruff format to scenario.py and the scenario param/core tests.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
@rlundeen2 rlundeen2 force-pushed the rlundeen2-phase-5-scenario-registry branch from cd2ecb9 to 6d78894 Compare July 2, 2026 17:45
rlundeen2 and others added 2 commits July 2, 2026 10:51
Add tests that pin the ComponentType -> registry mapping: each resolvable
family (target/converter/scorer) maps to its registry's .instances, scenario
is deliberately non-resolvable, and every ComponentType member must be
classified as one or the other. A newly added ComponentType now fails fast
here instead of only at build time.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
Reconcile per-technique strategy_converters feature (main) with the
registry-owned scenario lifecycle (this branch): strategy_converters and
scenario_strategies now flow through ScenarioRegistry.create_and_initialize_async.
Adapt test_start_run_forwards_strategy_converters to assert on the registry
call surface instead of scenario_instance.initialize_async.

Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>

@behnam-o behnam-o left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a minor preference/nit

Comment thread pyrit/registry/components/scenario_registry.py
@rlundeen2 rlundeen2 added this pull request to the merge queue Jul 2, 2026
Merged via the queue into microsoft:main with commit 4d8b9f6 Jul 2, 2026
53 checks passed
@rlundeen2 rlundeen2 deleted the rlundeen2-phase-5-scenario-registry branch July 2, 2026 20:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants