Skip to content

MAINT: Refactoring Identifiers to be Pydantic classes#1881

Merged
rlundeen2 merged 5 commits into
microsoft:mainfrom
rlundeen2:rlundeen2/pydantic-identifiers
Jun 1, 2026
Merged

MAINT: Refactoring Identifiers to be Pydantic classes#1881
rlundeen2 merged 5 commits into
microsoft:mainfrom
rlundeen2:rlundeen2/pydantic-identifiers

Conversation

@rlundeen2
Copy link
Copy Markdown
Contributor

This converts ComponentIdentifier to a Pydantic BaseModel

It is phase 2 of the pyrit.models refactor: https://gist.github.com/rlundeen2/3e8daa8e12a11b4b6e52587b3c9b1dca

rlundeen2 and others added 5 commits June 1, 2026 12:36
Convert ComponentIdentifier, ChildEvalRule, and IdentifierFilter from frozen dataclasses to Pydantic v2 BaseModel with frozen ConfigDict, matching the rest of pyrit.models.

Key design decisions:

- Unified serialization shape: model_dump and model_validate produce/accept the same flat shape today's to_dict and from_dict use. A model_validator(mode='before') normalizes flat/structured/legacy input, and a model_serializer(mode='plain') emits the flat shape with optional context-based truncation.

- to_dict and from_dict become DeprecationWarning shims; internal call sites are migrated to model_dump and model_validate in this PR so end users don't see internal warnings.

- with_eval_hash is kept (not deprecated) as the blessed, single-purpose helper for the one update operation that should preserve the stored content hash.

- model_copy is overridden to raise ValueError if a hash-affecting field (class_name, class_module, params, children) is updated without an explicit new 'hash', preventing silent hash drift.

- Reserved-name collision: params containing keys that would collide with structural keys (class_name, hash, etc.) are rejected loudly to keep storage round-trips lossless.

- ComponentIdentifier is now hashable via its content hash; equality is keyed off the content hash rather than per-field.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Reimplement ComponentIdentifier.with_eval_hash by constructing a fresh
instance (threading the existing hash through explicitly) instead of
relying on model_copy. This removes the model_copy override and its
_HASH_AFFECTING_FIELDS guard, which existed only to prevent silent hash
drift through model_copy -- a path nothing in production used. Leaves a
single, obvious construction path.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…entifiers

# Conflicts:
#	pyrit/models/message_piece.py
Comment thread pyrit/models/identifiers/component_identifier.py
@rlundeen2 rlundeen2 enabled auto-merge June 1, 2026 21:29
@rlundeen2 rlundeen2 added this pull request to the merge queue Jun 1, 2026
Merged via the queue into microsoft:main with commit 83caadd Jun 1, 2026
48 checks passed
@rlundeen2 rlundeen2 deleted the rlundeen2/pydantic-identifiers branch June 1, 2026 21:58
hannahwestra25 added a commit to hannahwestra25/PyRIT that referenced this pull request Jun 1, 2026
…port

PR microsoft#1881 (Pydantic identifiers refactor) re-merged with this PR's
compute_inner_attack_eval_hash addition and hoisted
rom pyrit.executor.attack.core.attack_strategy import AttackStrategy
to module level. That import triggers a cycle through
`pyrit.executor.attack` -> `pyrit.message_normalizer` ->
`pyrit.common.data_url_converter` -> `pyrit.models`, which fails
because `pyrit.models` is still being initialised at that point
(DataTypeSerializer is not yet defined).

`from __future__ import annotations` is already enabled, so the
type annotation `attack: AttackStrategy` works as a string. Moving
the import inside `if TYPE_CHECKING:` restores the original lazy
boundary and unblocks CI.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
hannahwestra25 added a commit to hannahwestra25/PyRIT that referenced this pull request Jun 1, 2026
…port

PR microsoft#1881 (Pydantic identifiers refactor) re-merged with this PR's
compute_inner_attack_eval_hash addition and hoisted
rom pyrit.executor.attack.core.attack_strategy import AttackStrategy
to module level. That import triggers a cycle through
`pyrit.executor.attack` -> `pyrit.message_normalizer` ->
`pyrit.common.data_url_converter` -> `pyrit.models`, which fails
because `pyrit.models` is still being initialised at that point
(DataTypeSerializer is not yet defined).

`from __future__ import annotations` is already enabled, so the
type annotation `attack: AttackStrategy` works as a string. Moving
the import inside `if TYPE_CHECKING:` restores the original lazy
boundary and unblocks CI.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
hannahwestra25 added a commit to hannahwestra25/PyRIT that referenced this pull request Jun 1, 2026
…Pydantic identifiers)

Four related fixes needed to unblock CI after the upstream merge that
brought in PR microsoft#1881 (Refactoring Identifiers to be Pydantic classes):

1. `pyrit/models/identifiers/evaluation_identifier.py`:
   The merge hoisted `from pyrit.executor.attack.core.attack_strategy
   import AttackStrategy` to module level, forming a cycle through
   `pyrit.executor.attack` -> `pyrit.message_normalizer` ->
   `pyrit.common.data_url_converter` -> `pyrit.models`. Move it back
   inside `if TYPE_CHECKING:` (`from __future__ import annotations`
   is already enabled, so the string annotation `attack: AttackStrategy`
   still resolves at type-check time).

2. `tests/unit/models/identifiers/test_evaluation_identifier.py`:
   Add missing blank lines between top-level classes (ruff format) and
   replace four `from pyrit.identifiers import ...` lines with
   `from pyrit.models.identifiers import ...` (the former is now a
   deprecation shim that the static-scan deprecation test forbids
   internal callers from using).

3. `pyrit/scenario/scenarios/adaptive/adaptive_scenario.py`:
   Mark the three classmethod stubs as `@abstractmethod` so
   `inspect.isabstract(AdaptiveScenario)` returns `True` and the
   scenario registry's auto-discovery skips it (otherwise the registry
   tries to instantiate the abstract base and raises
   `NotImplementedError`, breaking `test_load_default_datasets`).

4. `pyrit/scenario/scenarios/adaptive/adaptive_scenario.py` and
   `pyrit/scenario/scenarios/adaptive/text_adaptive.py`:
   Move `from pyrit.setup.initializers.components.scenario_techniques
   import build_scenario_technique_factories` from module level back
   into the function body. `scenario_techniques` imports
   `pyrit.scenario.core`, which transitively re-imports the adaptive
   package during `pyrit.scenario` initialization, so a top-level
   import forms a cycle.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants