Skip to content

MAINT: Standardize JSON serialization in models#1813

Merged
romanlutz merged 3 commits into
microsoft:mainfrom
romanlutz:romanlutz/standardize-model-json-serialization
May 27, 2026
Merged

MAINT: Standardize JSON serialization in models#1813
romanlutz merged 3 commits into
microsoft:mainfrom
romanlutz:romanlutz/standardize-model-json-serialization

Conversation

@romanlutz
Copy link
Copy Markdown
Contributor

@romanlutz romanlutz commented May 26, 2026

Description

Cleanup pass that standardizes how models do JSON (de)serialization. Two parts.

Part A — Pydantic models: forward to Pydantic built-ins, deprecate the wrappers

ChatMessage and EmbeddingResponse are both pydantic.BaseModel subclasses, but each defined a to_json (and ChatMessage also a from_json) that was a one-line passthrough to model_dump_json() / model_validate_json(). The wrappers add no value over the Pydantic built-ins.

  • Migrated 5 in-tree call sites across 3 files to the Pydantic API directly:
    • tests/unit/models/test_chat_message.py (4 call sites in 3 tests; test names renamed to reflect the Pydantic API they exercise)
    • doc/code/memory/embeddings.py and doc/code/memory/embeddings.ipynb (1 call site each, kept in sync per the docs sync rule)
  • Kept the 3 methods as deprecated shims (ChatMessage.to_json, ChatMessage.from_json, EmbeddingResponse.to_json) so external callers get a soft migration window. Each shim now emits DeprecationWarning via print_deprecation_message with removed_in="0.15.0" and forwards to the underlying Pydantic method. Tests added to assert each shim still works and emits the warning.

Part B — ScorerMetrics: stop misnaming the file-based loader

ScorerMetrics is a plain @dataclass (not Pydantic) and its serialization helpers have real behavior beyond json.dumps(asdict(...)):

  • from_json accepted a file path, opened the file, unwrapped a top-level "metrics" key (used by evaluation result files), and filtered out internal underscore-prefixed fields (e.g., init=False cached attrs like _harm_definition_obj).

The name from_json strongly suggested a JSON string argument, which it never accepted. Changes:

  • Renamed ScorerMetrics.from_jsonScorerMetrics.from_json_file to honestly reflect what it does (kept to_json as-is — it correctly returns a string).
  • Kept from_json as a deprecated shim that forwards to from_json_file and emits DeprecationWarning via print_deprecation_message with removed_in="0.15.0".
  • Clarified docstrings on both to_json and from_json_file — they're now documented as the canonical (de)serialization entry points for ScorerMetrics subclasses, and from_json_file calls out that it takes a file path and unwraps the "metrics" key plus filters underscore fields.
  • Updated 2 in-tree call sites in tests/unit/score/test_scorer_metrics.py to use the new name, and added a test that asserts the deprecated alias still works and emits the warning.
  • Verified there are no other places in pyrit/score/ that hand-roll json.dumps(asdict(metrics)) or json.load(...) plus ScorerMetrics(**...) outside these methods themselves. (scorer_metrics_io.py parses JSONL with wrapper metadata, which is a different abstraction and unchanged.)
  • Did not add from_json_str — no caller needs a string-only round trip today.

Verification

  • uv run pytest tests/unit -x -q — full unit suite passes (8049+ tests); new deprecation tests pass too
  • uv run pre-commit run --all-files — clean
  • grep confirms zero remaining non-deprecation in-tree callers of the old to_json/from_json symbols on ChatMessage / EmbeddingResponse / ScorerMetrics.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@romanlutz romanlutz force-pushed the romanlutz/standardize-model-json-serialization branch from 848449a to 67b847c Compare May 26, 2026 22:35
Comment thread pyrit/models/chat_message.py
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@romanlutz romanlutz enabled auto-merge May 26, 2026 23:31
@romanlutz romanlutz added this pull request to the merge queue May 26, 2026
Merged via the queue into microsoft:main with commit 520d689 May 27, 2026
48 checks passed
@romanlutz romanlutz deleted the romanlutz/standardize-model-json-serialization branch May 27, 2026 00:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants