Skip to content

feat: Enhance agent snapshot serialization with error handling for non-serializable inputs#11108

Merged
bogdankostic merged 2 commits intomainfrom
agent-snapshot-error
Apr 16, 2026
Merged

feat: Enhance agent snapshot serialization with error handling for non-serializable inputs#11108
bogdankostic merged 2 commits intomainfrom
agent-snapshot-error

Conversation

@bogdankostic
Copy link
Copy Markdown
Contributor

Related Issues

Proposed Changes:

_create_agent_snapshot called _serialize_value_with_schema() without error handling. If serialization failed (e.g. due to non-serializable objects), the resulting SerializationError would mask the real pipeline runtime error (e.g. PipelineRuntimeError).

This PR wraps each _serialize_value_with_schema() call for chat_generator and tool_invoker inputs in try-except blocks. On failure, a warning is logged and an empty dictionary is used as a fallback, matching the existing pattern in _create_pipeline_snapshot.

How did you test it?

Added three unit tests.

Notes for the reviewer

Checklist

  • I have read the contributors guidelines and the code of conduct.
  • I have updated the related issue with new insights and changes.
  • I have added unit tests and updated the docstrings.
  • I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test: and added ! in case the PR includes breaking changes.
  • I have documented my code.
  • I have added a release note file, following the contributors guidelines.
  • I have run pre-commit hooks and fixed any issue.

@bogdankostic bogdankostic requested a review from a team as a code owner April 15, 2026 09:49
@bogdankostic bogdankostic requested review from davidsbatista and removed request for a team April 15, 2026 09:49
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 15, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
haystack-docs Ignored Ignored Preview Apr 15, 2026 9:52am

Request Review

@coveralls
Copy link
Copy Markdown
Collaborator

coveralls commented Apr 15, 2026

Coverage Report for CI Build 24448007183

Coverage increased (+0.02%) to 92.856%

Details

  • Coverage increased (+0.02%) from the base build.
  • Patch coverage: No coverable lines changed in this PR.
  • No coverage regressions found.

Uncovered Changes

No uncovered changes found.

Coverage Regressions

No coverage regressions found.


Coverage Stats

Coverage Status
Relevant Lines: 17203
Covered Lines: 15974
Line Coverage: 92.86%
Coverage Strength: 0.93 hits per line

💛 - Coveralls

Copy link
Copy Markdown
Contributor

@davidsbatista davidsbatista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Copy link
Copy Markdown
Contributor

@davidsbatista davidsbatista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@bogdankostic bogdankostic merged commit 8d22008 into main Apr 16, 2026
29 of 30 checks passed
@bogdankostic bogdankostic deleted the agent-snapshot-error branch April 16, 2026 08:16
shaun0927 added a commit to shaun0927/haystack that referenced this pull request Apr 16, 2026
…ialized

The fallback added for agent snapshot serialization errors preserved the
original runtime failure, but it could also replace the entire
chat_generator or tool_invoker payload with an empty dict. That made the
saved snapshot impossible to resume even when only a runtime-only field
such as a streaming callback was non-serializable.

This change narrows the fallback behavior: Haystack now retries those
component inputs field-by-field and omits only the fields that cannot be
serialized, preserving resumable fields like messages, state, and tools.
A regression test covers resuming from a tool-invoker snapshot created
with a non-serializable runtime callback.

Constraint: Must preserve the original deepset-ai#11108 goal of not masking the real runtime error
Rejected: Keep saving `{}` and document snapshots as non-resumable | breaks the existing resume contract more than necessary
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: If agent snapshot fallback behavior changes again, verify both error preservation and snapshot resumability
Tested: hatch -e test run pytest test/components/agents/test_agent_breakpoints.py -k 'resume_from_tool_invoker' -q
Tested: hatch -e test run pytest test/core/pipeline/test_breakpoint.py -k 'create_agent_snapshot' -q
Tested: hatch run fmt-check haystack/core/pipeline/breakpoint.py test/components/agents/test_agent_breakpoints.py
Not-tested: Full unit suite and integration suite
Related: deepset-ai#11126
shaun0927 pushed a commit to shaun0927/haystack that referenced this pull request Apr 21, 2026
…puts fail to serialize

Address the Copilot review on deepset-ai#11127: when every field of a
chat_generator or tool_invoker input fails to serialize,
_serialize_agent_component_inputs previously returned a bare `{}`. The
downstream `_deserialize_value_with_schema` rejects `{}` with
DeserializationError, which would silently re-introduce the exact
non-resumable snapshot behavior the fix was meant to prevent (e.g. when
resuming from a ToolBreakpoint where the sub-component's inputs are
not strictly required).

The helper now always returns a structurally valid
`{"serialization_schema", "serialized_data"}` pair. When all fields
are omitted the payload degrades to the schema-valid empty object
(`{"type": "object", "properties": {}}` + `{}`), which deserializes
back to `{}` without raising.

Existing unit tests in TestCreateAgentSnapshot are updated to assert
the new empty-but-valid payload shape, and a new test verifies that
the all-fields-fail payload round-trips through
`_deserialize_value_with_schema` without raising. The release note is
extended to describe the empty-payload edge case.

Constraint: Must preserve the original deepset-ai#11108 goal of not masking the real runtime error
Constraint: Must keep the narrower field-by-field fallback from the previous commit intact
Rejected: Keep returning bare `{}` and document snapshots as non-resumable in this edge case | regresses the snapshot resume contract in the very scenario the previous commit promised to preserve
Rejected: Fall back to a different marker payload (e.g. string sentinel) | breaks downstream deserializer's object/properties contract and would require changes outside the fallback helper
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: If agent snapshot fallback behavior changes again, verify both DeserializationError is not raised on the empty-fields case and that the helper never returns a bare `{}`
Tested: hatch -e test run pytest test/core/pipeline/test_breakpoint.py -k 'create_agent_snapshot' -q
Tested: hatch -e test run pytest test/components/agents/test_agent_breakpoints.py -k 'non_serializable_runtime_callback' -q
Tested: hatch -e test run pytest test/components/agents/test_agent_breakpoints.py -k 'resume_from_tool_invoker and not new_breakpoint' -q
Tested: hatch run fmt-check haystack/core/pipeline/breakpoint.py test/core/pipeline/test_breakpoint.py test/components/agents/test_agent_breakpoints.py
Tested: hatch run test:types haystack/core/pipeline/breakpoint.py
Not-tested: Full unit suite and integration suite
Related: deepset-ai#11126
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update _create_agent_snapshot to be more robust towards errors just like _create_pipeline_snapshot

4 participants