Skip to content

feat: introduce checkpoint in experimental#2181

Merged
JackYPCOnline merged 2 commits intostrands-agents:mainfrom
JackYPCOnline:checkpoint_1
Apr 21, 2026
Merged

feat: introduce checkpoint in experimental#2181
JackYPCOnline merged 2 commits intostrands-agents:mainfrom
JackYPCOnline:checkpoint_1

Conversation

@JackYPCOnline
Copy link
Copy Markdown
Contributor

@JackYPCOnline JackYPCOnline commented Apr 21, 2026

Description

Replaces the experimental checkpoint module with a redesigned Checkpoint dataclass and adds "checkpoint" to the StopReason type. This is PR 1 of 2 — types and tests only, no behavioral changes.

What changed

  • New Checkpoint dataclass (experimental/checkpoint/checkpoint.py): Schema-versioned with to_dict()/from_dict() serialization, CheckpointPosition literal ("after_model" | "after_tools"), and app_data dict for provider-owned metadata (e.g. Temporal workflow IDs). from_dict() rejects schema version mismatches and logs a warning on unknown keys for debuggability.
  • "checkpoint" added to StopReason (types/event_loop.py): Type-level addition only — nothing emits this value yet.
  • Updated experimental/checkpoint/__init__.py: exports Checkpoint, CHECKPOINT_SCHEMA_VERSION, and CheckpointPosition.

What this PR does NOT change

  • No changes to Agent, AgentResult, EventLoopStopEvent, or event_loop.py behavior.
  • No new runtime code paths — nothing emits "checkpoint" stop reason yet.
  • All existing tests pass unmodified (2611 passed).

Follow-up PR (PR 2 of 2) will include

  • checkpointing=True flag on Agent constructor
  • Event loop integration: checkpoint emission at after_model and after_tools cycle boundaries
  • checkpointResume content block for resuming from a checkpoint
  • checkpoint field on AgentResult and extended EventLoopStopEvent tuple
  • Checkpoint promoted to top-level strands export
  • Integration tests for the full pause/resume flow

Related Issues

Documentation PR

Type of Change

New feature

Testing

  • 6 unit tests with 100% coverage on checkpoint.py: round-trip serialization, schema version immutability, schema mismatch rejection, defaults, unknown field warning, missing version rejection.

  • I ran hatch run prepare

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@JackYPCOnline
Copy link
Copy Markdown
Contributor Author

Adding checkpoint to stopReason is consider non-breaking, we've added serval items into it already, and will keep doing that

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 21, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Comment thread src/strands/experimental/checkpoint/checkpoint.py Outdated
Comment thread src/strands/experimental/checkpoint/checkpoint.py
@github-actions
Copy link
Copy Markdown

Issue: Per AGENTS.md: "After making changes that affect the directory structure (adding new directories, moving files, or adding significant new files), you MUST update this directory structure section to reflect the current state of the repository." The new checkpoint/ directory under experimental/ is not reflected in AGENTS.md.

Suggestion: Add a checkpoint/ entry under the experimental/ section in the directory structure, e.g.:

│   │   ├── checkpoint/                   # Durable agent execution checkpoints
│   │   │   └── checkpoint.py             # Checkpoint dataclass and serialization

@github-actions
Copy link
Copy Markdown

Issue: This PR introduces a new public dataclass (Checkpoint, CheckpointPosition) that customers will use to interact with the checkpoint/resume flow. Per the API Bar Raising process, new public classes that customers use should have the needs-api-review label. This is at least a "moderate change" — a new class that customers use to achieve new behavior.

Suggestion: Add the needs-api-review label to this PR (or to the combined PR 1+2 series). The PR description does a nice job documenting use cases and the API shape, which is exactly what the API proposer should prepare. Given this is experimental, a lightweight API review pass should be sufficient.

Comment thread src/strands/experimental/checkpoint/checkpoint.py Outdated
@github-actions
Copy link
Copy Markdown

Assessment: Comment

Clean, well-scoped PR that introduces types and tests only — good separation of concerns across the PR 1/2 series. The Checkpoint dataclass follows existing patterns (to_dict/from_dict, schema versioning) and tests provide solid coverage.

Review Categories
  • Terminology: Module docstrings use "event loop" instead of "agent loop" per SDK decision records; the class docstring is already correct.
  • Design Clarity: The relationship between Checkpoint.snapshot/app_data and the existing Snapshot type in types/_snapshot.py should be documented to avoid confusion.
  • Repo Maintenance: AGENTS.md directory structure needs updating per repository policy.
  • API Process: This introduces new public types — consider adding the needs-api-review label per the API bar-raising process.

Good use of the experimental module for iterating on the API before promotion.

@github-actions github-actions Bot added size/m and removed size/m labels Apr 21, 2026
@JackYPCOnline JackYPCOnline added the needs-api-review Makes changes to the public API surface label Apr 21, 2026
Comment thread src/strands/experimental/checkpoint/checkpoint.py
@github-actions
Copy link
Copy Markdown

Assessment: Comment

All prior review feedback has been addressed — terminology is consistent ("agent loop"), the Snapshot/Checkpoint relationship is documented, AGENTS.md is updated, and the error message is improved. One remaining item on exception type consistency.

Details
  • Exception pattern: Checkpoint.from_dict() uses ValueError while the SDK convention is domain-specific exceptions (e.g. SnapshotException). Minor for experimental, but worth aligning before promotion.
  • API process: This PR introduces new public types (Checkpoint, CheckpointPosition) — consider adding the needs-api-review label per the bar-raising process, especially before PR 2/2 merges.

Clean iteration — the code is well-structured and ready for the behavioral PR to build on.

Comment thread src/strands/experimental/checkpoint/checkpoint.py
Comment thread src/strands/experimental/checkpoint/checkpoint.py
Comment thread src/strands/experimental/checkpoint/checkpoint.py
Comment thread tests/strands/experimental/checkpoint/test_checkpoint.py
Comment thread src/strands/experimental/checkpoint/checkpoint.py
@JackYPCOnline JackYPCOnline enabled auto-merge (squash) April 21, 2026 21:06
@JackYPCOnline JackYPCOnline merged commit 724b591 into strands-agents:main Apr 21, 2026
19 of 21 checks passed
@JackYPCOnline JackYPCOnline deleted the checkpoint_1 branch April 21, 2026 21:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-api-review Makes changes to the public API surface size/m

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants