Summary
CosmosCheckpointStorage (in agent-framework-azure-cosmos) should reach API and behavior parity with FileCheckpointStorage (in agent-framework) for checkpoint deserialization. Two related changes should land together:
- Accept an
allowed_checkpoint_types constructor parameter on CosmosCheckpointStorage, with the same "module:qualname" format and semantics as FileCheckpointStorage.
- Align the Cosmos load path with File so it flows through the same restricted-type loading behavior that File already uses by default.
Today, the two providers differ in how they call into decode_checkpoint_value, which in turn decides whether checkpoint documents are loaded via RestrictedUnpickler or via plain pickle.loads. Bringing them into alignment makes the two providers interchangeable from a user's perspective and removes a surprising behavior difference that is easy to miss when swapping providers.
Background and code paths
FileCheckpointStorage stores an _allowed_types frozenset on the instance and forwards it on every load:
python/packages/core/agent_framework/_workflows/_checkpoint.py:
- L279 —
self._allowed_types: frozenset[str] = frozenset(allowed_checkpoint_types or [])
- L355 —
decode_checkpoint_value(encoded_checkpoint, allowed_types=self._allowed_types)
- L381 — same pattern in the second load path
Inside the encoding module, the loading mode is decided by a simple is not None check:
python/packages/core/agent_framework/_workflows/_checkpoint_encoding.py:
_base64_to_unpickle — if allowed_types is not None: uses _RestrictedUnpickler, otherwise falls through to pickle.loads.
Because FileCheckpointStorage always passes a frozenset (even an empty one), it is always on the restricted-type path. The empty frozenset still enforces the built-in safe set plus agent_framework.* framework types — the additive allowed_checkpoint_types layers user types on top of that floor.
CosmosCheckpointStorage does not pass allowed_types at all:
python/packages/azure-cosmos/agent_framework_azure_cosmos/_checkpoint_storage.py:
- L416 —
decoded = decode_checkpoint_value(cleaned)
The two providers therefore take different branches of _base64_to_unpickle for the same inputs, which is surprising given they implement the same CheckpointStorage protocol and share the same hybrid JSON + pickle encoding.
Expected behavior
CosmosCheckpointStorage.__init__ should accept allowed_checkpoint_types: list[str] | None = None as a keyword-only argument, with the same format and semantics as FileCheckpointStorage.
CosmosCheckpointStorage should forward the stored allowed-types frozenset to decode_checkpoint_value on every load path, matching FileCheckpointStorage. A user who swaps FileCheckpointStorage(...) for CosmosCheckpointStorage(...) with the same allowed_checkpoint_types should observe the same load behavior for the same checkpoint contents.
- Behavior should be consistent across all load code paths on
CosmosCheckpointStorage (load, list_checkpoints, get_latest, and any internal helpers that call _document_to_checkpoint).
Suggested implementation
Mirror the File pattern:
- In
CosmosCheckpointStorage.__init__ (python/packages/azure-cosmos/agent_framework_azure_cosmos/_checkpoint_storage.py), add a keyword-only allowed_checkpoint_types: list[str] | None = None parameter and store it as self._allowed_types: frozenset[str] = frozenset(allowed_checkpoint_types or []).
- Change
_document_to_checkpoint from a @staticmethod to an instance method and pass allowed_types=self._allowed_types into decode_checkpoint_value. Update all call sites accordingly.
- Update the class docstring to describe
allowed_checkpoint_types alongside the existing authentication and container setup notes, pointing at the same Learn docs section as FileCheckpointStorage.
Note: the error message raised by _RestrictedUnpickler currently names FileCheckpointStorage.allowed_checkpoint_types as the canonical example. Once this change lands, consider either generalizing that message to reference the user's actual storage class or leaving it as-is and accepting File as the documented example.
Tests to add
Under the azure-cosmos package tests, add coverage that parallels the existing File tests in packages/core/tests/workflow/test_checkpoint_unrestricted_pickle.py and test_checkpoint.py:
- Built-in safe set still loads without opt-in. A checkpoint whose state uses only JSON-native types,
datetime, uuid, Decimal, common collections, and agent_framework.* types loads cleanly with no allowed_checkpoint_types configured.
- Application types require opt-in. Save a checkpoint whose state contains an application-defined class; loading it without
allowed_checkpoint_types raises WorkflowCheckpointException. Passing that type's "module:qualname" via allowed_checkpoint_types makes the load succeed and return an instance of the expected class.
- All load paths are covered. Exercise
load, list_checkpoints, and get_latest so a future regression in any single path is caught.
- No regressions in existing Cosmos checkpoint tests.
Docs follow-up
Once this ships, the Learn docs page at agent-framework/workflows/checkpoints.md in MicrosoftDocs/semantic-kernel-pr should be updated so the "Pickle serialization" subsection describes allowed_checkpoint_types as available on both providers. A separate docs PR is already in flight to add CosmosCheckpointStorage to that page; this follow-up can be rolled into a subsequent docs update once the Python change lands.
Acceptance criteria
Migration note
This is a behavior change for existing CosmosCheckpointStorage users who store application-defined types in their checkpoints. After this change, those loads will raise WorkflowCheckpointException until the application passes its types via allowed_checkpoint_types. The exception message from _RestrictedUnpickler tells users exactly which key to add, which should keep the migration straightforward, and the changelog entry should call this out explicitly with a one-liner example.
Summary
CosmosCheckpointStorage(inagent-framework-azure-cosmos) should reach API and behavior parity withFileCheckpointStorage(inagent-framework) for checkpoint deserialization. Two related changes should land together:allowed_checkpoint_typesconstructor parameter onCosmosCheckpointStorage, with the same"module:qualname"format and semantics asFileCheckpointStorage.Today, the two providers differ in how they call into
decode_checkpoint_value, which in turn decides whether checkpoint documents are loaded viaRestrictedUnpickleror via plainpickle.loads. Bringing them into alignment makes the two providers interchangeable from a user's perspective and removes a surprising behavior difference that is easy to miss when swapping providers.Background and code paths
FileCheckpointStoragestores an_allowed_typesfrozenset on the instance and forwards it on every load:python/packages/core/agent_framework/_workflows/_checkpoint.py:self._allowed_types: frozenset[str] = frozenset(allowed_checkpoint_types or [])decode_checkpoint_value(encoded_checkpoint, allowed_types=self._allowed_types)Inside the encoding module, the loading mode is decided by a simple
is not Nonecheck:python/packages/core/agent_framework/_workflows/_checkpoint_encoding.py:_base64_to_unpickle—if allowed_types is not None:uses_RestrictedUnpickler, otherwise falls through topickle.loads.Because
FileCheckpointStoragealways passes afrozenset(even an empty one), it is always on the restricted-type path. The empty frozenset still enforces the built-in safe set plusagent_framework.*framework types — the additiveallowed_checkpoint_typeslayers user types on top of that floor.CosmosCheckpointStoragedoes not passallowed_typesat all:python/packages/azure-cosmos/agent_framework_azure_cosmos/_checkpoint_storage.py:decoded = decode_checkpoint_value(cleaned)The two providers therefore take different branches of
_base64_to_unpicklefor the same inputs, which is surprising given they implement the sameCheckpointStorageprotocol and share the same hybrid JSON + pickle encoding.Expected behavior
CosmosCheckpointStorage.__init__should acceptallowed_checkpoint_types: list[str] | None = Noneas a keyword-only argument, with the same format and semantics asFileCheckpointStorage.CosmosCheckpointStorageshould forward the stored allowed-types frozenset todecode_checkpoint_valueon every load path, matchingFileCheckpointStorage. A user who swapsFileCheckpointStorage(...)forCosmosCheckpointStorage(...)with the sameallowed_checkpoint_typesshould observe the same load behavior for the same checkpoint contents.CosmosCheckpointStorage(load,list_checkpoints,get_latest, and any internal helpers that call_document_to_checkpoint).Suggested implementation
Mirror the File pattern:
CosmosCheckpointStorage.__init__(python/packages/azure-cosmos/agent_framework_azure_cosmos/_checkpoint_storage.py), add a keyword-onlyallowed_checkpoint_types: list[str] | None = Noneparameter and store it asself._allowed_types: frozenset[str] = frozenset(allowed_checkpoint_types or [])._document_to_checkpointfrom a@staticmethodto an instance method and passallowed_types=self._allowed_typesintodecode_checkpoint_value. Update all call sites accordingly.allowed_checkpoint_typesalongside the existing authentication and container setup notes, pointing at the same Learn docs section asFileCheckpointStorage.Note: the error message raised by
_RestrictedUnpicklercurrently namesFileCheckpointStorage.allowed_checkpoint_typesas the canonical example. Once this change lands, consider either generalizing that message to reference the user's actual storage class or leaving it as-is and accepting File as the documented example.Tests to add
Under the azure-cosmos package tests, add coverage that parallels the existing File tests in
packages/core/tests/workflow/test_checkpoint_unrestricted_pickle.pyandtest_checkpoint.py:datetime,uuid,Decimal, common collections, andagent_framework.*types loads cleanly with noallowed_checkpoint_typesconfigured.allowed_checkpoint_typesraisesWorkflowCheckpointException. Passing that type's"module:qualname"viaallowed_checkpoint_typesmakes the load succeed and return an instance of the expected class.load,list_checkpoints, andget_latestso a future regression in any single path is caught.Docs follow-up
Once this ships, the Learn docs page at
agent-framework/workflows/checkpoints.mdinMicrosoftDocs/semantic-kernel-prshould be updated so the "Pickle serialization" subsection describesallowed_checkpoint_typesas available on both providers. A separate docs PR is already in flight to addCosmosCheckpointStorageto that page; this follow-up can be rolled into a subsequent docs update once the Python change lands.Acceptance criteria
CosmosCheckpointStorage.__init__acceptsallowed_checkpoint_types: list[str] | None = Nonewith the same"module:qualname"format asFileCheckpointStorage.CosmosCheckpointStorageload paths forward the stored allowed-types frozenset intodecode_checkpoint_value.allowed_checkpoint_types.allowed_checkpoint_typeswhen upgrading.Migration note
This is a behavior change for existing
CosmosCheckpointStorageusers who store application-defined types in their checkpoints. After this change, those loads will raiseWorkflowCheckpointExceptionuntil the application passes its types viaallowed_checkpoint_types. The exception message from_RestrictedUnpicklertells users exactly which key to add, which should keep the migration straightforward, and the changelog entry should call this out explicitly with a one-liner example.