Add history retrieval and terminal instance ID paging APIs#133
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds new gRPC client APIs to retrieve orchestration history and to page through terminal instance IDs, plus in-memory backend support and tests to validate both sync and async client behavior.
Changes:
- Added
get_orchestration_history()sync/async client APIs with chunk aggregation and payload de-externalization. - Added
list_instance_ids()sync/async client APIs returning a typedPagewith continuation tokens. - Implemented in-memory backend support for
StreamInstanceHistoryandListInstanceIds, plus added/updated tests and changelog.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/durabletask/test_orchestration_e2e.py | Adds an end-to-end test for orchestration history retrieval. |
| tests/durabletask/test_client.py | Adds unit tests for history chunk aggregation, payload de-externalization, and instance ID paging (sync + async). |
| tests/durabletask/test_batch_actions.py | Adds an integration-style test for paging terminal instance IDs in the in-memory backend. |
| durabletask/testing/in_memory_backend.py | Implements StreamInstanceHistory and ListInstanceIds, and tracks completed_at for terminal orchestration instances. |
| durabletask/internal/history_helpers.py | Adds shared helpers for collecting streamed history chunks and converting to typed history events. |
| durabletask/history.py | Introduces typed history event dataclasses and protobuf-to-dataclass conversion logic. |
| durabletask/client.py | Adds Page[T], plus sync/async get_orchestration_history() and list_instance_ids() client APIs. |
| CHANGELOG.md | Documents newly added public client APIs and in-memory backend support. |
| .github/copilot-instructions.md | Clarifies changelog guidance for user-visible outcomes vs internal implementation details. |
YunchuWang
left a comment
There was a problem hiding this comment.
Top 3 Risks (by severity × likelihood)
- Timezone Inconsistency — Naive datetimes from Timestamp.ToDatetime() (Medium × Likely = HIGH)
Mechanism: In history.py, _base_kwargs() calls event.timestamp.ToDatetime() without a tzinfo argument, producing naive (timezone-unaware) datetime objects. Similarly, _timestamp_value() and direct .ToDatetime() calls on timer events produce naive datetimes. Meanwhile, the in-memory
backend uses datetime.now(timezone.utc) everywhere. Users comparing timestamps from history events with timestamps from orchestration state or the backend will encounter TypeError: can't compare offset-naive and offset-aware datetimes.
Severity: Medium — doesn't crash the core flow but will surprise users doing timestamp comparisons or serialization.
Likelihood: Likely — this is the default ToDatetime() behavior and the Copilot reviewer already flagged it.
- Potential TypeError when passing None for protobuf repeated field runtimeStatus (Medium × Possible = MEDIUM)
Mechanism: In list_instance_ids(), when runtime_status is None, the expression [status.value for status in runtime_status] if runtime_status else None evaluates to None, which is passed as the runtimeStatus argument to pb.ListInstanceIdsRequest(...). Some protobuf Python
implementations raise TypeError when None is passed for a repeated field. The safe approach is to pass [] or omit the argument entirely.
Severity: Medium — would cause a runtime error on the most common call path (no status filter).
Likelihood: Possible — depends on protobuf library version; may silently accept None in some versions.
- Unbounded response for list_instance_ids with no page_size (Low × Possible = LOW-MEDIUM)
Mechanism: When page_size is None or 0, page_size or 0 sends 0 in the request, and the in-memory backend interprets this as "return all matching" (len(matching)). For large instance sets, this could produce very large responses.
Severity: Low (in-memory backend is for tests, not production).
Likelihood: Possible in integration tests with many orchestrations.
LGTM, some comments from claude
|
Thanks - merged without commenting first but wanted to mention that the above comments were all addressed. |
Summary
Validation