fix(adk): harden checkpoint resume compatibility#896
Merged
shentongmartin merged 14 commits intomainfrom Mar 18, 2026
Merged
Conversation
Change-Id: Ia874fd20a27d19f9b0f6cbb078bcffcc7bfb1a33
- Clarify and harden CMA checkpoint byte migration - Stabilize State gob names and remove internals map - Add v0.8.3 checkpoint fixture and resume test Change-Id: If335d36232a8bf8dc0011a4549c574032b13b4df
Change-Id: Id5ee4f19fb6801f2ef64ec6b6774b02be50ffe82
Change-Id: Ieb91a1f15f9093f85038bc5c4ff6f8932b19f0f2
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #896 +/- ##
==========================================
+ Coverage 81.86% 81.92% +0.06%
==========================================
Files 146 146
Lines 16035 16050 +15
==========================================
+ Hits 13127 13149 +22
Misses 1963 1963
+ Partials 945 938 -7 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Change-Id: Ib28262a51aa10c15871979067496c92684fafb97
Change-Id: Iadc660e90e8b28bebb9354d000ce26e77270f83e
- Generate checkpoint_data_v0.8.4.bin using current interrupt format - Add resume test for v0.8.4 fixture Change-Id: Iea6dd9377f936506cb24044212c23d3ebb821fee
Change-Id: I700b6e82bbdecd914662585f876ed8628e96048b
06c69a1 to
5293ed2
Compare
- Fix stale comments and correct version ranges - Propagate checkpoint migration errors - Make deep compat tests table-driven Change-Id: I8c47abc8ce486ffc163ee78df37131af9790709d
5293ed2 to
72c3828
Compare
Change-Id: I1e38adc115e00b27161eedf4bf2bdd188e5dc047
Change-Id: Idff42e9910397495ca65f79daf6068fb8cb605b5
Change-Id: I4408fddb062df6ff5f87293af0b49b70d7fdf787
hi-pender
reviewed
Mar 18, 2026
hi-pender
reviewed
Mar 18, 2026
hi-pender
reviewed
Mar 18, 2026
hi-pender
reviewed
Mar 18, 2026
b22119e to
4b176f8
Compare
hi-pender
reviewed
Mar 18, 2026
- Make State a plain gob struct (remove GobEncode/GobDecode) - Drop stateV07 and keep stateV080 for v0.8.0-v0.8.3 - Update v0.8.4 fixture Change-Id: I34ff4f4e40b55a8afd4dfab75b9a11f6f0da8207
4b176f8 to
887a4a2
Compare
Change-Id: I0fb39f23be00167f09303f807ed58ca6722ff00d
hi-pender
approved these changes
Mar 18, 2026
shentongmartin
added a commit
that referenced
this pull request
Mar 18, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix ADK checkpoint resume across v0.7 and v0.8.0–v0.8.3
Problem
Resuming from old ADK checkpoints can fail even when the logical
adk.Stateis compatible.This is caused by Go
gobdecoding rules:any/interface{}so the stream carries a concrete type name.structwire and aGobEncoderpayload are incompatible even if the type name is identical.Historically,
_eino_adk_react_statewas reused with two different wire kinds:structwireGobEncodepayload (opaque bytes)Solution
Statea plain gob struct (removeGobEncode/GobDecode) and register it under_eino_adk_react_state.*State(gob ignores fields that did not exist).stateGobNameV080) so gob routes to aGobDecode-compatible type (stateV080).stateV080to the current*State.Key Insight
If a type name is reused across versions but its wire kind changes, old bytes must be routed to a different local decoder. The safest options are:
Summary
*State修复 ADK 在 v0.7 与 v0.8.0–v0.8.3 的 checkpoint 恢复兼容性
问题
从旧的 ADK checkpoint 恢复时,即使
adk.State语义上兼容,也可能恢复失败。根因来自 Go
gob的解码规则:any/interface{}保存,线上会携带具体类型名。structwire 与GobEncoderpayload 是不兼容的。历史上
_eino_adk_react_state被复用但 wire 类型发生变化:structwireGobEncodepayload(不透明 bytes)方案
State变为普通 gob struct(移除GobEncode/GobDecode),并注册到_eino_adk_react_state。*State(gob 会忽略旧数据中不存在的字段差异)。stateGobNameV080),并将其注册到GobDecode兼容的类型(stateV080)。stateV080迁移到当前*State。关键认知
如果一个线上类型名跨版本复用但 wire 类型变化,就必须把旧 bytes 路由到不同的本地 decoder。最稳妥的手段是:
总结
*State