feat(adk): improve TurnLoop stop cleanup and add StopOption controls#925
Merged
meguminnnnnnnnn merged 3 commits intoalpha/09from Apr 8, 2026
Merged
feat(adk): improve TurnLoop stop cleanup and add StopOption controls#925meguminnnnnnnnn merged 3 commits intoalpha/09from
meguminnnnnnnnn merged 3 commits intoalpha/09from
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## alpha/09 #925 +/- ##
===========================================
Coverage ? 81.72%
===========================================
Files ? 158
Lines ? 19024
Branches ? 0
===========================================
Hits ? 15547
Misses ? 2345
Partials ? 1132 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
886a131 to
165eb23
Compare
Change-Id: Iabee0c25a83d5a25585d3592a41ca6a5fba35c2b
Change-Id: Ia0a396b9cc2e43f15e85056d966f20b010dcd2b6
Add two new StopOption variants for TurnLoop.Stop(): - WithSkipCheckpoint: prevents checkpoint persistence on stop, for cases where the caller does not intend to resume in the future. The flag is sticky across escalation calls. - WithStopCause: attaches a business-supplied reason string. Surfaced in TurnLoopExitState.StopCause and, after the Stopped channel closes, via TurnContext.StopCause(). Uses first-non-empty-wins semantics across multiple Stop() calls. Thread both fields through stopSignal with proper mutex protection. Update cleanup() to skip checkpoint save when skipCheckpoint is set. Change-Id: Ifeat-stop-options-skip-checkpoint-stop-cause
e1b3d66 to
77b6d6d
Compare
meguminnnnnnnnn
approved these changes
Apr 8, 2026
shentongmartin
added a commit
that referenced
this pull request
Apr 8, 2026
…925) * fix(adk): keep late turn loop items Change-Id: Iabee0c25a83d5a25585d3592a41ca6a5fba35c2b * docs(adk): clarify cancel wait semantics Change-Id: Ia0a396b9cc2e43f15e85056d966f20b010dcd2b6 * feat(adk): add WithSkipCheckpoint and WithStopCause StopOptions Add two new StopOption variants for TurnLoop.Stop(): - WithSkipCheckpoint: prevents checkpoint persistence on stop, for cases where the caller does not intend to resume in the future. The flag is sticky across escalation calls. - WithStopCause: attaches a business-supplied reason string. Surfaced in TurnLoopExitState.StopCause and, after the Stopped channel closes, via TurnContext.StopCause(). Uses first-non-empty-wins semantics across multiple Stop() calls. Thread both fields through stopSignal with proper mutex protection. Update cleanup() to skip checkpoint save when skipCheckpoint is set. Change-Id: Ifeat-stop-options-skip-checkpoint-stop-cause
shentongmartin
added a commit
that referenced
this pull request
Apr 8, 2026
…925) * fix(adk): keep late turn loop items Change-Id: Iabee0c25a83d5a25585d3592a41ca6a5fba35c2b * docs(adk): clarify cancel wait semantics Change-Id: Ia0a396b9cc2e43f15e85056d966f20b010dcd2b6 * feat(adk): add WithSkipCheckpoint and WithStopCause StopOptions Add two new StopOption variants for TurnLoop.Stop(): - WithSkipCheckpoint: prevents checkpoint persistence on stop, for cases where the caller does not intend to resume in the future. The flag is sticky across escalation calls. - WithStopCause: attaches a business-supplied reason string. Surfaced in TurnLoopExitState.StopCause and, after the Stopped channel closes, via TurnContext.StopCause(). Uses first-non-empty-wins semantics across multiple Stop() calls. Thread both fields through stopSignal with proper mutex protection. Update cleanup() to skip checkpoint save when skipCheckpoint is set. Change-Id: Ifeat-stop-options-skip-checkpoint-stop-cause
mrh997
pushed a commit
that referenced
this pull request
Apr 10, 2026
…925) * fix(adk): keep late turn loop items Change-Id: Iabee0c25a83d5a25585d3592a41ca6a5fba35c2b * docs(adk): clarify cancel wait semantics Change-Id: Ia0a396b9cc2e43f15e85056d966f20b010dcd2b6 * feat(adk): add WithSkipCheckpoint and WithStopCause StopOptions Add two new StopOption variants for TurnLoop.Stop(): - WithSkipCheckpoint: prevents checkpoint persistence on stop, for cases where the caller does not intend to resume in the future. The flag is sticky across escalation calls. - WithStopCause: attaches a business-supplied reason string. Surfaced in TurnLoopExitState.StopCause and, after the Stopped channel closes, via TurnContext.StopCause(). Uses first-non-empty-wins semantics across multiple Stop() calls. Thread both fields through stopSignal with proper mutex protection. Update cleanup() to skip checkpoint save when skipCheckpoint is set. Change-Id: Ifeat-stop-options-skip-checkpoint-stop-cause
shentongmartin
added a commit
that referenced
this pull request
Apr 13, 2026
…925) * fix(adk): keep late turn loop items Change-Id: Iabee0c25a83d5a25585d3592a41ca6a5fba35c2b * docs(adk): clarify cancel wait semantics Change-Id: Ia0a396b9cc2e43f15e85056d966f20b010dcd2b6 * feat(adk): add WithSkipCheckpoint and WithStopCause StopOptions Add two new StopOption variants for TurnLoop.Stop(): - WithSkipCheckpoint: prevents checkpoint persistence on stop, for cases where the caller does not intend to resume in the future. The flag is sticky across escalation calls. - WithStopCause: attaches a business-supplied reason string. Surfaced in TurnLoopExitState.StopCause and, after the Stopped channel closes, via TurnContext.StopCause(). Uses first-non-empty-wins semantics across multiple Stop() calls. Thread both fields through stopSignal with proper mutex protection. Update cleanup() to skip checkpoint save when skipCheckpoint is set. Change-Id: Ifeat-stop-options-skip-checkpoint-stop-cause
shentongmartin
added a commit
that referenced
this pull request
Apr 14, 2026
…925) * fix(adk): keep late turn loop items Change-Id: Iabee0c25a83d5a25585d3592a41ca6a5fba35c2b * docs(adk): clarify cancel wait semantics Change-Id: Ia0a396b9cc2e43f15e85056d966f20b010dcd2b6 * feat(adk): add WithSkipCheckpoint and WithStopCause StopOptions Add two new StopOption variants for TurnLoop.Stop(): - WithSkipCheckpoint: prevents checkpoint persistence on stop, for cases where the caller does not intend to resume in the future. The flag is sticky across escalation calls. - WithStopCause: attaches a business-supplied reason string. Surfaced in TurnLoopExitState.StopCause and, after the Stopped channel closes, via TurnContext.StopCause(). Uses first-non-empty-wins semantics across multiple Stop() calls. Thread both fields through stopSignal with proper mutex protection. Update cleanup() to skip checkpoint save when skipCheckpoint is set. Change-Id: Ifeat-stop-options-skip-checkpoint-stop-cause
N3kox
pushed a commit
that referenced
this pull request
Apr 14, 2026
…925) * fix(adk): keep late turn loop items Change-Id: Iabee0c25a83d5a25585d3592a41ca6a5fba35c2b * docs(adk): clarify cancel wait semantics Change-Id: Ia0a396b9cc2e43f15e85056d966f20b010dcd2b6 * feat(adk): add WithSkipCheckpoint and WithStopCause StopOptions Add two new StopOption variants for TurnLoop.Stop(): - WithSkipCheckpoint: prevents checkpoint persistence on stop, for cases where the caller does not intend to resume in the future. The flag is sticky across escalation calls. - WithStopCause: attaches a business-supplied reason string. Surfaced in TurnLoopExitState.StopCause and, after the Stopped channel closes, via TurnContext.StopCause(). Uses first-non-empty-wins semantics across multiple Stop() calls. Thread both fields through stopSignal with proper mutex protection. Update cleanup() to skip checkpoint save when skipCheckpoint is set. Change-Id: Ifeat-stop-options-skip-checkpoint-stop-cause
shentongmartin
added a commit
that referenced
this pull request
Apr 14, 2026
…925) * fix(adk): keep late turn loop items Change-Id: Iabee0c25a83d5a25585d3592a41ca6a5fba35c2b * docs(adk): clarify cancel wait semantics Change-Id: Ia0a396b9cc2e43f15e85056d966f20b010dcd2b6 * feat(adk): add WithSkipCheckpoint and WithStopCause StopOptions Add two new StopOption variants for TurnLoop.Stop(): - WithSkipCheckpoint: prevents checkpoint persistence on stop, for cases where the caller does not intend to resume in the future. The flag is sticky across escalation calls. - WithStopCause: attaches a business-supplied reason string. Surfaced in TurnLoopExitState.StopCause and, after the Stopped channel closes, via TurnContext.StopCause(). Uses first-non-empty-wins semantics across multiple Stop() calls. Thread both fields through stopSignal with proper mutex protection. Update cleanup() to skip checkpoint save when skipCheckpoint is set. Change-Id: Ifeat-stop-options-skip-checkpoint-stop-cause
shentongmartin
added a commit
that referenced
this pull request
Apr 15, 2026
…925) * fix(adk): keep late turn loop items Change-Id: Iabee0c25a83d5a25585d3592a41ca6a5fba35c2b * docs(adk): clarify cancel wait semantics Change-Id: Ia0a396b9cc2e43f15e85056d966f20b010dcd2b6 * feat(adk): add WithSkipCheckpoint and WithStopCause StopOptions Add two new StopOption variants for TurnLoop.Stop(): - WithSkipCheckpoint: prevents checkpoint persistence on stop, for cases where the caller does not intend to resume in the future. The flag is sticky across escalation calls. - WithStopCause: attaches a business-supplied reason string. Surfaced in TurnLoopExitState.StopCause and, after the Stopped channel closes, via TurnContext.StopCause(). Uses first-non-empty-wins semantics across multiple Stop() calls. Thread both fields through stopSignal with proper mutex protection. Update cleanup() to skip checkpoint save when skipCheckpoint is set. Change-Id: Ifeat-stop-options-skip-checkpoint-stop-cause
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
TurnLoop stop cleanup and explicit StopOption controls
Problem
TurnLoophad several cleanup and control gaps afterStop().Cleanup blind spots (existing):
First, pushes that arrived after the loop was already stopped only returned
false; the caller had no structured way to recover those items later.Second, checkpoint persistence status was folded back into general exit handling. That made it hard to answer two different questions independently:
The stale-checkpoint cleanup path also relied on an undocumented fallback behavior for stores without
CheckPointDeleter, which made the ownership of checkpoint lifecycle unclear.Missing stop controls (new):
Third, callers had no way to tell the TurnLoop "do not persist a checkpoint for this stop" — useful when the business knows it will not resume in the future. Every Stop with pending state would produce a checkpoint write, even when the data would never be consumed.
Fourth, there was no way to attach a business-level reason to Stop. The existing
ExitReasoncarries the technical error (context cancellation,*CancelError, callback error), but callers could not distinguish why the stop was requested at the business level.Solution
Part 1: Late-stop recovery and checkpoint observability (existing commits)
TurnLoopExitStatenow carries:Checkpointedto indicate whether cleanup attempted a checkpoint saveCheckpointErrto report the save result separately fromExitReasonTakeLateItems()to retrieve inputs that were pushed after stop and therefore never entered the normal unhandled-item pathOn the write path, failed post-stop
Push()calls append to a dedicated late-item buffer. On the cleanup path, checkpoint saves are gated to genuine Stop-caused exits, so concurrent callback errors do not produce resumable state from an inconsistent turn.The checkpoint deletion contract is also made explicit: stores that want automatic stale-checkpoint cleanup must implement
CheckPointDeleter; otherwise lifecycle management belongs to the store owner.Part 2: Explicit StopOption controls (new commit)
Two new
StopOptionvariants forTurnLoop.Stop():WithSkipCheckpoint()— tells the TurnLoop not to persist a checkpoint on this stop. The flag is sticky: once anyStop()call sets it, subsequent escalation calls cannot undo it. When set, theshouldSaveCheckpointcondition incleanup()evaluates to false, and any stale checkpoint from a prior run is deleted.WithStopCause(cause string)— attaches a business-supplied reason string. The cause is surfaced in two places:TurnLoopExitState.StopCause— available afterWait()returnsTurnContext.StopCause()— available insideOnAgentEventsafter<-tc.StoppedclosesUses first-non-empty-wins semantics: the first
Stop()call with a cause sets it; subsequent escalation calls do not overwrite.Both fields are threaded through
stopSignalwith proper mutex protection.StopCauseonTurnContextis afunc() string(closure overstopSignal.getStopCause) rather than a plain string, becauseTurnContextis constructed beforeStop()may be called. The caller reads it after<-tc.Stopped, at which point the value is guaranteed to be set.Decisions
ExitReasonso loop termination and checkpoint durability are observable as independent outcomes.UnhandledItems.CheckPointDeletersemantics explicit instead of preserving an implicit nil-write deletion convention.skipCheckpointsticky across escalation calls — the intent "don't persist" should not be undone by a subsequentStop()that only escalates the cancel mode.stopCause— the business sets the cause once; escalation calls are about cancel mechanics, not about changing the reason.StopCauseonTurnContextas a function to avoid races — the context is constructed at turn start but the cause may arrive mid-turn.Key Insight
TurnLoop has three distinct classes of leftover data at shutdown, and they need different ownership:
UnhandledItems: accepted by the loop but never consumedCanceledItems: tied to the interrupted turn and resumable checkpoint stateAdditionally, the reason for stopping is a separate dimension from the mechanism of stopping.
ExitReasoncaptures the technical mechanism (*CancelError, context cancellation), whileStopCausecaptures the business intent ("user session timeout", "quota exceeded"). Keeping these separate makes stop/resume behavior predictable and debuggable.Summary
TakeLateItems()CheckpointedandCheckpointErralongsideExitReasonCheckPointDeleteras the explicit cleanup mechanismWithSkipCheckpoint()StopOption, sticky across escalationWithStopCause()StopOption, surfaced inTurnLoopExitState.StopCauseandTurnContext.StopCause()TurnLoop 停止清理及显式 StopOption 控制
问题
TurnLoop在Stop()之后有若干清理和控制方面的缺陷。清理盲点(已有):
第一,loop 已经停止后又到来的
Push()虽然会返回false,但调用方没有结构化方式在之后取回这些输入。第二,检查点持久化状态会和通用退出结果混在一起,导致两个问题难以独立判断:loop 是否正常退出?是否尝试保存检查点?保存是否成功?
另外,旧检查点清理路径此前依赖 store 在未实现
CheckPointDeleter时的隐式兜底行为,这使检查点生命周期的责任边界不清晰。缺少停止控制(新增):
第三,调用方无法告诉 TurnLoop "这次 Stop 不要持久化检查点"——当业务侧确定以后不会 Resume 时,这很常见。此前每次 Stop 只要有待处理状态都会产生一次检查点写入,即使数据永远不会被消费。
第四,无法在 Stop 时附带业务层面的停止原因。已有的
ExitReason承载的是技术层面的错误(context 取消、*CancelError、callback error),但调用方无法区分为什么在业务层面要求停止。方案
第一部分:晚到输入恢复和检查点可观测性(已有 commit)
TurnLoopExitState现在新增:Checkpointed:表示 cleanup 是否尝试保存检查点CheckpointErr:单独表示检查点保存结果,而不是塞进ExitReasonTakeLateItems():取回 Stop 之后才被 push、因此没有进入正常未处理路径的输入在写入路径上,Stop 之后失败的
Push()会进入独立的晚到输入缓冲区。在 cleanup 路径上,检查点保存仅允许发生在真正由 Stop 导致的退出场景。检查点删除契约也被明确化:如果 store 需要自动清理过期检查点,就必须实现
CheckPointDeleter。第二部分:显式 StopOption 控制(新 commit)
为
TurnLoop.Stop()新增两个StopOption:WithSkipCheckpoint()— 告知 TurnLoop 本次 Stop 不要持久化检查点。标志位是粘性的:一旦任何Stop()调用设置了它,后续升级调用无法撤消。设置后cleanup()中的shouldSaveCheckpoint条件为 false,并且会清理之前加载的旧检查点。WithStopCause(cause string)— 附带一个业务侧提供的原因字符串。原因在两个地方可获取:TurnLoopExitState.StopCause— 在Wait()返回后可用TurnContext.StopCause()— 在OnAgentEvents回调中,<-tc.Stopped关闭后可用使用"首个非空值胜出"语义:第一个带 cause 的
Stop()设置值;后续升级调用不会覆盖。两个字段都通过
stopSignal传递,有正确的 mutex 保护。TurnContext上的StopCause是func() string(闭包引用stopSignal.getStopCause),因为TurnContext在 turn 开始时就构造了,而Stop()可能在 turn 中途才到来。调用方在<-tc.Stopped之后读取,此时值已保证就绪。决策
ExitReason分离,使 loop 退出结果和检查点耐久性成为两个独立可观测结果。UnhandledItems。CheckPointDeleter语义,而不是继续依赖隐式的 nil 写入删除约定。skipCheckpoint在升级调用中保持粘性 —— "不要持久化" 的意图不应被后续仅升级取消模式的Stop()撤消。stopCause使用首个非空值胜出 —— 业务侧设置一次原因;升级调用关心的是取消机制,不是改变原因。TurnContext上以函数方式暴露StopCause,避免竞态 —— context 在 turn 开始时构造,但 cause 可能在 turn 中途才到达。关键洞察
TurnLoop 在停止时会留下三类语义不同的数据,必须分别归属:
UnhandledItems:已经被 loop 接收,但还没被消费CanceledItems:属于被中断 turn 的可恢复状态此外,停止的原因和停止的机制是两个独立维度。
ExitReason捕捉的是技术机制(*CancelError、context 取消),而StopCause捕捉的是业务意图("用户会话超时"、"配额用尽")。将二者分开,使 Stop / Resume 语义可预测且可调试。总结
TakeLateItems()暴露ExitReason之外新增Checkpointed和CheckpointErrCheckPointDeleter文档化为显式清理机制WithSkipCheckpoint()StopOption,在升级调用中保持粘性WithStopCause()StopOption,在TurnLoopExitState.StopCause和TurnContext.StopCause()中暴露