refactor(adk): improve cancel propagation, encapsulate TurnLoop stop options, add UntilIdleFor#942
Merged
shentongmartin merged 27 commits intoalpha/09from Apr 14, 2026
Merged
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## alpha/09 #942 +/- ##
===========================================
Coverage ? 82.14%
===========================================
Files ? 162
Lines ? 20228
Branches ? 0
===========================================
Hits ? 16616
Misses ? 2438
Partials ? 1174 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
5dd483a to
284d35d
Compare
54f0d53 to
36317f0
Compare
8304d17 to
833e586
Compare
4c6d7e9 to
26b0af2
Compare
…thRecursive - CancelAfterChatModel/CancelAfterToolCalls/CancelImmediate now only affect the root agent by default; descendant agents inside AgentTools are not notified unless WithRecursive() is passed as an AgentCancelOption. - Add recursive int32 + recursiveChan fields to cancelContext for gating deriveChild propagation with support for escalation from non-recursive to recursive mid-flight. - Modify wrapGraphInterruptWithGracePeriod to only apply grace period when recursive mode is active. - Update existing tests that relied on deep propagation to include WithRecursive(). - Add 17 new unit tests covering shallow/recursive behavior, escalation, grandchild propagation, race conditions, and edge cases. Change-Id: Id4c63e59fb9ec40cddb71df50159134cb548ca31
- Stop: replace 'idempotent' with 'may be called multiple times' since subsequent calls update cancel options (not idempotent by definition). - Wait: fix 'ever' to 'never' — Wait blocks forever when Run is never called, not when it is called. Change-Id: I71b7020223e53d0f9ec52a7551b81c02245ec9ec
Change-Id: I297ed416bc2345ec87c224a790270416c0b80742
- Add SafePoint bitmask type (AfterToolCalls, AfterChatModelCall, AnySafePoint) - Replace WithAgentCancel with WithGraceful/WithGracefulTimeout for Stop - Replace generic WithPreempt[T](opts...) with WithPreempt[T](SafePoint) and WithPreemptTimeout[T](SafePoint, Duration) for Push - WithGraceful bundles safe-point + recursive automatically - WithPreemptTimeout adds timeout-based escalation to immediate cancel - Update all doc comments referencing old APIs Change-Id: Id938445bebdcfc3261177af6eec51fbc4ac95ac6
Add guidance that the callback should not propagate CancelError: return non-nil error only for callback-internal failures that should terminate the loop; return nil when the agent is canceled by Stop or Preempt. Change-Id: I71d99bd407b79d7612743ee7e422c9fb3d35ee79
TestTurnLoop_ConcurrentPreemptsDuringTurn uses a mock agent that blocks on ctx.Done(). With WithPreempt(AnySafePoint) the agent waits at a safe point that never fires because the mock has no cancel infrastructure. Switch to WithPreemptTimeout with 1ms so the timeout escalates to immediate cancel and unblocks the agent reliably. Change-Id: I1a21c6cda86a3051bc040e83d2b229aef21559aa
- Rename AfterChatModelCall to AfterChatModel to match CancelAfterChatModel and align with the plural pattern of AfterToolCalls. - Add type-level doc comment to SafePoint explaining bitmask combinability and its relationship to CancelMode. - Document Stop() root-only default: immediate cancel only affects the root agent; point users to WithGraceful() for recursive propagation. - Add validation panic for gracePeriod <= 0 in WithGracefulTimeout, consistent with WithPreempt's zero-check. - Add internal comments: deriveChild two-phase double-select pattern, setRecursive monotonic semantics, wrapGraphInterruptWithGracePeriod shallow-mode guard, buildCancelFunc recursive escalation asymmetry. Change-Id: I82e54c0005b8cedc72f0d4b90a58727278c035cc
- Delete duplicate TestTurnLoop_StopOptionsArePassed (identical to StopWithMode). - Add 7 panic coverage tests: WithGracefulTimeout(<=0), WithPreempt(0), WithPreemptTimeout(0), SafePoint.toCancelMode(), NewTurnLoop nil fields, and deriveChild(nil). - Rewrite cancel_recursive_test.go: add assertNotClosedWithin channel helper replacing 6 time.Sleep(200ms) negative assertions; add setupParentChild helper; restructure 17 flat tests into t.Run groups (Shallow/Recursive/Escalation + Race). - Tighten assertions: GreaterOrEqual to require.GreaterOrEqual for fail-fast, pushCount to Equal(2) where deterministic. - Replace 16 assert.True(t, len(x) > 0) with assert.NotEmpty across cancel_test.go and turn_loop_test.go. - Extract 3 cancel error helpers (assertHasCancelError, drainAndAssertCancelError, drainEventsAndAssertCancelError) applied to 6 sites in cancel_test.go. - Extract newPreemptTestLoop helper applied to 4 preempt tests, removing ~84 lines of scaffold in turn_loop_test.go. Change-Id: Ic7e6a6b3c9efc7c892af7079fb90db625c0369e5
- WithImmediate(): explicitly request immediate cancel (previously the bare Stop() default). Bare Stop() now means turn-boundary exit. - UntilIdleFor(d): deferred stop that fires after the loop has been continuously idle for d. Timer resets on item arrival. Escalatable by subsequent Stop calls with cancel options. - Refactor Stop into commitStop to support deferred-commit semantics. - signal() now guards agentCancelOpts with nil-check for monotonic escalation (bare Stop never overwrites previously-set cancel opts). - Extract tryCancel closure in watchStopSignal to reduce duplication. Change-Id: Ic5c5f23a16ecec74eda9c37480be799f52c1a8ef
Change-Id: Ie5ddcd1e8bf4d02362ae0f6ea96de3a5778d09cf
Change-Id: I83c00a19051de11601b44837f81500ad18483413
Change-Id: I10fa4fdcf56c2367c779971e4cd6c9f5f32b1a1a
…review v4 Change-Id: I49a5049d75e398aa9287aa4f9c3583460db6a178
Change-Id: Ib29775ffc1b41a6aad6bca437db7f36f70c256cf
Change-Id: Ie65fbfa202c41951a7cf1021420b615369c0b1bb
Change-Id: I1b75fde802cdd29d418cb4099d6ebaca53c9d507
Change-Id: I5178edc93c337a3450a8a18184d1efe35afc819f
Change-Id: I018f237745f065420ce70c2ce769bedf7f5652d1
…turn Change-Id: Iaa951a9fe7d0a1a02e6e6b8182488f9b28d5132e
…ePoint Change-Id: Ibeca6f9c8aa994257fa4ebd50802aafe1b3bc2ac
Change-Id: I74400bc84085deb7f205dd1b27cd3e4160965f10
Change-Id: Iad6e40b5af1074e77b003469be76c78dcda2c639
…oundaries Change-Id: I55560a4ec22b9b66711f50f07f53e8023b71ebeb
…e in Stop() doc - 'cancel' → 'abort' for WithImmediate (consistent with earlier rewrite) - 'the stop commits automatically' → 'the loop shuts down automatically' - 'A non-UntilIdleFor call commits the stop immediately' → 'A Stop() call without UntilIdleFor shuts down the loop immediately' - 'still pending' → 'still waiting' Change-Id: Icab4d29abb57a55d17fd8b79ff40e70986f0cc51
…ed doc Same consolidation as Stop() doc — replace internal 'commit' jargon with 'the loop shuts down' in user-facing documentation. Change-Id: I0370380aa7312bfcb7d2b12b5fefdecff6d916c0
TurnLoop's buffer needs PushFront, TakeAll, Wakeup, and ClearWakeup — operations that are not part of a general-purpose unbounded channel. These were bolted onto internal.UnboundedChan, making it do double duty as both a simple MPSC queue (for compose/react) and a turn-execution buffer with priority requeue and idle-wakeup signaling. Extract a dedicated turnBuffer[T] type in the adk package that owns all TurnLoop-specific buffer semantics. Strip UnboundedChan back to its original Send/TrySend/Receive/Close surface. Change-Id: Ic6d0677b4550b98326bf582b608136286bfb61b1
…afety - Add attack_test.go with 12 adversarial tests covering UntilIdleFor concurrency, stop escalation, de-escalation guard, CanceledItems invariant, turnBuffer semantics, and concurrent Stop race conditions. - Upgrade 2 assert.True(errors.As) to require.True in turn_loop_test.go where the extracted *CancelError was immediately dereferenced. Change-Id: I18001c4b92d19bedb26ed005162d4778ee9c909d
f691573 to
db499c0
Compare
meguminnnnnnnnn
approved these changes
Apr 14, 2026
shentongmartin
added a commit
that referenced
this pull request
Apr 14, 2026
…options, add UntilIdleFor (#942)
shentongmartin
added a commit
that referenced
this pull request
Apr 14, 2026
…options, add UntilIdleFor (#942)
shentongmartin
added a commit
that referenced
this pull request
Apr 15, 2026
…options, add UntilIdleFor (#942)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
WithRecursive(): cancel modes only affect the root agent by defaultTurnLoop.StopandTurnLoop.Pushdirectly exposeAgentCancelOptionandCancelMode, leaking lower-level ChatModelAgent concepts into the higher-level TurnLoop APIWithGraceful/WithGracefulTimeoutfor Stop,WithPreempt/WithPreemptTimeoutfor PushFeature 1: Opt-in Recursive Cancel (
WithRecursive)Key Insight
Cancel has two orthogonal dimensions: timing (when to cancel) and scope (where to cancel).
CancelMode:CancelImmediate,CancelAfterChatModel,CancelAfterToolCallsAgentCancelOption:WithRecursive()Conflating scope with timing (e.g., making
CancelRecursivea bitmask bit inCancelMode) leads to problems:CancelImmediate = 0cannot be OR'd with anythingDesign Decisions
Why root-only by default? In
ChatModelAgent → AgentTool → ChatModelAgent, users expectCancelAfterChatModelto fire after the root agent's ChatModel, not an arbitrary inner one. Deep propagation is powerful but surprising for the common case.Monotonic escalation. Once
WithRecursive()is set via any cancel call, it stays set. Escalation is monotonic, never reversed.Mid-flight escalation. A
recursiveChan(closed on firstsetRecursive(true)) enables mid-flight escalation: parkedderiveChildgoroutines wake up and propagate immediately when a later cancel call addsWithRecursive().Grace period gating.
wrapGraphInterruptWithGracePeriodnow checksisRecursive() && hasActiveChildren()— without recursive mode, there are no child interrupts to wait for.Feature 2: TurnLoop Cancel Option Encapsulation
Key Insight
TurnLoop is a higher-level orchestrator; it should not expose ChatModelAgent-specific cancel concepts (
CancelMode,AgentCancelOption) in its public API. Instead, it should present use-case-driven vocabulary:The
SafePointbitmask (AfterToolCalls,AfterChatModelCall,AnySafePoint) maps internally toCancelModebut hides that detail from TurnLoop users.API Mapping
Stop(WithAgentCancel(WithAgentCancelMode(CancelImmediate)))Stop()Stop(WithAgentCancel(WithAgentCancelMode(CancelAfterToolCalls)))Stop(WithGraceful())Stop(WithAgentCancel(..., WithAgentCancelTimeout(d)))Stop(WithGracefulTimeout(d))Push(x, WithPreempt[T]())Push(x, WithPreempt[T](AnySafePoint))Push(x, WithPreempt[T](WithAgentCancelMode(CancelAfterToolCalls)))Push(x, WithPreempt[T](AfterToolCalls))Push(x, WithPreempt[T](WithAgentCancelMode(CancelImmediate)))Push(x, WithPreemptTimeout[T](AnySafePoint, smallDuration))Design Decisions
Two-function pattern over variadic/pointer.
WithGraceful()+WithGracefulTimeout(d)instead ofWithGraceful(d ...time.Duration)orWithGraceful(d *time.Duration). Variadic allows passing multiple values (semantic mismatch);0as a duration conflicts withWithGraphInterruptTimeout(0)meaning "immediate".WithGraceful bundles recursive automatically. Graceful shutdown implies wanting to reach a safe point across the entire agent hierarchy, so
WithRecursive()is included by default. Immediate stop (bareStop()) does not set recursive — it just cancels the context.Preempt does not set recursive. Preemption is a turn-level concern. The user preempts the current turn to process a higher-priority item; recursive propagation is not the default expectation.
摘要
WithRecursive()显式传播TurnLoop.Stop和Push直接暴露AgentCancelOption/CancelMode,将底层 ChatModelAgent 概念泄露到更高层的 TurnLoop APIWithGraceful/WithGracefulTimeout,Push 用WithPreempt/WithPreemptTimeout功能 1:可选递归取消(
WithRecursive)核心洞察
Cancel 有两个正交维度:时机(
CancelMode)和范围(WithRecursive())。将 scope 混入 bitmask 会导致CancelImmediate = 0无法做 OR 运算。分离为独立的AgentCancelOption更清晰。设计要点
CancelAfterChatModel在根 Agent 触发,而非内层。WithRecursive()一旦设置不可回退。recursiveChan中途升级:后续 cancel 调用加入WithRecursive()后,已阻塞的deriveChildgoroutine 立即被唤醒。功能 2:TurnLoop Cancel Option 封装
核心洞察
TurnLoop 是更高层的编排器,不应暴露 ChatModelAgent 特有的 cancel 概念。用面向用例的词汇替代:
WithGraceful/WithGracefulTimeout)vs 立即关停(默认)WithPreempt(SafePoint)),可选超时升级(WithPreemptTimeout)SafePoint位掩码(AfterToolCalls、AfterChatModelCall、AnySafePoint)内部映射为CancelMode,但对 TurnLoop 用户隐藏了这一细节。设计要点
WithGraceful()+WithGracefulTimeout(d)而非变参或指针,避免语义歧义。WithGraceful自动包含递归:优雅关停意味着整个层级都应到达安全点。