refactor(runtime): improve testability and simplify package structure#2554
Merged
dgageot merged 12 commits intodocker:mainfrom Apr 28, 2026
Merged
Conversation
…detector reset These trivial helpers had 0% coverage, leaving the constructor shape of every ResumeRequest variant untested. Add table-driven tests for IsValidResumeType / ValidResumeTypes (round-trip), each ResumeApprove* constructor, ElicitationError.Error formatting, and toolLoopDetector.reset (state cleared, counter restarts at 1). Assisted-By: docker-agent
All time-dependent operations inside the runtime (assistant message CreatedAt, tool result CreatedAt, tool-call latency in runAgentTool, fallback cooldown windows, compaction prompt timestamps) now go through r.now() instead of time.Now(). The new WithClock(func() time.Time) Opt lets tests pin time deterministically and verify cooldown expiry without sleeping. Event.go's AgentContext.Timestamp is intentionally left on time.Now: it is wire-format only and the existing clearTimestamps helper already scrubs it. Includes a clock_test.go that verifies the seam: WithClock(nil) keeps the default, and the fallback-cooldown evict path is now testable without real wall-clock advancement. Assisted-By: docker-agent
Move the default ModelStore behind a lazyModelStore wrapper that calls modelsdev.NewStore only when GetModel or GetDatabase are first invoked. NewLocalRuntime no longer performs disk I/O (os.UserHomeDir + os.MkdirAll for ~/.cagent) at construction time and no longer fails when the home-dir lookup fails. Tests that don't touch the catalog can now construct a runtime cheaply, and tests that do can still inject a stub via WithModelStore — that path bypasses the lazy default entirely. Assisted-By: docker-agent
…bility Define a small Telemetry interface in pkg/runtime that captures the five observability events the runtime emits (session start/end, error, tool call, token usage). Default to a defaultTelemetry struct that forwards to the package-level pkg/telemetry helpers (preserving the existing context-attached-client behaviour). Replace every direct telemetry.RecordX call site (loop.go x4, tool_dispatch.go, streaming.go) with r.telemetry.RecordX. handleStream stays a free function — its design contract is to be runtime-state-free — but now takes the Telemetry as an explicit parameter; fallback.go forwards r.telemetry when calling it. Tests can now inject a recordingTelemetry via WithTelemetry to assert lifecycle ordering, agent/session correlation, and token-usage totals without standing up an OTel client. Assisted-By: docker-agent
…flow cap Pull every event-channel buffer literal (loop.go, client.go, persistent_runtime.go, remote_runtime.go) and the emitToolsChanged 5-second timeout into named constants in defaults.go so they can be reasoned about in one place. Promote the inline maxOverflowCompactions=1 constant into a per-runtime field (LocalRuntime.maxOverflowCompactions) defaulted via defaultMaxOverflowCompactions. New WithMaxOverflowCompactions(n) Opt lets tests exercise both the "compaction succeeded" and "compaction exhausted" branches without driving many real overflows; negative values clamp to 0. Includes overflow_compaction_test.go covering the cap=0, default-cap, and negative-clamp cases. Assisted-By: docker-agent
…send race contract The two raw fields elicitationEventsChannel + elicitationEventsChannelMux on LocalRuntime carried a non-trivial concurrency contract: the swap-back at finalizeEventChannel must not race with close(events), and a send must hold the read lock to keep the channel reference open until the send returns. That contract was implicit and hard to reason about while diff'ing the surrounding 250-line RunStream goroutine. Move the channel + mutex + swap/send into a dedicated elicitationBridge type. LocalRuntime now exposes a single elicitation field, the surrounding methods are simple delegations, and the contract has its own race-detector test that drives the inflight-send-vs-swap interleave directly without spinning up a runtime, agent, or MCP toolset. Note: an earlier plan to push the channel into context.WithValue and drop the mutex would have been a regression — the MCP elicitation handler can be invoked with a context that did not flow from RunStream (OAuth path), so a per-stream context value would not reach every callsite, and dropping the mutex would re-introduce the close-during-send race. Assisted-By: docker-agent
… from RunStream Two of the largest, most-branchy regions of the RunStream goroutine were unreachable from unit tests because they were anonymous code blocks inside a 250-line closure: the max-iterations enforcement / resume-or-reject flow (~60 lines) and the model-error recovery path that decides between auto-compaction retry and fatal exit (~30 lines). Move both into named methods on *LocalRuntime in a new loop_steps.go: enforceMaxIterations returns (newMax, decision) and handleStreamError returns a streamErrorOutcome. The run loop becomes a small switch-on-outcome at each callsite, and each branch (below cap / non-interactive auto-stop / interactive approve / interactive reject / context cancel for max-iter; canceled / overflow-retry / overflow-exhausted / generic-fatal for stream errors) gets its own focused test. Net effect on loop.go: -81 lines, +191 lines of testable helpers, +9 unit tests covering branches that previously needed full end-to-end runs to exercise. Assisted-By: docker-agent
The fallback cooldown logic was scattered across three methods on *LocalRuntime (getCooldownState/setCooldownState/clearCooldownState), each repeating the same map+mutex+slog ceremony, plus the eviction-on-read invariant lived inline in getCooldownState. Reasoning about that invariant required reading 60 lines of fallback.go and was untestable without spinning up an agent + team + runtime. Move the map, mutex, eviction logic, and slog emissions into a small cooldownManager type with Get/Set/Clear and an injectable clock. The three runtime methods become one-line delegations. The clock is wired through r.now so WithClock(fakeClock) makes the eviction window deterministic and the new test exercises the contract directly under -race without going through tryModelWithFallback. Net effect on fallback.go: -50 lines, +50 lines of standalone testable type, +8 unit tests covering Set/Get/Clear, eviction, replacement, isolation, nil-clock default, and concurrent safety. Assisted-By: docker-agent
…the testability series Five methods on *LocalRuntime (CurrentAgentName, setCurrentAgent, SetCurrentAgent, CurrentAgent, resolveSessionAgent) all touched the same two raw fields (currentAgent string + currentAgentMu sync.RWMutex) and reimplemented the same patterns: name lookup, mutex-guarded write, validated write, team.Agent(...) resolve. Same story as cooldownManager — a small mutable subdomain hidden inside a 1100-line struct. Move it to a dedicated agentRouter (Name/Set/SetValidated/Current/ResolveSession). LocalRuntime keeps single-line wrappers for the existing callsites; the router can be unit-tested directly with a team of two agents and a session, exercising the session-pinned-vs-current fallback, the unknown-pin fallback to current, the validated-set rejection of unknown names, and concurrent safety under -race. Drive-by lint cleanup picked up while finishing the testability series: - gci formatting on runtime.go and telemetry_test.go (struct field alignment) - testifylint require-error in lazy_model_store_test.go - modernize WaitGroup.Go in elicitation_bridge_test.go Assisted-By: docker-agent
…build Several thin wrappers added during the testability series can go now that the underlying types are stable: - LocalRuntime.swapElicitationEventsChannel was a one-line delegation to elicitationBridge.swap with two callsites in loop.go. Replace both callsites with the direct call. - LocalRuntime.setCooldownState / clearCooldownState were one-line delegations with single callsites in fallback.go. Inline them. getCooldownState had three callsites; inline those too — r.cooldowns.Get(...) reads the same as before. - Move fallbackCooldownState from fallback.go to cooldown_manager.go where the manager that owns it lives. The struct is purely internal to the cooldown subsystem. NewLocalRuntime built the cooldown manager twice: once before opts and again after, just in case WithClock changed r.now. Rebuild once after opts (no opt depends on r.cooldowns existing during construction). One initialisation, no "rebuild now that opts have run" comment to puzzle over. enforceMaxIterations had two identical addAgentMessage(stop msg) blocks (non-interactive auto-stop + interactive reject). Pull the formatting and the assistant-message append into a tiny local helper. Assisted-By: docker-agent
…method
RunStream's anonymous "go func() { ~250 lines }()" closure was the kind of thing that's hard to navigate in editors, hard to identify in stack traces, and hard to read because everything inside was indented two extra tab stops.
Move the body verbatim into a runStreamLoop method on *LocalRuntime. RunStream becomes a 4-line dispatcher (slog.Debug, channel make, go r.runStreamLoop, return). The body's logic is unchanged \u2014 every line is identical, just dedented by one tab \u2014 but it now has a real symbol name and a focused doc comment.
Net diff is mostly reindentation; +3 / -3 net lines of meaningful change.
Assisted-By: docker-agent
…iles runtime.go was a 1081-line catch-all that mixed five distinct concerns: the LocalRuntime struct + Opts + constructor, current-agent helpers, MCP prompts, session/permissions/lifecycle plumbing, the Resume API surface, and the elicitation API + handler. Splitting the elicitation and resume concerns into their own files removes \u003c100 lines from runtime.go and groups related logic by topic. Move ElicitationResult / ElicitationError / ElicitationRequestHandler / ResumeElicitation method / elicitationHandler method, plus the elicitationBridge type from elicitation_bridge.go, into a single elicitation.go. Everything elicitation-related now lives in one ~140-line file. Move ResumeType + constants + ResumeRequest + ResumeApprove* helpers + IsValidResumeType / ValidResumeTypes (previously split across runtime.go and resume_validation.go) into resume.go. Doc comments expanded to make the type act as an entry point for readers. Test files renamed correspondingly: resume_validation_test.go \u2192 resume_test.go, elicitation_bridge_test.go \u2192 elicitation_test.go (TestElicitationError_Error moved over too). Net effect: runtime.go drops from 1081 to 963 lines; the package gains two well-named topical files; no behaviour changes. Assisted-By: docker-agent
404001f to
38075cd
Compare
trungutt
approved these changes
Apr 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Improve the testability of
pkg/runtimethrough targeted dependency injection, extraction of inline subdomains into focused types, and a final pass of cleanup/reorg. No public API changes; no observable behaviour changes.Coverage of
pkg/runtimerises from 53.6% → 56.5%; the package gains 17 new files (mostly tests) andruntime.goshrinks from 1081 to 963 lines.Why
Several patterns in the runtime forced tests to either skip whole code paths or do invasive things to reach them:
time.Now()was hardcoded in 8 places (timestamps, fallback cooldowns, tool-call latency); tests had to scrub timestamps with reflection or sleep on real wall-clock.NewLocalRuntimedid disk I/O viamodelsdev.NewStoreand could fail at construction on hosts whereos.UserHomeDiris unhappy.128event-channel buffer,maxOverflowCompactions = 1,5*time.Second) lived inline; tests couldn't drive both branches of the overflow-recovery path without 1+ real overflows.RunStreamgoroutine was a 250-line anonymous closure mixing iteration limits, fallbacks, hooks, compaction, tool dispatch, steer/follow-up draining — none of it independently testable.LocalRuntimewith concurrency contracts that could only be reasoned about by reading the surrounding methods.model_switcher_test.gowas the smoking gun: 5 tests bypassedNewLocalRuntimeentirely with&LocalRuntime{...}struct literals because the constructor was too coupled.What
12 focused commits, each independently reviewable:
test(runtime): cover resume helpers, ElicitationError, and tool loop detector resetrefactor(runtime): inject clock via WithClock for deterministic teststime.Now()→r.now(); newWithClockOpt; cooldown-eviction now testable without sleepingrefactor(runtime): defer modelsdev.NewStore until first uselazyModelStore; constructor no longer hits diskrefactor(runtime): introduce Telemetry interface for testable observabilityTelemetryinterface +WithTelemetryOpt; default forwards topkg/telemetryrefactor(runtime): name magic numbers and let tests override the overflow capdefaults.goconstants +WithMaxOverflowCompactionsOptrefactor(runtime): extract elicitationBridge to encapsulate the swap+send race contractrefactor(runtime): extract enforceMaxIterations and handleStreamError from RunStreamrefactor(runtime): extract cooldownManager from fallback.gorefactor(runtime): extract agentRouterrefactor(runtime): drop pass-through wrappers and constructor double-buildrefactor(runtime): hoist RunStream goroutine body into runStreamLoop methodgo func(){...}→ named method, real symbol in stack tracesrefactor(runtime): regroup elicitation and resume code into focused filesresume.go+elicitation.go; runtime.go drops to 963 linesNew testability seams
WithClock(func() time.Time)WithTelemetry(Telemetry)WithMaxOverflowCompactions(n)lazyModelStoreelicitationBridge,cooldownManager,agentRouterenforceMaxIterations,handleStreamErrorAll concurrency primitives have race-detector tests that drive the contract directly (e.g.
TestElicitationBridge_SwapWaitsForInflightSendersexercises the swap-vs-inflight-send interleave that the previous raw mutex+field carried implicitly).Validation
go build ./...✅go test ./pkg/runtime/... -race -count=1✅mise lint✅ (golangci-lint, custom lint,go mod tidy)mise test✅ (full module, exit 0)pkg/runtime: 53.6% → 56.5%Notes for reviewers
go build ./...confirms no callers outsidepkg/runtimewere affected.pkg/telemetry/context.go(argument order swap in the wrapper, cancelled by an inverse swap at the runtime callsite) was discovered during commit 4 and explicitly preserved to keep this PR's scope tight. Worth a separate PR.event.go'stime.Now()forAgentContext.Timestampis intentionally untouched — wire-format only, scrubbed by the existingclearTimestampstest helper.Assisted-By: docker-agent