fix starlark goroutine leak in exec() with context.AfterFunc#81
Merged
Conversation
Every Eval() call spawned a goroutine blocking on <-ctx.Done(). With non-cancellable contexts (e.g., context.Background()), these goroutines leaked unboundedly. Replace with context.AfterFunc which registers a callback and returns a stop function, avoiding the leak.
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
Contributor
There was a problem hiding this comment.
Pull request overview
Fixes a goroutine leak in the Starlark evaluator’s exec() cancellation handling by replacing a per-evaluation goroutine waiting on ctx.Done() with context.AfterFunc, and adds a regression test intended to ensure goroutine counts don’t grow across many Eval() calls.
Changes:
- Replace the
go func { <-ctx.Done(); thread.Cancel(...) }pattern withcontext.AfterFunc(...); defer stop()inexec(). - Add a regression test that performs repeated
Eval()calls and checks goroutine-count growth stays bounded.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| engines/starlark/evaluator/evaluator.go | Uses context.AfterFunc + defer stop() to prevent unbounded goroutine accumulation when contexts never cancel. |
| engines/starlark/evaluator/evaluator_test.go | Adds a goroutine-leak regression test around repeated Eval() calls. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
drop t.Parallel() since runtime.NumGoroutine() reads a process-wide counter and parallel tests perturb the baseline. switch the loop's context back to context.Background() so ctx.Done() is nil — the exact scenario the bare-goroutine bug couldn't escape.
the previous commit reverted t.Context() back to context.Background() to actually exercise the Done()==nil scenario, but golangci-lint's usetesting rule flagged it. add a targeted nolint with a pointer to the rationale comment immediately above.
|
This was referenced May 14, 2026
robbyt
added a commit
that referenced
this pull request
May 14, 2026
) Closes #125 PR #81 added `context.AfterFunc(ctx, thread.Cancel)` to Starlark's exec() to fix a goroutine leak, but no test asserted that **cancellation actually halts execution within a bounded time**. Same gap existed for Risor — the Risor v2 VM checks ctx.Done() at DefaultContextCheckInterval (1000 instructions) and via a background goroutine, but nothing locked the behavior in. Extism is already covered by the execHelper cancellation cases added in PR #121. Add TestEval_CancellationHaltsExecution to both Risor and Starlark evaluator test files. The script body is a long-running loop sized so natural completion would far outrun the 2-second test deadline; only ctx cancellation can return Eval early. - Risor: Risor v2 has no for/while statements (it's a functional language with .map/.filter/.each higher-order methods), so the long-running construct is `range(1e9).each(x => x)`. The VM's periodic ctx-check inside .each's callable.Call propagates cancellation. - Starlark: lazy `range(1e12)` with a Python-style for-loop. The AfterFunc-registered thread.Cancel halts at the next instruction boundary. Both tests: 1. Launch Eval in a goroutine. 2. Sleep 50ms so the engine is observably mid-script. 3. Cancel the context. 4. Assert Eval returns within 2s with a cancellation-shaped error (errors.Is(context.Canceled) OR "context canceled" / "cancel" substring — Starlark's thread.Cancel wraps the reason in an EvalError that doesn't unwrap to context.Canceled). Stress-tested locally with `-count=20 -race`: 40 runs, no flakes, ~2 seconds total wall time. Out of scope: Eval-halts-on-deadline (context.WithTimeout); same code path as cancel. Co-authored-by: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



Summary
Every
Eval()call on the Starlark evaluator spawned a goroutine that blocked on<-ctx.Done()to interrupt the running thread on cancellation. With non-cancellable contexts (e.g.context.Background()), those goroutines never exited and accumulated unboundedly.This swaps the bare goroutine for
context.AfterFunc, which registers a callback and returns a stop function. Wedefer stop()so the callback is unregistered as soon as the script completes — no leak, regardless of whether the context is cancellable.Changes
engines/starlark/evaluator/evaluator.go— replace the goroutine inexec()withcontext.AfterFunc(ctx, thread.Cancel)plusdefer stop().engines/starlark/evaluator/evaluator_test.go— newTestEval_NoGoroutineLeakregression test: runs 100 evaluations with a non-cancellable context and asserts the goroutine count doesn't grow.Test plan
go test -race ./engines/starlark/...passes locally