Skip to content

workflow: Create run pool reduce alloc#163

Merged
andrewwormald merged 5 commits intomainfrom
andreww-create-run-pool-reduce-alloc
Dec 11, 2025
Merged

workflow: Create run pool reduce alloc#163
andrewwormald merged 5 commits intomainfrom
andreww-create-run-pool-reduce-alloc

Conversation

@andrewwormald
Copy link
Collaborator

@andrewwormald andrewwormald commented Dec 11, 2025

Benchmark Case Before (ns/op) After (ns/op) Faster Run Improvement
Workflow/5 1,251,504,917 1,002,638,042 After 19.9%
Workflow/10 2,501,506,750 1,753,387,000 After 29.9%
Workflow/100 19,882,173,167 16,414,042,084 After 17.5%

@coderabbitai
Copy link

coderabbitai bot commented Dec 11, 2025

Walkthrough

The changes add a Run object pooling mechanism: a new runPool *sync.Pool field on Workflow[Type, Status], a newRunPool constructor, newRunObj() and releaseRun() methods, and two helper types runCollector and runReleaser. buildRun now obtains Runs via a collector and is pooled; callers across callback, step, timeout, testing and waiting logic are updated to use w.newRunObj() and to defer w.releaseRun(run) after creation. Tests and benchmarks are adapted to initialise the pool and the step consumer signatures were extended to accept the run collector and releaser.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Areas needing extra attention:

  • All updated buildRun call sites and new parameter ordering (collector first).
  • Correct placement of defer w.releaseRun(run) to ensure release on all paths.
  • releaseRun clearing of Run fields (Object, controller) to avoid stale references.
  • Concurrency considerations for sync.Pool lifecycle within Workflow.
  • Updated stepConsumer signature and all test call-site adjustments.
  • Benchmark changes in workflow_test.go affecting expected final status.

Suggested reviewers

  • echarrod
  • ScaleneZA

Poem

🐰
I hop to fetch a Run with care,
I tidy tails and comb its hair.
Borrowed, used, then set it free —
A little pool keeps springs for me.
Hooray for lighter memory!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 6.25% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title directly describes the main change: introducing a run pool to reduce allocations, which is the core purpose of this PR.
Description check ✅ Passed The description provides benchmark performance metrics demonstrating the practical impact of the changes, showing improvements of 17.5% to 29.9% across different test cases.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch andreww-create-run-pool-reduce-alloc

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai bot requested review from ScaleneZA and echarrod December 11, 2025 21:23
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
run.go (1)

18-19: Unused reset() method.

This method is defined but never called anywhere in the pooling lifecycle. Either implement it with the field-clearing logic (and call it from releaseRun), or remove it to avoid dead code.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ad7a9c0 and 1c24fdd.

📒 Files selected for processing (12)
  • builder.go (2 hunks)
  • callback.go (1 hunks)
  • callback_internal_test.go (2 hunks)
  • run.go (3 hunks)
  • schedule_test.go (1 hunks)
  • step.go (3 hunks)
  • step_internal_test.go (4 hunks)
  • testing.go (1 hunks)
  • timeout.go (2 hunks)
  • timeout_internal_test.go (2 hunks)
  • workflow.go (1 hunks)
  • workflow_test.go (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (5)
workflow_test.go (2)
run.go (1)
  • Run (10-16)
testing.go (1)
  • Require (90-135)
callback_internal_test.go (1)
run.go (1)
  • Run (10-16)
step_internal_test.go (1)
run.go (1)
  • Run (10-16)
timeout_internal_test.go (1)
run.go (1)
  • Run (10-16)
run.go (4)
status.go (1)
  • StatusType (3-7)
record.go (2)
  • Record (7-45)
  • TypedRecord (80-84)
runstate.go (1)
  • NewRunStateController (90-95)
workflow.go (1)
  • Workflow (59-105)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: build
  • GitHub Check: core (1)
  • GitHub Check: core (1.24)
🔇 Additional comments (17)
schedule_test.go (1)

285-285: Inconsistent sleep duration compared to similar tests.

This sleep was increased from 10ms to 1 second, but other scheduling tests in this file (lines 60, 71, 83, 188, 200, 217, 232, 340, 393, 400) use 10ms sleeps for the same purpose. Either this change is masking a timing issue introduced by the pooling changes, or it makes the test unnecessarily slow.

Consider investigating whether there's an underlying race condition, or revert to 10ms if the longer sleep was added as a quick fix.

timeout_internal_test.go (1)

27-31: LGTM!

The runPool initialisation correctly mirrors the production pool setup, enabling the test to exercise the pooled Run lifecycle.

run.go (3)

54-86: LGTM on the collector/releaser pattern and buildRun refactoring.

The abstraction cleanly separates pool acquisition from Run initialisation, and buildRun correctly populates the pooled Run's fields before returning.


88-93: LGTM!

The newRunObj method correctly wraps the pool's Get call with the appropriate type assertion.


95-103: No changes needed. The Record.Meta field contains only value types (string and uint), so no references persist between pool reuses. The existing comment is accurate.

testing.go (1)

153-159: LGTM!

The pool integration is correctly implemented: buildRun obtains a Run from the pool, the error is checked before the defer is registered, and releaseRun ensures the Run is returned after fn(run) completes.

workflow.go (1)

95-96: The pool is already properly initialised with a New function.

In builder.go lines 42–46, the runPool is initialised within NewBuilder with a New function that creates &Run[Type, Status]{} instances. This ensures sync.Pool.Get() will never return nil, preventing any panic on the type assertion in newRunObj().

callback.go (1)

66-72: LGTM! Run pooling integration is correctly implemented.

The newRunObj() retrieval and deferred releaseRun(run) ensure proper lifecycle management. The defer placement after the error check on line 67-69 is correct — if buildRun fails, there's no Run to release.

builder.go (1)

42-46: LGTM! Pool initialisation is correctly configured.

The sync.Pool is properly initialised with a New function that returns a fresh *Run[Type, Status]{}. This ensures the pool can create new instances when empty.

callback_internal_test.go (1)

25-29: LGTM! Test setup correctly mirrors production pool configuration.

The test properly initialises the runPool field to match the production configuration in builder.go, ensuring the pooling behaviour is exercised during tests.

workflow_test.go (1)

238-244: LGTM! Status indexing logic is consistent.

The loop now creates steps from status(1) through status(numberOfSteps+1), and the final assertion on line 271 correctly expects status(numberOfSteps+1). The 1-based indexing is internally consistent.

timeout.go (2)

121-127: LGTM! Run pooling correctly integrated in timeout processing.

The newRunObj() retrieval and deferred releaseRun(run) follow the same correct pattern established in callback.go. The defer is placed after the error check, ensuring no premature release attempts.


293-294: The original concern is unfounded. w.newRunObj() returns a runCollector[Type, Status] function (defined in run.go:89-92), not a *Run instance. Both call sites correctly pass this function: line 121 passes it to buildRun, and lines 293-294 pass it to stepConsumer. The buildRun function receives the collector and invokes it at line 74 to retrieve a Run from the pool. Types align correctly across both usages.

step_internal_test.go (1)

93-94: Test parameter updates look correct.

The use of inline factory and no-op releaser functions for tests is appropriate. This isolates test behaviour from the actual pooling mechanism whilst maintaining the correct function signature contract.

Also applies to: 146-147, 197-198, 250-251

step.go (3)

94-95: Wiring of pool collector and releaser looks correct.

The integration passes w.newRunObj() as the factory function and w.releaseRun as the cleanup callback, properly connecting the workflow's pool to the step consumer.


116-117: Function signature update aligns with the pooling pattern.

The addition of runCollector and runReleaser parameters follows a clean collector/releaser pattern that enables dependency injection for both production pooling and test isolation.


167-173: Correct placement of defer for pool release with proper cleanup.

The defer is correctly placed after buildRun succeeds (line 173), ensuring the run is only returned to the pool when one was actually obtained. This handles all exit paths (success, error, and panic). The releaseRun function properly clears critical references (run.Object and run.controller) before returning the object to the pool, preventing data leakage between workflow runs.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
run.go (2)

19-20: Wire reset into releaseRun and fully clear pooled state

reset is currently empty and unused, while releaseRun manually clears only Object and controller. This both triggers the static analysis warning and means any references held in Record stay reachable while the Run sits in the pool.

I’d suggest centralising clean‑up in reset and calling it from releaseRun, e.g.:

-func (r *Run[Type, Status]) reset() {
-}
+func (r *Run[Type, Status]) reset() {
+	// Prepare the Run for reuse in the pool.
+	r.TypedRecord = TypedRecord[Type, Status]{}
+	r.controller = nil
+}

 // releaseRun returns a Run object back to the workflow's pool for reuse
 func (w *Workflow[Type, Status]) releaseRun(run *Run[Type, Status]) {
-	// Clear references to prevent memory leaks
-	run.Object = nil
-	run.controller = nil
-	// Note: We don't clear the Record as it's a value type
-
-	w.runPool.Put(run)
+	run.reset()
+	w.runPool.Put(run)
 }

This keeps the reset logic in one place, avoids retaining old Record contents unnecessarily, and resolves the SonarCloud complaint.

Also applies to: 113-119


60-74: newRunPool implementation is sound; pre‑allocation is optional

The generic newRunPool setup is correct and safe for concurrent use. If benchmarks do not show a clear win from warming the pool with 10 items, you could simplify by dropping the pre‑allocation loop and relying solely on the New function, but that’s an optional micro‑optimisation.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1c24fdd and c9b5a8f.

📒 Files selected for processing (6)
  • builder.go (1 hunks)
  • callback_internal_test.go (1 hunks)
  • run.go (4 hunks)
  • timeout_internal_test.go (1 hunks)
  • workflow.go (1 hunks)
  • workflow_test.go (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • workflow_test.go
  • callback_internal_test.go
  • builder.go
  • timeout_internal_test.go
  • workflow.go
🧰 Additional context used
🧬 Code graph analysis (1)
run.go (3)
status.go (1)
  • StatusType (3-7)
record.go (2)
  • Record (7-45)
  • TypedRecord (80-84)
runstate.go (1)
  • NewRunStateController (90-95)
🪛 GitHub Check: SonarCloud Code Analysis
run.go

[failure] 19-20: Add a nested comment explaining why this function is empty or complete the implementation.

See more on https://sonarcloud.io/project/issues?id=luno_workflow&issues=AZsPWOuZ0z0nsieMzw38&open=AZsPWOuZ0z0nsieMzw38&pullRequest=163

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: core (1)
🔇 Additional comments (4)
run.go (4)

3-6: sync import is appropriate

Importing sync here is appropriate for the new sync.Pool‑based run pooling below. Nothing to change.


55-58: Collector/releaser function types read cleanly

The runCollector / runReleaser generics give clear names to the pooling callbacks and keep call‑sites tidy. No issues here.


76-103: buildRun correctly reinitialises pooled Run instances

Using the injected collector to fetch a Run and then fully assigning TypedRecord and controller ensures pooled instances are correctly reinitialised and no stale state is carried over. The logic around RunStateInitiatedRunStateRunning is preserved. This looks correct.


105-110: newRunObj is a thin, suitable wrapper over the pool

Returning a closure that does w.runPool.Get().(*Run[Type, Status]) keeps the pooling concern encapsulated in Workflow. Assuming runPool is always initialised when a Workflow is constructed, this is fine as‑is.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
run.go (2)

52-71: Pooling helpers look good; consider making preallocation more explicit

The runCollector / runReleaser aliases and newRunPool are clear and keep pooling concerns separate from buildRun. The preallocation of 10 Run instances is a reasonable starting point, but it is effectively a magic number.

If you expect very different concurrency levels across deployments, consider either documenting why “10” is a good default, or making it configurable at Workflow construction time. Otherwise, this is fine as is.


73-100: buildRun pooling integration preserves behaviour; maybe add a defensive check

The change to obtain a Run via the injected collector and then reinitialise TypedRecord and controller looks behaviourally equivalent to the previous direct allocation, while enabling reuse. The ordering (unmarshal, run‑state transition, then allocation/initialisation) is also sensible and avoids touching the pool when unmarshalling fails.

Given buildRun now depends on the caller providing a non‑nil collector that returns a correctly typed *Run[Type, Status], you may want to add a cheap defensive check (e.g. panic or error) if collector is nil or returns nil, to fail fast in case of misuse rather than later via a nil‑pointer dereference.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c9b5a8f and 1702164.

📒 Files selected for processing (1)
  • run.go (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
run.go (4)
status.go (1)
  • StatusType (3-7)
record.go (2)
  • Record (7-45)
  • TypedRecord (80-84)
runstate.go (1)
  • NewRunStateController (90-95)
workflow.go (1)
  • Workflow (59-105)
🔇 Additional comments (1)
run.go (1)

3-6: Import of sync is appropriate

Using sync here is appropriate for the new sync.Pool‑based run pooling; no issues from an import/use perspective.

@sonarqubecloud
Copy link

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
run.go (2)

57-71: newRunPool implementation looks correct; pre‑warm size is a tunable constant

The pool setup is sound: New always returns *Run[Type, Status], and the pre‑warm loop simply seeds 10 zeroed instances, which buildRun then fully reinitialises. The only thing to consider is whether 10 should be a small named constant so it’s easy to tune (or even set to 0 if benchmarks ever show pre‑warming doesn’t help on some workloads).

Please confirm your benchmarks still look good if you vary the pre‑warm size; if needed, we can parameterise this via a const or helper.


102-107: newRunObj is straightforward; consider guarding against a nil pool

The closure directly wraps w.runPool.Get() and asserts to *Run[Type, Status], which is fine given the way newRunPool is constructed. As a small safety improvement, you could either lazily initialise w.runPool here if it’s nil or add a comment/defensive check to make it clear that all Workflow instances must be constructed via the builder that sets runPool.

If you expect any callers to instantiate Workflow directly in this repo, it’s worth verifying they all initialise runPool before newRunObj is used.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1702164 and 588ade2.

📒 Files selected for processing (1)
  • run.go (3 hunks)
🔇 Additional comments (4)
run.go (4)

3-6: Import of sync for pooling is appropriate

Using sync here is expected for the new sync.Pool-based Run pooling; nothing to change.


52-55: Generic collector / releaser aliases are clear and type‑safe

The runCollector / runReleaser type aliases read well and make the pooling contract explicit at call sites, without leaking sync.Pool details. No changes needed.


73-100: Pooling integration in buildRun is correct and avoids leaks

Using the collector only after Unmarshal succeeds ensures pooled Run instances aren’t leaked on decode errors. Reassigning a fresh TypedRecord (including Record, Status, and Object) plus a new RunStateController on every call correctly wipes any previous state from the pooled object. This matches the previous non‑pooled semantics and should be safe across reuse.


109-113: releaseRun now correctly drops all embedded state before pooling

Zeroing run.TypedRecord and clearing run.controller before returning the object to runPool ensures there are no lingering references, including to Record.Object byte slices, addressing the earlier retention concern noted in past review. The pooling lifecycle (buildRun fully reinitialises; releaseRun fully clears) looks consistent.

@andrewwormald andrewwormald merged commit 91f313c into main Dec 11, 2025
10 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments