-
-
Notifications
You must be signed in to change notification settings - Fork 740
v4: Experimental process keep alive #2183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
🦋 Changeset detectedLatest commit: 7c70800 The changes in this PR will be included in the next version bump. Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
WalkthroughThis update introduces a comprehensive process keep-alive and process pooling mechanism for task execution in both development and managed environments. New classes ( 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
✅ Files skipped from review due to trivial changes (1)
⏰ Context from checks skipped due to timeout of 90000ms (25)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 19
🔭 Outside diff range comments (1)
apps/webapp/app/v3/otlpExporter.server.ts (1)
289-304
: Span-side extraction duplicates log logicSame comment as above – consider centralising the merge logic to avoid two divergent copies.
♻️ Duplicate comments (1)
packages/cli-v3/src/entryPoints/dev-run-worker.ts (1)
468-470
: Same Node-version concern as managed worker
AbortSignal.any()
requires Node 20. Add the same fallback / poly-fill strategy here to stay compatible with common Node 18 dev environments.
🧹 Nitpick comments (22)
packages/core/src/v3/locals/manager.ts (1)
37-39
: Minor: consider idempotent reset in multi-instance scenarios.
reset()
blindly clears the sharedstore
. If multiple logical subsystems share the sameStandardLocalsManager
instance, one caller could erase locals needed by another.If cross-component sharing is expected, expose a new isolated instance instead of clearing the global one, or guard the call behind ownership checks.
packages/core/src/v3/waitUntil/manager.ts (2)
6-8
: Reset logic is sound, but duplicated withblockUntilSettled()
finaliser.
reset()
clearsmaybeDeferredPromises
, yetblockUntilSettled()
also does so on line 28.
Not harmful, but the double-clear is redundant.
Optionally callreset()
from withinblockUntilSettled()
to centralise the logic.
15-21
: Typo in property name reduces readability.
promisesRequringResolving
is missing an “i”.
While internal, consistent naming helps maintenance.- private get promisesRequringResolving(): MaybeDeferredPromise[] { + private get promisesRequiringResolving(): MaybeDeferredPromise[] {.changeset/polite-badgers-suffer.md (1)
5-5
: Nit: capitalise feature description for consistency.-experimental processKeepAlive +Experimental: processKeepAlivepackages/core/src/v3/usage/devUsageManager.ts (1)
53-57
: Stop or finalize in-flight measurements before wiping maps
reset()
clears_currentMeasurements
, but anyDevUsageMeasurement
already handed out keeps running and will never be stopped or sampled.
Consider finishing them (e.g.measurement.stop()
) or at least logging dropped measurements before the map is emptied. This prevents silent loss of usage data and avoids dangling references.reset(): void { - this._firstMeasurement = undefined; - this._currentMeasurements.clear(); - this._pauses.clear(); + // Finalise any active measurements to avoid leaking state + for (const m of this._currentMeasurements.values()) { + m.stop(); + } + + this._firstMeasurement = undefined; + this._currentMeasurements.clear(); + this._pauses.clear(); }packages/core/src/v3/runtime/sharedRuntimeManager.ts (1)
44-47
: Consider cancelling the 5-minute debug interval on resetWhile
reset()
clears resolver maps, thesetInterval
created in the constructor keeps running forever.
If the same manager instance is reused across executions, each call toreset()
adds no new interval but also never removes the old one – acceptable, yet you may want to clear it to minimise noise and avoid multiple loggers if the class is ever reinstantiated.Minor, but worth noting.
packages/cli-v3/src/entryPoints/managed-index-worker.ts (1)
162-167
: Factor out keep-alive config mapping to a shared helperIdentical ternary logic is popping up in multiple entry points. Centralising the conversion to a small helper (e.g.
toProcessKeepAliveConfig(config.experimental_processKeepAlive)
) removes duplication and guarantees consistent behaviour if the rules ever change.packages/core/src/v3/otel/tracingSDK.ts (1)
60-63
: Getter is fine – consider thread-safety annotation
isResolved
is a harmless addition and useful for the call-sites.
If you anticipate concurrent access from multiple worker threads, mark the fieldprivate _resolved: volatile boolean
(or document the race expectations) – currently it’s fine in single-threaded Node.packages/core/src/v3/timeout/api.ts (1)
45-48
: Consider flipping the reset / disable order
reset()
first calls#getManager().reset()
before unregistering the global manager.
Ifreset()
implementations inspect global state (e.g. call back intoTimeoutAPI.getInstance()
) they will still see themselves registered, which is good, but conversely any code running concurrently afterreset()
but beforeunregisterGlobal()
still sees a “live” manager that has just been reset – potentially surprising.- this.#getManager().reset(); - this.disable(); + const manager = this.#getManager(); + this.disable(); // make future look-ups return NOOP immediately + manager.reset(); // finally reset the previous instanceThis keeps externally-visible state consistent while still allowing the underlying manager to clean itself up.
Feel free to ignore if your implementation relies on the current order.packages/cli-v3/src/executions/taskRunProcess.ts (1)
296-298
: Minor nit – markisExecuting()
asreadonly
getterA getter is more idiomatic and prevents accidental reassignment.
- isExecuting() { - return this._currentExecution !== undefined; - } + get isExecuting() { + return this._currentExecution !== undefined; + }packages/core/src/v3/config.ts (1)
237-261
:enabled
property seems redundant when the parent key is already optionalBecause
experimental_processKeepAlive
can itself befalse
, the nested{ enabled }
flag duplicates information:experimental_processKeepAlive?: boolean | { enabled: boolean; // duplicated … }Simplify the shape by dropping
enabled
and relying on:experimental_processKeepAlive?: { maxExecutionsPerProcess?: number; devMaxPoolSize?: number; } | false;This removes one boolean and a layer of cognitive overhead for users.
packages/core/src/v3/workers/populateEnv.ts (1)
34-35
: Minor:previousEnv
belongs in destructuring but isn’t documented in JSDoc.Consider adding it to the function comment for clarity.
packages/core/src/v3/runTimelineMetrics/runTimelineMetricsManager.ts (1)
41-46
: Duplicate metric risk.
#seedMetricsFromEnvironment()
runs every call toregisterMetricsFromExecution
; with a long-lived process this seeds the samefork
metric repeatedly.
Guard with a boolean or clear inreset()
.packages/cli-v3/src/dev/devSupervisor.ts (1)
100-133
: Boolean plumbing is overly repetitive.
enableProcessReuse
is computed on 103-107, but the ternary logic is duplicated when constructing the pool (117-122). Reuse the variable to avoid divergence.packages/cli-v3/src/entryPoints/dev-run-controller.ts (1)
777-785
: Fallback ifreturnProcess
fails.When returning the process in
runFinished
fails, the process handle is lost but still running. Callkill()
after logging the failure to avoid leaks.packages/cli-v3/src/entryPoints/managed/controller.ts (1)
41-43
: Duplicate state – local vars already exist in manifest.
processKeepAliveEnabled/MaxExecutionCount
duplicateworkerManifest.processKeepAlive
. You can read straight from the manifest where needed, reducing state drift.packages/cli-v3/src/entryPoints/managed/execution.ts (1)
612-616
: Leaked reference to completed processAfter a successful
returnProcess
,this.taskRunProcess
still holds the (possibly pooled) process instance.
On the next attemptgetProcess()
will create / supply another process but the old reference remains, preventing GC and complicating reasoning.
Set it toundefined
once the provider takes ownership back.await this.taskRunProcessProvider.returnProcess(this.taskRunProcess); -// ... +this.taskRunProcess = undefined;packages/cli-v3/src/dev/taskRunProcessPool.ts (1)
130-134
: Guardpid
before using as map key
process.pid
is assumed truthy. If spawning failed or the child died early,pid
can beundefined
, leading to the magic key0
and count collisions.
Skip bookkeeping whenpid
is falsy.apps/webapp/app/v3/otlpExporter.server.ts (4)
186-188
: Good abstraction, but consider memoising per‐resource
resourceProperties
is recomputed for every resource group, which is fine, but if you later loop over multiple scopes of the same resource you could avoid repeating the extraction by caching the result in a map keyed by the resource pointer or hash.
191-213
: Reuse the computedlogLevel
You already compute
const logLevel = logLevelToEventLevel(log.severityNumber);
but later call the helper again for thelevel
property.
Use the cached variable to eliminate an unnecessary function call.- level: logLevelToEventLevel(log.severityNumber), + level: logLevel,
418-494
: Type-safety ofextractEventProperties
- The direct cast
as CreatableEventEnvironmentType
is unsafe (value may be undefined or an arbitrary string).
Provide a whitelist check or default:- environmentType: extractStringAttribute(... ) as CreatableEventEnvironmentType, + environmentType: + (extractStringAttribute(...) as CreatableEventEnvironmentType) ?? "DEVELOPMENT",
- Repeating
extractStringAttribute(attributes, [prefix, X])
can be shortened by a local helper:const k = (k: string) => extractStringAttribute(attributes, [prefix, k]);which will further simplify maintenance.
716-807
: Overload bloat can be trimmedThe four helper extractors (
extractStringAttribute
, etc.) repeat two overload signatures each. TS supports an optional parameter, so one overload is enough:function extractStringAttribute( attrs: KeyValue[], name: string | (string | undefined)[], fallback?: string, ): string | undefined { ... }Less code, same type safety.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (3)
references/hello-world/src/trigger/example.ts
is excluded by!references/**
references/hello-world/src/trigger/schedule.ts
is excluded by!references/**
references/hello-world/trigger.config.ts
is excluded by!references/**
📒 Files selected for processing (36)
.changeset/polite-badgers-suffer.md
(1 hunks)apps/supervisor/src/util.ts
(1 hunks)apps/supervisor/src/workloadManager/docker.ts
(2 hunks)apps/webapp/app/v3/otlpExporter.server.ts
(8 hunks)packages/cli-v3/src/dev/devSupervisor.ts
(6 hunks)packages/cli-v3/src/dev/taskRunProcessPool.ts
(1 hunks)packages/cli-v3/src/entryPoints/dev-run-controller.ts
(4 hunks)packages/cli-v3/src/entryPoints/dev-run-worker.ts
(13 hunks)packages/cli-v3/src/entryPoints/managed-index-worker.ts
(1 hunks)packages/cli-v3/src/entryPoints/managed-run-worker.ts
(12 hunks)packages/cli-v3/src/entryPoints/managed/controller.ts
(9 hunks)packages/cli-v3/src/entryPoints/managed/execution.ts
(13 hunks)packages/cli-v3/src/entryPoints/managed/taskRunProcessProvider.ts
(1 hunks)packages/cli-v3/src/executions/taskRunProcess.ts
(4 hunks)packages/core/src/v3/config.ts
(1 hunks)packages/core/src/v3/lifecycleHooks/manager.ts
(1 hunks)packages/core/src/v3/locals/manager.ts
(1 hunks)packages/core/src/v3/otel/tracingSDK.ts
(2 hunks)packages/core/src/v3/runMetadata/manager.ts
(1 hunks)packages/core/src/v3/runTimelineMetrics/runTimelineMetricsManager.ts
(2 hunks)packages/core/src/v3/runtime/sharedRuntimeManager.ts
(1 hunks)packages/core/src/v3/schemas/build.ts
(1 hunks)packages/core/src/v3/schemas/schemas.ts
(1 hunks)packages/core/src/v3/taskContext/index.ts
(1 hunks)packages/core/src/v3/taskContext/otelProcessors.ts
(3 hunks)packages/core/src/v3/timeout/api.ts
(2 hunks)packages/core/src/v3/timeout/types.ts
(1 hunks)packages/core/src/v3/timeout/usageTimeoutManager.ts
(1 hunks)packages/core/src/v3/usage/api.ts
(1 hunks)packages/core/src/v3/usage/devUsageManager.ts
(1 hunks)packages/core/src/v3/usage/noopUsageManager.ts
(1 hunks)packages/core/src/v3/usage/prodUsageManager.ts
(1 hunks)packages/core/src/v3/usage/types.ts
(1 hunks)packages/core/src/v3/waitUntil/manager.ts
(1 hunks)packages/core/src/v3/workers/populateEnv.ts
(3 hunks)packages/core/src/v3/workers/taskExecutor.ts
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (24)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 10)
- GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (9, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 10)
- GitHub Check: typecheck / typecheck
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (10, 10)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 10)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 10)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (13)
packages/core/src/v3/taskContext/index.ts (1)
38-48
:attributes
getter no longer includesresourceAttributes
– verify downstream expectations.Prior versions merged all context + resource keys into
.attributes
.
After introducingresourceAttributes()
, callers that still consume.attributes
will silently lose environment / organisation / machine data.Confirm that:
- All OTEL spans now call
taskContext.resourceAttributes
explicitly, or.attributes
is intentionally slim and documented as such.If not deliberate, append
...this.resourceAttributes
to the spread list.packages/core/src/v3/usage/noopUsageManager.ts (1)
30-32
: LGTM – maintains interface parity.packages/core/src/v3/usage/types.ts (1)
17-18
: Interface change is breaking – ensure all implementations compileAdding
reset(): void
toUsageManager
is a breaking API change. Double-check that every custom/third-party implementation used inside the repo (and in downstream extensions) now implements this method to avoid compile/runtime errors.
A quickrg "implements UsageManager"
can confirm.packages/core/src/v3/timeout/types.ts (1)
4-4
: Sync allTimeoutManager
implementations with newreset
contract
reset
is now mandatory. VerifyNoopTimeoutManager
, any test fakes, and external plugins compile and provide a no-op implementation.packages/core/src/v3/usage/api.ts (1)
51-54
: Looks good – reset path correctly cleans up usage managerThe two-step
reset()
(manager reset ➔ disable global) is sound and avoids losing the reference before cleanup.apps/supervisor/src/workloadManager/docker.ts (1)
80-82
: Good call – host-aware warm-start URLUsing
normalizeDockerHostUrl
forTRIGGER_WARM_START_URL
prevents “localhost“ pitfalls on macOS/Windows Docker.packages/core/src/v3/schemas/schemas.ts (1)
165-167
: Verify ramifications of raisingconcurrencyLimit
hard-cap 100×Jumping from 1 000 to 100 000 materially alters worst-case resource usage.
Please double-check:
- Any counters, DB fields, or metrics that assume the old 4-digit bound.
- Overflow / memory concerns when pre-allocating arrays or semaphore tokens.
- Operational safeguards (e.g. Kubernetes HPA, queue depth safeguards) that could now be bypassed.
If downstream code already treats the value as “untrusted” and clamps to environment limits, great – otherwise consider introducing a softer default (e.g. keep
max(…)
at 100 000 but applymin(requested, envLimit)
at runtime).packages/core/src/v3/otel/tracingSDK.ts (1)
130-132
: 👍 Adds useful SDK metadata to the resourceEmbedding
SDK_VERSION
andSDK_LANGUAGE
helps trace attribution and doesn’t break prior contracts. Looks good.packages/core/src/v3/taskContext/otelProcessors.ts (1)
22-23
: 👍 Cleaner attribute propagationSwitching to
taskContext.attributes
removes duplication and keeps the attribute contract centralised – looks good.Also applies to: 72-73, 98-99
packages/cli-v3/src/executions/taskRunProcess.ts (1)
312-313
: 👍 Helpful debug contextIncluding the PID in exit logs will make post-mortem debugging far easier.
packages/core/src/v3/runTimelineMetrics/runTimelineMetricsManager.ts (1)
36-38
:reset()
not invoked anywhere yet.If the intent is to clear metrics between pooled executions, ensure callers (e.g. worker warm-start path) actually call
reset()
.packages/cli-v3/src/dev/devSupervisor.ts (1)
336-343
: Guard may hide bugs when pool creation fails.The pool is always instantiated; reaching this branch means init failed silently. Prefer throwing during
init()
rather than silently skipping runs.packages/cli-v3/src/entryPoints/managed/execution.ts (1)
131-144
:prepareForExecution
becameasync
– ensure callers awaitThe method now returns a
Promise
, but its signature didn't change in callers.
Search & update all invocations (await runExecution.prepareForExecution(...)
) to avoid executing with an un-initialised process.
packages/cli-v3/src/entryPoints/managed/taskRunProcessProvider.ts
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🔭 Outside diff range comments (1)
packages/cli-v3/src/entryPoints/managed/execution.ts (1)
131-144
: Warm-start process is created without resetting its event listeners
prepareForExecution()
fetches a (possibly reused)TaskRunProcess
, but does not callattachTaskRunProcessHandlers()
.
UntilexecuteRun()
runs, the process keeps the handlers from the previous attempt, so any heartbeat/debug-log emitted during that window will be routed to the stale callbacks and reference the wrong run ID.Attach handlers immediately after acquiring the process (or detach them here) to guarantee a clean slate.
🧹 Nitpick comments (1)
packages/cli-v3/src/entryPoints/managed/execution.ts (1)
616-623
: Clearthis.taskRunProcess
after returning it to the providerAfter a successful return the reference is stale and its handlers are detached.
Nulling it out prevents accidental reuse and makes future state checks (if (this.taskRunProcess)
) more accurate.- this.taskRunProcessProvider.returnProcess(this.taskRunProcess) + this.taskRunProcessProvider.returnProcess(this.taskRunProcess) + .then(() => { + this.taskRunProcess = undefined; + })
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
apps/supervisor/src/util.ts
(1 hunks)packages/cli-v3/src/dev/devSupervisor.ts
(6 hunks)packages/cli-v3/src/entryPoints/managed/controller.ts
(9 hunks)packages/cli-v3/src/entryPoints/managed/execution.ts
(16 hunks)packages/cli-v3/src/entryPoints/managed/taskRunProcessProvider.ts
(1 hunks)packages/cli-v3/src/executions/taskRunProcess.ts
(4 hunks)packages/core/src/v3/lifecycleHooks/manager.ts
(1 hunks)packages/core/src/v3/runMetadata/manager.ts
(1 hunks)packages/core/src/v3/schemas/build.ts
(1 hunks)packages/core/src/v3/timeout/usageTimeoutManager.ts
(1 hunks)packages/core/src/v3/usage/prodUsageManager.ts
(2 hunks)packages/core/src/v3/workers/populateEnv.ts
(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (11)
- packages/core/src/v3/timeout/usageTimeoutManager.ts
- packages/core/src/v3/schemas/build.ts
- packages/core/src/v3/lifecycleHooks/manager.ts
- packages/core/src/v3/runMetadata/manager.ts
- apps/supervisor/src/util.ts
- packages/core/src/v3/workers/populateEnv.ts
- packages/core/src/v3/usage/prodUsageManager.ts
- packages/cli-v3/src/executions/taskRunProcess.ts
- packages/cli-v3/src/dev/devSupervisor.ts
- packages/cli-v3/src/entryPoints/managed/controller.ts
- packages/cli-v3/src/entryPoints/managed/taskRunProcessProvider.ts
⏰ Context from checks skipped due to timeout of 90000ms (25)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (10, 10)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 10)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (9, 10)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
- GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
- GitHub Check: typecheck / typecheck
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (2)
packages/cli-v3/src/entryPoints/managed/execution.ts (2)
118-121
: Handle abort failures to avoid unhandled-promise exits
handleProcessAbort()
is awaited, but any rejection will bubble up and makekill()
itself throw, potentially bringing the whole supervisor down during a shutdown path.- if (this.taskRunProcess) { - await this.taskRunProcessProvider.handleProcessAbort(this.taskRunProcess); - } + if (this.taskRunProcess) { + const [err] = await tryCatch( + this.taskRunProcessProvider.handleProcessAbort(this.taskRunProcess) + ); + if (err) { + this.sendDebugLog("kill: failed to abort process", { error: err.message }); + } + }
185-196
:canExecute
may over-report readinessReturning
true
solely becausehasPersistentProcess
istrue
ignores whether the process has been prepared with the correct env for the next run.
If a pooled process is still configured for another project / task, the controller could dispatch a run that will inevitably fail later.Recommend additionally checking that
currentTaskRunEnv
is set (or that the provider can confirm the env matches the incoming run) before returningtrue
.
650090d
to
247ca37
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (5)
apps/webapp/app/v3/otlpExporter.server.ts (2)
239-268
: Code duplication issue persists - consider refactoring.The explicit field-by-field property merging pattern is duplicated between logs and spans conversion. This creates maintenance overhead and potential for inconsistencies.
358-386
: Type casting issue remains unresolved.The
environmentType
property casting toCreatableEventEnvironmentType
without validation can hide undefined values, as noted in previous reviews.packages/cli-v3/src/entryPoints/managed/taskRunProcessProvider.ts (2)
211-217
: Makecleanup()
awaitable for deterministic shutdownThe
cleanup()
method callsthis.cleanupProcess()
which returns a promise, but doesn't await or return it. This creates a race condition where callers may assume cleanup is complete when the process might still be terminating.- async cleanup() { + async cleanup(): Promise<void> { if (this.persistentProcess) { this.sendDebugLog("cleanup() called"); - await this.cleanupProcess(this.persistentProcess); + await this.cleanupProcess(this.persistentProcess); } }
259-259
: Guard against NaN in HEARTBEAT_INTERVAL_MS environment variableThe multiplication by 1000 can produce
NaN
ifTRIGGER_HEARTBEAT_INTERVAL_SECONDS
is undefined, which will be passed as a string to the child process.- HEARTBEAT_INTERVAL_MS: String(this.env.TRIGGER_HEARTBEAT_INTERVAL_SECONDS * 1000), + ...(this.env.TRIGGER_HEARTBEAT_INTERVAL_SECONDS + ? { HEARTBEAT_INTERVAL_MS: String(this.env.TRIGGER_HEARTBEAT_INTERVAL_SECONDS * 1000) } + : {}),packages/cli-v3/src/entryPoints/managed-run-worker.ts (1)
506-506
: AbortSignal.any() compatibility issue with Node.js ≤ 18
AbortSignal.any()
is only available in Node.js 20+. This will cause runtime crashes on older Node versions commonly used in serverless environments.- const signal = AbortSignal.any([_cancelController.signal, timeoutController.signal]); + // Feature detection for AbortSignal.any() compatibility + const signal = typeof AbortSignal.any === 'function' + ? AbortSignal.any([_cancelController.signal, timeoutController.signal]) + : createManualAbortSignal([_cancelController.signal, timeoutController.signal]);Add a fallback implementation:
function createManualAbortSignal(signals: AbortSignal[]): AbortSignal { const controller = new AbortController(); for (const signal of signals) { if (signal.aborted) { controller.abort(signal.reason); break; } signal.addEventListener('abort', () => { controller.abort(signal.reason); }, { once: true }); } return controller.signal; }
🧹 Nitpick comments (4)
packages/cli-v3/src/entryPoints/managed/taskRunProcessProvider.ts (3)
42-56
: Consider simplifying immediate retry logicThe
handleImmediateRetry()
method only cleans up processes when keep-alive is disabled, but the method name suggests it should handle immediate retries regardless of keep-alive settings. This could be confusing.Consider either:
- Renaming the method to better reflect its actual behavior
- Or handling immediate retry scenarios for both keep-alive enabled and disabled cases
- async handleImmediateRetry(): Promise<void> { + async handleImmediateRetryWhenKeepAliveDisabled(): Promise<void> { if (!this.processKeepAliveEnabled) {
263-276
: Enhance process health checks for robustnessThe current health check only verifies the process isn't being killed and has a PID. Consider adding more comprehensive health checks for better reliability.
private isProcessHealthy(process: TaskRunProcess): boolean { - // Basic health check - TaskRunProcess will handle more detailed internal health checks - return !process.isBeingKilled && process.pid !== undefined; + // Enhanced health check including process responsiveness + return !process.isBeingKilled && + process.pid !== undefined && + process.isExecuting && + process.isPreparedForNextAttempt; }
146-190
: Improve process suspension logic clarityThe
suspendProcess
method has complex conditional logic that could be simplified for better readability and maintainability.Consider extracting the process matching logic into a helper method:
+ private async suspendMatchingProcess(process: TaskRunProcess, flush: boolean): Promise<void> { + if (this.persistentProcess?.pid === process.pid) { + this.persistentProcess = null; + this.executionCount = 0; + } + await process.suspend({ flush }); + } async suspendProcess(flush: boolean, process?: TaskRunProcess): Promise<void> { if (process) { - // Complex conditional logic here + await this.suspendMatchingProcess(process, flush); } else if (this.persistentProcess) { // Handle persistent process suspension } }packages/cli-v3/src/entryPoints/managed-run-worker.ts (1)
454-454
: Process title update may be unnecessary in process reuse scenariosSetting the process title for each task execution might not be optimal when processes are being reused across different tasks, as it could be misleading or cause unnecessary overhead.
Consider whether process title updates are needed in keep-alive scenarios:
- process.title = `trigger-dev-worker: ${execution.task.id} ${execution.run.id}`; + // Process title updates may not be necessary with process reuse + // process.title = `trigger-dev-worker: ${execution.task.id} ${execution.run.id}`;
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (3)
references/hello-world/src/trigger/example.ts
is excluded by!references/**
references/hello-world/src/trigger/schedule.ts
is excluded by!references/**
references/hello-world/trigger.config.ts
is excluded by!references/**
📒 Files selected for processing (37)
.changeset/polite-badgers-suffer.md
(1 hunks)apps/supervisor/src/util.ts
(1 hunks)apps/supervisor/src/workloadManager/docker.ts
(2 hunks)apps/webapp/app/v3/otlpExporter.server.ts
(8 hunks)packages/cli-v3/src/dev/devSupervisor.ts
(6 hunks)packages/cli-v3/src/dev/taskRunProcessPool.ts
(1 hunks)packages/cli-v3/src/entryPoints/dev-run-controller.ts
(4 hunks)packages/cli-v3/src/entryPoints/dev-run-worker.ts
(14 hunks)packages/cli-v3/src/entryPoints/managed-index-worker.ts
(1 hunks)packages/cli-v3/src/entryPoints/managed-run-worker.ts
(14 hunks)packages/cli-v3/src/entryPoints/managed/controller.ts
(9 hunks)packages/cli-v3/src/entryPoints/managed/execution.ts
(21 hunks)packages/cli-v3/src/entryPoints/managed/taskRunProcessProvider.ts
(1 hunks)packages/cli-v3/src/executions/taskRunProcess.ts
(4 hunks)packages/core/src/v3/config.ts
(1 hunks)packages/core/src/v3/lifecycleHooks/manager.ts
(1 hunks)packages/core/src/v3/locals/manager.ts
(1 hunks)packages/core/src/v3/otel/tracingSDK.ts
(2 hunks)packages/core/src/v3/runMetadata/manager.ts
(1 hunks)packages/core/src/v3/runTimelineMetrics/runTimelineMetricsManager.ts
(2 hunks)packages/core/src/v3/runtime/sharedRuntimeManager.ts
(1 hunks)packages/core/src/v3/schemas/build.ts
(1 hunks)packages/core/src/v3/schemas/schemas.ts
(1 hunks)packages/core/src/v3/semanticInternalAttributes.ts
(1 hunks)packages/core/src/v3/taskContext/index.ts
(1 hunks)packages/core/src/v3/taskContext/otelProcessors.ts
(3 hunks)packages/core/src/v3/timeout/api.ts
(2 hunks)packages/core/src/v3/timeout/types.ts
(1 hunks)packages/core/src/v3/timeout/usageTimeoutManager.ts
(1 hunks)packages/core/src/v3/usage/api.ts
(1 hunks)packages/core/src/v3/usage/devUsageManager.ts
(1 hunks)packages/core/src/v3/usage/noopUsageManager.ts
(1 hunks)packages/core/src/v3/usage/prodUsageManager.ts
(2 hunks)packages/core/src/v3/usage/types.ts
(1 hunks)packages/core/src/v3/waitUntil/manager.ts
(1 hunks)packages/core/src/v3/workers/populateEnv.ts
(3 hunks)packages/core/src/v3/workers/taskExecutor.ts
(5 hunks)
🚧 Files skipped from review as they are similar to previous changes (33)
- packages/core/src/v3/schemas/build.ts
- packages/core/src/v3/usage/types.ts
- packages/core/src/v3/semanticInternalAttributes.ts
- packages/core/src/v3/runtime/sharedRuntimeManager.ts
- packages/core/src/v3/otel/tracingSDK.ts
- packages/core/src/v3/locals/manager.ts
- apps/supervisor/src/workloadManager/docker.ts
- packages/core/src/v3/timeout/usageTimeoutManager.ts
- packages/core/src/v3/waitUntil/manager.ts
- .changeset/polite-badgers-suffer.md
- packages/core/src/v3/timeout/types.ts
- packages/core/src/v3/usage/noopUsageManager.ts
- packages/core/src/v3/usage/devUsageManager.ts
- packages/core/src/v3/schemas/schemas.ts
- packages/cli-v3/src/entryPoints/managed-index-worker.ts
- packages/core/src/v3/usage/api.ts
- packages/core/src/v3/lifecycleHooks/manager.ts
- packages/core/src/v3/workers/populateEnv.ts
- packages/core/src/v3/taskContext/otelProcessors.ts
- apps/supervisor/src/util.ts
- packages/core/src/v3/timeout/api.ts
- packages/core/src/v3/runMetadata/manager.ts
- packages/core/src/v3/config.ts
- packages/core/src/v3/runTimelineMetrics/runTimelineMetricsManager.ts
- packages/core/src/v3/usage/prodUsageManager.ts
- packages/cli-v3/src/executions/taskRunProcess.ts
- packages/core/src/v3/taskContext/index.ts
- packages/core/src/v3/workers/taskExecutor.ts
- packages/cli-v3/src/entryPoints/managed/controller.ts
- packages/cli-v3/src/entryPoints/dev-run-controller.ts
- packages/cli-v3/src/entryPoints/managed/execution.ts
- packages/cli-v3/src/dev/devSupervisor.ts
- packages/cli-v3/src/dev/taskRunProcessPool.ts
⏰ Context from checks skipped due to timeout of 90000ms (25)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (10, 10)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (9, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 10)
- GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
- GitHub Check: typecheck / typecheck
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (13)
apps/webapp/app/v3/otlpExporter.server.ts (5)
186-186
: Function rename supports hierarchical attribute extraction.The rename from
extractResourceProperties
toextractEventProperties
better reflects the function's enhanced capability to extract properties at different levels (resource vs event-specific with prefix).
197-200
: Good use of metadata prefix for event-specific properties.The addition of
SemanticInternalAttributes.METADATA
as a prefix enables extraction of nested log-specific properties, allowing event-level overrides of resource-level properties.
300-303
: Consistent metadata prefix usage for spans.The span property extraction follows the same pattern as logs, maintaining consistency in how metadata prefixes are handled.
418-494
: Well-implemented hierarchical key support.The enhanced
extractEventProperties
function correctly handles the optional prefix parameter by passing it to the extraction helpers, enabling nested attribute lookup while maintaining backward compatibility.
716-806
: Robust hierarchical key construction with proper type safety.The attribute extraction helpers are well-enhanced with:
- Function overloads providing clear type contracts
- Array support for hierarchical keys with proper filtering of undefined values
- Consistent dot-joining logic across all extraction functions
The implementation correctly handles both string and array inputs while maintaining existing functionality.
packages/cli-v3/src/entryPoints/dev-run-worker.ts (5)
166-252
: LGTM! Bootstrap caching implementation is well-structuredThe extraction of bootstrap logic into
doBootstrap()
with caching inbootstrap()
is a clean approach that avoids repeated initialization while maintaining the same interface. The metric measurement wrapper provides good observability.
325-334
: Flush synchronization prevents race conditionsThe implementation correctly waits for any ongoing flush operations before starting a new task execution. This prevents race conditions between flush operations and new task starts.
584-624
: Flush promise lifecycle management is well-implementedThe use of
promiseWithResolvers
to create and resolve flush promises provides proper synchronization. The promise is created at the start offlushAll()
and resolved at the end, ensuring subsequent task runs wait for completion.
392-459
: Task import optimization with resource catalog checkThe refactored task import logic first checks the resource catalog before importing, which is an optimization for process reuse scenarios. The error handling and logging are comprehensive.
284-303
: Let’s retry locating and inspectingStandardResourceCatalog
without the incorrect flags:#!/bin/bash # 1. Find the file defining StandardResourceCatalog file=$(rg -l "class StandardResourceCatalog" -g "*.ts") echo "Definition file: $file" # 2. Show its class declaration and surrounding lines echo -e "\n=== Class declaration (+5/-2 lines) ===" rg -n "class StandardResourceCatalog" "$file" -A5 -B2 # 3. Search within that file for any reset or clear methods echo -e "\n=== Methods matching reset|clear ===" rg -nE "reset\s*\(|clear\s*\(" "$file"packages/cli-v3/src/entryPoints/managed-run-worker.ts (3)
157-241
: Bootstrap caching implementation matches dev worker patternThe bootstrap caching implementation is consistent with the dev worker approach, properly extracting the initialization logic and caching results. The managed-specific configuration (different telemetry structure) is handled correctly.
270-289
: Environment reset function is comprehensiveThe
resetExecutionEnvironment()
function properly resets all global managers and state variables. The usage ofusage.reset()
andtimeout.reset()
instead of manager-specific resets reflects the managed environment's usage manager pattern.
584-624
: Flush synchronization implementation mirrors dev workerThe flush promise management using
promiseWithResolvers
is implemented consistently with the dev worker, ensuring proper synchronization between task executions.
…ributes because resource attributes are global and immutable
…tions in the attempt span
Also don’t re-import the task file if the task is already in the resourceCatalog
b0bbc08
to
847109f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (7)
apps/webapp/app/v3/otlpExporter.server.ts (3)
239-268
: Refactor the property merging to reduce duplication and maintenance burdenThe explicit property-by-property merging creates a maintenance nightmare with extensive duplication between logs and spans. This pattern is error-prone and makes it easy to miss properties when adding new fields.
Create a helper function to handle the property merging:
+function mergeEventProperties( + eventProps: ReturnType<typeof extractEventProperties>, + resourceProps: ReturnType<typeof extractEventProperties> +): Partial<CreatableEvent> { + return { + metadata: eventProps.metadata ?? resourceProps.metadata, + serviceName: eventProps.serviceName ?? resourceProps.serviceName ?? "unknown", + serviceNamespace: eventProps.serviceNamespace ?? resourceProps.serviceNamespace ?? "unknown", + environmentId: eventProps.environmentId ?? resourceProps.environmentId ?? "unknown", + environmentType: eventProps.environmentType ?? resourceProps.environmentType ?? "DEVELOPMENT", + organizationId: eventProps.organizationId ?? resourceProps.organizationId ?? "unknown", + projectId: eventProps.projectId ?? resourceProps.projectId ?? "unknown", + projectRef: eventProps.projectRef ?? resourceProps.projectRef ?? "unknown", + runId: eventProps.runId ?? resourceProps.runId ?? "unknown", + runIsTest: eventProps.runIsTest ?? resourceProps.runIsTest ?? false, + taskSlug: eventProps.taskSlug ?? resourceProps.taskSlug ?? "unknown", + taskPath: eventProps.taskPath ?? resourceProps.taskPath ?? "unknown", + workerId: eventProps.workerId ?? resourceProps.workerId ?? "unknown", + workerVersion: eventProps.workerVersion ?? resourceProps.workerVersion ?? "unknown", + queueId: eventProps.queueId ?? resourceProps.queueId ?? "unknown", + queueName: eventProps.queueName ?? resourceProps.queueName ?? "unknown", + batchId: eventProps.batchId ?? resourceProps.batchId, + idempotencyKey: eventProps.idempotencyKey ?? resourceProps.idempotencyKey, + machinePreset: eventProps.machinePreset ?? resourceProps.machinePreset, + machinePresetCpu: eventProps.machinePresetCpu ?? resourceProps.machinePresetCpu, + machinePresetMemory: eventProps.machinePresetMemory ?? resourceProps.machinePresetMemory, + machinePresetCentsPerMs: eventProps.machinePresetCentsPerMs ?? resourceProps.machinePresetCentsPerMs, + }; +} return { // ... other properties - metadata: logProperties.metadata ?? resourceProperties.metadata, - serviceName: logProperties.serviceName ?? resourceProperties.serviceName ?? "unknown", - // ... rest of the manual property assignments + ...mergeEventProperties(logProperties, resourceProperties), attemptId: extractStringAttribute( log.attributes ?? [], [SemanticInternalAttributes.METADATA, SemanticInternalAttributes.ATTEMPT_ID].join(".") ) ?? resourceProperties.attemptId, // ... continue with attempt-specific properties
358-388
: Same property merging duplication issueThis section has the identical property merging pattern as the logs conversion, reinforcing the need for the helper function mentioned in the previous comment.
Apply the same
mergeEventProperties
helper function here:return { // ... other properties - metadata: spanProperties.metadata ?? resourceProperties.metadata, - serviceName: spanProperties.serviceName ?? resourceProperties.serviceName ?? "unknown", - // ... rest of the manual property assignments + ...mergeEventProperties(spanProperties, resourceProperties), attemptId: extractStringAttribute( span.attributes ?? [], [SemanticInternalAttributes.METADATA, SemanticInternalAttributes.ATTEMPT_ID].join(".") ) ?? resourceProperties.attemptId, // ... continue with attempt-specific properties
418-433
: Fix the unsafe type casting for environmentTypeThe
environmentType
property is cast toCreatableEventEnvironmentType
without ensuring it's a valid enum value, which can hide undefined or invalid values.- environmentType: extractStringAttribute(attributes, [ - prefix, - SemanticInternalAttributes.ENVIRONMENT_TYPE, - ]) as CreatableEventEnvironmentType, + environmentType: (extractStringAttribute(attributes, [ + prefix, + SemanticInternalAttributes.ENVIRONMENT_TYPE, + ]) ?? "DEVELOPMENT") as CreatableEventEnvironmentType,Or better yet, validate the value:
+function validateEnvironmentType(value: string | undefined): CreatableEventEnvironmentType { + const validTypes = ["DEVELOPMENT", "STAGING", "PRODUCTION"] as const; + return validTypes.includes(value as any) ? (value as CreatableEventEnvironmentType) : "DEVELOPMENT"; +} - environmentType: extractStringAttribute(attributes, [ - prefix, - SemanticInternalAttributes.ENVIRONMENT_TYPE, - ]) as CreatableEventEnvironmentType, + environmentType: validateEnvironmentType(extractStringAttribute(attributes, [ + prefix, + SemanticInternalAttributes.ENVIRONMENT_TYPE, + ])),packages/cli-v3/src/entryPoints/managed/taskRunProcessProvider.ts (2)
211-217
: cleanup() fires asynchronous work without awaiting
cleanup()
callsthis.cleanupProcess(...)
but returnsvoid
, discarding the promise. If callers assume all processes are dead aftercleanup()
, they might race with the still-running kill.#!/bin/bash # Check if cleanup() is called and expected to be synchronous rg -A 5 '\.cleanup\(\)' --glob '*.ts' | grep -E 'await.*cleanup|\.then'
255-261
: PossibleNaN
/'undefined'
inHEARTBEAT_INTERVAL_MS
TRIGGER_HEARTBEAT_INTERVAL_SECONDS
can be empty →Number(undefined)
→NaN
, which is stringified and forwarded to the child.Guard the value before injecting it:
- HEARTBEAT_INTERVAL_MS: String(this.env.TRIGGER_HEARTBEAT_INTERVAL_SECONDS * 1000), + HEARTBEAT_INTERVAL_MS: this.env.TRIGGER_HEARTBEAT_INTERVAL_SECONDS + ? String(this.env.TRIGGER_HEARTBEAT_INTERVAL_SECONDS * 1000) + : "30000", // Default to 30 secondspackages/cli-v3/src/entryPoints/managed-run-worker.ts (1)
506-507
:AbortSignal.any()
may not be available on Node ≤ 18
AbortSignal.any()
is only available in Node 20+. This will crash at runtime on earlier versions.Implement a fallback for Node 18 compatibility:
-const signal = AbortSignal.any([_cancelController.signal, timeoutController.signal]); +// Create a compatible abort signal that works on Node 18+ +const signal = AbortSignal.any + ? AbortSignal.any([_cancelController.signal, timeoutController.signal]) + : (() => { + const controller = new AbortController(); + const abort = () => controller.abort(); + _cancelController.signal.addEventListener('abort', abort); + timeoutController.signal.addEventListener('abort', abort); + return controller.signal; + })();packages/cli-v3/src/entryPoints/dev-run-controller.ts (1)
610-626
:version
sent to pool may differ on return
getProcess
receivesversion: this.opts.worker.serverWorker?.version
(possiblyundefined
), butreturnProcess
uses"unknown"
when undefined. This mismatch can leak processes in the pool.Use consistent version normalization:
+private normalizeVersion(): string { + return this.opts.worker.serverWorker?.version || "unknown"; +} // In executeRun: -version: this.opts.worker.serverWorker?.version, +version: this.normalizeVersion(), // In all returnProcess calls: -const version = this.opts.worker.serverWorker?.version || "unknown"; +const version = this.normalizeVersion();Also applies to: 656-659
🧹 Nitpick comments (2)
packages/cli-v3/src/dev/taskRunProcessPool.ts (2)
129-134
: Consider cleaning up execution counts for killed processesThe execution count map will grow indefinitely as it's never cleaned up when processes are killed.
Clean up the execution count when killing a process:
private async killProcess(process: TaskRunProcess): Promise<void> { try { await process.cleanup(true); + if (process.pid) { + this.executionCountsPerProcess.delete(process.pid); + } } catch (error) { logger.debug("[TaskRunProcessPool] Error killing process", { error }); } }
179-207
: Comprehensive reuse eligibility checksThe method thoroughly evaluates all relevant factors for process reuse eligibility.
Consider extracting the magic number for readability:
+private readonly DEFAULT_MAX_EXECUTIONS = 50; + constructor(options: TaskRunProcessPoolOptions) { this.options = options; this.maxPoolSize = options.maxPoolSize ?? 3; - this.maxExecutionsPerProcess = options.maxExecutionsPerProcess ?? 50; + this.maxExecutionsPerProcess = options.maxExecutionsPerProcess ?? this.DEFAULT_MAX_EXECUTIONS; }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (3)
references/hello-world/src/trigger/example.ts
is excluded by!references/**
references/hello-world/src/trigger/schedule.ts
is excluded by!references/**
references/hello-world/trigger.config.ts
is excluded by!references/**
📒 Files selected for processing (38)
.changeset/polite-badgers-suffer.md
(1 hunks)apps/supervisor/src/util.ts
(1 hunks)apps/supervisor/src/workloadManager/docker.ts
(2 hunks)apps/webapp/app/utils/timelineSpanEvents.ts
(0 hunks)apps/webapp/app/v3/otlpExporter.server.ts
(8 hunks)packages/cli-v3/src/dev/devSupervisor.ts
(6 hunks)packages/cli-v3/src/dev/taskRunProcessPool.ts
(1 hunks)packages/cli-v3/src/entryPoints/dev-run-controller.ts
(4 hunks)packages/cli-v3/src/entryPoints/dev-run-worker.ts
(14 hunks)packages/cli-v3/src/entryPoints/managed-index-worker.ts
(1 hunks)packages/cli-v3/src/entryPoints/managed-run-worker.ts
(14 hunks)packages/cli-v3/src/entryPoints/managed/controller.ts
(9 hunks)packages/cli-v3/src/entryPoints/managed/execution.ts
(21 hunks)packages/cli-v3/src/entryPoints/managed/taskRunProcessProvider.ts
(1 hunks)packages/cli-v3/src/executions/taskRunProcess.ts
(4 hunks)packages/core/src/v3/config.ts
(1 hunks)packages/core/src/v3/lifecycleHooks/manager.ts
(1 hunks)packages/core/src/v3/locals/manager.ts
(1 hunks)packages/core/src/v3/otel/tracingSDK.ts
(2 hunks)packages/core/src/v3/runMetadata/manager.ts
(1 hunks)packages/core/src/v3/runTimelineMetrics/runTimelineMetricsManager.ts
(2 hunks)packages/core/src/v3/runtime/sharedRuntimeManager.ts
(1 hunks)packages/core/src/v3/schemas/build.ts
(1 hunks)packages/core/src/v3/schemas/schemas.ts
(1 hunks)packages/core/src/v3/semanticInternalAttributes.ts
(1 hunks)packages/core/src/v3/taskContext/index.ts
(1 hunks)packages/core/src/v3/taskContext/otelProcessors.ts
(3 hunks)packages/core/src/v3/timeout/api.ts
(2 hunks)packages/core/src/v3/timeout/types.ts
(1 hunks)packages/core/src/v3/timeout/usageTimeoutManager.ts
(1 hunks)packages/core/src/v3/usage/api.ts
(1 hunks)packages/core/src/v3/usage/devUsageManager.ts
(1 hunks)packages/core/src/v3/usage/noopUsageManager.ts
(1 hunks)packages/core/src/v3/usage/prodUsageManager.ts
(2 hunks)packages/core/src/v3/usage/types.ts
(1 hunks)packages/core/src/v3/waitUntil/manager.ts
(1 hunks)packages/core/src/v3/workers/populateEnv.ts
(3 hunks)packages/core/src/v3/workers/taskExecutor.ts
(5 hunks)
💤 Files with no reviewable changes (1)
- apps/webapp/app/utils/timelineSpanEvents.ts
🚧 Files skipped from review as they are similar to previous changes (32)
- .changeset/polite-badgers-suffer.md
- packages/core/src/v3/schemas/build.ts
- apps/supervisor/src/workloadManager/docker.ts
- packages/core/src/v3/usage/types.ts
- packages/core/src/v3/locals/manager.ts
- packages/core/src/v3/usage/api.ts
- packages/core/src/v3/timeout/types.ts
- packages/core/src/v3/waitUntil/manager.ts
- packages/core/src/v3/usage/devUsageManager.ts
- packages/core/src/v3/usage/noopUsageManager.ts
- packages/core/src/v3/semanticInternalAttributes.ts
- packages/core/src/v3/runMetadata/manager.ts
- packages/cli-v3/src/entryPoints/managed-index-worker.ts
- packages/core/src/v3/runtime/sharedRuntimeManager.ts
- packages/core/src/v3/timeout/api.ts
- packages/core/src/v3/timeout/usageTimeoutManager.ts
- packages/core/src/v3/schemas/schemas.ts
- packages/core/src/v3/lifecycleHooks/manager.ts
- packages/core/src/v3/taskContext/otelProcessors.ts
- packages/core/src/v3/workers/populateEnv.ts
- packages/core/src/v3/otel/tracingSDK.ts
- apps/supervisor/src/util.ts
- packages/core/src/v3/config.ts
- packages/core/src/v3/usage/prodUsageManager.ts
- packages/core/src/v3/runTimelineMetrics/runTimelineMetricsManager.ts
- packages/core/src/v3/taskContext/index.ts
- packages/cli-v3/src/executions/taskRunProcess.ts
- packages/core/src/v3/workers/taskExecutor.ts
- packages/cli-v3/src/dev/devSupervisor.ts
- packages/cli-v3/src/entryPoints/managed/controller.ts
- packages/cli-v3/src/entryPoints/dev-run-worker.ts
- packages/cli-v3/src/entryPoints/managed/execution.ts
🧰 Additional context used
📓 Path-based instructions (2)
`apps/webapp/app/v3/otlpExporter.server.ts`: This file contains code for TaskEvent data (otel data sent from tasks to servers).
apps/webapp/app/v3/otlpExporter.server.ts
: This file contains code for TaskEvent data (otel data sent from tasks to servers).
📄 Source: CodeRabbit Inference Engine (.cursor/rules/webapp.mdc)
List of files the instruction was applied to:
apps/webapp/app/v3/otlpExporter.server.ts
`apps/webapp/**/*`: The webapp at apps/webapp is a Remix 2.1 app that uses Node.js v20. Use zod extensively in the webapp.
apps/webapp/**/*
: The webapp at apps/webapp is a Remix 2.1 app that uses Node.js v20.
Use zod extensively in the webapp.
📄 Source: CodeRabbit Inference Engine (.github/copilot-instructions.md)
List of files the instruction was applied to:
apps/webapp/app/v3/otlpExporter.server.ts
⏰ Context from checks skipped due to timeout of 90000ms (25)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (10, 10)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (9, 10)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 10)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
- GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
- GitHub Check: typecheck / typecheck
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (13)
apps/webapp/app/v3/otlpExporter.server.ts (8)
186-186
: LGTM: Function rename and parameter additionThe rename from
extractResourceProperties
toextractEventProperties
better reflects the function's purpose, and the optional prefix parameter enables hierarchical attribute extraction.
197-200
: LGTM: Consistent prefix usage for nested metadata extractionThe addition of the
SemanticInternalAttributes.METADATA
prefix for log properties extraction is consistent with the span implementation.
300-303
: LGTM: Consistent prefix usage for nested metadata extractionThe span properties extraction correctly uses the same prefix pattern as the log implementation.
434-490
: LGTM: Prefix support for nested attribute extractionThe implementation correctly uses the optional prefix parameter to build hierarchical attribute keys. The logic handles both single string keys and array-based nested keys appropriately.
714-733
: LGTM: Enhanced key handling with array supportThe function overloads and array key handling logic are well-implemented. The
filter(Boolean).join(".")
approach correctly handles undefined values in the array and creates proper dot-notation keys.
737-756
: LGTM: Consistent implementation across attribute extractorsThe
extractNumberAttribute
function follows the same pattern asextractStringAttribute
for array key handling.
760-779
: LGTM: Consistent implementation continuesThe
extractDoubleAttribute
function maintains consistency with the other extractor functions.
783-802
: LGTM: Complete implementation of array key supportThe
extractBooleanAttribute
function completes the consistent implementation of array key support across all attribute extractors.packages/cli-v3/src/entryPoints/managed-run-worker.ts (3)
157-259
: Excellent refactoring of bootstrap logicThe extraction of
doBootstrap()
with metric tracking and caching inbootstrap()
is a clean implementation that will significantly improve warm start performance.
316-324
: Well-implemented flush synchronizationThe addition of
_lastFlushPromise
tracking and awaiting ensures proper synchronization between task executions, preventing potential data races during flush operations.Also applies to: 585-624
270-289
: Comprehensive environment reset implementationThe
resetExecutionEnvironment()
function properly resets all global state, which is crucial for safe process reuse. Good job ensuring all managers have reset methods.Verify that all global managers have reset() methods implemented:
#!/bin/bash # Check if all managers referenced in resetExecutionEnvironment have reset methods for manager in "StandardLocalsManager" "StandardLifecycleHooksManager" "StandardRunTimelineMetricsManager" "StandardMetadataManager" "StandardWaitUntilManager" "SharedRuntimeManager" "DurableClock"; do echo "Checking $manager for reset method:" ast-grep --pattern "class $manager { $$$ reset($$$) { $$$ } $$$ }" donepackages/cli-v3/src/entryPoints/dev-run-controller.ts (1)
656-664
: Good defensive error handling for pool operationsThe consistent pattern of catching and logging errors when returning processes to the pool ensures that pool management failures don't interrupt the main execution flow.
Also applies to: 776-785, 814-821
packages/cli-v3/src/dev/taskRunProcessPool.ts (1)
46-120
: Well-structured process acquisition logicThe
getProcess
method correctly handles both reuse and new process creation scenarios with proper state management between available and busy pools.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
apps/webapp/test/timelineSpanEvents.test.ts (1)
26-33
: Consider updating test data comments for clarity.While the test data correctly preserves the original structure with specific file names (which is good for testing the transformation), consider adding comments to clarify that the specific file information in the test data gets transformed to generic event names.
This would help future maintainers understand why the test data contains specific file names but the assertions expect generic names.
{ name: "trigger.dev/start", time: new Date("2025-04-04T08:39:27.224Z"), properties: { + // File name in properties gets transformed to generic "Importing task file" event name file: "src/trigger/chat.ts", event: "import", duration: 67, entryPoint: "/Users/eric/code/triggerdotdev/trigger.dev/references/d3-chat/.trigger/tmp/build-AL7zTl/references/d3-chat/src/trigger/chat.mjs", }, },
Also applies to: 51-58
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
apps/webapp/test/timelineSpanEvents.test.ts
(5 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`apps/webapp/**/*`: The webapp at apps/webapp is a Remix 2.1 app that uses Node.js v20. Use zod extensively in the webapp.
apps/webapp/**/*
: The webapp at apps/webapp is a Remix 2.1 app that uses Node.js v20.
Use zod extensively in the webapp.
📄 Source: CodeRabbit Inference Engine (.github/copilot-instructions.md)
List of files the instruction was applied to:
apps/webapp/test/timelineSpanEvents.test.ts
⏰ Context from checks skipped due to timeout of 90000ms (25)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (10, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (9, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 10)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 10)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
- GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 10)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
- GitHub Check: typecheck / typecheck
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (5)
apps/webapp/test/timelineSpanEvents.test.ts (5)
79-79
: LGTM: Consistent test assertion update.The test assertion correctly expects the generic "Importing task file" event name, aligning with the updated event naming strategy.
89-89
: LGTM: Event ordering test updated correctly.The event ordering test properly expects "Importing task file" as the 4th event in the sorted timeline.
179-181
: LGTM: Help text verification updated consistently.The test correctly verifies that the generic "Importing task file" event has the appropriate help text "A task file was imported".
190-190
: LGTM: Duration preservation test updated correctly.The test properly verifies that duration is preserved for the generic "Importing task file" event.
217-217
: LGTM: Non-admin visibility test updated consistently.The test correctly verifies that non-admin users can see the generic "Importing task file" event when no fork event exists.
const enableProcessReuse = | ||
typeof this.options.config.experimental_processKeepAlive === "boolean" | ||
? this.options.config.experimental_processKeepAlive | ||
: false; | ||
|
||
if (enableProcessReuse) { | ||
logger.debug("[DevSupervisor] Enabling process reuse", { | ||
enableProcessReuse, | ||
}); | ||
} | ||
|
||
this.taskRunProcessPool = new TaskRunProcessPool({ | ||
env, | ||
cwd: this.options.config.workingDir, | ||
enableProcessReuse: | ||
typeof this.options.config.experimental_processKeepAlive === "boolean" | ||
? this.options.config.experimental_processKeepAlive | ||
: typeof this.options.config.experimental_processKeepAlive === "object" | ||
? this.options.config.experimental_processKeepAlive.enabled | ||
: false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The enableProcessReuse
var and prop differ
await this.taskRunProcess?.kill("SIGKILL"); | ||
if (this.taskRunProcess) { | ||
await this.taskRunProcessProvider.handleProcessAbort(this.taskRunProcess); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to check that this does the same thing as before. handleProcessAbort
calls process.cleanup instead of process.kill
@@ -249,7 +232,11 @@ export class RunExecution { | |||
if (this.currentAttemptNumber && this.currentAttemptNumber !== run.attemptNumber) { | |||
this.sendDebugLog("error: attempt number mismatch", snapshotMetadata); | |||
// This is a rogue execution, a new one will already have been created elsewhere | |||
await this.exitTaskRunProcessWithoutFailingRun({ flush: false }); | |||
// TODO: keep this one, kill the process even if it's a keep-alive one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo?
@@ -263,7 +250,11 @@ export class RunExecution { | |||
if (deprecated) { | |||
this.sendDebugLog("run execution is deprecated", { incomingSnapshot: snapshot }); | |||
|
|||
await this.exitTaskRunProcessWithoutFailingRun({ flush: false }); | |||
// TODO: keep this one, kill the process even if it's a keep-alive one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo?
case "FINISHED": { | ||
this.sendDebugLog("run is finished", snapshotMetadata); | ||
|
||
await this.exitTaskRunProcessWithoutFailingRun({ flush: true }); | ||
// This can sometimes be called before the handleCompletionResult, so we don't need to do anything here | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we need to exit here after all. Maybe not immediately, but as a backup. It could help catch race conditions where the poller detects a run is finished, even if handleCompletionResult is never called.
@@ -150,6 +148,7 @@ export class TaskRunProcess { | |||
PATH: process.env.PATH, | |||
TRIGGER_PROCESS_FORK_START_TIME: String(Date.now()), | |||
TRIGGER_WARM_START: this.options.isWarmStart ? "true" : "false", | |||
TRIGGERDOTDEV: "1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🥳
Currently during a warm start, we still recreate the actual task run process between runs, leading to a completely fresh global environment for each run. This experimental option will keep the task run process alive between run executions, leading to even faster warm starts. This option is respected in both the
dev
CLI and in deployed tasks.To enable this option, add this to your
trigger.config.ts
:You can also pass an object to
experimental_processKeepAlive
to provide more options:Gotchas
run
function of a task.We recommend enabling this option and testing in a staging or preview environment before trying it out in prod, as there could be unknown issues depending on what you are doing in your tasks.
How to use
Install the following version of all the
@trigger.dev/*
packages and thetrigger.dev
CLI:0.0.0-process-keep-alive-20250623153513
This is part 1 of 2 in a stack made with GitButler: