Skip to content

v4: Experimental process keep alive #2183

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 32 commits into
base: main
Choose a base branch
from
Open

Conversation

ericallam
Copy link
Member

@ericallam ericallam commented Jun 18, 2025

Currently during a warm start, we still recreate the actual task run process between runs, leading to a completely fresh global environment for each run. This experimental option will keep the task run process alive between run executions, leading to even faster warm starts. This option is respected in both the dev CLI and in deployed tasks.

To enable this option, add this to your trigger.config.ts:

import { defineConfig } from "@trigger.dev/sdk";

export default defineConfig({
  project: "<project ref>",
  // This is false by default
  experimental_processKeepAlive: true,
  maxDuration: 60,
});

You can also pass an object to experimental_processKeepAlive to provide more options:

import { defineConfig } from "@trigger.dev/sdk";

export default defineConfig({
  project: "<project ref>",
  experimental_processKeepAlive: {
    enabled: true,
    // After 20 runs execute with a single process, we'll restart the process and start fresh
    maxExecutionsPerProcess: 20,
    // In dev, you can combine this option with setting a max pool size, giving you the ability to limit the number of processes created on your local dev machine. Has no effect on deployed tasks
    devMaxPoolSize: 10,
  },
  maxDuration: 60,
});

Gotchas

  • Be careful with memory usage and memory leaks, as this will cause memory to persist across run executions.
  • It's possible different tasks get executed in the same persisted process.
  • If you configure any 3rd party SDKs globally using env vars for API keys, those SDKs will not change between runs. So if you change an env var between runs, the SDK will be "stale" and continue using the old env var value. Instead, you should initialize SDKs using env vars at runtime (in any of the lifecycle hooks or inside the run function of a task.
  • Cancelling a task will cause the task run process to be restarted. Exiting the process is the only reliable way to actually stop a running function from stopping.
  • This DOES NOT effect cold starts, warm starts only will be improved.

We recommend enabling this option and testing in a staging or preview environment before trying it out in prod, as there could be unknown issues depending on what you are doing in your tasks.

How to use

Install the following version of all the @trigger.dev/* packages and the trigger.dev CLI: 0.0.0-process-keep-alive-20250623153513


This is part 1 of 2 in a stack made with GitButler:

Copy link

changeset-bot bot commented Jun 18, 2025

🦋 Changeset detected

Latest commit: 7c70800

The changes in this PR will be included in the next version bump.

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link
Contributor

coderabbitai bot commented Jun 18, 2025

Walkthrough

This update introduces a comprehensive process keep-alive and process pooling mechanism for task execution in both development and managed environments. New classes (TaskRunProcessPool, TaskRunProcessProvider) manage the lifecycle and reuse of task execution processes, governed by configuration options in the TriggerConfig and WorkerManifest schemas. The DevSupervisor, DevRunController, and managed run controller/execution flows are refactored to utilize these pools/providers, supporting process reuse, version deprecation, and cleanup. Numerous core managers (locals, hooks, usage, timeout, waitUntil, metadata, metrics, and shared runtime) gain reset() methods to facilitate environment resets between executions. The worker entry points are updated to reset execution state before each run and to synchronize flush operations. Additional changes include enhancements to environment variable population, resource attribute extraction supporting nested keys, and schema updates for process keep-alive and queue concurrency limits.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 352b046 and 7c70800.

📒 Files selected for processing (1)
  • internal-packages/run-engine/vitest.config.ts (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • internal-packages/run-engine/vitest.config.ts
⏰ Context from checks skipped due to timeout of 90000ms (25)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (10, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 10)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (9, 10)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 10)
  • GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
  • GitHub Check: typecheck / typecheck
  • GitHub Check: Analyze (javascript-typescript)
✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 19

🔭 Outside diff range comments (1)
apps/webapp/app/v3/otlpExporter.server.ts (1)

289-304: Span-side extraction duplicates log logic

Same comment as above – consider centralising the merge logic to avoid two divergent copies.

♻️ Duplicate comments (1)
packages/cli-v3/src/entryPoints/dev-run-worker.ts (1)

468-470: Same Node-version concern as managed worker

AbortSignal.any() requires Node 20. Add the same fallback / poly-fill strategy here to stay compatible with common Node 18 dev environments.

🧹 Nitpick comments (22)
packages/core/src/v3/locals/manager.ts (1)

37-39: Minor: consider idempotent reset in multi-instance scenarios.

reset() blindly clears the shared store. If multiple logical subsystems share the same StandardLocalsManager instance, one caller could erase locals needed by another.

If cross-component sharing is expected, expose a new isolated instance instead of clearing the global one, or guard the call behind ownership checks.

packages/core/src/v3/waitUntil/manager.ts (2)

6-8: Reset logic is sound, but duplicated with blockUntilSettled() finaliser.

reset() clears maybeDeferredPromises, yet blockUntilSettled() also does so on line 28.
Not harmful, but the double-clear is redundant.
Optionally call reset() from within blockUntilSettled() to centralise the logic.


15-21: Typo in property name reduces readability.

promisesRequringResolving is missing an “i”.
While internal, consistent naming helps maintenance.

-  private get promisesRequringResolving(): MaybeDeferredPromise[] {
+  private get promisesRequiringResolving(): MaybeDeferredPromise[] {
.changeset/polite-badgers-suffer.md (1)

5-5: Nit: capitalise feature description for consistency.

-experimental processKeepAlive
+Experimental: processKeepAlive
packages/core/src/v3/usage/devUsageManager.ts (1)

53-57: Stop or finalize in-flight measurements before wiping maps

reset() clears _currentMeasurements, but any DevUsageMeasurement already handed out keeps running and will never be stopped or sampled.
Consider finishing them (e.g. measurement.stop()) or at least logging dropped measurements before the map is emptied. This prevents silent loss of usage data and avoids dangling references.

reset(): void {
-  this._firstMeasurement = undefined;
-  this._currentMeasurements.clear();
-  this._pauses.clear();
+  // Finalise any active measurements to avoid leaking state
+  for (const m of this._currentMeasurements.values()) {
+    m.stop();
+  }
+
+  this._firstMeasurement = undefined;
+  this._currentMeasurements.clear();
+  this._pauses.clear();
}
packages/core/src/v3/runtime/sharedRuntimeManager.ts (1)

44-47: Consider cancelling the 5-minute debug interval on reset

While reset() clears resolver maps, the setInterval created in the constructor keeps running forever.
If the same manager instance is reused across executions, each call to reset() adds no new interval but also never removes the old one – acceptable, yet you may want to clear it to minimise noise and avoid multiple loggers if the class is ever reinstantiated.

Minor, but worth noting.

packages/cli-v3/src/entryPoints/managed-index-worker.ts (1)

162-167: Factor out keep-alive config mapping to a shared helper

Identical ternary logic is popping up in multiple entry points. Centralising the conversion to a small helper (e.g. toProcessKeepAliveConfig(config.experimental_processKeepAlive)) removes duplication and guarantees consistent behaviour if the rules ever change.

packages/core/src/v3/otel/tracingSDK.ts (1)

60-63: Getter is fine – consider thread-safety annotation

isResolved is a harmless addition and useful for the call-sites.
If you anticipate concurrent access from multiple worker threads, mark the field private _resolved: volatile boolean (or document the race expectations) – currently it’s fine in single-threaded Node.

packages/core/src/v3/timeout/api.ts (1)

45-48: Consider flipping the reset / disable order

reset() first calls #getManager().reset() before unregistering the global manager.
If reset() implementations inspect global state (e.g. call back into TimeoutAPI.getInstance()) they will still see themselves registered, which is good, but conversely any code running concurrently after reset() but before unregisterGlobal() still sees a “live” manager that has just been reset – potentially surprising.

-    this.#getManager().reset();
-    this.disable();
+    const manager = this.#getManager();
+    this.disable();          // make future look-ups return NOOP immediately
+    manager.reset();         // finally reset the previous instance

This keeps externally-visible state consistent while still allowing the underlying manager to clean itself up.
Feel free to ignore if your implementation relies on the current order.

packages/cli-v3/src/executions/taskRunProcess.ts (1)

296-298: Minor nit – mark isExecuting() as readonly getter

A getter is more idiomatic and prevents accidental reassignment.

-  isExecuting() {
-    return this._currentExecution !== undefined;
-  }
+  get isExecuting() {
+    return this._currentExecution !== undefined;
+  }
packages/core/src/v3/config.ts (1)

237-261: enabled property seems redundant when the parent key is already optional

Because experimental_processKeepAlive can itself be false, the nested { enabled } flag duplicates information:

experimental_processKeepAlive?: boolean | {
  enabled: boolean;   // duplicated
  
}

Simplify the shape by dropping enabled and relying on:

experimental_processKeepAlive?: {
  maxExecutionsPerProcess?: number;
  devMaxPoolSize?: number;
} | false;

This removes one boolean and a layer of cognitive overhead for users.

packages/core/src/v3/workers/populateEnv.ts (1)

34-35: Minor: previousEnv belongs in destructuring but isn’t documented in JSDoc.

Consider adding it to the function comment for clarity.

packages/core/src/v3/runTimelineMetrics/runTimelineMetricsManager.ts (1)

41-46: Duplicate metric risk.

#seedMetricsFromEnvironment() runs every call to registerMetricsFromExecution; with a long-lived process this seeds the same fork metric repeatedly.
Guard with a boolean or clear in reset().

packages/cli-v3/src/dev/devSupervisor.ts (1)

100-133: Boolean plumbing is overly repetitive.

enableProcessReuse is computed on 103-107, but the ternary logic is duplicated when constructing the pool (117-122). Reuse the variable to avoid divergence.

packages/cli-v3/src/entryPoints/dev-run-controller.ts (1)

777-785: Fallback if returnProcess fails.

When returning the process in runFinished fails, the process handle is lost but still running. Call kill() after logging the failure to avoid leaks.

packages/cli-v3/src/entryPoints/managed/controller.ts (1)

41-43: Duplicate state – local vars already exist in manifest.

processKeepAliveEnabled/MaxExecutionCount duplicate workerManifest.processKeepAlive. You can read straight from the manifest where needed, reducing state drift.

packages/cli-v3/src/entryPoints/managed/execution.ts (1)

612-616: Leaked reference to completed process

After a successful returnProcess, this.taskRunProcess still holds the (possibly pooled) process instance.
On the next attempt getProcess() will create / supply another process but the old reference remains, preventing GC and complicating reasoning.
Set it to undefined once the provider takes ownership back.

await this.taskRunProcessProvider.returnProcess(this.taskRunProcess);
-// ...
+this.taskRunProcess = undefined;
packages/cli-v3/src/dev/taskRunProcessPool.ts (1)

130-134: Guard pid before using as map key

process.pid is assumed truthy. If spawning failed or the child died early, pid can be undefined, leading to the magic key 0 and count collisions.
Skip bookkeeping when pid is falsy.

apps/webapp/app/v3/otlpExporter.server.ts (4)

186-188: Good abstraction, but consider memoising per‐resource

resourceProperties is recomputed for every resource group, which is fine, but if you later loop over multiple scopes of the same resource you could avoid repeating the extraction by caching the result in a map keyed by the resource pointer or hash.


191-213: Reuse the computed logLevel

You already compute const logLevel = logLevelToEventLevel(log.severityNumber); but later call the helper again for the level property.
Use the cached variable to eliminate an unnecessary function call.

-  level: logLevelToEventLevel(log.severityNumber),
+  level: logLevel,

418-494: Type-safety of extractEventProperties

  1. The direct cast as CreatableEventEnvironmentType is unsafe (value may be undefined or an arbitrary string).
    Provide a whitelist check or default:
- environmentType: extractStringAttribute(... ) as CreatableEventEnvironmentType,
+ environmentType:
+   (extractStringAttribute(...) as CreatableEventEnvironmentType) ?? "DEVELOPMENT",
  1. Repeating extractStringAttribute(attributes, [prefix, X]) can be shortened by a local helper:
const k = (k: string) => extractStringAttribute(attributes, [prefix, k]);

which will further simplify maintenance.


716-807: Overload bloat can be trimmed

The four helper extractors (extractStringAttribute, etc.) repeat two overload signatures each. TS supports an optional parameter, so one overload is enough:

function extractStringAttribute(
  attrs: KeyValue[],
  name: string | (string | undefined)[],
  fallback?: string,
): string | undefined { ... }

Less code, same type safety.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3c93783 and bdc515a.

⛔ Files ignored due to path filters (3)
  • references/hello-world/src/trigger/example.ts is excluded by !references/**
  • references/hello-world/src/trigger/schedule.ts is excluded by !references/**
  • references/hello-world/trigger.config.ts is excluded by !references/**
📒 Files selected for processing (36)
  • .changeset/polite-badgers-suffer.md (1 hunks)
  • apps/supervisor/src/util.ts (1 hunks)
  • apps/supervisor/src/workloadManager/docker.ts (2 hunks)
  • apps/webapp/app/v3/otlpExporter.server.ts (8 hunks)
  • packages/cli-v3/src/dev/devSupervisor.ts (6 hunks)
  • packages/cli-v3/src/dev/taskRunProcessPool.ts (1 hunks)
  • packages/cli-v3/src/entryPoints/dev-run-controller.ts (4 hunks)
  • packages/cli-v3/src/entryPoints/dev-run-worker.ts (13 hunks)
  • packages/cli-v3/src/entryPoints/managed-index-worker.ts (1 hunks)
  • packages/cli-v3/src/entryPoints/managed-run-worker.ts (12 hunks)
  • packages/cli-v3/src/entryPoints/managed/controller.ts (9 hunks)
  • packages/cli-v3/src/entryPoints/managed/execution.ts (13 hunks)
  • packages/cli-v3/src/entryPoints/managed/taskRunProcessProvider.ts (1 hunks)
  • packages/cli-v3/src/executions/taskRunProcess.ts (4 hunks)
  • packages/core/src/v3/config.ts (1 hunks)
  • packages/core/src/v3/lifecycleHooks/manager.ts (1 hunks)
  • packages/core/src/v3/locals/manager.ts (1 hunks)
  • packages/core/src/v3/otel/tracingSDK.ts (2 hunks)
  • packages/core/src/v3/runMetadata/manager.ts (1 hunks)
  • packages/core/src/v3/runTimelineMetrics/runTimelineMetricsManager.ts (2 hunks)
  • packages/core/src/v3/runtime/sharedRuntimeManager.ts (1 hunks)
  • packages/core/src/v3/schemas/build.ts (1 hunks)
  • packages/core/src/v3/schemas/schemas.ts (1 hunks)
  • packages/core/src/v3/taskContext/index.ts (1 hunks)
  • packages/core/src/v3/taskContext/otelProcessors.ts (3 hunks)
  • packages/core/src/v3/timeout/api.ts (2 hunks)
  • packages/core/src/v3/timeout/types.ts (1 hunks)
  • packages/core/src/v3/timeout/usageTimeoutManager.ts (1 hunks)
  • packages/core/src/v3/usage/api.ts (1 hunks)
  • packages/core/src/v3/usage/devUsageManager.ts (1 hunks)
  • packages/core/src/v3/usage/noopUsageManager.ts (1 hunks)
  • packages/core/src/v3/usage/prodUsageManager.ts (1 hunks)
  • packages/core/src/v3/usage/types.ts (1 hunks)
  • packages/core/src/v3/waitUntil/manager.ts (1 hunks)
  • packages/core/src/v3/workers/populateEnv.ts (3 hunks)
  • packages/core/src/v3/workers/taskExecutor.ts (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (24)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 10)
  • GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (9, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 10)
  • GitHub Check: typecheck / typecheck
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (10, 10)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 10)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 10)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
  • GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (13)
packages/core/src/v3/taskContext/index.ts (1)

38-48: attributes getter no longer includes resourceAttributes – verify downstream expectations.

Prior versions merged all context + resource keys into .attributes.
After introducing resourceAttributes(), callers that still consume .attributes will silently lose environment / organisation / machine data.

Confirm that:

  1. All OTEL spans now call taskContext.resourceAttributes explicitly, or
  2. .attributes is intentionally slim and documented as such.

If not deliberate, append ...this.resourceAttributes to the spread list.

packages/core/src/v3/usage/noopUsageManager.ts (1)

30-32: LGTM – maintains interface parity.

packages/core/src/v3/usage/types.ts (1)

17-18: Interface change is breaking – ensure all implementations compile

Adding reset(): void to UsageManager is a breaking API change. Double-check that every custom/third-party implementation used inside the repo (and in downstream extensions) now implements this method to avoid compile/runtime errors.
A quick rg "implements UsageManager" can confirm.

packages/core/src/v3/timeout/types.ts (1)

4-4: Sync all TimeoutManager implementations with new reset contract

reset is now mandatory. Verify NoopTimeoutManager, any test fakes, and external plugins compile and provide a no-op implementation.

packages/core/src/v3/usage/api.ts (1)

51-54: Looks good – reset path correctly cleans up usage manager

The two-step reset() (manager reset ➔ disable global) is sound and avoids losing the reference before cleanup.

apps/supervisor/src/workloadManager/docker.ts (1)

80-82: Good call – host-aware warm-start URL

Using normalizeDockerHostUrl for TRIGGER_WARM_START_URL prevents “localhost“ pitfalls on macOS/Windows Docker.

packages/core/src/v3/schemas/schemas.ts (1)

165-167: Verify ramifications of raising concurrencyLimit hard-cap 100×

Jumping from 1 000 to 100 000 materially alters worst-case resource usage.
Please double-check:

  1. Any counters, DB fields, or metrics that assume the old 4-digit bound.
  2. Overflow / memory concerns when pre-allocating arrays or semaphore tokens.
  3. Operational safeguards (e.g. Kubernetes HPA, queue depth safeguards) that could now be bypassed.

If downstream code already treats the value as “untrusted” and clamps to environment limits, great – otherwise consider introducing a softer default (e.g. keep max(…) at 100 000 but apply min(requested, envLimit) at runtime).

packages/core/src/v3/otel/tracingSDK.ts (1)

130-132: 👍 Adds useful SDK metadata to the resource

Embedding SDK_VERSION and SDK_LANGUAGE helps trace attribution and doesn’t break prior contracts. Looks good.

packages/core/src/v3/taskContext/otelProcessors.ts (1)

22-23: 👍 Cleaner attribute propagation

Switching to taskContext.attributes removes duplication and keeps the attribute contract centralised – looks good.

Also applies to: 72-73, 98-99

packages/cli-v3/src/executions/taskRunProcess.ts (1)

312-313: 👍 Helpful debug context

Including the PID in exit logs will make post-mortem debugging far easier.

packages/core/src/v3/runTimelineMetrics/runTimelineMetricsManager.ts (1)

36-38: reset() not invoked anywhere yet.

If the intent is to clear metrics between pooled executions, ensure callers (e.g. worker warm-start path) actually call reset().

packages/cli-v3/src/dev/devSupervisor.ts (1)

336-343: Guard may hide bugs when pool creation fails.

The pool is always instantiated; reaching this branch means init failed silently. Prefer throwing during init() rather than silently skipping runs.

packages/cli-v3/src/entryPoints/managed/execution.ts (1)

131-144: prepareForExecution became async – ensure callers await

The method now returns a Promise, but its signature didn't change in callers.
Search & update all invocations (await runExecution.prepareForExecution(...)) to avoid executing with an un-initialised process.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
packages/cli-v3/src/entryPoints/managed/execution.ts (1)

131-144: Warm-start process is created without resetting its event listeners

prepareForExecution() fetches a (possibly reused) TaskRunProcess, but does not call attachTaskRunProcessHandlers().
Until executeRun() runs, the process keeps the handlers from the previous attempt, so any heartbeat/debug-log emitted during that window will be routed to the stale callbacks and reference the wrong run ID.

Attach handlers immediately after acquiring the process (or detach them here) to guarantee a clean slate.

🧹 Nitpick comments (1)
packages/cli-v3/src/entryPoints/managed/execution.ts (1)

616-623: Clear this.taskRunProcess after returning it to the provider

After a successful return the reference is stale and its handlers are detached.
Nulling it out prevents accidental reuse and makes future state checks (if (this.taskRunProcess)) more accurate.

-      this.taskRunProcessProvider.returnProcess(this.taskRunProcess)
+      this.taskRunProcessProvider.returnProcess(this.taskRunProcess)
+        .then(() => {
+          this.taskRunProcess = undefined;
+        })
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bdc515a and 6f1a596.

📒 Files selected for processing (12)
  • apps/supervisor/src/util.ts (1 hunks)
  • packages/cli-v3/src/dev/devSupervisor.ts (6 hunks)
  • packages/cli-v3/src/entryPoints/managed/controller.ts (9 hunks)
  • packages/cli-v3/src/entryPoints/managed/execution.ts (16 hunks)
  • packages/cli-v3/src/entryPoints/managed/taskRunProcessProvider.ts (1 hunks)
  • packages/cli-v3/src/executions/taskRunProcess.ts (4 hunks)
  • packages/core/src/v3/lifecycleHooks/manager.ts (1 hunks)
  • packages/core/src/v3/runMetadata/manager.ts (1 hunks)
  • packages/core/src/v3/schemas/build.ts (1 hunks)
  • packages/core/src/v3/timeout/usageTimeoutManager.ts (1 hunks)
  • packages/core/src/v3/usage/prodUsageManager.ts (2 hunks)
  • packages/core/src/v3/workers/populateEnv.ts (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (11)
  • packages/core/src/v3/timeout/usageTimeoutManager.ts
  • packages/core/src/v3/schemas/build.ts
  • packages/core/src/v3/lifecycleHooks/manager.ts
  • packages/core/src/v3/runMetadata/manager.ts
  • apps/supervisor/src/util.ts
  • packages/core/src/v3/workers/populateEnv.ts
  • packages/core/src/v3/usage/prodUsageManager.ts
  • packages/cli-v3/src/executions/taskRunProcess.ts
  • packages/cli-v3/src/dev/devSupervisor.ts
  • packages/cli-v3/src/entryPoints/managed/controller.ts
  • packages/cli-v3/src/entryPoints/managed/taskRunProcessProvider.ts
⏰ Context from checks skipped due to timeout of 90000ms (25)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (10, 10)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 10)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (9, 10)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
  • GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
  • GitHub Check: typecheck / typecheck
  • GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (2)
packages/cli-v3/src/entryPoints/managed/execution.ts (2)

118-121: Handle abort failures to avoid unhandled-promise exits

handleProcessAbort() is awaited, but any rejection will bubble up and make kill() itself throw, potentially bringing the whole supervisor down during a shutdown path.

-    if (this.taskRunProcess) {
-      await this.taskRunProcessProvider.handleProcessAbort(this.taskRunProcess);
-    }
+    if (this.taskRunProcess) {
+      const [err] = await tryCatch(
+        this.taskRunProcessProvider.handleProcessAbort(this.taskRunProcess)
+      );
+      if (err) {
+        this.sendDebugLog("kill: failed to abort process", { error: err.message });
+      }
+    }

185-196: canExecute may over-report readiness

Returning true solely because hasPersistentProcess is true ignores whether the process has been prepared with the correct env for the next run.
If a pooled process is still configured for another project / task, the controller could dispatch a run that will inevitably fail later.

Recommend additionally checking that currentTaskRunEnv is set (or that the provider can confirm the env matches the incoming run) before returning true.

@ericallam ericallam force-pushed the process-keep-alive branch from 650090d to 247ca37 Compare June 19, 2025 16:20
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (5)
apps/webapp/app/v3/otlpExporter.server.ts (2)

239-268: Code duplication issue persists - consider refactoring.

The explicit field-by-field property merging pattern is duplicated between logs and spans conversion. This creates maintenance overhead and potential for inconsistencies.


358-386: Type casting issue remains unresolved.

The environmentType property casting to CreatableEventEnvironmentType without validation can hide undefined values, as noted in previous reviews.

packages/cli-v3/src/entryPoints/managed/taskRunProcessProvider.ts (2)

211-217: Make cleanup() awaitable for deterministic shutdown

The cleanup() method calls this.cleanupProcess() which returns a promise, but doesn't await or return it. This creates a race condition where callers may assume cleanup is complete when the process might still be terminating.

-  async cleanup() {
+  async cleanup(): Promise<void> {
     if (this.persistentProcess) {
       this.sendDebugLog("cleanup() called");
 
-      await this.cleanupProcess(this.persistentProcess);
+      await this.cleanupProcess(this.persistentProcess);
     }
   }

259-259: Guard against NaN in HEARTBEAT_INTERVAL_MS environment variable

The multiplication by 1000 can produce NaN if TRIGGER_HEARTBEAT_INTERVAL_SECONDS is undefined, which will be passed as a string to the child process.

-      HEARTBEAT_INTERVAL_MS: String(this.env.TRIGGER_HEARTBEAT_INTERVAL_SECONDS * 1000),
+      ...(this.env.TRIGGER_HEARTBEAT_INTERVAL_SECONDS
+        ? { HEARTBEAT_INTERVAL_MS: String(this.env.TRIGGER_HEARTBEAT_INTERVAL_SECONDS * 1000) }
+        : {}),
packages/cli-v3/src/entryPoints/managed-run-worker.ts (1)

506-506: AbortSignal.any() compatibility issue with Node.js ≤ 18

AbortSignal.any() is only available in Node.js 20+. This will cause runtime crashes on older Node versions commonly used in serverless environments.

- const signal = AbortSignal.any([_cancelController.signal, timeoutController.signal]);
+ // Feature detection for AbortSignal.any() compatibility
+ const signal = typeof AbortSignal.any === 'function'
+   ? AbortSignal.any([_cancelController.signal, timeoutController.signal])
+   : createManualAbortSignal([_cancelController.signal, timeoutController.signal]);

Add a fallback implementation:

function createManualAbortSignal(signals: AbortSignal[]): AbortSignal {
  const controller = new AbortController();
  
  for (const signal of signals) {
    if (signal.aborted) {
      controller.abort(signal.reason);
      break;
    }
    signal.addEventListener('abort', () => {
      controller.abort(signal.reason);
    }, { once: true });
  }
  
  return controller.signal;
}
🧹 Nitpick comments (4)
packages/cli-v3/src/entryPoints/managed/taskRunProcessProvider.ts (3)

42-56: Consider simplifying immediate retry logic

The handleImmediateRetry() method only cleans up processes when keep-alive is disabled, but the method name suggests it should handle immediate retries regardless of keep-alive settings. This could be confusing.

Consider either:

  1. Renaming the method to better reflect its actual behavior
  2. Or handling immediate retry scenarios for both keep-alive enabled and disabled cases
- async handleImmediateRetry(): Promise<void> {
+ async handleImmediateRetryWhenKeepAliveDisabled(): Promise<void> {
   if (!this.processKeepAliveEnabled) {

263-276: Enhance process health checks for robustness

The current health check only verifies the process isn't being killed and has a PID. Consider adding more comprehensive health checks for better reliability.

  private isProcessHealthy(process: TaskRunProcess): boolean {
-   // Basic health check - TaskRunProcess will handle more detailed internal health checks
-   return !process.isBeingKilled && process.pid !== undefined;
+   // Enhanced health check including process responsiveness
+   return !process.isBeingKilled && 
+          process.pid !== undefined && 
+          process.isExecuting && 
+          process.isPreparedForNextAttempt;
  }

146-190: Improve process suspension logic clarity

The suspendProcess method has complex conditional logic that could be simplified for better readability and maintainability.

Consider extracting the process matching logic into a helper method:

+ private async suspendMatchingProcess(process: TaskRunProcess, flush: boolean): Promise<void> {
+   if (this.persistentProcess?.pid === process.pid) {
+     this.persistentProcess = null;
+     this.executionCount = 0;
+   }
+   await process.suspend({ flush });
+ }

  async suspendProcess(flush: boolean, process?: TaskRunProcess): Promise<void> {
    if (process) {
-     // Complex conditional logic here
+     await this.suspendMatchingProcess(process, flush);
    } else if (this.persistentProcess) {
      // Handle persistent process suspension
    }
  }
packages/cli-v3/src/entryPoints/managed-run-worker.ts (1)

454-454: Process title update may be unnecessary in process reuse scenarios

Setting the process title for each task execution might not be optimal when processes are being reused across different tasks, as it could be misleading or cause unnecessary overhead.

Consider whether process title updates are needed in keep-alive scenarios:

- process.title = `trigger-dev-worker: ${execution.task.id} ${execution.run.id}`;
+ // Process title updates may not be necessary with process reuse
+ // process.title = `trigger-dev-worker: ${execution.task.id} ${execution.run.id}`;
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 650090d and 247ca37.

⛔ Files ignored due to path filters (3)
  • references/hello-world/src/trigger/example.ts is excluded by !references/**
  • references/hello-world/src/trigger/schedule.ts is excluded by !references/**
  • references/hello-world/trigger.config.ts is excluded by !references/**
📒 Files selected for processing (37)
  • .changeset/polite-badgers-suffer.md (1 hunks)
  • apps/supervisor/src/util.ts (1 hunks)
  • apps/supervisor/src/workloadManager/docker.ts (2 hunks)
  • apps/webapp/app/v3/otlpExporter.server.ts (8 hunks)
  • packages/cli-v3/src/dev/devSupervisor.ts (6 hunks)
  • packages/cli-v3/src/dev/taskRunProcessPool.ts (1 hunks)
  • packages/cli-v3/src/entryPoints/dev-run-controller.ts (4 hunks)
  • packages/cli-v3/src/entryPoints/dev-run-worker.ts (14 hunks)
  • packages/cli-v3/src/entryPoints/managed-index-worker.ts (1 hunks)
  • packages/cli-v3/src/entryPoints/managed-run-worker.ts (14 hunks)
  • packages/cli-v3/src/entryPoints/managed/controller.ts (9 hunks)
  • packages/cli-v3/src/entryPoints/managed/execution.ts (21 hunks)
  • packages/cli-v3/src/entryPoints/managed/taskRunProcessProvider.ts (1 hunks)
  • packages/cli-v3/src/executions/taskRunProcess.ts (4 hunks)
  • packages/core/src/v3/config.ts (1 hunks)
  • packages/core/src/v3/lifecycleHooks/manager.ts (1 hunks)
  • packages/core/src/v3/locals/manager.ts (1 hunks)
  • packages/core/src/v3/otel/tracingSDK.ts (2 hunks)
  • packages/core/src/v3/runMetadata/manager.ts (1 hunks)
  • packages/core/src/v3/runTimelineMetrics/runTimelineMetricsManager.ts (2 hunks)
  • packages/core/src/v3/runtime/sharedRuntimeManager.ts (1 hunks)
  • packages/core/src/v3/schemas/build.ts (1 hunks)
  • packages/core/src/v3/schemas/schemas.ts (1 hunks)
  • packages/core/src/v3/semanticInternalAttributes.ts (1 hunks)
  • packages/core/src/v3/taskContext/index.ts (1 hunks)
  • packages/core/src/v3/taskContext/otelProcessors.ts (3 hunks)
  • packages/core/src/v3/timeout/api.ts (2 hunks)
  • packages/core/src/v3/timeout/types.ts (1 hunks)
  • packages/core/src/v3/timeout/usageTimeoutManager.ts (1 hunks)
  • packages/core/src/v3/usage/api.ts (1 hunks)
  • packages/core/src/v3/usage/devUsageManager.ts (1 hunks)
  • packages/core/src/v3/usage/noopUsageManager.ts (1 hunks)
  • packages/core/src/v3/usage/prodUsageManager.ts (2 hunks)
  • packages/core/src/v3/usage/types.ts (1 hunks)
  • packages/core/src/v3/waitUntil/manager.ts (1 hunks)
  • packages/core/src/v3/workers/populateEnv.ts (3 hunks)
  • packages/core/src/v3/workers/taskExecutor.ts (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (33)
  • packages/core/src/v3/schemas/build.ts
  • packages/core/src/v3/usage/types.ts
  • packages/core/src/v3/semanticInternalAttributes.ts
  • packages/core/src/v3/runtime/sharedRuntimeManager.ts
  • packages/core/src/v3/otel/tracingSDK.ts
  • packages/core/src/v3/locals/manager.ts
  • apps/supervisor/src/workloadManager/docker.ts
  • packages/core/src/v3/timeout/usageTimeoutManager.ts
  • packages/core/src/v3/waitUntil/manager.ts
  • .changeset/polite-badgers-suffer.md
  • packages/core/src/v3/timeout/types.ts
  • packages/core/src/v3/usage/noopUsageManager.ts
  • packages/core/src/v3/usage/devUsageManager.ts
  • packages/core/src/v3/schemas/schemas.ts
  • packages/cli-v3/src/entryPoints/managed-index-worker.ts
  • packages/core/src/v3/usage/api.ts
  • packages/core/src/v3/lifecycleHooks/manager.ts
  • packages/core/src/v3/workers/populateEnv.ts
  • packages/core/src/v3/taskContext/otelProcessors.ts
  • apps/supervisor/src/util.ts
  • packages/core/src/v3/timeout/api.ts
  • packages/core/src/v3/runMetadata/manager.ts
  • packages/core/src/v3/config.ts
  • packages/core/src/v3/runTimelineMetrics/runTimelineMetricsManager.ts
  • packages/core/src/v3/usage/prodUsageManager.ts
  • packages/cli-v3/src/executions/taskRunProcess.ts
  • packages/core/src/v3/taskContext/index.ts
  • packages/core/src/v3/workers/taskExecutor.ts
  • packages/cli-v3/src/entryPoints/managed/controller.ts
  • packages/cli-v3/src/entryPoints/dev-run-controller.ts
  • packages/cli-v3/src/entryPoints/managed/execution.ts
  • packages/cli-v3/src/dev/devSupervisor.ts
  • packages/cli-v3/src/dev/taskRunProcessPool.ts
⏰ Context from checks skipped due to timeout of 90000ms (25)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (10, 10)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (9, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 10)
  • GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
  • GitHub Check: typecheck / typecheck
  • GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (13)
apps/webapp/app/v3/otlpExporter.server.ts (5)

186-186: Function rename supports hierarchical attribute extraction.

The rename from extractResourceProperties to extractEventProperties better reflects the function's enhanced capability to extract properties at different levels (resource vs event-specific with prefix).


197-200: Good use of metadata prefix for event-specific properties.

The addition of SemanticInternalAttributes.METADATA as a prefix enables extraction of nested log-specific properties, allowing event-level overrides of resource-level properties.


300-303: Consistent metadata prefix usage for spans.

The span property extraction follows the same pattern as logs, maintaining consistency in how metadata prefixes are handled.


418-494: Well-implemented hierarchical key support.

The enhanced extractEventProperties function correctly handles the optional prefix parameter by passing it to the extraction helpers, enabling nested attribute lookup while maintaining backward compatibility.


716-806: Robust hierarchical key construction with proper type safety.

The attribute extraction helpers are well-enhanced with:

  • Function overloads providing clear type contracts
  • Array support for hierarchical keys with proper filtering of undefined values
  • Consistent dot-joining logic across all extraction functions

The implementation correctly handles both string and array inputs while maintaining existing functionality.

packages/cli-v3/src/entryPoints/dev-run-worker.ts (5)

166-252: LGTM! Bootstrap caching implementation is well-structured

The extraction of bootstrap logic into doBootstrap() with caching in bootstrap() is a clean approach that avoids repeated initialization while maintaining the same interface. The metric measurement wrapper provides good observability.


325-334: Flush synchronization prevents race conditions

The implementation correctly waits for any ongoing flush operations before starting a new task execution. This prevents race conditions between flush operations and new task starts.


584-624: Flush promise lifecycle management is well-implemented

The use of promiseWithResolvers to create and resolve flush promises provides proper synchronization. The promise is created at the start of flushAll() and resolved at the end, ensuring subsequent task runs wait for completion.


392-459: Task import optimization with resource catalog check

The refactored task import logic first checks the resource catalog before importing, which is an optimization for process reuse scenarios. The error handling and logging are comprehensive.


284-303: Let’s retry locating and inspecting StandardResourceCatalog without the incorrect flags:

#!/bin/bash
# 1. Find the file defining StandardResourceCatalog
file=$(rg -l "class StandardResourceCatalog" -g "*.ts")
echo "Definition file: $file"

# 2. Show its class declaration and surrounding lines
echo -e "\n=== Class declaration (+5/-2 lines) ==="
rg -n "class StandardResourceCatalog" "$file" -A5 -B2

# 3. Search within that file for any reset or clear methods
echo -e "\n=== Methods matching reset|clear ==="
rg -nE "reset\s*\(|clear\s*\(" "$file"
packages/cli-v3/src/entryPoints/managed-run-worker.ts (3)

157-241: Bootstrap caching implementation matches dev worker pattern

The bootstrap caching implementation is consistent with the dev worker approach, properly extracting the initialization logic and caching results. The managed-specific configuration (different telemetry structure) is handled correctly.


270-289: Environment reset function is comprehensive

The resetExecutionEnvironment() function properly resets all global managers and state variables. The usage of usage.reset() and timeout.reset() instead of manager-specific resets reflects the managed environment's usage manager pattern.


584-624: Flush synchronization implementation mirrors dev worker

The flush promise management using promiseWithResolvers is implemented consistently with the dev worker, ensuring proper synchronization between task executions.

@ericallam ericallam force-pushed the process-keep-alive branch from b0bbc08 to 847109f Compare June 25, 2025 14:31
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (7)
apps/webapp/app/v3/otlpExporter.server.ts (3)

239-268: Refactor the property merging to reduce duplication and maintenance burden

The explicit property-by-property merging creates a maintenance nightmare with extensive duplication between logs and spans. This pattern is error-prone and makes it easy to miss properties when adding new fields.

Create a helper function to handle the property merging:

+function mergeEventProperties(
+  eventProps: ReturnType<typeof extractEventProperties>,
+  resourceProps: ReturnType<typeof extractEventProperties>
+): Partial<CreatableEvent> {
+  return {
+    metadata: eventProps.metadata ?? resourceProps.metadata,
+    serviceName: eventProps.serviceName ?? resourceProps.serviceName ?? "unknown",
+    serviceNamespace: eventProps.serviceNamespace ?? resourceProps.serviceNamespace ?? "unknown",
+    environmentId: eventProps.environmentId ?? resourceProps.environmentId ?? "unknown",
+    environmentType: eventProps.environmentType ?? resourceProps.environmentType ?? "DEVELOPMENT",
+    organizationId: eventProps.organizationId ?? resourceProps.organizationId ?? "unknown",
+    projectId: eventProps.projectId ?? resourceProps.projectId ?? "unknown",
+    projectRef: eventProps.projectRef ?? resourceProps.projectRef ?? "unknown",
+    runId: eventProps.runId ?? resourceProps.runId ?? "unknown",
+    runIsTest: eventProps.runIsTest ?? resourceProps.runIsTest ?? false,
+    taskSlug: eventProps.taskSlug ?? resourceProps.taskSlug ?? "unknown",
+    taskPath: eventProps.taskPath ?? resourceProps.taskPath ?? "unknown",
+    workerId: eventProps.workerId ?? resourceProps.workerId ?? "unknown",
+    workerVersion: eventProps.workerVersion ?? resourceProps.workerVersion ?? "unknown",
+    queueId: eventProps.queueId ?? resourceProps.queueId ?? "unknown",
+    queueName: eventProps.queueName ?? resourceProps.queueName ?? "unknown",
+    batchId: eventProps.batchId ?? resourceProps.batchId,
+    idempotencyKey: eventProps.idempotencyKey ?? resourceProps.idempotencyKey,
+    machinePreset: eventProps.machinePreset ?? resourceProps.machinePreset,
+    machinePresetCpu: eventProps.machinePresetCpu ?? resourceProps.machinePresetCpu,
+    machinePresetMemory: eventProps.machinePresetMemory ?? resourceProps.machinePresetMemory,
+    machinePresetCentsPerMs: eventProps.machinePresetCentsPerMs ?? resourceProps.machinePresetCentsPerMs,
+  };
+}

        return {
          // ... other properties
-          metadata: logProperties.metadata ?? resourceProperties.metadata,
-          serviceName: logProperties.serviceName ?? resourceProperties.serviceName ?? "unknown",
-          // ... rest of the manual property assignments
+          ...mergeEventProperties(logProperties, resourceProperties),
          attemptId:
            extractStringAttribute(
              log.attributes ?? [],
              [SemanticInternalAttributes.METADATA, SemanticInternalAttributes.ATTEMPT_ID].join(".")
            ) ?? resourceProperties.attemptId,
          // ... continue with attempt-specific properties

358-388: Same property merging duplication issue

This section has the identical property merging pattern as the logs conversion, reinforcing the need for the helper function mentioned in the previous comment.

Apply the same mergeEventProperties helper function here:

        return {
          // ... other properties
-          metadata: spanProperties.metadata ?? resourceProperties.metadata,
-          serviceName: spanProperties.serviceName ?? resourceProperties.serviceName ?? "unknown",
-          // ... rest of the manual property assignments
+          ...mergeEventProperties(spanProperties, resourceProperties),
          attemptId:
            extractStringAttribute(
              span.attributes ?? [],
              [SemanticInternalAttributes.METADATA, SemanticInternalAttributes.ATTEMPT_ID].join(".")
            ) ?? resourceProperties.attemptId,
          // ... continue with attempt-specific properties

418-433: Fix the unsafe type casting for environmentType

The environmentType property is cast to CreatableEventEnvironmentType without ensuring it's a valid enum value, which can hide undefined or invalid values.

-    environmentType: extractStringAttribute(attributes, [
-      prefix,
-      SemanticInternalAttributes.ENVIRONMENT_TYPE,
-    ]) as CreatableEventEnvironmentType,
+    environmentType: (extractStringAttribute(attributes, [
+      prefix,
+      SemanticInternalAttributes.ENVIRONMENT_TYPE,
+    ]) ?? "DEVELOPMENT") as CreatableEventEnvironmentType,

Or better yet, validate the value:

+function validateEnvironmentType(value: string | undefined): CreatableEventEnvironmentType {
+  const validTypes = ["DEVELOPMENT", "STAGING", "PRODUCTION"] as const;
+  return validTypes.includes(value as any) ? (value as CreatableEventEnvironmentType) : "DEVELOPMENT";
+}

-    environmentType: extractStringAttribute(attributes, [
-      prefix,
-      SemanticInternalAttributes.ENVIRONMENT_TYPE,
-    ]) as CreatableEventEnvironmentType,
+    environmentType: validateEnvironmentType(extractStringAttribute(attributes, [
+      prefix,
+      SemanticInternalAttributes.ENVIRONMENT_TYPE,
+    ])),
packages/cli-v3/src/entryPoints/managed/taskRunProcessProvider.ts (2)

211-217: cleanup() fires asynchronous work without awaiting

cleanup() calls this.cleanupProcess(...) but returns void, discarding the promise. If callers assume all processes are dead after cleanup(), they might race with the still-running kill.

#!/bin/bash
# Check if cleanup() is called and expected to be synchronous
rg -A 5 '\.cleanup\(\)' --glob '*.ts' | grep -E 'await.*cleanup|\.then'

255-261: Possible NaN / 'undefined' in HEARTBEAT_INTERVAL_MS

TRIGGER_HEARTBEAT_INTERVAL_SECONDS can be empty → Number(undefined)NaN, which is stringified and forwarded to the child.

Guard the value before injecting it:

- HEARTBEAT_INTERVAL_MS: String(this.env.TRIGGER_HEARTBEAT_INTERVAL_SECONDS * 1000),
+ HEARTBEAT_INTERVAL_MS: this.env.TRIGGER_HEARTBEAT_INTERVAL_SECONDS 
+   ? String(this.env.TRIGGER_HEARTBEAT_INTERVAL_SECONDS * 1000)
+   : "30000", // Default to 30 seconds
packages/cli-v3/src/entryPoints/managed-run-worker.ts (1)

506-507: AbortSignal.any() may not be available on Node ≤ 18

AbortSignal.any() is only available in Node 20+. This will crash at runtime on earlier versions.

Implement a fallback for Node 18 compatibility:

-const signal = AbortSignal.any([_cancelController.signal, timeoutController.signal]);
+// Create a compatible abort signal that works on Node 18+
+const signal = AbortSignal.any 
+  ? AbortSignal.any([_cancelController.signal, timeoutController.signal])
+  : (() => {
+      const controller = new AbortController();
+      const abort = () => controller.abort();
+      _cancelController.signal.addEventListener('abort', abort);
+      timeoutController.signal.addEventListener('abort', abort);
+      return controller.signal;
+    })();
packages/cli-v3/src/entryPoints/dev-run-controller.ts (1)

610-626: version sent to pool may differ on return

getProcess receives version: this.opts.worker.serverWorker?.version (possibly undefined), but returnProcess uses "unknown" when undefined. This mismatch can leak processes in the pool.

Use consistent version normalization:

+private normalizeVersion(): string {
+  return this.opts.worker.serverWorker?.version || "unknown";
+}

 // In executeRun:
-version: this.opts.worker.serverWorker?.version,
+version: this.normalizeVersion(),

 // In all returnProcess calls:
-const version = this.opts.worker.serverWorker?.version || "unknown";
+const version = this.normalizeVersion();

Also applies to: 656-659

🧹 Nitpick comments (2)
packages/cli-v3/src/dev/taskRunProcessPool.ts (2)

129-134: Consider cleaning up execution counts for killed processes

The execution count map will grow indefinitely as it's never cleaned up when processes are killed.

Clean up the execution count when killing a process:

 private async killProcess(process: TaskRunProcess): Promise<void> {
   try {
     await process.cleanup(true);
+    if (process.pid) {
+      this.executionCountsPerProcess.delete(process.pid);
+    }
   } catch (error) {
     logger.debug("[TaskRunProcessPool] Error killing process", { error });
   }
 }

179-207: Comprehensive reuse eligibility checks

The method thoroughly evaluates all relevant factors for process reuse eligibility.

Consider extracting the magic number for readability:

+private readonly DEFAULT_MAX_EXECUTIONS = 50;
+
 constructor(options: TaskRunProcessPoolOptions) {
   this.options = options;
   this.maxPoolSize = options.maxPoolSize ?? 3;
-  this.maxExecutionsPerProcess = options.maxExecutionsPerProcess ?? 50;
+  this.maxExecutionsPerProcess = options.maxExecutionsPerProcess ?? this.DEFAULT_MAX_EXECUTIONS;
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b0bbc08 and 847109f.

⛔ Files ignored due to path filters (3)
  • references/hello-world/src/trigger/example.ts is excluded by !references/**
  • references/hello-world/src/trigger/schedule.ts is excluded by !references/**
  • references/hello-world/trigger.config.ts is excluded by !references/**
📒 Files selected for processing (38)
  • .changeset/polite-badgers-suffer.md (1 hunks)
  • apps/supervisor/src/util.ts (1 hunks)
  • apps/supervisor/src/workloadManager/docker.ts (2 hunks)
  • apps/webapp/app/utils/timelineSpanEvents.ts (0 hunks)
  • apps/webapp/app/v3/otlpExporter.server.ts (8 hunks)
  • packages/cli-v3/src/dev/devSupervisor.ts (6 hunks)
  • packages/cli-v3/src/dev/taskRunProcessPool.ts (1 hunks)
  • packages/cli-v3/src/entryPoints/dev-run-controller.ts (4 hunks)
  • packages/cli-v3/src/entryPoints/dev-run-worker.ts (14 hunks)
  • packages/cli-v3/src/entryPoints/managed-index-worker.ts (1 hunks)
  • packages/cli-v3/src/entryPoints/managed-run-worker.ts (14 hunks)
  • packages/cli-v3/src/entryPoints/managed/controller.ts (9 hunks)
  • packages/cli-v3/src/entryPoints/managed/execution.ts (21 hunks)
  • packages/cli-v3/src/entryPoints/managed/taskRunProcessProvider.ts (1 hunks)
  • packages/cli-v3/src/executions/taskRunProcess.ts (4 hunks)
  • packages/core/src/v3/config.ts (1 hunks)
  • packages/core/src/v3/lifecycleHooks/manager.ts (1 hunks)
  • packages/core/src/v3/locals/manager.ts (1 hunks)
  • packages/core/src/v3/otel/tracingSDK.ts (2 hunks)
  • packages/core/src/v3/runMetadata/manager.ts (1 hunks)
  • packages/core/src/v3/runTimelineMetrics/runTimelineMetricsManager.ts (2 hunks)
  • packages/core/src/v3/runtime/sharedRuntimeManager.ts (1 hunks)
  • packages/core/src/v3/schemas/build.ts (1 hunks)
  • packages/core/src/v3/schemas/schemas.ts (1 hunks)
  • packages/core/src/v3/semanticInternalAttributes.ts (1 hunks)
  • packages/core/src/v3/taskContext/index.ts (1 hunks)
  • packages/core/src/v3/taskContext/otelProcessors.ts (3 hunks)
  • packages/core/src/v3/timeout/api.ts (2 hunks)
  • packages/core/src/v3/timeout/types.ts (1 hunks)
  • packages/core/src/v3/timeout/usageTimeoutManager.ts (1 hunks)
  • packages/core/src/v3/usage/api.ts (1 hunks)
  • packages/core/src/v3/usage/devUsageManager.ts (1 hunks)
  • packages/core/src/v3/usage/noopUsageManager.ts (1 hunks)
  • packages/core/src/v3/usage/prodUsageManager.ts (2 hunks)
  • packages/core/src/v3/usage/types.ts (1 hunks)
  • packages/core/src/v3/waitUntil/manager.ts (1 hunks)
  • packages/core/src/v3/workers/populateEnv.ts (3 hunks)
  • packages/core/src/v3/workers/taskExecutor.ts (5 hunks)
💤 Files with no reviewable changes (1)
  • apps/webapp/app/utils/timelineSpanEvents.ts
🚧 Files skipped from review as they are similar to previous changes (32)
  • .changeset/polite-badgers-suffer.md
  • packages/core/src/v3/schemas/build.ts
  • apps/supervisor/src/workloadManager/docker.ts
  • packages/core/src/v3/usage/types.ts
  • packages/core/src/v3/locals/manager.ts
  • packages/core/src/v3/usage/api.ts
  • packages/core/src/v3/timeout/types.ts
  • packages/core/src/v3/waitUntil/manager.ts
  • packages/core/src/v3/usage/devUsageManager.ts
  • packages/core/src/v3/usage/noopUsageManager.ts
  • packages/core/src/v3/semanticInternalAttributes.ts
  • packages/core/src/v3/runMetadata/manager.ts
  • packages/cli-v3/src/entryPoints/managed-index-worker.ts
  • packages/core/src/v3/runtime/sharedRuntimeManager.ts
  • packages/core/src/v3/timeout/api.ts
  • packages/core/src/v3/timeout/usageTimeoutManager.ts
  • packages/core/src/v3/schemas/schemas.ts
  • packages/core/src/v3/lifecycleHooks/manager.ts
  • packages/core/src/v3/taskContext/otelProcessors.ts
  • packages/core/src/v3/workers/populateEnv.ts
  • packages/core/src/v3/otel/tracingSDK.ts
  • apps/supervisor/src/util.ts
  • packages/core/src/v3/config.ts
  • packages/core/src/v3/usage/prodUsageManager.ts
  • packages/core/src/v3/runTimelineMetrics/runTimelineMetricsManager.ts
  • packages/core/src/v3/taskContext/index.ts
  • packages/cli-v3/src/executions/taskRunProcess.ts
  • packages/core/src/v3/workers/taskExecutor.ts
  • packages/cli-v3/src/dev/devSupervisor.ts
  • packages/cli-v3/src/entryPoints/managed/controller.ts
  • packages/cli-v3/src/entryPoints/dev-run-worker.ts
  • packages/cli-v3/src/entryPoints/managed/execution.ts
🧰 Additional context used
📓 Path-based instructions (2)
`apps/webapp/app/v3/otlpExporter.server.ts`: This file contains code for TaskEvent data (otel data sent from tasks to servers).

apps/webapp/app/v3/otlpExporter.server.ts: This file contains code for TaskEvent data (otel data sent from tasks to servers).

📄 Source: CodeRabbit Inference Engine (.cursor/rules/webapp.mdc)

List of files the instruction was applied to:

  • apps/webapp/app/v3/otlpExporter.server.ts
`apps/webapp/**/*`: The webapp at apps/webapp is a Remix 2.1 app that uses Node.js v20. Use zod extensively in the webapp.

apps/webapp/**/*: The webapp at apps/webapp is a Remix 2.1 app that uses Node.js v20.
Use zod extensively in the webapp.

📄 Source: CodeRabbit Inference Engine (.github/copilot-instructions.md)

List of files the instruction was applied to:

  • apps/webapp/app/v3/otlpExporter.server.ts
⏰ Context from checks skipped due to timeout of 90000ms (25)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (10, 10)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (9, 10)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 10)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
  • GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
  • GitHub Check: typecheck / typecheck
  • GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (13)
apps/webapp/app/v3/otlpExporter.server.ts (8)

186-186: LGTM: Function rename and parameter addition

The rename from extractResourceProperties to extractEventProperties better reflects the function's purpose, and the optional prefix parameter enables hierarchical attribute extraction.


197-200: LGTM: Consistent prefix usage for nested metadata extraction

The addition of the SemanticInternalAttributes.METADATA prefix for log properties extraction is consistent with the span implementation.


300-303: LGTM: Consistent prefix usage for nested metadata extraction

The span properties extraction correctly uses the same prefix pattern as the log implementation.


434-490: LGTM: Prefix support for nested attribute extraction

The implementation correctly uses the optional prefix parameter to build hierarchical attribute keys. The logic handles both single string keys and array-based nested keys appropriately.


714-733: LGTM: Enhanced key handling with array support

The function overloads and array key handling logic are well-implemented. The filter(Boolean).join(".") approach correctly handles undefined values in the array and creates proper dot-notation keys.


737-756: LGTM: Consistent implementation across attribute extractors

The extractNumberAttribute function follows the same pattern as extractStringAttribute for array key handling.


760-779: LGTM: Consistent implementation continues

The extractDoubleAttribute function maintains consistency with the other extractor functions.


783-802: LGTM: Complete implementation of array key support

The extractBooleanAttribute function completes the consistent implementation of array key support across all attribute extractors.

packages/cli-v3/src/entryPoints/managed-run-worker.ts (3)

157-259: Excellent refactoring of bootstrap logic

The extraction of doBootstrap() with metric tracking and caching in bootstrap() is a clean implementation that will significantly improve warm start performance.


316-324: Well-implemented flush synchronization

The addition of _lastFlushPromise tracking and awaiting ensures proper synchronization between task executions, preventing potential data races during flush operations.

Also applies to: 585-624


270-289: Comprehensive environment reset implementation

The resetExecutionEnvironment() function properly resets all global state, which is crucial for safe process reuse. Good job ensuring all managers have reset methods.

Verify that all global managers have reset() methods implemented:

#!/bin/bash
# Check if all managers referenced in resetExecutionEnvironment have reset methods
for manager in "StandardLocalsManager" "StandardLifecycleHooksManager" "StandardRunTimelineMetricsManager" "StandardMetadataManager" "StandardWaitUntilManager" "SharedRuntimeManager" "DurableClock"; do
  echo "Checking $manager for reset method:"
  ast-grep --pattern "class $manager {
    $$$
    reset($$$) {
      $$$
    }
    $$$
  }"
done
packages/cli-v3/src/entryPoints/dev-run-controller.ts (1)

656-664: Good defensive error handling for pool operations

The consistent pattern of catching and logging errors when returning processes to the pool ensures that pool management failures don't interrupt the main execution flow.

Also applies to: 776-785, 814-821

packages/cli-v3/src/dev/taskRunProcessPool.ts (1)

46-120: Well-structured process acquisition logic

The getProcess method correctly handles both reuse and new process creation scenarios with proper state management between available and busy pools.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
apps/webapp/test/timelineSpanEvents.test.ts (1)

26-33: Consider updating test data comments for clarity.

While the test data correctly preserves the original structure with specific file names (which is good for testing the transformation), consider adding comments to clarify that the specific file information in the test data gets transformed to generic event names.

This would help future maintainers understand why the test data contains specific file names but the assertions expect generic names.

 {
   name: "trigger.dev/start",
   time: new Date("2025-04-04T08:39:27.224Z"),
   properties: {
+    // File name in properties gets transformed to generic "Importing task file" event name
     file: "src/trigger/chat.ts",
     event: "import",
     duration: 67,
     entryPoint:
       "/Users/eric/code/triggerdotdev/trigger.dev/references/d3-chat/.trigger/tmp/build-AL7zTl/references/d3-chat/src/trigger/chat.mjs",
   },
 },

Also applies to: 51-58

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 847109f and 352b046.

📒 Files selected for processing (1)
  • apps/webapp/test/timelineSpanEvents.test.ts (5 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`apps/webapp/**/*`: The webapp at apps/webapp is a Remix 2.1 app that uses Node.js v20. Use zod extensively in the webapp.

apps/webapp/**/*: The webapp at apps/webapp is a Remix 2.1 app that uses Node.js v20.
Use zod extensively in the webapp.

📄 Source: CodeRabbit Inference Engine (.github/copilot-instructions.md)

List of files the instruction was applied to:

  • apps/webapp/test/timelineSpanEvents.test.ts
⏰ Context from checks skipped due to timeout of 90000ms (25)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (10, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (9, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 10)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 10)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
  • GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 10)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
  • GitHub Check: typecheck / typecheck
  • GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (5)
apps/webapp/test/timelineSpanEvents.test.ts (5)

79-79: LGTM: Consistent test assertion update.

The test assertion correctly expects the generic "Importing task file" event name, aligning with the updated event naming strategy.


89-89: LGTM: Event ordering test updated correctly.

The event ordering test properly expects "Importing task file" as the 4th event in the sorted timeline.


179-181: LGTM: Help text verification updated consistently.

The test correctly verifies that the generic "Importing task file" event has the appropriate help text "A task file was imported".


190-190: LGTM: Duration preservation test updated correctly.

The test properly verifies that duration is preserved for the generic "Importing task file" event.


217-217: LGTM: Non-admin visibility test updated consistently.

The test correctly verifies that non-admin users can see the generic "Importing task file" event when no fork event exists.

Comment on lines +104 to +123
const enableProcessReuse =
typeof this.options.config.experimental_processKeepAlive === "boolean"
? this.options.config.experimental_processKeepAlive
: false;

if (enableProcessReuse) {
logger.debug("[DevSupervisor] Enabling process reuse", {
enableProcessReuse,
});
}

this.taskRunProcessPool = new TaskRunProcessPool({
env,
cwd: this.options.config.workingDir,
enableProcessReuse:
typeof this.options.config.experimental_processKeepAlive === "boolean"
? this.options.config.experimental_processKeepAlive
: typeof this.options.config.experimental_processKeepAlive === "object"
? this.options.config.experimental_processKeepAlive.enabled
: false,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The enableProcessReuse var and prop differ

Comment on lines -114 to +120
await this.taskRunProcess?.kill("SIGKILL");
if (this.taskRunProcess) {
await this.taskRunProcessProvider.handleProcessAbort(this.taskRunProcess);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to check that this does the same thing as before. handleProcessAbort calls process.cleanup instead of process.kill

@@ -249,7 +232,11 @@ export class RunExecution {
if (this.currentAttemptNumber && this.currentAttemptNumber !== run.attemptNumber) {
this.sendDebugLog("error: attempt number mismatch", snapshotMetadata);
// This is a rogue execution, a new one will already have been created elsewhere
await this.exitTaskRunProcessWithoutFailingRun({ flush: false });
// TODO: keep this one, kill the process even if it's a keep-alive one
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo?

@@ -263,7 +250,11 @@ export class RunExecution {
if (deprecated) {
this.sendDebugLog("run execution is deprecated", { incomingSnapshot: snapshot });

await this.exitTaskRunProcessWithoutFailingRun({ flush: false });
// TODO: keep this one, kill the process even if it's a keep-alive one
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo?

Comment on lines 283 to 288
case "FINISHED": {
this.sendDebugLog("run is finished", snapshotMetadata);

await this.exitTaskRunProcessWithoutFailingRun({ flush: true });
// This can sometimes be called before the handleCompletionResult, so we don't need to do anything here
return;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we need to exit here after all. Maybe not immediately, but as a backup. It could help catch race conditions where the poller detects a run is finished, even if handleCompletionResult is never called.

@@ -150,6 +148,7 @@ export class TaskRunProcess {
PATH: process.env.PATH,
TRIGGER_PROCESS_FORK_START_TIME: String(Date.now()),
TRIGGER_WARM_START: this.options.isWarmStart ? "true" : "false",
TRIGGERDOTDEV: "1",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥳

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants