Skip to content

[rush] Should Rush CLI tracing start earlier? #5791

@kevin-y-ang

Description

@kevin-y-ang

Summary

While investigating Rush performance telemetry, I noticed that the tracing infrastructure captures timing data starting after the CLI bootstrap phase. I'd like to understand if this is intentional and discuss whether capturing earlier timing would be valuable.

This is a discussion/question rather than a bug report.

Current Tracing Architecture

Entry Points

Rush has two distinct entry points that are often confused:

File Purpose When Used
apps/rush/src/start.ts CLI frontend Normal rush / rushx / rush-pnpm commands
libraries/rush-lib/src/start.ts Direct library usage Programmatic API consumers (isManaged: false)

Where Tracing Currently Starts

The existing performance.mark('rush:start') is located at:

libraries/rush-lib/src/start.ts:6

This file is on the direct library usage path, not the CLI path. The CLI path is:

apps/rush/bin/rush                           # Line 2: require('../lib-commonjs/start.js')
  → apps/rush/src/start.ts                   # 105 lines of bootstrap code
  → RushCommandSelector.execute()            # apps/rush/src/RushCommandSelector.ts:28
  → Rush.launch()                            # libraries/rush-lib/src/api/Rush.ts:79
  → measureAsyncFn('rush:parser:executeAsync', ...)  # Line 99 - first traced span

The first performance measurement on the CLI path is rush:parser:executeAsync at libraries/rush-lib/src/api/Rush.ts:99.

Existing Performance Instrumentation

For reference, here are the existing measureAsyncFn / measureFn calls in rush-lib:

Location Measurement Name What It Measures
Rush.ts:99 rush:parser:executeAsync Parser execution (first span)
RushCommandLineParser.ts:242 rush:initializeUnassociatedPlugins Plugin init
RushCommandLineParser.ts:323 rush:commandLineParser:onExecuteAsync Command execution
BaseRushAction.ts:81 rush:<action>:runAsync Action body
BaseRushAction.ts:127-134 rush:<action>:initializePluginsAsync Per-action plugin init
PhasedScriptAction.ts:354+ rush:phased:* ~20 internal spans
BaseInstallAction.ts:289-333 rush:installManager:* Install operations
GlobalScriptAction.ts:187 rush:globalScriptAction:prepareAutoinstaller Autoinstaller prep

Observations

Bootstrap Code Outside Current Tracing

The following code executes before the first performance measurement:

Location Lines Work Performed
apps/rush/src/start.ts:9-15 7 Node.js version compatibility check
apps/rush/src/start.ts:17-20 4 Node "too new" warning check
apps/rush/src/start.ts:22-34 13 Import 8 modules (semver, node-core-library, terminal, rush-lib types)
apps/rush/src/start.ts:37-38 2 MinimalRushConfiguration.loadFromDefaultLocation() - finds and parses rush.json
apps/rush/src/start.ts:40 1 Load own package.json
apps/rush/src/start.ts:44-84 40 Preview version handling, version selection logic
apps/rush/src/start.ts:89-91 3 Create terminal provider, build launch options
apps/rush/src/start.ts:95-104 10 Version selector or direct launch
Rush.ts:80-96 16 Banner, migration advisor, parser construction
Total ~96

Additionally, RushCommandLineParser constructor (RushCommandLineParser.ts:99-180) performs work before executeAsync() is called:

  • Line 134-144: Find rush.json, load .env, load full RushConfiguration
  • Line 149-153: Node.js compatibility checks
  • Line 155: Create RushGlobalFolder
  • Line 157-168: Create RushSession, PluginManager
  • Line 170-171: Read plugin command-line configurations
  • Line 180: Populate all actions

Timing Comparison

For a hypothetical rush build command with 50ms of actual build work:

Metric What It Measures Example Value
durationInSeconds (telemetry) Stopwatch started inside action 0.05s
rush:parser:executeAsync (perf entry) From parser.executeAsync() call 0.08s
Wall-clock time From bin stub to exit 0.35s

The difference (~270ms in this example) represents startup overhead from module loading, config parsing, and plugin initialization.

Existing Mark Location

There is a performance.mark('rush:start') at libraries/rush-lib/src/start.ts:6, added in commit 25ecb0bc2f (#5269, July 2025). This mark fires when using rush-lib as a direct module import (the isManaged: false path), which is separate from the CLI path. I'm curious whether this placement was intentional for API consumers, or if it was meant to cover the CLI path as well.

Possible Approaches (If We Want Earlier Tracing)

Option 1: Add mark to CLI start.ts (1 line change)

Add a performance mark at the top of apps/rush/src/start.ts:

// apps/rush/src/start.ts - line 3 (after copyright)
performance.mark('rush:cli:start');

Pros:

  • Minimal change (1 line)
  • Captures most of the bootstrap phase
  • Mark's startTime property shows ms since Node process start
  • Existing collectPerformanceEntries() automatically captures it

Cons:

  • Misses bin stub + initial require (~5ms typically)

Option 2: Add mark to bin stubs (3-4 line change)

Add marks at the earliest possible point:

// apps/rush/bin/rush (and rushx, rush-pnpm)
#!/usr/bin/env node
performance.mark('rush:launcher:start');
require('../lib-commonjs/start.js');

Optionally add a measure in rush-lib to compute the span:

// libraries/rush-lib/src/api/Rush.ts - inside Rush.launch()
if (performance.getEntriesByName('rush:launcher:start').length > 0) {
  performance.measure('rush:launcher', 'rush:launcher:start');
}

Pros:

  • Captures timing from the absolute earliest JS execution point
  • Covers all three entry points (rush, rushx, rush-pnpm)

Cons:

  • 3-4 lines across multiple files
  • Bin stubs are simple by design; adding logic may be undesirable

Option 3: Environment variable timestamp (9 lines)

Set timestamp in bin stub, read in telemetry:

// bin stubs
process.env.RUSH_LAUNCH_TIMESTAMP = String(Date.now());
// In telemetry consumers
const launchTimestamp = process.env.RUSH_LAUNCH_TIMESTAMP 
  ? Number(process.env.RUSH_LAUNCH_TIMESTAMP) 
  : undefined;
extraData.totalDurationMs = launchTimestamp 
  ? Date.now() - launchTimestamp 
  : undefined;

Pros:

  • Survives child process spawning (useful for version selector path)
  • Visible in debugging (env | grep RUSH)
  • Consistent with existing RUSH_* environment variables

Cons:

  • More lines changed
  • String serialization overhead (negligible)

Option 4: Wrap entire CLI in measurement

Create a new traced entry point that wraps the existing start.ts:

// apps/rush/src/start-traced.ts
performance.mark('rush:cli:start');
import './start.js';

Pros:

  • Clean separation
  • Doesn't modify existing start.ts

Cons:

  • Adds a new file
  • Import still happens after the mark, so module load time for start.ts itself is captured but not the mark statement's own overhead

Questions

  1. Is the current tracing scope intentional? Perhaps measuring from parser.executeAsync() is the intended design - focusing on command execution rather than CLI bootstrap. I'd like to understand the original intent.

  2. Is there value in capturing startup overhead? For users investigating "why does rush build feel slow?", knowing the bootstrap cost could be helpful. But maybe that's out of scope for the telemetry system.

  3. Should bin stubs remain logic-free? The current 2-line stubs are intentionally minimal. If earlier tracing is desirable, is adding a performance mark acceptable there?

  4. What about the version selector path? When apps/rush/src/RushVersionSelector.ts downloads and launches a different Rush version, the timing story gets more complex. Would we want to trace across that boundary?

  5. Separate metrics or combined? If we add earlier tracing:

    • Keep durationInSeconds as action-only time, add totalDurationInSeconds separately
    • Or add both and let consumers choose what matters to them

Details

Related Code References

  • Bin stubs: apps/rush/bin/rush, apps/rush/bin/rushx, apps/rush/bin/rush-pnpm
  • CLI entry: apps/rush/src/start.ts
  • Library entry: libraries/rush-lib/src/start.ts
  • Rush.launch(): libraries/rush-lib/src/api/Rush.ts:79-100
  • Parser constructor: libraries/rush-lib/src/cli/RushCommandLineParser.ts:99-180
  • Performance utilities: libraries/rush-lib/src/utilities/performance.ts
  • Telemetry collection: libraries/rush-lib/src/logic/Telemetry.ts:172-173
  • Original commit: 25ecb0bc2f - [rush] Enhance performance data in telemetry payload ([rush] Enhance performance data in telemetry payload #5269)

Standard questions

Question Answer
@microsoft/rush globally installed version? N/A (source investigation)
rushVersion from rush.json? 5.156.0
Operating system? Mac
Would you consider contributing a PR? Yes
Node.js version (node -v)? v22.14.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Needs triage

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions