Skip to content

Threat-detection job omits Setup Node.js but launches copilot_driver.cjs via node → 'node: command not found' #28143

@romainh-betclic

Description

@romainh-betclic

Summary

Affected engine: copilot

The gh-aw workflow compiler emits Setup Node.js in the main agent job (via DetectRuntimeRequirements in compiler_yaml_main_job.go:147) but not in the separately compiled detection job, even though that job unconditionally wraps the Copilot CLI with node ${RUNNER_TEMP}/gh-aw/actions/copilot_driver.cjs (copilot_engine_execution.go:186-200). On any runner without Node preinstalled on PATH, the detection job dies with node: command not found and the downstream parse step fails with No THREAT_DETECTION_RESULT found. The main agent job in the same run succeeds. Claude and Codex detection engines are unaffected because their GetInstallationSteps calls already bundle Setup Node.js via BuildStandardNpmEngineInstallSteps(..., includeNodeSetup=true, ...). Copilot relies on the shared runtime-detection pipeline for Node setup — a pipeline the detection job never goes through.

Affected Area

Detection job compiler (pkg/workflow/threat_detection.go, buildDetectionEngineExecutionStep). The main job calls DetectRuntimeRequirements which invokes requiresNodeForEngineDriver and emits Setup Node.js for any engine implementing DriverProvider with a non-empty GetDriverScriptName(). The detection job calls engine.GetInstallationSteps() directly and never reaches that pipeline. Copilot's install steps do not bundle Node setup; its driver invocation always requires node on PATH.

Reproduction Outline

  1. Compile any gh-aw workflow with copilot and at least one safe-outputs type configured so a detection job is generated.
  2. Run the compiled workflow on a runner without ambient Node (e.g. a stripped Ubuntu image, or any self-hosted runner that does not preinstall Node globally).
  3. Inspect the generated .github/workflows/*.lock.yml: confirm the main agent job has a Setup Node.js step and the detection job has none.
  4. Observe the detection job fail at the Execute GitHub Copilot CLI step with /bin/bash: line 1: node: command not found.
  5. Observe the subsequent Parse and conclude threat detection step fail with No lines containing THREAT_DETECTION_RESULT found / No THREAT_DETECTION_RESULT found in detection log.

Observed Behavior

The generated detection job contains no Setup Node.js step in its step list, but its Execute GitHub Copilot CLI step still discovers and invokes node to run copilot_driver.cjs:

GH_AW_NODE_BIN=$(command -v node 2>/dev/null || true)
...
"$GH_AW_NODE_EXEC" ${RUNNER_TEMP}/gh-aw/actions/copilot_driver.cjs ...

On a runner without ambient Node this produces:

[entrypoint] Using host PATH for chroot
...
/bin/bash: line 1: node: command not found
[WARN] Command completed with exit code: 127

Downstream parse step:

No lines containing THREAT_DETECTION_RESULT found in 117 lines
Failed to parse detection result: No THREAT_DETECTION_RESULT found in detection log.

Expected Behavior

If the generated detection job is going to invoke copilot_driver.cjs via node, the job must guarantee Node is available first — by emitting the same Setup Node.js bootstrap the main agent job receives. The generated workflow should not rely on ambient runner Node availability for detection when the main job already requires explicit Node setup.

Additional Context

Root cause: buildDetectionEngineExecutionStep calls engine.GetInstallationSteps(threatDetectionData) directly rather than going through DetectRuntimeRequirements. Copilot's install steps (copilot_engine_installation.go) do not include GenerateNodeJsSetupStep() because Copilot has always depended on the main job's runtime-detection pipeline to emit it. The detection job never goes through that pipeline.

Related issues:

  • github/gh-aw#27829 : node: command not found but distinct scope. That issue affects the main agent job on scheduled workflows and attributes the failure to lock-file staleness. This issue is a detection-job-only compiler-emission asymmetry; recompiling lock files does not fix it.

Versions confirmed affected: gh-aw v0.69.3. The asymmetry between the main job and detection job compilation paths pre-dates this version; any v0.69.x producing a Copilot-based detection job is affected.


Proposed Fix: Implementation plan

Analysis

The detection-job compiler path in pkg/workflow/threat_detection.go::buildDetectionEngineExecutionStep accumulates steps by calling engine.GetInstallationSteps(threatDetectionData) directly and appending the result. It never goes through DetectRuntimeRequirements / requiresNodeForEngineDriver the way compiler_yaml_main_job.go does for the main agent job. Copilot's installation steps (GenerateCopilotInstallerSteps in pkg/workflow/copilot_installer.go) emit only the Install GitHub Copilot CLI step — no Setup Node.js. Meanwhile copilot_engine_execution.go (around lines 186–203) unconditionally wraps the CLI with node ${RUNNER_TEMP}/gh-aw/actions/copilot_driver.cjs because CopilotEngine.GetDriverScriptName() (pkg/workflow/copilot_engine.go:128–131) always returns "copilot_driver.cjs". On a runner without ambient Node the driver invocation fails with exit 127.

Claude and Codex are not affected because their install steps flow through BuildStandardNpmEngineInstallSteps(..., includeNodeSetup=true, ...) (pkg/workflow/engine_helpers.go), which bundles Setup Node.js via GenerateNpmInstallSteps. The fix must (a) emit Setup Node.js in the detection job whenever the selected engine wraps its CLI with a Node-launched driver script, (b) not duplicate Setup Node.js when the engine's install steps already bundle one (the JobManager.ValidateDuplicateSteps validator in pkg/workflow/jobs_validation.go treats duplicates as a compiler bug and hard-fails the compile), and (c) repair an adjacent correctness gap in the same function — the detection-engine config rebuild drops DriverScript.

The correct predicate is interface-level. CopilotEngine.GetDriverScriptName() returns a non-empty default even when EngineConfig.DriverScript is empty, so a config-keyed check (the requiresNodeForEngineDriver(workflowData) helper in pkg/workflow/runtime_detection.go) would miss the default-driver case at this call site. The existing DriverProvider interface in pkg/workflow/agentic_engine.go is the right seam.

Implementation steps

  1. Add an interface-level predicate in pkg/workflow/agentic_engine.go (next to the existing DriverProvider interface, around line 243).

    Add a package-level helper:

    // engineRequiresNodeDriver reports whether the engine's execution command wraps
    // the CLI with a driver script launched via node (see nodeRuntimeResolutionCommand
    // in copilot_engine_execution.go). Used by call sites that must ensure node is on
    // PATH before the driver runs — notably the detection job, which does not go
    // through DetectRuntimeRequirements.
    func engineRequiresNodeDriver(engine CodingAgentEngine) bool {
        if engine == nil {
            return false
        }
        dp, ok := engine.(DriverProvider)
        if !ok {
            return false
        }
        return dp.GetDriverScriptName() != ""
    }

    Why interface-level and not EngineConfig.DriverScript: copilot_engine_execution.go:193–196 resolves the driver as e.GetDriverScriptName() first and only overrides with EngineConfig.DriverScript when it is non-empty. A config-keyed predicate would miss the common default-driver case.

  2. Add a dedup guard in pkg/workflow/nodejs.go (append after GenerateNpmInstallStepsWithScope).

    // installStepsContainNodeSetup reports whether any of the provided steps is already
    // a "Setup Node.js" step. Uses the same extractStepName matcher as
    // JobManager.ValidateDuplicateSteps so the guard cannot drift from what the
    // validator would flag as a duplicate.
    func installStepsContainNodeSetup(steps []GitHubActionStep) bool {
        for _, step := range steps {
            if extractStepName(strings.Join(step, "\n")) == "Setup Node.js" {
                return true
            }
        }
        return false
    }

    extractStepName is defined in pkg/workflow/jobs.go. Reuse is deliberate — it guarantees the dedup guard matches exactly what JobManager.ValidateDuplicateSteps would flag as a duplicate, and stays aligned if the duplicate detector ever changes its matching logic. Import strings if not already imported in the file.

  3. Preserve DriverScript when rebuilding the detection engine config in pkg/workflow/threat_detection.go::buildDetectionEngineExecutionStep, around the existing field copy at lines 508–516.

    The current rebuild preserves ID, Model, Version, Env, Config, Args, APITarget but drops DriverScript. Since engine.driver is a validated, supported field (see compiler_orchestrator_workflow.go::validateEngineDriverScript), a threat-detection-specific override via safe-outputs.threat-detection.engine-config.driver currently does not propagate to the detection job — the generated detection step silently uses the engine default driver instead. Add DriverScript to the preserved field list:

    detectionEngineConfig = &EngineConfig{
        ID:           detectionEngineConfig.ID,
        Model:        detectionEngineConfig.Model,
        Version:      detectionEngineConfig.Version,
        Env:          detectionEngineConfig.Env,
        Config:       detectionEngineConfig.Config,
        Args:         detectionEngineConfig.Args,
        APITarget:    detectionEngineConfig.APITarget,
        DriverScript: detectionEngineConfig.DriverScript,
    }

    This is a distinct correctness fix adjacent to the Node-setup bug: without it, the Node-setup plumbing would still work (the interface-level predicate triggers on the default driver), but the detection job would execute the wrong driver script when a user configured a threat-detection-specific override. Fix both in the same change because they touch the same config rebuild and the same conceptual surface.

  4. Conditionally prepend Setup Node.js in pkg/workflow/threat_detection.go::buildDetectionEngineExecutionStep, between the installSteps := engine.GetInstallationSteps(threatDetectionData) call and the existing loop that appends installSteps to steps:

    installSteps := engine.GetInstallationSteps(threatDetectionData)
    
    // Ensure node is on PATH when the engine's execution wraps the CLI with a driver
    // script (see engineRequiresNodeDriver). The detection job does not go through
    // DetectRuntimeRequirements, so the setup must be emitted here explicitly. Guard
    // against engines whose install steps already bundle Setup Node.js (Claude/Codex
    // via BuildStandardNpmEngineInstallSteps) — a duplicate would trip
    // JobManager.ValidateDuplicateSteps and hard-fail the compile.
    if engineRequiresNodeDriver(engine) && !installStepsContainNodeSetup(installSteps) {
        for _, line := range GenerateNodeJsSetupStep() {
            steps = append(steps, line+"\n")
        }
    }
    
    for _, step := range installSteps {
        for _, line := range step {
            steps = append(steps, line+"\n")
        }
    }

    Ordering is deliberate: Setup Node.js must appear before the engine install step, because the Copilot installer script itself shells out through the runner's PATH. Do not hoist the setup-node block to the top of the detection job — the compiler-generated pre-steps above this call site use Actions-JS runtimes (native Node) rather than shell node, so moving the setup earlier is unnecessary and widens the diff.

Tests

  1. Add tests in pkg/workflow/threat_detection_test.go. Follow the existing file style: table-driven, t.Run subtests, //go:build !integration is not used in this file because it is already part of the default (non-integration) build.

    5a. TestBuildDetectionEngineExecutionStepEmitsNodeSetupForCopilot — table-driven with four cases:

    Case AI SafeOutputs.ThreatDetection.EngineConfig Expected install step name
    copilot main engine "copilot" nil Install GitHub Copilot CLI
    copilot via threat-detection override "claude" &EngineConfig{ID: "copilot"} Install GitHub Copilot CLI
    claude main engine (dedup path) "claude" nil Install Claude Code CLI
    codex main engine (dedup path) "codex" nil Install Codex CLI

    Each case must assert:

    • strings.Count(stepsString, "- name: Setup Node.js") == 1 — catches both a missing prepend on the Copilot cases (count 0) and a duplicate on the Claude/Codex cases (count 2, which would later trip JobManager.ValidateDuplicateSteps).
    • strings.Index(stepsString, "- name: Setup Node.js") < strings.Index(stepsString, "- name: <expected install step>") — ordering invariant.

    Skeleton:

    func TestBuildDetectionEngineExecutionStepEmitsNodeSetupForCopilot(t *testing.T) {
        compiler := NewCompiler()
        tests := []struct {
            name                string
            data                *WorkflowData
            expectedInstallStep string
        }{ /* four cases as above */ }
        for _, tt := range tests {
            t.Run(tt.name, func(t *testing.T) {
                steps := compiler.buildDetectionEngineExecutionStep(tt.data)
                if len(steps) == 0 { t.Fatal("expected non-empty steps") }
                s := strings.Join(steps, "")
                if c := strings.Count(s, "- name: Setup Node.js"); c != 1 {
                    t.Errorf("want exactly one Setup Node.js, got %d.\n%s", c, s)
                }
                nodeIdx := strings.Index(s, "- name: Setup Node.js")
                installIdx := strings.Index(s, "- name: "+tt.expectedInstallStep)
                if installIdx == -1 { t.Fatalf("missing %q step", tt.expectedInstallStep) }
                if nodeIdx > installIdx {
                    t.Errorf("Setup Node.js must precede %q", tt.expectedInstallStep)
                }
            })
        }
    }

    5b. TestInstallStepsContainNodeSetup — direct unit test for the dedup guard, independent of engine wiring. Cases: empty input; canonical setup-node step produced by GenerateNodeJsSetupStep(); install-only step; setup-node preceded by an unrelated step; differently-indented setup-node (to confirm extractStepName whitespace tolerance). Rationale: the only current DriverProvider engine is Copilot, whose install steps do not bundle setup-node, so the dedup branch is never exercised by the higher-level test; a direct unit test locks the guard against drift if a future DriverProvider engine also bundles its own Node setup.

    5c. TestBuildDetectionEngineExecutionStepPropagatesDriverScriptOverride — asserts the DriverScript preservation fix from step 3. Construct WorkflowData{ AI: "copilot", SafeOutputs: &SafeOutputsConfig{ ThreatDetection: &ThreatDetectionConfig{ EngineConfig: &EngineConfig{ ID: "copilot", DriverScript: "custom_copilot_driver.cjs" } } } }, call buildDetectionEngineExecutionStep, join the returned steps and assert the substring custom_copilot_driver.cjs is present. Without the preservation fix, the rebuilt config would silently fall back to the default copilot_driver.cjs. Pair with a negative assertion that the default driver name is not present in this specific case, to make the regression direction unambiguous.

Lock-file regeneration

  1. Regenerate all .github/workflows/*.lock.yml via the standard Makefile target:

    make recompile
    

    make recompile runs ./gh-aw init and then ./gh-aw compile --validate --verbose --purge --stats (see Makefile). Commit the regenerated lock files in the same commit as the code change so each commit is internally consistent and bisectable — do not split code and generated artifacts into separate commits. Expected lock-file diff pattern: every workflow that uses Copilot for the main agent engine, or overrides the detection engine to Copilot via safe-outputs.threat-detection.engine-config.id: copilot, gains a Setup Node.js block inside the detection job, inserted immediately before Install GitHub Copilot CLI. Workflows using Claude or Codex for detection must produce zero lock-file delta — their install steps already bundle Setup Node.js, and the dedup guard must suppress the prepend. If Claude/Codex workflows show a diff, the dedup guard is wrong and must be fixed before commit. Lock files are compiler output — never hand-edit.

Validation

  1. Run the project's standard validation pipeline before opening the PR:

    make fmt
    make test-unit    # fast loop for pkg/workflow changes
    make test         # full unit + integration
    make lint
    make recompile    # regenerates lock files
    make agent-finish # required pre-commit gate per AGENTS.md / CONTRIBUTING.md
    

    make agent-finish (defined at Makefile:748–749) runs deps-dev fmt lint build build-wasm test-all fix recompile dependabot generate-schema-docs generate-agent-factory security-scan. Failures in any of these must block the commit.

Suggested commit message

fix(workflow): emit Setup Node.js in detection job for driver-wrapped engines

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions