Skip to content

[bug] run-tests.sh lifecycle echoes corrupt --format json when stdout is captured #25

@dogkeeper886

Description

@dogkeeper886

Context

Surfaced during the first end-to-end dispatch of test-pipeline.yml on the self-hosted runner registered in #22. See run 24659380207 — Build suite passed, Smoke suite failed on the Check test results step:

parse error: Invalid numeric literal at line 1, column 4
##[error]Process completed with exit code 4.

The failure is not in a test. It's upstream — the JSON the check step tries to parse is not actually JSON.

Root cause

The workflow step in .github/workflows/test-suite.yml does:

bash cicd/scripts/run-tests.sh $SUITE_FLAG $JUDGE_FLAGS --format json > /tmp/test-results.json || true

run-tests.sh and ci-up.sh both echo lifecycle banners to stdout:

  • run-tests.sh:58echo "=== Session setup ==="
  • run-tests.sh:48-49echo "" + echo "=== Session teardown (exit code: $exit_code) ==="
  • ci-up.sh:47-50echo "=== CI Environment Ready ===" + echo "TestLink URL: ..." + echo "Admin user: ..." + echo "API key: ..."

All of these get captured by > /tmp/test-results.json, interleaved with the framework's --format json output. jq hits the = at column 4 of === Session setup === and bails.

Why build suite passes

Per run-tests.sh:38-42, the build suite skips the whole lifecycle (NEEDS_STACK=false). No ci-up.sh, no Session setup banner, no Session teardown. Only npx tsx src/cli.ts run --format json writes to stdout, and that output is clean JSON. Build suite happens to dodge the bug.

Smoke and every suite after it trips over it.

Repro

On any checkout with the current testlink_1_9_20_fixed:

bash cicd/scripts/run-tests.sh --suite smoke --format json --no-llm > /tmp/out.json
jq . /tmp/out.json
# parse error: Invalid numeric literal at line 1, column 4

Not a regression introduced by FR-005

The pattern predates FR-005 / #22 — it was dormant because the workflows hadn't actually been dispatched end-to-end against the self-hosted runner (all three previous runs in the history also failed, for the same reason or pre-existing ones). FR-005's wiring is what made us finally dispatch a real run and notice.

Proposed fix

Redirect lifecycle echoes to stderr in run-tests.sh, ci-up.sh, and ci-down.sh:

echo "=== Session setup ===" >&2

They are logs, not data — belong on stderr. Once they move, > /tmp/test-results.json captures only the framework's JSON and jq parses cleanly.

Secondary consideration: scan cli.ts and related for any other process.stdout.write that isn't the final JSON envelope. The earlier [CONFIG] … lines are already correctly on stderr — good — but worth a full pass.

Alternative: change the workflow to read the on-disk results file (cicd/results/<timestamp>_<suite>/…) instead of redirecting stdout. More involved — the output dir is timestamped and would need globbing. Rejected as the default fix; the stderr redirect is two-line and preserves the human-readable stdout contract for interactive use.

Acceptance criteria

  • bash cicd/scripts/run-tests.sh --suite smoke --format json --no-llm > /tmp/out.json && jq . /tmp/out.json succeeds.
  • A dispatch of test-pipeline.yml with runner=self-hosted runs Check test results successfully when all tests pass (i.e. the pipeline doesn't trip on the parser step).
  • Interactive runs without stdout redirection still show the lifecycle banners on the terminal (stderr is still a terminal).

Out of scope

  • Fixing any actual test failures the LLM judge catches once the parser step is healthy — separate problem, separate tickets if they appear.
  • Rewriting the workflow to use on-disk results files.

Blocks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions