Skip to content

fix(cascade-tools): explicit help topics + envelope-only stdout + pin strict mode#1285

Merged
zbigniewsobiecki merged 4 commits into
devfrom
fix/cascade-tools-help-stdout-strict
May 9, 2026
Merged

fix(cascade-tools): explicit help topics + envelope-only stdout + pin strict mode#1285
zbigniewsobiecki merged 4 commits into
devfrom
fix/cascade-tools-help-stdout-strict

Conversation

@zbigniewsobiecki
Copy link
Copy Markdown
Member

Summary

Three independent fixes for the remaining issues from the 2026-05-09 prod-run analysis. PR #1281 (merged) closed the dominant boolean-flag pair; this closes the rest.

#1cascade-tools --help topic summaries are explicit

When pjson.oclif.topics is unset, oclif borrows each topic's description from its first command (see node_modules/@oclif/core/lib/config/config.js). Bare cascade-tools --help showed:

TOPICS
  pm        Add a checklist with items to a work item. ...
  scm       Create a GitHub pull request. Handles the full workflow ...
  alerting  Retrieve full details for an alerting event ...
  session   Call this gadget when you have completed all tasks ...

— each line is a specific gadget's description leaking into the topic line. Agents reading bare --help to map the surface got a misleading frame (≥ 2× in the 2026-05-09 corpus).

Fix: explicit pjson.oclif.topics = {...} block in bin/cascade-tools.js covering all six discovered topics (pm, scm, alerting, session, plus the legacy direct-provider github / trello whose summaries point operators back to the canonical topic).

#2cascade-tools stdout is envelope-only (drop inherited LLMIST_LOG_TEE)

Prod 2026-05-09: 75/120 cascade-tools calls (62%) emitted DEBUG/INFO log lines + ANSI escapes on stdout BEFORE the JSON envelope, polluting the agent's tool-result channel.

Root cause: the worker process at src/backends/llmist/index.ts:83-84 sets LLMIST_LOG_FILE=<engineLogPath> AND LLMIST_LOG_TEE='true' so its OWN logger tees to both the engine log file AND stdout. Both env vars are in the subprocess allowlist (src/utils/cascadeEnv.ts:14-15) and pass through to the bash subprocess that runs cascade-tools — making cascade-tools' logger ALSO tee, polluting the agent's view.

Fix: strip LLMIST_LOG_TEE at the very top of bin/cascade-tools.js (before the singleton logger reads env). With LLMIST_LOG_FILE still set, all logs (including the load-bearing [image-pipeline] work-item-fetch summary per spec 016 / src/integrations/README.md "Diagnostic log line" contract) land in the engine log the worker collects — operator observability via cascade runs logs <runId> is preserved. For standalone runs, redirect to /dev/null so dev runs stay envelope-only too; developers can override with LLMIST_LOG_FILE=/tmp/x.log cascade-tools ….

No source-tree code changessrc/utils/logging.ts, the cascade logger usage everywhere, and the [image-pipeline] log line all unchanged. The fix is one entrypoint env tweak.

#3 — Pin oclif strict mode on CredentialScopedCommand

Locks oclif's documented default (strict = true) explicitly on the cascade-tools command base. Without it, unknown flags would slip past parse validation and reach the gadget body as positional args — silently bypassing the spec-014 unknown-flag envelope. Preventive; passes today, guards against future drift.

Note: the original 2026-05-09 analysis flagged run 27be3592 (session finish --agent-type review --review-submitted --comment "..." returning exit=0) as a strict-mode gap. Closer inspection of src/cli/session/finish.ts shows it's a hand-written oclif command — --agent-type, --pr-created, --review-submitted are legitimate CLI extensions, listed in --help. Not a real bug. The pin in this PR is preventive, not a fix for that specific run.

End-to-end smoke (built CLI)

$ NODE_ENV='' node bin/cascade-tools.js --help | head -10
TOPICS
  alerting  Inspect Sentry alerting issues and events.
  github    Direct GitHub provider commands. Prefer the provider-agnostic `scm` topic.
  pm        Read and write PM work items, comments, and checklists across Trello/JIRA/Linear.
  scm       Interact with GitHub PRs: create, review, comment, fetch diffs and CI logs.
  session   End the agent session. Exclusive terminal call.
  trello    Direct Trello provider commands. Prefer the provider-agnostic `pm` topic.

$ LLMIST_LOG_TEE=true LLMIST_LOG_FILE=/tmp/engine.log \
    node bin/cascade-tools.js pm read-work-item --workItemId NOT-A-REAL 1>/tmp/out.txt
$ cat /tmp/out.txt
{"success":true,"data":"Error reading work item: No Trello credentials in scope. ..."}
$ head -3 /tmp/engine.log | sed 's/\x1b\[[0-9;]*m//g'
2026-05-09 11:13:41:465  DEBUG  [cascade]  Fetching Trello card {"cardId":"NOT-A-REAL"}
2026-05-09 11:13:41:466  DEBUG  [cascade]  Fetching card checklists {"cardId":"NOT-A-REAL"}
2026-05-09 11:13:41:466  DEBUG  [cascade]  Fetching card attachments {"cardId":"NOT-A-REAL"}

Stdout is envelope-only; operator observability via the engine log file preserved.

Test plan

  • npm test — 9057 / 9057 passing
  • tests/unit/cli/cascade-tools-help.test.ts (NEW, 2 tests) — pins canonical topic summaries + per-gadget --help unaffected
  • tests/unit/cli/cascade-tools-stdout-cleanliness.test.ts (NEW, 3 tests) — pins envelope-only stdout under worker / standalone / dev-override env shapes; pins engine log gets logger output
  • tests/unit/cli/cli-command-factory.test.ts (extended, 1 new test) — pins unknown-flag envelope on factory-generated commands (locks the strict-mode default)
  • npm run lint clean (only pre-existing complexity warnings in unrelated files)
  • npm run typecheck clean
  • Pre-push hook ran the full unit suite green
  • Manual smoke as shown above — every fix verified end-to-end against the built CLI

Files changed

  • bin/cascade-tools.js+48 lines (topics block + LLMIST_LOG_TEE strip + LLMIST_LOG_FILE fallback)
  • src/cli/base.ts+9 lines (one-line static override strict = true plus rationale)
  • tests/unit/cli/cascade-tools-help.test.ts+80 lines (NEW, regression net for topics)
  • tests/unit/cli/cascade-tools-stdout-cleanliness.test.ts+124 lines (NEW, regression net for stdout cleanliness)
  • tests/unit/cli/cli-command-factory.test.ts+26 lines (strict-mode regression test)

🤖 Generated with Claude Code

zbigniewsobiecki and others added 3 commits May 9, 2026 11:03
When `pjson.oclif.topics` is unset, oclif borrows each topic's description
from its first command (`node_modules/@oclif/core/lib/config/config.js` —
`this._topics.set(name, { description: c.summary || c.description, name })`).
That made bare `cascade-tools --help` show:

  pm        Add a checklist with items to a work item. ...
  scm       Create a GitHub pull request. Handles the full workflow ...
  alerting  Retrieve full details for an alerting event ...
  session   Call this gadget when you have completed all tasks ...

Agents reading bare --help to map the surface got a misleading frame (saw
≥ 2× in the 2026-05-09 prod corpus). Set explicit topic summaries so each
TOPICS line is one truthful sentence per topic.

Also covers the two direct-provider topics (`github`, `trello`) that mirror
the canonical `scm` / `pm` surfaces; their summaries point operators back to
the provider-agnostic topic.

Regression net at tests/unit/cli/cascade-tools-help.test.ts pins both the
borrowed-description-must-not-appear and canonical-summary-must-appear
invariants for all six topics.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Locks oclif's documented default (`strict = true`) explicitly. Without it,
unknown flags would slip past parse validation and reach the gadget body as
positional args — silently bypassing the spec-014 `unknown-flag` envelope
even though every cascade-tools command claims to honor it.

Adds a regression test on the factory-generated command surface that asserts
unknown flags fire the structured `unknown-flag` envelope (passes today;
guards against any future drift if oclif loosens the default).

Note: the original 2026-05-09 analysis read run 27be3592 as a strict-mode
gap (`session finish --agent-type review --review-submitted --comment ...`
returning success despite `finishDef` declaring only `comment`). Closer
inspection of `src/cli/session/finish.ts` shows it's a hand-written oclif
command — `--agent-type`, `--pr-created`, and `--review-submitted` are
legitimate CLI extensions, listed in `--help`. Not a real bug. The pin in
this commit is preventive, not a fix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…elope-only

Prod 2026-05-09: 75/120 cascade-tools calls (62%) emitted DEBUG/INFO log
lines + ANSI escapes on stdout BEFORE the JSON envelope, polluting the
agent's tool-result channel.

Root cause: the worker process at `src/backends/llmist/index.ts:83-84`
sets `LLMIST_LOG_FILE=<engineLogPath>` AND `LLMIST_LOG_TEE='true'` so its
own logger tees to both file and stdout. Both env vars are in the
subprocess allowlist (`src/utils/cascadeEnv.ts:14-15`) and pass through to
the bash subprocess that runs cascade-tools. The cascade-tools logger
then ALSO tees, polluting the agent's view.

Fix: strip `LLMIST_LOG_TEE` at the very top of `bin/cascade-tools.js`
(before the singleton logger reads env). With `LLMIST_LOG_FILE` still set,
all logs (including the load-bearing `[image-pipeline] work-item-fetch
summary` per spec 016 / `src/integrations/README.md`) land in the engine
log file the worker collects — operator observability via `cascade runs
logs <runId>` is preserved. For standalone runs (no LLMIST_LOG_FILE
inherited), redirect to /dev/null so dev runs stay envelope-only too;
developers can override with `LLMIST_LOG_FILE=/tmp/x.log cascade-tools ...`.

No source-tree code changes — `src/utils/logging.ts`, the cascade logger
usage everywhere, and the `[image-pipeline]` log line all unchanged. The
fix is one entrypoint env tweak.

Regression net at tests/unit/cli/cascade-tools-stdout-cleanliness.test.ts
spawns cascade-tools as a subprocess under three env shapes (worker /
standalone / dev-override) and asserts stdout matches `^{"success":` with
no ANSI escapes (ESC byte 0x1b absent), no `[cascade]` substring, and no
tab-separated log-level prefixes. Also asserts the engine log file DOES
receive the cascade logger output — the operator-observability invariant.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@nhopeatall nhopeatall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

CI is failing because the help-topic regression test is pinned to legacy topics that are not present in a clean build.

Code Issues

Blocking

  • bin/cascade-tools.js:68 — The new github / trello topic entries do not have any matching commands in the current source tree, so a fresh build has no child topics for oclif to render. I verified both HEAD and origin/dev have no src/cli/github/* or src/cli/trello/*; oclif Help._topics only shows configured topics with child topics/commands. This means the new test expectations for github and trello only pass with stale dist/cli/github / dist/cli/trello artifacts, and current CI already reports lint-and-test failing in the test step. Drop these dead topics from the topic block/test, or restore real legacy commands if they are meant to exist.

🕵️ codex · gpt-5.5 · run details

Comment thread bin/cascade-tools.js Outdated
},
alerting: { description: 'Inspect Sentry alerting issues and events.' },
session: { description: 'End the agent session. Exclusive terminal call.' },
github: {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no src/cli/github/* or src/cli/trello/* commands in this branch or in origin/dev; bin/cascade-tools.js only loads commands from ./dist/cli. In a clean build these configured topics have no child commands, and oclif filters such topics out of root help, so the new test expectations for github/trello depend on stale dist artifacts. Please remove these dead topic entries/tests or restore actual legacy commands.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Removed both and from the topics block in bin/cascade-tools.js — confirmed there are no src/cli/github/* or src/cli/trello/* commands in the source tree, and oclif silently filters out topics with no child commands on a clean build. Removed the corresponding test assertions (expect(stdout).toMatch(/github.../) and expect(stdout).toMatch(/trello.../)) from cascade-tools-help.test.ts as well. All unit tests pass.

Remove `github` and `trello` from the explicit topics block in
bin/cascade-tools.js and the matching test assertions. There are no
src/cli/github/* or src/cli/trello/* commands in the source tree, so
oclif filters these topics out of root help on a clean build. The test
expectations only passed against stale dist artifacts, causing CI
failures.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 9, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown
Collaborator

@nhopeatall nhopeatall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - the entrypoint now strips inherited tee logging before bootstrap, preserves file logging for worker observability, pins strict parsing on the shared command base, and the CI-backed regression tests cover the help and stdout behavior.

🕵️ codex · gpt-5.5 · run details

@zbigniewsobiecki zbigniewsobiecki merged commit cc90d06 into dev May 9, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants