fix(media): suppress local whisper progress transcripts by kesslerio · Pull Request #87393 · openclaw/openclaw

kesslerio · 2026-05-27T19:41:09Z

What

CLI audio transcription now treats deterministic transcript-file outputs as authoritative instead of falling back to progress/status stdout.

This covers:

whisper-cli with -otxt / --output-txt and -of / --output-file
whisper with --output_format txt and --output_dir
parakeet-mlx with explicit --output-format txt and --output-dir
the local Whisper Node wrapper output paths via --output-dir, --output, and -o

Empty or missing inferred transcript files now produce no transcript instead of exposing status/progress output as user text. Unexpected transcript-file read errors still surface. Other Node wrappers, parakeet default/non-txt output, and other non-authoritative CLI paths keep the existing stdout fallback behavior.

Fixes #87384.

Why

Some local speech-to-text wrappers print progress banners to stdout while writing the actual transcript to a file. When that transcript file is empty or missing, OpenClaw could previously treat the progress banner as the user transcript. That is wrong for live audio messages because wrapper status text can leak into the inbound conversation body.

Real behavior proof

Behavior or issue addressed: local Whisper-style CLI wrapper status text could be injected as an audio transcript when the expected transcript file was empty or missing.
Real environment tested: refreshed PR head 3f383768675217b456189d972867a1d0bdde7858, disposable OpenClaw staging state, stored Telegram DM voice media (audio/ogg) from @kesslerClawBot staging proof, and a controlled local-whisper-compatible wrapper that printed Whisper progress to stdout, wrote an empty inferred transcript file under {{OutputDir}}, and exited successfully.
Exact steps or command run after this patch: merged current origin/main into the PR branch, then ran a current-head direct media runtime proof through src/media-understanding/runner.entries.ts / runCliEntry with node <proof-wrapper> <stored-telegram-audio.ogg> --output-dir <tmp>. The same wrapper/media path was previously exercised through the disposable Telegram gateway with @kesslerClawBot.

Evidence after fix (screenshot, recording, terminal capture, console output, redacted runtime log, linked artifact, or copied live output): public proof gist with current-head direct runtime output plus the redacted live Telegram gateway excerpt: https://gist.github.com/kesslerio/97110dff154bef75ddb80f35c5cdc8e8

PR #87393 current-head runCliEntry proof
head=3f383768675217b456189d972867a1d0bdde7858
media=<staging-state>/media/inbound/ec282e95-e8cd-4472-b8ce-d15d73cac548.ogg
wrapper=<staging-state>/proof/local-whisper/transcribe.js

result null
PASS: runCliEntry returned null; local-whisper wrapper progress stdout was not exposed as transcript text.

Prior live Telegram gateway excerpt for the same wrapper behavior:

audio-preflight: transcribing attachment 0 for mention check
Media understanding via CLI: node <staging-state>/proof/local-whisper/transcribe.js <staging-state>/media/inbound/ec282e95...ogg --output-dir /tmp/openclaw/openclaw-media-cli-...
Whisper Voice Transcription
Model: proof
Transcribing with Whisper...
Media understanding audio: skipped (0/1) reason=empty output
telegram inbound: chatId=<redacted> from=telegram:<redacted> len=92 preview="[Telegram ...] <media:audio>"

Observed result after fix: current-head runCliEntry returned null for the empty authoritative transcript file; in the live Telegram proof, OpenClaw logged wrapper progress output but kept the inbound Telegram body as <media:audio>. The progress stdout was not injected as transcript text.
What was not tested: real Whisper model transcription quality and real upstream parakeet-mlx binary execution were not tested in the staging proof. The parakeet compatibility path is covered by focused regression tests instead.
Proof limitations or environment constraints: the current-head proof uses a controlled wrapper to force the empty-output condition deterministically. It proves OpenClaw's wrapper-output handling on the changed source path; the live Telegram proof was captured before the current-main merge, but the refreshed PR diff still only changes the same five media/doc/test files against current origin/main.
Before evidence (optional but encouraged): Bug: CLI audio transcription can use progress stdout when transcript file is empty #87384 describes the reported failure mode; current main fell back from missing/empty inferred transcript files to stdout for known CLI paths.

Validation

pnpm exec oxfmt --check --threads=1 docs/nodes/audio.md docs/nodes/media-understanding.md src/media-understanding/apply.test.ts src/media-understanding/runner.cli-audio.test.ts src/media-understanding/runner.entries.ts
git diff --check origin/main...HEAD
timeout -k5s 180s node scripts/run-vitest.mjs run --config test/vitest/vitest.media-understanding.config.ts --reporter=verbose src/media-understanding/runner.cli-audio.test.ts -t "parakeet|whisper-cli inferred|inferred whisper transcript|local whisper|node wrapper"
timeout -k5s 180s node scripts/run-vitest.mjs run --config test/vitest/vitest.media-understanding.config.ts --reporter=verbose src/media-understanding/runner.cli-audio.test.ts src/media-understanding/apply.test.ts

Result:

formatting passed
diff whitespace check passed
focused parakeet/Whisper/local-wrapper tests passed: 9 tests, 4 skipped
broader media-understanding audio files passed: 65 tests
current-head direct runtime proof passed and is linked above

Validation limitation:

timeout -k5s 180s node scripts/run-tsgo.mjs -p test/tsconfig/tsconfig.core.test.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/core-test.tsbuildinfo failed after the current-main merge on declaration errors in unrelated current-main files for proper-lockfile, cross-spawn, and hosted-git-info. No media-understanding files were reported in that failure.

CI Status

CI is running on current head 3f383768675217b456189d972867a1d0bdde7858.

The branch now includes current origin/main (7d5d62511f) and contains scripts/github/dependency-guard.mjs, so the previous Dependency Guard failure from checking out 53ad531df9 without that script should be cleared by this head.

Maintainer Edits

Maintainer edits are enabled on this fork PR.

AI Assistance

AI-assisted implementation and review. Human-run proof was captured through the disposable staging gateway and supplemented with a current-head direct runtime proof. Local review included scoped simplify/code-review passes and focused media-understanding validation.

clawsweeper · 2026-05-27T19:44:59Z

Codex review: needs maintainer review before merge. Reviewed June 1, 2026, 11:52 PM ET / 03:52 UTC.

Summary
The PR changes media CLI audio output resolution, tests, and docs so deterministic transcript-file outputs are treated as authoritative instead of falling back to progress stdout.

PR surface: Source +30, Tests +337, Docs 0. Total +367 across 5 files.

Reproducibility: yes. at source level: current main reads inferred transcript files only when non-empty and otherwise falls through to stdout, while the linked issue gives a concrete wrapper/progress-stdout reproduction shape. I did not execute a failing current-main run because this review is read-only.

Review metrics: 1 noteworthy metric.

CLI output-source policy: 4 CLI path families changed. whisper-cli, whisper, explicit-txt parakeet-mlx, and local-whisper node wrappers stop using stdout fallback when the expected transcript file is empty or missing.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🦞 diamond lobster
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

[P1] Get maintainer sign-off on the fail-closed transcript-file behavior and the local-whisper wrapper exception before merge.

Risk before merge

[P1] Known file-output CLI entries and the local-whisper node wrapper now fail closed on empty or missing inferred transcript files and throw unexpected transcript-file read errors instead of falling back to stdout, which can change existing local setups that relied on stdout fallback.
[P1] The local-whisper handling is path/name based in core rather than a documented generic CLI output contract, so maintainers should decide whether that external-wrapper exception belongs in core or should stay a separate plugin/skill contract.

Maintainer options:

Accept the file-output contract change
Maintainers can land this once they agree that empty or missing deterministic transcript files should fail closed for the covered CLI paths despite stdout fallback compatibility changes.
Narrow the external-wrapper exception
Before merge, remove or redesign the local-whisper path match if maintainers do not want core to recognize a skill-specific wrapper name.
Pause for a generic output policy
If custom wrapper support needs a durable contract, pause this PR and design an explicit CLI output-source policy separately from the known-command bug fix.

Next step before merge

[P2] Maintainers need to accept the compatibility change and local-wrapper boundary; there is no narrow automation repair request from this review.

Security
Cleared: No concrete security or supply-chain regression was found; the diff changes media runner output selection, focused tests, and docs without new dependencies, workflows, or secret handling.

Review details

Best possible solution:

Land a focused media-runner fix that makes deterministic file-output STT paths authoritative, with explicit maintainer acceptance or removal of the local-whisper wrapper exception before merge.

Do we have a high-confidence way to reproduce the issue?

Yes at source level: current main reads inferred transcript files only when non-empty and otherwise falls through to stdout, while the linked issue gives a concrete wrapper/progress-stdout reproduction shape. I did not execute a failing current-main run because this review is read-only.

Is this the best way to solve the issue?

Mostly yes: making known deterministic transcript-file outputs authoritative is the narrowest repair for the bug. The local-whisper path-based exception is the remaining owner-boundary and compatibility decision before merge.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 39f319c7a47e.

Label changes

Label changes:

add proof: sufficient: Contributor real behavior proof is sufficient. The PR body and linked gist provide current-head runtime output plus redacted live Telegram gateway logs showing progress stdout was not exposed as transcript text after the patch.
add rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
add status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (logs): The PR body and linked gist provide current-head runtime output plus redacted live Telegram gateway logs showing progress stdout was not exposed as transcript text after the patch.
remove rating: 🌊 off-meta tidepool: Current PR rating is rating: 🐚 platinum hermit, so this older rating label is no longer current.

Label justifications:

P2: This is a normal-priority media transcription correctness fix with limited blast radius in CLI audio understanding.
merge-risk: 🚨 compatibility: Merging changes existing CLI fallback behavior for empty, missing, or unreadable deterministic transcript-file outputs.
rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (logs): The PR body and linked gist provide current-head runtime output plus redacted live Telegram gateway logs showing progress stdout was not exposed as transcript text after the patch.
proof: sufficient: Contributor real behavior proof is sufficient. The PR body and linked gist provide current-head runtime output plus redacted live Telegram gateway logs showing progress stdout was not exposed as transcript text after the patch.

Evidence reviewed

PR surface:

Source +30, Tests +337, Docs 0. Total +367 across 5 files.

View PR surface stats

Area	Files	Added	Removed	Net
Source	1	41	11	+30
Tests	2	346	9	+337
Docs	2	2	2	0
Config	0	0	0	0
Generated	0	0	0	0
Other	0	0	0	0
Total	5	389	22	+367

What I checked:

Current main fallback behavior: Current main only returns an inferred transcript file when the file exists and has non-empty trimmed content; empty or missing files fall through to stdout trimming. (src/media-understanding/runner.entries.ts:217, 39f319c7a47e)
PR authoritative output behavior: The PR head resolves known file-output paths, treats any resolved file path as authoritative, returns empty output on ENOENT, and throws unexpected read errors instead of falling back to stdout. (src/media-understanding/runner.entries.ts:237, 3f3837686752)
Runtime entry point checked: runCliEntry creates the temporary OutputDir/OutputBase, templates CLI args, runs the command, and passes stdout plus the resolved media path into resolveCliOutput before deciding whether to return an audio transcription. (src/media-understanding/runner.entries.ts:924, 3f3837686752)
Regression coverage added: The PR adds focused coverage for empty/missing whisper-cli, whisper, explicit-txt parakeet-mlx, local-whisper node wrapper behavior, parakeet omitted-format stdout fallback, other node-wrapper stdout fallback, and unexpected read errors. (src/media-understanding/runner.cli-audio.test.ts:179, 3f3837686752)
Docs updated for parakeet compatibility: The docs now say parakeet-mlx reads the inferred .txt file only when --output-format txt is explicit, preserving stdout fallback for omitted or non-txt formats. Public docs: docs/nodes/media-understanding.md. (docs/nodes/media-understanding.md:300, 3f3837686752)
Real behavior proof inspected: The linked public gist shows current-head runCliEntry returning null for a controlled local-whisper-compatible wrapper that printed progress stdout and wrote an empty authoritative transcript file, plus a redacted Telegram gateway excerpt where the inbound body stayed as media:audio. (3f3837686752)

Likely related people:

steipete: GitHub path history shows repeated recent media runner and media core refactor work touching the central files and adjacent media execution contracts. (role: recent area contributor; confidence: high; commits: 242eab9d20f7, 77f1359612f6, 252a76d25c93; files: src/media-understanding/runner.entries.ts, src/media-understanding/runner.ts)
TurboTheTurtle: Authored the nearby empty Sherpa structured transcript fix in the same CLI audio output path, making them relevant for empty-output semantics. (role: recent adjacent bugfix author; confidence: medium; commits: 46030f548986; files: src/media-understanding/runner.entries.ts, src/media-understanding/runner.cli-audio.test.ts, src/media-understanding/apply.test.ts)
Spolen23: GitHub history shows earlier infer CLI reliability work with media-understanding runner tests, which is adjacent to the output-source contract this PR changes. (role: earlier media CLI reliability contributor; confidence: medium; commits: 12aaef9035ea; files: src/media-understanding/runner.cli-audio.test.ts, src/media-understanding/runner.entries.ts)

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fc49c6db76

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-27T19:46:01Z

+          : commandId === "node"
+            ? resolveNodeWhisperWrapperOutputPath(params.args, params.mediaPath)
+            : null;
+  const fileOutputAuthoritative = commandId === "node" && fileOutputPath !== null;


Don't fall back from inferred file outputs

When a file path is inferred for whisper-cli, whisper, or parakeet-mlx, fileOutputAuthoritative remains false, so an empty transcript file skips the return here and falls through to params.stdout.trim() later. In configurations such as whisper ... --output_format txt --output_dir {{OutputDir}} where stdout contains progress or banner text, that status text is still injected as the transcript, leaving the file-output CLI regression unfixed for the built-in paths this runner already recognizes.

Useful? React with 👍 / 👎.

clawsweeper · 2026-05-27T19:50:44Z

ClawSweeper PR egg

✨ Hatched: 🥚 common Brave Merge Sprite

Hatch command

Comment @clawsweeper hatch when this PR is hatchable.

Hatchability rules:

Merged PRs are hatchable.
Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.

Rarity: 🥚 common.
Trait: keeps receipts.
Image traits: location proof lagoon; accessory CI status badge; palette amber, ink, and glacier blue; mood curious; pose leaning over a miniature review desk; shell translucent glimmer shell; lighting warm desk-lamp glow; background delicate sparkle particles.
Share on X: post this hatch
Copy: My PR egg hatched a 🥚 common Brave Merge Sprite in ClawSweeper.

What is this egg doing here?

Eggs appear after the PR passes real-behavior proof. It is here for vibes, not verdicts: it does not change labels, ratings, merge decisions, or automation.
The shell reacts to review momentum: open follow-up work warms it up, re-review makes it wobble, and a clean final review lets it hatch.
Hatchability usually comes from sufficient real-behavior proof, no blocking P0/P1/P2 findings, no security attention needed, and clean correctness. A merged PR is already final, so merge makes the egg hatchable independently.
The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

kesslerio · 2026-05-27T19:59:23Z

Updated this PR after ClawSweeper feedback.

Head: ec4e753514fbfc9f129a84f67f18c7b3f422a8b3

What changed:

Broadened the fix from local-whisper-only to the deterministic inferred transcript-file contract.
Any CLI audio path where OpenClaw can infer a .txt transcript output path now treats that file as authoritative.
Empty or missing inferred transcript files no longer fall back to progress stdout for whisper-cli, whisper, parakeet-mlx, or the local-whisper node wrapper.
Explicit non-txt/stdout modes still preserve stdout fallback.

Reviewer/bot feedback addressed:

ClawSweeper P2: “Make inferred file outputs authoritative, not local-whisper only.”

Proof:

public artifact or excerpt: the PR diff now includes focused regression tests for whisper-cli, whisper, parakeet-mlx, and local-whisper wrapper behavior in src/media-understanding/runner.cli-audio.test.ts.
raw local audit source, if any: none.
proof limitation, if any: I attempted an ad-hoc real wrapper proof through runCliEntry, but the tsx --eval proof script timed out before output. So the current pushed proof is still focused test/typecheck proof, not live audio/gateway proof. A redacted real wrapper transcript or real audio run is still the remaining proof upgrade.

Validation:

pnpm exec oxfmt --write src/media-understanding/runner.entries.ts src/media-understanding/runner.cli-audio.test.ts
pnpm exec oxfmt --check --threads=1 src/media-understanding/runner.entries.ts src/media-understanding/runner.cli-audio.test.ts
git diff --check
timeout -k5s 150s node scripts/run-vitest.mjs src/media-understanding/runner.cli-audio.test.ts -t "whisper-cli inferred|inferred whisper transcript|parakeet-mlx inferred|local whisper|node wrapper"
timeout -k5s 180s node scripts/run-tsgo.mjs -p test/tsconfig/tsconfig.core.test.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/core-test.tsbuildinfo

Result:

formatter write/check passed
git diff --check passed
focused Vitest passed: 1 file, 8 tests passed, 4 skipped
core test typecheck passed

Current state:

conflicts: none known before this push; CI will verify
CI: running
proof: supplied but still needs real behavior proof for ClawSweeper clearance
rating target: platinum minimum, diamond preferred after real proof
current rating before this push: unranked krab because patch scope was too narrow and proof was unit-only
remaining blocker, if any: real behavior proof from a local wrapper or real audio transcription path

@clawsweeper re-review

clawsweeper · 2026-05-27T19:59:26Z

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

State: Superseded
Detail: A newer re-review for this item started before this run finished, so GitHub cancelled this older run. Check the latest ClawSweeper run for the current result.
Run: https://github.com/openclaw/clawsweeper/actions/runs/26535409711
Updated: 2026-05-27T20:09:22.399Z

kesslerio · 2026-05-27T20:07:37Z

Proof: real Telegram audio path at PR head `ec4e753`

I added a channel-level proof pass after the earlier focused test/typecheck proof.

Setup:

disposable OpenClaw staging gateway, started from this PR head
Telegram provider: @kesslerClawBot
inbound surface: Telegram DM voice message
audio media model: CLI node .../local-whisper/transcribe.js {{MediaPath}} --output-dir {{OutputDir}}
proof wrapper behavior: prints typical Whisper progress to stdout, writes an empty inferred transcript file, exits successfully

Redacted gateway transcript:

2026-05-27T13:05:57.270-07:00 [telegram] [default] starting provider (@kesslerClawBot)
2026-05-27T13:06:00.435-07:00 [telegram] update: {... "voice":{"duration":6,"mime_type":"audio/ogg","file_size":125925}}
2026-05-27T13:06:03.048-07:00 audio-preflight: transcribing attachment 0 for mention check
2026-05-27T13:06:03.064-07:00 Media understanding via CLI: node .../proof/local-whisper/transcribe.js .../media/inbound/<redacted>.ogg --output-dir /tmp/openclaw/openclaw-media-cli-<redacted>
2026-05-27T13:06:03.151-07:00 Whisper Voice Transcription
Model: proof
Transcribing with Whisper...
2026-05-27T13:06:03.162-07:00 Media understanding audio: skipped (0/1) reason=empty output
2026-05-27T13:06:03.182-07:00 telegram inbound: ... preview="[Telegram Art Kessler (@kesslerIO) id:8352721935 +1h Wed 2026-05-27 13:05 PDT] <media:audio>"
2026-05-27T13:06:03.191-07:00 [telegram] Inbound message telegram:<redacted> -> @kesslerClawBot (direct, audio/ogg, 13 chars)
2026-05-27T13:06:03.809-07:00 Media understanding via CLI: node .../proof/local-whisper/transcribe.js .../media/inbound/<redacted>.ogg --output-dir /tmp/openclaw/openclaw-media-cli-<redacted>
2026-05-27T13:06:03.918-07:00 Whisper Voice Transcription
Model: proof
Transcribing with Whisper...
2026-05-27T13:06:03.926-07:00 Media understanding audio: skipped (0/1) reason=empty output

What this proves:

the real Telegram DM/audio path downloaded the voice note and invoked media understanding from this PR head
the wrapper produced stdout progress (Whisper Voice Transcription, Transcribing with Whisper...)
because the inferred transcript file was empty, OpenClaw treated it as authoritative empty output and skipped transcription
the message body stayed as <media:audio>; progress stdout was not injected as transcript text

Additional direct CLI check against the same disposable staging config:

OPENCLAW_CONFIG_PATH=.../openclaw.json OPENCLAW_STATE_DIR=.../state \
  node scripts/run-node.mjs infer audio transcribe --file /tmp/openclaw-whisper-proof-o3cW2g/sample.wav --json

Error: No transcript returned for audio: /tmp/openclaw-whisper-proof-o3cW2g/sample.wav

That direct check also confirms stdout progress did not become a transcript.

Validation already run on this head:

pnpm exec oxfmt --check --threads=1 src/media-understanding/runner.entries.ts src/media-understanding/runner.cli-audio.test.ts
git diff --check
timeout -k5s 150s node scripts/run-vitest.mjs src/media-understanding/runner.cli-audio.test.ts -t "whisper-cli inferred|inferred whisper transcript|parakeet-mlx inferred|local whisper|node wrapper"
timeout -k5s 180s node scripts/run-tsgo.mjs -p test/tsconfig/tsconfig.core.test.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/core-test.tsbuildinfo

@clawsweeper re-review

kesslerio · 2026-05-27T20:25:41Z

Updated after the latest ClawSweeper review.

Head: ba1dbe8bc36a1ebf53b9a82011fcc9993818e0e6

What changed:

Restored stdout fallback for parakeet-mlx --output-dir when --output-format txt is omitted.
Kept authoritative .txt file behavior only for explicit parakeet txt output.
Added the focused regression for omitted --output-format.
Updated the PR body to use the exact real-behavior proof fields parsed by CI.

Feedback addressed:

ClawSweeper P1: keep parakeet defaults out of authoritative txt mode.
Real behavior proof CI: required field labels are now present and the local policy check passes.

Validation:

pnpm exec oxfmt --check --threads=1 src/media-understanding/apply.test.ts src/media-understanding/runner.entries.ts src/media-understanding/runner.cli-audio.test.ts
git diff --check
timeout -k5s 180s node scripts/run-vitest.mjs run --config test/vitest/vitest.media-understanding.config.ts --reporter=verbose src/media-understanding/runner.cli-audio.test.ts -t "parakeet|whisper-cli inferred|inferred whisper transcript|local whisper|node wrapper"
timeout -k5s 180s node scripts/run-vitest.mjs run --config test/vitest/vitest.media-understanding.config.ts --reporter=verbose src/media-understanding/runner.cli-audio.test.ts src/media-understanding/apply.test.ts
timeout -k5s 180s node scripts/run-tsgo.mjs -p test/tsconfig/tsconfig.core.test.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/core-test.tsbuildinfo
node --input-type=module <local real-behavior-proof-policy check against PR body>

Result:

formatting and git diff --check passed
focused parakeet/Whisper/local-wrapper tests passed: 9 passed, 4 skipped
broader media-understanding audio files passed: 65 passed
core TypeScript check passed
local real-behavior proof policy evaluation passed

Current state:

CI: running on the new head, no current failures visible at the time of this comment
proof: supplied; PR body now uses parseable template fields
remaining blocker expected from previous review: addressed

@clawsweeper re-review

clawsweeper · 2026-05-27T20:25:44Z

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

State: Complete
Detail: The targeted re-review finished, the durable review comment was updated, and the synced verdict was routed.
Run: https://github.com/openclaw/clawsweeper/actions/runs/26536749903
Updated: 2026-05-27T20:35:54.171Z

kesslerio · 2026-05-27T21:23:40Z

Addressed the ClawSweeper P2 docs feedback on head f85508ba3b.

What changed:

Updated docs/nodes/media-understanding.md and docs/nodes/audio.md so the parakeet note now says <output-dir>/<media-basename>.txt is read only when --output-format txt is explicit.
Clarified that omitted or non-txt parakeet output formats use stdout fallback.
No runtime behavior changed in this follow-up commit.

Validation:

pnpm exec oxfmt --check --threads=1 docs/nodes/audio.md docs/nodes/media-understanding.md
git diff --check
pnpm test src/media-understanding/runner.cli-audio.test.ts src/media-understanding/apply.test.ts
pnpm check:docs

Results:

formatting and diff whitespace checks passed
media-understanding tests passed: 2 files, 65 tests
docs checks passed: format, markdownlint, MDX, i18n glossary, links

This should address the remaining author-facing blocker from the durable ClawSweeper review.

@clawsweeper re-review

clawsweeper · 2026-05-27T21:23:43Z

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

State: Complete
Detail: The targeted re-review finished, the durable review comment was updated, and the synced verdict was routed.
Run: https://github.com/openclaw/clawsweeper/actions/runs/26539608456
Updated: 2026-05-27T21:31:43.650Z

kesslerio · 2026-05-27T22:32:11Z

@clawsweeper re-review

Current head 658e605b17fdada9b985d2c3902cd3d1ebe1b980 is merged with current origin/main; the previously red CI is now green, and the PR body CI section has been updated with the merge-refresh validation.

clawsweeper · 2026-05-27T22:32:13Z

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

State: Complete
Detail: The targeted re-review finished, the durable review comment was updated, and the synced verdict was routed.
Run: https://github.com/openclaw/clawsweeper/actions/runs/26542675523
Updated: 2026-05-27T22:40:01.655Z

…-output

kesslerio · 2026-06-02T03:47:06Z

@clawsweeper re-review

Current head 3f383768675217b456189d972867a1d0bdde7858 is refreshed against current origin/main (7d5d62511f). The previous Dependency Guard failure checked out 53ad531df9, where scripts/github/dependency-guard.mjs did not exist; this head contains that script.

What changed since the last review:

merged current origin/main into the PR branch
kept the PR diff scoped to the same five media/doc/test files
updated the PR body proof section to use the parseable Real behavior proof fields and current head SHA
added a public proof gist with current-head direct runtime output plus the redacted live Telegram gateway excerpt: https://gist.github.com/kesslerio/97110dff154bef75ddb80f35c5cdc8e8

Validation run locally:

pnpm exec oxfmt --check --threads=1 docs/nodes/audio.md docs/nodes/media-understanding.md src/media-understanding/apply.test.ts src/media-understanding/runner.cli-audio.test.ts src/media-understanding/runner.entries.ts
git diff --check origin/main...HEAD
timeout -k5s 180s node scripts/run-vitest.mjs run --config test/vitest/vitest.media-understanding.config.ts --reporter=verbose src/media-understanding/runner.cli-audio.test.ts -t "parakeet|whisper-cli inferred|inferred whisper transcript|local whisper|node wrapper"
timeout -k5s 180s node scripts/run-vitest.mjs run --config test/vitest/vitest.media-understanding.config.ts --reporter=verbose src/media-understanding/runner.cli-audio.test.ts src/media-understanding/apply.test.ts

Results:

formatting passed
diff whitespace check passed
focused parakeet/Whisper/local-wrapper tests passed: 9 passed, 4 skipped
broader media-understanding tests passed: 65 passed
current-head direct proof passed: runCliEntry returned null, so wrapper progress stdout was not exposed as transcript text

One validation limitation: the broad core tsgo command now fails on unrelated current-main declaration issues for proper-lockfile, cross-spawn, and hosted-git-info; no media-understanding files are involved in that failure.

clawsweeper · 2026-06-02T03:47:08Z

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

State: Complete
Detail: The targeted re-review finished, the durable review comment was updated, and the synced verdict was routed.
Run: https://github.com/openclaw/clawsweeper/actions/runs/26797117965
Updated: 2026-06-02T03:53:09.530Z

fix(media): suppress local whisper progress transcripts

fc49c6d

openclaw-barnacle Bot added size: M proof: supplied External PR includes structured after-fix real behavior proof. labels May 27, 2026

chatgpt-codex-connector Bot reviewed May 27, 2026

View reviewed changes

clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels May 27, 2026

fix(media): make inferred transcript files authoritative

ec4e753

openclaw-barnacle Bot added triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. and removed proof: supplied External PR includes structured after-fix real behavior proof. labels May 27, 2026

test(media): align whisper-cli autodetect fixture

304b620

openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 27, 2026

fix(media): preserve parakeet default stdout fallback

ba1dbe8

openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 27, 2026

docs(media): clarify parakeet txt output requirement

f85508b

openclaw-barnacle Bot added the docs Improvements or additions to documentation label May 27, 2026

Merge remote-tracking branch 'origin/main' into HEAD

658e605

openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 27, 2026

clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 27, 2026

RomneyDa added dependency-guard-backfill and removed dependency-guard-backfill labels May 29, 2026

Merge remote-tracking branch 'origin/main' into fix/local-whisper-cli…

3f38376

…-output

Uh oh!

Conversation

kesslerio commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

Real behavior proof

Validation

CI Status

Maintainer Edits

AI Assistance

Uh oh!

clawsweeper Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 27, 2026

Choose a reason for hiding this comment

Uh oh!

clawsweeper Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Hatch command

Uh oh!

kesslerio commented May 27, 2026

Uh oh!

clawsweeper Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kesslerio commented May 27, 2026

Proof: real Telegram audio path at PR head ec4e753

Uh oh!

kesslerio commented May 27, 2026

Uh oh!

clawsweeper Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kesslerio commented May 27, 2026

Uh oh!

clawsweeper Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kesslerio commented May 27, 2026

Uh oh!

clawsweeper Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kesslerio commented Jun 2, 2026

Uh oh!

clawsweeper Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kesslerio commented May 27, 2026 •

edited

Loading

clawsweeper Bot commented May 27, 2026 •

edited

Loading

clawsweeper Bot commented May 27, 2026 •

edited

Loading

clawsweeper Bot commented May 27, 2026 •

edited

Loading

Proof: real Telegram audio path at PR head `ec4e753`

clawsweeper Bot commented May 27, 2026 •

edited

Loading

clawsweeper Bot commented May 27, 2026 •

edited

Loading

clawsweeper Bot commented May 27, 2026 •

edited

Loading

clawsweeper Bot commented Jun 2, 2026 •

edited

Loading