Skip to content

pr-comment: pipe changes payload via stdin to avoid jq ARG_MAX failure#118

Merged
reuvenharrison merged 2 commits into
mainfrom
fix/pr-comment-jq-arg-list-too-long
May 22, 2026
Merged

pr-comment: pipe changes payload via stdin to avoid jq ARG_MAX failure#118
reuvenharrison merged 2 commits into
mainfrom
fix/pr-comment-jq-arg-list-too-long

Conversation

@reuvenharrison
Copy link
Copy Markdown
Contributor

Summary

The pr-comment entrypoint built its JSON payload by passing the diff content as a jq --argjson changes "$changes" value. For specs whose changelog runs into the multi-megabyte range, the JSON string exceeds the OS argument-length limit (ARG_MAX, typically 2 MB on Linux), and jq aborts with:

/entrypoint.sh: line N: jq: Argument list too long

This happens after the script has printed the free ::notice:: review URL but before the POST to oasdiff-service, so the customer sees a /review?... notice in their action log, no PR comment from oasdiff[bot], and assumes the Pro workflow swap had no effect. Silent failure under set -e because the notice line has already been printed and the workflow step exit code propagates without an obvious "this is the cause" pointer.

Fix

Pipe $changes through stdin instead. printf is a shell builtin in POSIX /bin/sh / busybox ash, so the variable is never on argv and ARG_MAX is irrelevant. The jq filter changes $changes to . (the implicit stdin input), and the -n (null-input) flag is dropped since jq now consumes real input.

-payload=$(jq -n \
+payload=$(printf '%s' "$changes" | jq \
     --arg token "$github_token" \
     ...
-    --argjson changes "$changes" \
-    '{... changes: $changes}')
+    '{... changes: .}')

The other --arg values (token, owner/repo, SHAs, file paths) are short and stay on the command line unchanged.

Regression test

A new job pr_comment_handles_large_payload in .github/workflows/test.yaml stubs oasdiff to emit a ~4 MB synthetic changelog (two entries with 2 MB of filler each, well above the 2 MB Linux ARG_MAX ceiling) and asserts the entrypoint reaches the "No oasdiff-token provided" skip without aborting. If someone later reverts to --argjson changes "$changes", the synthetic payload trips ARG_MAX on the runner and the test fails with a specific error message pointing at the cause.

Follows the same stub-oasdiff-on-PATH pattern as the existing pr_comment_tolerates_oasdiff_fail_on and pr_comment_aborts_on_oasdiff_real_failure jobs.

Test plan

  • bash -n pr-comment/entrypoint.sh clean
  • YAML parser accepts the workflow file
  • Synthetic 2 MB / 5000-change payload locally constructs a valid JSON object via the new stdin path
  • On merge, the new CI job confirms the multi-MB path works end-to-end via the stubbed entrypoint

Discovered by

A Pro customer trying to set up the pr-comment workflow on a multi-thousand-endpoint spec. Their action log surfaced the exact jq: Argument list too long error from line 89; the fix shape was unambiguous from there.

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

Generated with Claude Code

reuvenharrison and others added 2 commits May 22, 2026 19:40
The pr-comment entrypoint built its JSON payload via `jq -n --argjson
changes "$changes"`, which puts the entire JSON string on jq's command
line. For real-world specs the changes array can run into the
multi-megabyte range, exceeding the OS argument-length limit (ARG_MAX,
typically 2 MB on Linux). The script then aborts with:

    /entrypoint.sh: line N: jq: Argument list too long

right before the POST to oasdiff-service, so the customer sees:

  * The free `::notice::` review URL printed earlier (which is
    indistinguishable from what the free `breaking` action would emit)
  * No PR comment from oasdiff[bot] (because the service never received
    the report)
  * No commit-status check transition

...and assumes either the workflow swap from `breaking` to `pr-comment`
didn't take effect, or that approve/reject is broken for them. The
action's failure is silent in the workflow's overall status because the
script aborts under `set -e` after the notice line has already printed.

Fix: pipe `$changes` to jq via stdin and reference it as `.` in the
filter expression. `printf` is a shell builtin in /bin/sh / busybox ash,
so the variable is never on argv and ARG_MAX is irrelevant. The other
`--arg` values are short (token, owner/repo, SHAs, file paths) and stay
on the command line unchanged.

Adds a regression test in .github/workflows/test.yaml that stubs
oasdiff to emit a ~4 MB changelog and asserts the entrypoint reaches
the no-token skip without aborting. If the regression returns (someone
reverts to `--argjson changes`), the synthetic payload will trip
ARG_MAX on the CI runner and the test fails with a precise error
message pointing at the cause.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The new pr_comment_handles_large_payload job ran `yes a | head -c
2000000 | tr -d '\n'` to build the filler string. Under the workflow
shell's `set -o pipefail`, `yes` exits with SIGPIPE (141) when `head`
closes the pipe, which fails the pipeline and aborts the step before
the test even runs. Classic pattern: infinite source plus bounded
head plus pipefail.

Switch to `dd if=/dev/zero bs=1024 count=2000 status=none | tr '\0'
'a'`. dd reads a bounded number of zero bytes and exits 0 cleanly,
so the pipeline succeeds under pipefail. Output is identical: a 2 MB
string of 'a' characters.

Confirmed locally with `set -o pipefail` enabled: the pipeline
returns 0 and produces 2 048 000 bytes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@reuvenharrison reuvenharrison merged commit e8d6e98 into main May 22, 2026
38 checks passed
reuvenharrison added a commit that referenced this pull request May 22, 2026
)

Same class of bug as #118 (jq ARG_MAX), one layer down. After #118
fixed the jq invocation, the next line on the failure path is

    response=$(curl ... -d "$payload")

The full JSON payload (~4 MB for the customer spec that surfaced the
jq case) goes through curl's argv via `-d`, exceeds the OS argument-
length limit (ARG_MAX, typically 2 MB on Linux), and the script
aborts with:

    /entrypoint.sh: line 124: curl: Argument list too long

right after the (now-successful) jq step, with the same downstream
symptoms: free `::notice::` URL printed, no PR comment posted, no
review-token link generated. Customer saw "same problem" after
bumping their pin to @main with #118 applied — the jq step now
works but the next layer hits the same trap.

Fix: pipe the payload to curl via stdin and use `--data-binary @-`
to consume from stdin. printf is a shell builtin so the variable
never goes through execve. Same shape as the jq fix.

Extends the test workflow with `pr_comment_curl_handles_large_payload`,
a companion to the existing `pr_comment_handles_large_payload`. The
previous test exercises the jq path with an empty oasdiff_token,
which short-circuits past the curl step. The new test exercises the
curl step itself by providing a non-empty token and stubbing both
oasdiff (multi-MB changelog source) and curl (records the body byte
count from stdin and emits a fake 200 response). Asserts the curl
stub was invoked with >4 MB of body — proves the payload made it
through stdin without hitting ARG_MAX on argv.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant