Skip to content

fix: drop redundant decodeURIComponent and repair urlencoded schema validation#267

Merged
danadajian merged 2 commits into
ExpediaGroup:mainfrom
lzeligman-mark43:fix/remove-redundant-decode-uri-component
May 26, 2026
Merged

fix: drop redundant decodeURIComponent and repair urlencoded schema validation#267
danadajian merged 2 commits into
ExpediaGroup:mainfrom
lzeligman-mark43:fix/remove-redundant-decode-uri-component

Conversation

@lzeligman-mark43
Copy link
Copy Markdown
Contributor

@lzeligman-mark43 lzeligman-mark43 commented May 21, 2026

Summary

Fixes two bugs in lambda/parse-request-body.ts that together break every valid application/x-www-form-urlencoded GitHub webhook. Both bugs were introduced by the recent 094cda1 (feat(repo): switch to bun) refactor that introduced URLSearchParams.

Bug 1 — schema is parsed against the wrong shape

const params = new URLSearchParams(body);
const payloadParam = params.get("payload");        // string | null
const { payload } = bodySchema.parse(payloadParam); // bodySchema is z.object({ payload: z.string() })

URLSearchParams.get returns the value of the payload= field as a string | null. bodySchema is declared as z.object({ payload: z.string() }), so zod throws Invalid input: expected object, received string and parseRequestBody returns undefined. The handler then 403s every valid urlencoded GitHub webhook.

The existing fixture-based test (fixtures/invalid-payload-urlencoded.txt) hides this because the fixture is deliberately invalid and the test only checks that invalid payloads are rejected — it does not exercise a valid urlencoded payload's success path.

Bug 2 — redundant decodeURIComponent throws on literal %

return JSON.parse(decodeURIComponent(payload));

URLSearchParams.get already URL-decodes the value once (%7B{, etc.). Calling decodeURIComponent a second time on the already-decoded JSON string throws URIError: URI malformed whenever the JSON contains a bare % that doesn't begin a valid %XX escape — extremely common in real user content:

Pattern Example
Percentage in PR title / commit message 30% threshold rollout, 100% complete
Trailing % rate: 5%
Windows env-var reference set %USERPROFILE% to ~, %PATH%
SQL LIKE wildcard WHERE name LIKE '%foo%'
C / Go format verbs printf("%s\n", val), log.Printf("%v", obj)
Bare % followed by non-hex look here: %g and %h

GitHub sends webhook bodies single-URL-encoded per the form-urlencoded spec; the second decode was never needed for correct decoding of real traffic.

Fix

       case CONTENT_TYPES.URL_ENCODED:
         const params = new URLSearchParams(body);
-        const payloadParam = params.get("payload");
-        const { payload } = bodySchema.parse(payloadParam);
-        return JSON.parse(decodeURIComponent(payload));
+        const { payload } = bodySchema.parse({
+          payload: params.get("payload"),
+        });
+        return JSON.parse(payload);
  1. Wraps params.get("payload") into the { payload: string } shape bodySchema expects, so validation actually runs.
  2. Drops the redundant decodeURIComponent so literal % characters in user content no longer crash the parse.

Tests

Adds 9 new test cases in lambda/proxy.test.ts:

  • One end-to-end test through the handler driving a urlencoded webhook whose JSON contains literal % characters (multiple patterns combined).
  • A new describe("parseRequestBody — urlencoded payloads with literal '%' characters", …) block exercising parseRequestBody directly for each of the user-content patterns listed in the table above, plus:
    • A control with no special characters to confirm normal payloads still parse.
    • A %20 preservation test confirming that literal %20 in a JSON string value is preserved verbatim (not re-decoded to a space), which is the correct semantic now that the outer URLSearchParams.get is the sole decode pass.

Local run: 22 pass, 0 fail (12 existing + 10 new).

Backwards-compatibility analysis

The only scenario where removing the second decode would change behavior is a hypothetical client that double-URL-encodes the body. GitHub never does this. Even if it happened, the lambda only inspects enterprise.slug and sender.login for the routing/auth decision (both are GitHub identifier strings that cannot contain %), and the raw original body is forwarded byte-for-byte to the destination via axios.post(url, body, …) — so the routing decision and forwarded payload are unchanged in that edge case.

Notes

  • No deps changed.
  • bun run format-check clean.
  • bun test passes 22/22.

…edundant decodeURIComponent

The current `URL_ENCODED` branch of `parseRequestBody` has two bugs that combine to make every valid urlencoded GitHub webhook fail:

1. `bodySchema.parse(payloadParam)` is called with a `string | null` returned by `URLSearchParams.get("payload")`, but `bodySchema` is `z.object({ payload: z.string() })` — i.e. it expects an *object* with a `payload` key. zod throws `Invalid input: expected object, received string` on every valid payload, so the function returns `undefined`. The pre-existing fixture test (`fixtures/invalid-payload-urlencoded.txt`) hides this because it only exercises an invalid payload that's expected to be rejected with 403.

2. `JSON.parse(decodeURIComponent(payload))` calls `decodeURIComponent` a second time on a value `URLSearchParams.get` has already URL-decoded. This throws `URIError: URI malformed` whenever the decoded JSON contains a literal `%` that isn't part of a valid `%XX` escape — which is common in real PR titles, commit messages, and comments ("30% threshold", "set %USERPROFILE% to ~", "WHERE x LIKE '%foo%'", "printf(\"%s\", val)").

Fixes:

- Wrap the `URLSearchParams.get("payload")` value into the `{ payload: string }` shape the schema expects, so schema validation actually runs against valid payloads.
- Drop the redundant `decodeURIComponent(payload)` — `URLSearchParams.get` already URL-decodes once, which is exactly what GitHub's single-URL-encoded webhook body needs.

Tests: 9 new cases in `lambda/proxy.test.ts` cover the percent-character regression patterns plus a control and a `%20`-preservation case, exercised both end-to-end through the handler and directly against `parseRequestBody`.
Comment thread lambda/parse-request-body.ts Outdated
Comment thread lambda/proxy.test.ts Outdated
Comment thread lambda/proxy.test.ts Outdated
Comment thread lambda/proxy.test.ts Outdated
Co-authored-by: Dan Adajian <danadajian@gmail.com>
@danadajian
Copy link
Copy Markdown
Contributor

Thanks for the fix @lzeligman-mark43!

@danadajian danadajian enabled auto-merge (squash) May 26, 2026 17:46
@danadajian danadajian merged commit 94e24a1 into ExpediaGroup:main May 26, 2026
3 checks passed
@eg-oss-ci
Copy link
Copy Markdown
Contributor

🎉 This PR is included in version 2.4.2 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants