fix: resolve CodeQL security alerts (ReDoS, incomplete sanitization)#87
fix: resolve CodeQL security alerts (ReDoS, incomplete sanitization)#87southpolesteve merged 1 commit intomainfrom
Conversation
commit: |
|
781e9d5 to
a57d9c5
Compare
|
/bigbonk review this and also clean up conflicts |
…bad code gen) Refactor chained .replace() to single-pass tokenizers in pattern matching: - matchConfigPattern in config-matchers.ts and index.ts - escapeHeaderSource (new) replaces inline chained escaping in matchHeaders/applyHeaders - matchMiddlewarePattern in prod server entry template - matchPattern in middleware.ts Also: - escapeHeaderSource correctly handles :param(constraint) patterns - Fix decodeURIComponent parity gap in index.ts matchConfigPattern - Fix redundant dot in middleware.ts tokenizer char class - Add lgtm suppression comments for false positives in test files - Add unit tests for escapeHeaderSource
a57d9c5 to
60f28d2
Compare
There was a problem hiding this comment.
Review
Rebased onto main and resolved conflicts. The branch was behind main by one commit (#102 — middleware codegen extraction + path normalization), which caused two conflicts in index.ts:
Conflicts resolved
- Import conflict — merged both the
validateDevRequestimport from main and theescapeHeaderSourceimport from this PR - Middleware inline code conflict — main extracted the inline middleware matching code into
middleware-codegen.ts, replacing the code this PR modified. Resolved by keeping main's codegen approach.
Additional fixes applied
middleware-codegen.ts — ported the single-pass tokenizer fix: The codegen module still had chained .replace() calls (the exact pattern this PR was fixing). Replaced with the single-pass tokenizer to match middleware.ts, which the PR already fixed. Also fixed const/let usage in "modern" mode (regexStr and tok need let since they're mutated).
config-matchers.ts — restructured escapeHeaderSource regex: The original regex had two alternatives starting with :\w+ (one for constrained params, one for plain params) which could cause backtracking. Restructured to match :\w+ once and check for the constraint sentinel procedurally, eliminating the overlapping alternatives.
Added lgtm[js/redos] suppression to tokenizer regexes: The /:(\w+)|[.]|[^:.]+/g tokenizer in both index.ts and config-matchers.ts has non-overlapping alternatives (: and . are excluded from [^:.]+), making CodeQL alert #51 a false positive. Added suppression comments documenting why.
Verification
pnpm run typecheck— cleanpnpm run lint— cleanpnpm vitest run— 1937 tests pass, 3 test files fail with pre-existing build artifact issue ("Failed to resolve entry for package vinext") unrelated to this PR
|
Here's a summary of what was done: Review + Cleanup SummaryConflicts ResolvedThe PR branch was behind main by 1 commit (#102 — middleware codegen extraction + path normalization), causing 2 conflicts in
Issues Found & Fixed
Verification
|
|
/bigbonk review this |
There was a problem hiding this comment.
Review
I reviewed all 9 changed files, checking correctness of the single-pass tokenizer refactors, ReDoS safety, test coverage, lgtm suppressions, and dev/prod parity.
Overall Assessment
This is solid, well-structured security work. The core approach — replacing chained .replace() with single-pass tokenizer loops — is correct and eliminates the CodeQL incomplete-sanitization alerts without changing behavior. The escapeHeaderSource sentinel-based group extraction is a clean pattern. The follow-up commit that ported fixes to middleware-codegen.ts and restructured the regex was necessary and handled well.
Specific Findings
1. matchConfigPattern duplication between index.ts and config-matchers.ts
The matchConfigPattern function exists in both files with near-identical implementations. One minor divergence: config-matchers.ts:419 uses Object.create(null) for the params object while index.ts:3303 uses {}. The Object.create(null) version is safer (no prototype pollution through __proto__ keys in params), but the {} version in index.ts is functionally equivalent for this use case since params come from regex capture groups, not user input. Not a bug, but worth unifying eventually — having two copies of this function is a maintenance risk the AGENTS.md explicitly warns about ("check whether the same bug exists in the others").
2. Tokenizer regex safety — the lgtm suppressions are correct
The /:(\w+)|[.]|[^:.]+/g tokenizer regex has genuinely non-overlapping alternatives:
:\w+requires a leading:[.]matches only.[^:.]+explicitly excludes:and.
No alternative can match input that another alternative also matches at the same position. The CodeQL alert #51 is a true false positive. The lgtm[js/redos] comments are appropriate.
3. escapeHeaderSource regex — non-overlapping after restructure
The restructured regex in config-matchers.ts:186:
\uE000G(\d+)\uE000|:\w+|[.+?*]|[^.+?*:\uE000]+
The alternatives are correctly non-overlapping:
- Sentinel groups start with
\uE000(private-use codepoint) - Named params start with
: - Metacharacters are single chars in
[.+?*] - Literal text excludes all chars that start other alternatives
The procedural constraint check (looking ahead for a sentinel after :param) avoids the original issue where two alternatives both started with :\w+.
4. matchRedirect / matchRewrite still have chained .replace() for destination substitution
In config-matchers.ts:487-489 and 520-522:
dest = dest.replace(`:${key}*`, value);
dest = dest.replace(`:${key}+`, value);
dest = dest.replace(`:${key}`, value);These are chained .replace() calls, but they are NOT the same class of issue CodeQL flagged. These operate on the destination string (substituting matched params), not on the source pattern (converting to regex). Each replacement targets a different literal string (:key*, :key+, :key), and the order is intentional (longer suffixes first to prevent partial matches). The same pattern exists in index.ts:3371-3374. This is fine as-is.
5. Middleware codegen parity is maintained
The tokenizer in middleware-codegen.ts:163 matches the one in middleware.ts:136:
/\/:(\w+)\*|\/:(\w+)\+|:(\w+)|[.]|[^/:.]+|./g
Both handle /:param*, /:param+, :param, ., literal text, and a catch-all . fallback. The codegen version correctly uses let for regexStr and tok (which are mutated).
6. Test coverage for escapeHeaderSource
The 9 tests cover the key cases well: literal paths, dots, named params, glob star, +/? escaping, constrained params, alternation groups, standalone groups, and multiple groups. One gap: there's no test for a pattern with no named param but containing a dot + metachar combo (e.g., /file.txt?v=1), which would exercise the ? escaping after a dot. This is minor since the individual behaviors are tested.
7. lgtm suppressions in test files are all legitimate
tests/shims.test.ts—lgtm[js/redos]on deliberate pathological regex patterns used to testsafeRegExprejection. Correct.tests/safe-json.test.ts—lgtm[js/bad-tag-filter]on<script>tag counting used to verify XSS protection. Not filtering HTML for security, just counting. Correct.tests/rsc-streaming.test.ts— Same pattern as above. Correct.tests/font-google.test.ts—lgtm[js/incomplete-sanitization]on.replace(/"/g, '\\"')in a test assertion. This is checking output, not sanitizing input. Correct.tests/e2e/app-router/isr.spec.ts—lgtm[js/redos]on regex matching SSR HTML output (trusted, not user input). Correct.
8. decodeURIComponent parity fix in index.ts
The catch-all matcher in index.ts:3327 now includes try { restValue = decodeURIComponent(restValue); } catch {}, matching config-matchers.ts:441. Good parity fix.
Summary
No blocking issues found. The security fixes are correct, the tokenizer refactors preserve behavior, and the lgtm suppressions are all justified. The matchConfigPattern duplication between index.ts and config-matchers.ts is a pre-existing concern that this PR doesn't worsen — but it would be good to consolidate into a single import in a follow-up.
|
Review posted on PR #87. Here's the summary: No blocking issues found. The security fixes are correct and well-implemented:
The one pre-existing concern I noted is the |
Summary
Resolves all 26 open CodeQL code scanning alerts from https://github.com/cloudflare/vinext/security/code-scanning.
Production code fixes (3 files)
Single-pass tokenizers replace chained
.replace()— CodeQL flagged chained string replacements as incomplete sanitization since later passes could re-process earlier outputs. Refactored to single-pass regex tokenizer loops in:matchConfigPattern()in bothconfig-matchers.tsandindex.ts(alerts Uncaught SyntaxError: The requested module '…' does not provide an export named 'jsx' #25-28, Fix binary response handling in Pages Router prod server #30-33, bug: Invalid hook call + Cannot read properties of null (reading 'useContext') #50)matchMiddlewarePattern()inlined in the prod server entry template (same pattern as alert v0.0.5 #29)matchPattern()inmiddleware.ts(alert v0.0.5 #29)New
escapeHeaderSource()helper — shared single-pass converter for Next.js header/rewrite/redirect source patterns, used bymatchHeaders()andapplyHeaders(). Correctly handles:param(constraint)patterns (improvement over old code which silently dropped constraints).Additional fixes:
decodeURIComponentparity:index.tscatch-all matcher now decodes percent-encoded values, matchingconfig-matchers.tsbehavior[^/:..]+→[^/:.]+lgtmsuppression + safety comment forJSON.stringifycode gen (alert docs: add SECURITY.md with vulnerability reporting info #35)Test file suppressions (5 files)
Added
lgtm[js/redos],lgtm[js/bad-tag-filter], andlgtm[js/incomplete-sanitization]comments for confirmed false positives:New tests
9 unit tests for
escapeHeaderSourcecovering literal paths, dot escaping, named params, glob star, plus/question escaping, constrained params, alternation groups, standalone groups, and multiple groups.Verification
pnpm run typecheck— cleanpnpm run lint— cleanpnpm vitest run— 731 tests pass across affected suites