Skip to content

Tighten turbopack chunk filename allowlist regex#8048

Merged
gilluminate merged 5 commits intomainfrom
gill/fix-turbopack-chunk-regex
Apr 29, 2026
Merged

Tighten turbopack chunk filename allowlist regex#8048
gilluminate merged 5 commits intomainfrom
gill/fix-turbopack-chunk-regex

Conversation

@gilluminate
Copy link
Copy Markdown
Contributor

@gilluminate gilluminate commented Apr 28, 2026

Ticket ENG-3597 - follow-up to #8018

Description Of Changes

#8018 introduced TURBOPACK_CHUNK_RE to keep turbopack-emitted admin-ui chunk names (which embed ..) from being rejected by the catch-all path-traversal guard in sanitise_url_path. The original regex (^[\w~-]+\.\.[a-z]+$) was too narrow: real turbopack chunks have dots both before and after the literal .. (e.g. 0~9kfdw7..yey.js, 0te-shorr2._..js, turbopack-0bk8.vt1kbb2..js), and #8022 had to revert turbopack to webpack to dodge the breakage. This branch reverts that revert and broadens the regex empirically.

The new pattern ^[\w~-][\w~.-]*\.\.[\w~.-]+$ matches the full turbopack chunk alphabet (word chars, ~, ., -) and just forbids a leading dot, so traversal payloads (.., ..foo, ../etc/passwd, ..%2f...) still fall through to the existing guard.

To make sure this isn't another whack-a-mole iteration, I sampled chunk filenames across 10 builds spanning 9 distinct turbopack-era source states (every commit between #7956 and HEAD that touched clients/admin-ui/src):

  • 1,189 unique chunks across the union, 150 in the intersection — real variance, not deterministic rebuilds
  • 44 unique chunks containing ..
  • 0 misses against the new regex

The regex's character class is now closed over turbopack's actual chunk alphabet, so future rebuilds can't introduce /, %, or other path-relevant characters without a fundamental change to turbopack's naming scheme.

Code Changes

  • clients/admin-ui/package.json: drop the --webpack pin from build/dev/analyze/build:test/build:vercel/export scripts; remove the now-stale //webpack-flag comment. Re-enables turbopack across the board.
  • src/fides/api/main.py: broaden TURBOPACK_CHUNK_RE and update the surrounding comment to describe the alphabet and the leading-dot rule.
  • tests/ops/util/test_api_router.py: extend the parametrized fixtures with the new turbopack chunk shapes (0~9kfdw7..yey.js, abc-123..part.min.js, 0te-shorr2._..js, a.b.c..js).

Steps to Confirm

  1. git checkout this branch and cd clients/admin-ui && rm -rf .next && npm run build. Confirm it builds with turbopack (no --webpack flag).
  2. Spin up the fides webserver and hit /_next/static/chunks/<some-chunk-with-dotdot>.js from the built output. Should serve 200, not redirect to /.
  3. Try a real traversal: curl -i 'http://localhost:8080/../../../../../../etc/passwd'. Should still be caught by the guard and redirected to the admin index.
  4. Run the regression tests: pytest tests/ops/util/test_api_router.py::TestApiRouter -k sanit (or just -k turbopack) — new fixtures should pass alongside the traversal-rejection cases.

Pre-Merge Checklist

  • Issue requirements met
  • All CI pipelines succeeded
  • CHANGELOG.md updated
    • Add a db-migration This indicates that a change includes a database migration label to the entry if your change includes a DB migration
    • Add a high-risk This issue suggests changes that have a high-probability of breaking existing code label to the entry if your change includes a high-risk change (i.e. potential for performance impact or unexpected regression) that should be flagged
    • Updates unreleased work already in Changelog, no new entry necessary
  • UX feedback:
    • All UX related changes have been reviewed by a designer
    • No UX review needed
  • Followup issues:
    • Followup issues created
    • No followup issues
  • Database migrations:
    • Ensure that your downrev is up to date with the latest revision on main
    • Ensure that your downgrade() migration is correct and works
      • If a downgrade migration is not possible for this change, please call this out in the PR description!
    • No migrations
  • Documentation:
    • Documentation complete, PR opened in fidesdocs
    • Documentation issue created in fidesdocs
    • If there are any new client scopes created as part of the pull request, remember to update public-facing documentation that references our scope registry
    • No documentation updates required

@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented Apr 28, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

2 Skipped Deployments
Project Deployment Actions Updated (UTC)
fides-plus-nightly Ignored Ignored Preview Apr 29, 2026 5:38pm
fides-privacy-center Ignored Ignored Apr 29, 2026 5:38pm

Request Review

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 28, 2026

Dependency Review

✅ No vulnerabilities found.

Snapshot Warnings

⚠️: No snapshots were found for the head SHA 96cdcf0.
Ensure that dependencies are being submitted on PR branches and consider enabling retry-on-snapshot-warnings. See the documentation for more information and troubleshooting advice.

Scanned Files

None

gilluminate added a commit that referenced this pull request Apr 28, 2026
@gilluminate gilluminate force-pushed the gill/fix-turbopack-chunk-regex branch from 1deb7a6 to aaa6b28 Compare April 28, 2026 17:34
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 28, 2026

Title Lines Statements Branches Functions
admin-ui Coverage: 8%
6.31% (2800/44305) 5.56% (1404/25231) 4.41% (579/13106)
fides-js Coverage: 78%
79.39% (2011/2533) 65.99% (1240/1879) 73.09% (345/472)
privacy-center Coverage: 88%
85.97% (331/385) 81.36% (179/220) 78.87% (56/71)

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 28, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 84.96%. Comparing base (6cea3cf) to head (96cdcf0).
⚠️ Report is 6 commits behind head on main.

❌ Your project status has failed because the head coverage (84.96%) is below the target coverage (85.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8048      +/-   ##
==========================================
- Coverage   84.97%   84.96%   -0.01%     
==========================================
  Files         633      633              
  Lines       41633    41659      +26     
  Branches     4869     4879      +10     
==========================================
+ Hits        35376    35397      +21     
- Misses       5149     5152       +3     
- Partials     1108     1110       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@gilluminate gilluminate force-pushed the gill/fix-turbopack-chunk-regex branch from aaa6b28 to 8162ef5 Compare April 28, 2026 18:25
@gilluminate gilluminate marked this pull request as ready for review April 28, 2026 18:26
@gilluminate gilluminate requested review from a team as code owners April 28, 2026 18:26
@gilluminate gilluminate requested review from dsill-ethyca and jpople and removed request for a team April 28, 2026 18:26
@rayharnett
Copy link
Copy Markdown
Contributor

Code Review: ENG-3597 — Tighten turbopack chunk filename allowlist regex

PR: #8048
Author: @gilluminate
Base: maingill/fix-turbopack-chunk-regex


🚨 Critical Issues (Must Fix)

1. Stale comment referencing old regex (main.py:68)

The comment on line 68 still describes the old regex pattern:

# Turbopack (Next.js 16+) chunk filenames can embed ".." before the extension,
# e.g. "0y3j4e~tvxaz..js". These are benign static assets, not traversal.

But the regex on line 70 has been broadened. The comment on lines 69–73 (the new block) is correct, but the original comment is now misleading and contradictory. It should be removed or merged with the new comment block.

Fix: Remove lines 68–69 (the old comment) since lines 69–73 already fully describe the pattern.


⚠️ Warnings (Should Fix)

2. CI: codecov/project check is failing

The codecov/project check is marked as failing. This is likely because the new test cases in test_api_router.py don't add enough net-new coverage to offset the baseline, or the codecov project-level threshold is set too high.

Action: Check the Codecov report for details. If the failure is a project-level threshold (not patch-level), this is a CI configuration issue, not a code issue. The patch-level check (codecov/patch) succeeded.


💡 Suggestions (Nitpicks / Improvements)

3. Consider .fullmatch() for semantic clarity (main.py:70)

The regex uses .match() on line 144:

TURBOPACK_CHUNK_RE.match(token)

Since the regex has ^ and $ anchors, .match() and .fullmatch() behave identically here. However, .fullmatch() makes the intent explicit — we're validating the entire token, not just a prefix. This is a minor readability improvement:

TURBOPACK_CHUNK_RE.fullmatch(token)

4. Test edge case: single-character trailing segment

The regex [\w~.-]+$ after .. requires at least one character. The test data includes a.b.c..js (trailing js) but not a minimal case like a..b. Adding a test for a single-char trailing segment would verify the + quantifier works as expected:

("/_next/static/chunks/a..b.js", False),  # minimal trailing segment

5. Changelog: consider adding security label

The change tightens a path-traversal allowlist regex. While this is a bug fix (re-enabling turbopack), it touches security-critical code. The changelog entry has labels: [] — consider whether a security or improvement label would be appropriate for the release notes.


✅ Good Practices

  • Empirical validation: Sampling 1,189 unique chunks across 10 builds with 0 misses against the new regex is excellent engineering discipline. This is exactly the right approach to avoid whack-a-mole.
  • Test coverage: The new test cases (0~9kfdw7..yey.js, abc-123..part.min.js, 0te-shorr2._..js, a.b.c..js) cover the key edge cases: tilde in middle, hyphen, dots before and after ...
  • Commit discipline: Three clean, focused commits (regex fix → turbopack restore → changelog) make the history easy to bisect.
  • PR description: Thorough, well-structured, with clear verification steps. The empirical analysis section is a model for similar fixes.
  • Security preservation: The leading-dot restriction ([\w~-] as first char) correctly prevents bypasses via ., .., ... prefixed tokens. All traversal test cases still pass.

Summary

This is a well-researched, narrowly-scoped fix. The regex change is sound and the test coverage is adequate. The only action item is removing the stale comment (Critical #1). The failing codecov/project check should be investigated but is likely a CI configuration issue rather than a code problem.

Recommendation: ✅ Approve with the comment fix applied.

- Use fullmatch() instead of match() for explicitness
- Add minimal a..b trailing-segment test case
- Flag changelog entry as high-risk
@gilluminate
Copy link
Copy Markdown
Contributor Author

Thanks @rayharnett! Walking through the items:

  1. Stale comment (Critical): Heads up, this one's a misread of the diff. The file currently has only the updated comment block on lines 68–73 — the original "can embed '..' before the extension..." comment was replaced in a3455e3, not duplicated alongside the new block. Nothing stale to remove.

  2. codecov/project failing: It's the project-level threshold (84.96% vs 85.00%). Patch coverage is clean. Not blocking on this.

  3. .fullmatch() over .match(): Done — switched to .fullmatch() for explicitness even though the anchors make it a no-op.

  4. Minimal a..b test case: Added ("/_next/static/chunks/a..b.js", False) with a comment explaining the + quantifier is what we're guarding.

  5. high-risk label on changelog: Added. Fair call given the regex sits in front of the path-traversal guard.

All changes pushed in 96cdcf0.

@gilluminate gilluminate added this pull request to the merge queue Apr 29, 2026
Merged via the queue into main with commit 9615a32 Apr 29, 2026
77 of 78 checks passed
@gilluminate gilluminate deleted the gill/fix-turbopack-chunk-regex branch April 29, 2026 18:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants