Skip to content

fix(helpers): fix DecodeURIIfNeeded idempotence for literal '%' in URIs#714

Merged
wizzomafizzo merged 3 commits intomainfrom
fix/fuzz-decode-uri-percent-idempotence
Apr 19, 2026
Merged

fix(helpers): fix DecodeURIIfNeeded idempotence for literal '%' in URIs#714
wizzomafizzo merged 3 commits intomainfrom
fix/fuzz-decode-uri-percent-idempotence

Conversation

@wizzomafizzo
Copy link
Copy Markdown
Member

@wizzomafizzo wizzomafizzo commented Apr 19, 2026

Fixes #712.

The nightly fuzz run found that decode("steAm://%25000") returns "steAm://%000", but a second call returns "steAm://\x000" — violating the idempotence invariant checked by FuzzDecodeURIIfNeeded.

The custom-scheme branch's re-encoder already handles /, ?, and # (from PR #708). The same class of bug applies to the % character itself: url.PathUnescape("%25000")"%000", and a second decode pass interprets %00 as a null-byte escape. Adding %%25 as the first entry in the re-encoder prevents this.

% must be registered before the other replacements so that the %2F/%3F/%23 strings produced by those rules are not themselves re-encoded.

Side effect: doubly-encoded inputs like steam://999/Name%2520Here now return unchanged instead of decoding the %25 layer. This is required for idempotence — two existing tests updated to reflect the new semantic.

Includes corpus file 6b19b7d4146026ef (steAm://%25000) from the nightly fuzz artifact.

Summary by CodeRabbit

  • Bug Fixes

    • Fixed URI decoding to properly handle percent-encoded characters in custom URI schemes, preventing unintended re-decoding when URLs are parsed multiple times.
  • Tests

    • Added test coverage for edge cases involving percent-encoded character sequences in URIs to ensure consistent behavior across different input patterns.

…icalization

Three related changes to the tag system:

1. Storage-only numeric padding: purely-numeric tag values are zero-padded to
   width 4 in SQLite (e.g. disc:1 → disc:0001) so ORDER BY sorts correctly.
   PadTagValue is applied at every DB write site; UnpadTagValue strips at every
   read site. Public API, NFC tokens, and ZapScript remain in natural form.

2. Net-new upstream tags from PigSaint/GameDataBase: keyboard, touchscreen,
   positional:4 (input); barcode namespace (addon); vibration:rumble and
   accelerometer (embedded); vicdual/g80/h1/model1-3/naomi and new
   manufacturers nichibutsu/taiyo/tecfri:ambush/tourvision (arcadeboard);
   gameboy:infrared and gameboy:gba (compatibility); archimedes/atari:falcon/
   sega:32x/nintendo:disksystem/nintendo:gameandwatch/wonderswan (port);
   vr and keyword:ubikey (search); comicclassics (reboxed); pcemini/
   ninjajajamaru/zeldacollection and 3dfukkoku:01/02 (rerelease); rev:f,
   set:f1/f2, alt:4/5/6 (range fills); ca (lang); ddrgb/fullchanger (addon
   controller); mobileadaptergb (link); glasses:mvd, led:powerantenna/bugsensor,
   pocketsakura, spectrumcommunicator (addon misc); seganet (reboxed).

3. Deprecated alias canonicalization: addon:barcodeboy rewrites to
   addon:barcode:barcodeboy; addon:controller:jcart to embedded:slot:jcart;
   addon:controller:rumble to embedded:vibration:rumble. Old NFC tokens and
   ZapScript files using the former names resolve transparently at query time
   via CanonicalizeTagAlias in resolveFilter.
…Is (#712)

Custom-scheme branch: add '%' → '%25' to the re-encoder (before '/', '?',
'#') so that a decoded literal percent character does not become a pct-encode
prefix on a second parse pass. Fixes crash on steAm://%25000 where %25
decoded to '%', leaving %000, which then decoded %00 to null on the next
call (idempotence failure).

Updates the double_encoding and triple_encoding table tests: with the new
re-encoder a doubly-encoded input like steam://999/Name%2520Here is returned
unchanged (the decoded '%' is re-encoded back), which is required by the
idempotence invariant.

Adds regression corpus entry 6b19b7d4146026ef ("steAm://%25000") from the
nightly fuzz artifact attached to run #24621237120.
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 19, 2026

📝 Walkthrough

Walkthrough

A bug fix for DecodeURIIfNeeded addresses a percent-character re-triggering issue during URI custom-scheme decoding. The fix reorders the re-encoding logic to percent-encode the % character first, preventing unwanted decode cycles. Includes a new fuzz corpus entry that triggered the original crash and updated test expectations.

Changes

Cohort / File(s) Summary
Fuzz Testdata
pkg/helpers/testdata/fuzz/FuzzDecodeURIIfNeeded/6b19b7d4146026ef
Added new fuzz corpus entry containing steAm://%25000 to reproduce the crash from issue #712.
URI Decoding Logic
pkg/helpers/uris.go
Modified DecodeURIIfNeeded to percent-encode % first in the reenc replacer before encoding gen-delimiters (/, ?, #), preventing % from re-triggering decoding on subsequent passes.
Test Updates
pkg/helpers/uris_test.go
Updated test expectations for double/triple-encoded URIs to reflect that percent-sequences remain encoded; descriptions clarified that literal % characters are re-encoded for idempotence.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Poem

🐰 A percent sign caused such recursive pain,
So first we encode it in the chain,
Before the slashes take their turn,
No more loops—idempotent and clean!
The fuzz crash solved, the logic keen. ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: fixing DecodeURIIfNeeded idempotence for literal '%' characters in URIs, which directly addresses issue #712.
Linked Issues check ✅ Passed The PR successfully addresses issue #712 requirements by reproducing the fuzz failure, adding the corpus file (6b19b7d4146026ef), fixing the idempotence issue with literal '%' in URIs, and updating tests accordingly.
Out of Scope Changes check ✅ Passed All changes are directly scoped to fixing the DecodeURIIfNeeded idempotence issue: the fuzz corpus, the percent-encoding re-encoder fix, and test expectation updates are all necessary for resolving issue #712.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/fuzz-decode-uri-percent-idempotence

Comment @coderabbitai help to get the list of available commands and usage tips.

@sentry
Copy link
Copy Markdown

sentry bot commented Apr 19, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
pkg/helpers/uris_test.go (1)

943-953: Consider adding the exact fuzz-found URI as a table test case.

These expectation updates are correct, but adding steAm://%25000 directly in this test table would lock in a deterministic regression check independent of fuzz corpus execution.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/helpers/uris_test.go` around lines 943 - 953, Add a deterministic table
test entry for the fuzz-found URI next to the existing "double_encoding" and
"triple_encoding" cases: create a new test case (e.g., name
"fuzz_steAm_percent") with input "steAm://%25000" and expected "steAm://%25000"
(and a short description) in the same test slice in pkg/helpers/uris_test.go so
the regression is checked without relying on the fuzz corpus.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@pkg/helpers/uris_test.go`:
- Around line 943-953: Add a deterministic table test entry for the fuzz-found
URI next to the existing "double_encoding" and "triple_encoding" cases: create a
new test case (e.g., name "fuzz_steAm_percent") with input "steAm://%25000" and
expected "steAm://%25000" (and a short description) in the same test slice in
pkg/helpers/uris_test.go so the regression is checked without relying on the
fuzz corpus.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 16fb63c9-6bc2-4b88-9d6f-3e585f86b0b9

📥 Commits

Reviewing files that changed from the base of the PR and between 7ac5dd4 and 9454dd4.

📒 Files selected for processing (3)
  • pkg/helpers/testdata/fuzz/FuzzDecodeURIIfNeeded/6b19b7d4146026ef
  • pkg/helpers/uris.go
  • pkg/helpers/uris_test.go

@wizzomafizzo wizzomafizzo merged commit d54aa51 into main Apr 19, 2026
11 checks passed
@wizzomafizzo wizzomafizzo deleted the fix/fuzz-decode-uri-percent-idempotence branch April 19, 2026 07:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Nightly fuzz: crash found (2026-04-19)

1 participant