fix(helpers): fix DecodeURIIfNeeded idempotence for literal '%' in URIs#714
Conversation
…icalization Three related changes to the tag system: 1. Storage-only numeric padding: purely-numeric tag values are zero-padded to width 4 in SQLite (e.g. disc:1 → disc:0001) so ORDER BY sorts correctly. PadTagValue is applied at every DB write site; UnpadTagValue strips at every read site. Public API, NFC tokens, and ZapScript remain in natural form. 2. Net-new upstream tags from PigSaint/GameDataBase: keyboard, touchscreen, positional:4 (input); barcode namespace (addon); vibration:rumble and accelerometer (embedded); vicdual/g80/h1/model1-3/naomi and new manufacturers nichibutsu/taiyo/tecfri:ambush/tourvision (arcadeboard); gameboy:infrared and gameboy:gba (compatibility); archimedes/atari:falcon/ sega:32x/nintendo:disksystem/nintendo:gameandwatch/wonderswan (port); vr and keyword:ubikey (search); comicclassics (reboxed); pcemini/ ninjajajamaru/zeldacollection and 3dfukkoku:01/02 (rerelease); rev:f, set:f1/f2, alt:4/5/6 (range fills); ca (lang); ddrgb/fullchanger (addon controller); mobileadaptergb (link); glasses:mvd, led:powerantenna/bugsensor, pocketsakura, spectrumcommunicator (addon misc); seganet (reboxed). 3. Deprecated alias canonicalization: addon:barcodeboy rewrites to addon:barcode:barcodeboy; addon:controller:jcart to embedded:slot:jcart; addon:controller:rumble to embedded:vibration:rumble. Old NFC tokens and ZapScript files using the former names resolve transparently at query time via CanonicalizeTagAlias in resolveFilter.
…Is (#712) Custom-scheme branch: add '%' → '%25' to the re-encoder (before '/', '?', '#') so that a decoded literal percent character does not become a pct-encode prefix on a second parse pass. Fixes crash on steAm://%25000 where %25 decoded to '%', leaving %000, which then decoded %00 to null on the next call (idempotence failure). Updates the double_encoding and triple_encoding table tests: with the new re-encoder a doubly-encoded input like steam://999/Name%2520Here is returned unchanged (the decoded '%' is re-encoded back), which is required by the idempotence invariant. Adds regression corpus entry 6b19b7d4146026ef ("steAm://%25000") from the nightly fuzz artifact attached to run #24621237120.
📝 WalkthroughWalkthroughA bug fix for Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
pkg/helpers/uris_test.go (1)
943-953: Consider adding the exact fuzz-found URI as a table test case.These expectation updates are correct, but adding
steAm://%25000directly in this test table would lock in a deterministic regression check independent of fuzz corpus execution.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/helpers/uris_test.go` around lines 943 - 953, Add a deterministic table test entry for the fuzz-found URI next to the existing "double_encoding" and "triple_encoding" cases: create a new test case (e.g., name "fuzz_steAm_percent") with input "steAm://%25000" and expected "steAm://%25000" (and a short description) in the same test slice in pkg/helpers/uris_test.go so the regression is checked without relying on the fuzz corpus.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pkg/helpers/uris_test.go`:
- Around line 943-953: Add a deterministic table test entry for the fuzz-found
URI next to the existing "double_encoding" and "triple_encoding" cases: create a
new test case (e.g., name "fuzz_steAm_percent") with input "steAm://%25000" and
expected "steAm://%25000" (and a short description) in the same test slice in
pkg/helpers/uris_test.go so the regression is checked without relying on the
fuzz corpus.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 16fb63c9-6bc2-4b88-9d6f-3e585f86b0b9
📒 Files selected for processing (3)
pkg/helpers/testdata/fuzz/FuzzDecodeURIIfNeeded/6b19b7d4146026efpkg/helpers/uris.gopkg/helpers/uris_test.go
Fixes #712.
The nightly fuzz run found that
decode("steAm://%25000")returns"steAm://%000", but a second call returns"steAm://\x000"— violating the idempotence invariant checked byFuzzDecodeURIIfNeeded.The custom-scheme branch's re-encoder already handles
/,?, and#(from PR #708). The same class of bug applies to the%character itself:url.PathUnescape("%25000")→"%000", and a second decode pass interprets%00as a null-byte escape. Adding%→%25as the first entry in the re-encoder prevents this.%must be registered before the other replacements so that the%2F/%3F/%23strings produced by those rules are not themselves re-encoded.Side effect: doubly-encoded inputs like
steam://999/Name%2520Herenow return unchanged instead of decoding the%25layer. This is required for idempotence — two existing tests updated to reflect the new semantic.Includes corpus file
6b19b7d4146026ef(steAm://%25000) from the nightly fuzz artifact.Summary by CodeRabbit
Bug Fixes
Tests