fix(visualsign): reject JSON short-form control escapes in validate_charset#332
Conversation
…harset `SignablePayload::validate_charset()` used to only block `\u` substrings in the serialized JSON, then rely on `is_ascii() && (is_ascii_graphic() || is_ascii_whitespace())`. The default `serde_json::CompactFormatter` emits short-form escapes for 0x08, 0x09, 0x0A, 0x0C, 0x0D as `\b`, `\t`, `\n`, `\f`, `\r`, which are pure ASCII graphic + whitespace bytes on the wire. Those escapes survived the validator while still smuggling real control bytes into the decoded field text once a downstream JSON parser ran, letting an attacker break a single-line wallet UI or splice display lines. The Tron converter forwarding `parameter.type_url` straight into a text field is the concrete sink called out on the ticket. Extend the substring-rejection list to cover every escape the serializer can emit: `\u`, `\n`, `\t`, `\r`, `\b`, `\f`, `\"`, `\\`, plus `\/` defensively. Adds regression tests for each escape, for the existing `\u` rule, and a positive case for plain ASCII. Closes PRS-231 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The PRS-231 deny-list rejected `\n` along with the other short-form control escapes, but `\n` is the wallet's documented multi-line separator. 30+ trusted call sites across the chain converters emit strings like "Program ID: X\nData: Y" deliberately. The original deny broke every chain's instruction display in workspace integration tests. The other short-form escapes stay rejected (`\t`, `\r`, `\b`, `\f`, plus the existing `\u`, `\"`, `\\`, `\/`). Attacker-controlled strings flowing into field text must be sanitized at the insertion site; the remaining sinks (e.g. Tron parameter.type_url) are tracked as a follow-up to PRS-231. Adds a regression test pinning the `\n` carve-out. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Tightens SignablePayload::validate_charset() to reject serde_json short-form escape sequences (and other escape forms) that can otherwise pass ASCII validation while decoding into control/structural bytes in downstream display strings.
Changes:
- Expanded validation from a single
\\ucheck to a deny-list of JSON escape sequences emitted byserde_json(plus a defensive\\/). - Added regression tests covering rejection of control-byte escapes, quotes/backslashes, and unicode escapes, plus positive tests for plain ASCII and the newline carve-out.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Reorder FORBIDDEN_JSON_ESCAPES so `\/` is checked before `\\`. With `\\` listed first, any input containing a literal backslash would short-circuit on the backslash rule and the defensive `\/` entry could never attribute a rejection, masking whether it does anything. The new order lets a hypothetical future serializer emitting bare `\/` (the compact formatter does not) be rejected first by the dedicated rule. Rename the slash-related regression test to honestly describe what it covers (a backslash-slash input combination, not a serializer-emitted `\/`), and document the unreachable-with-current-serializer caveat inline. Sharpen the short-form-control-escape test so the sanity check pins the specific escape serde_json emits (`\t`, `\r`, `\b`, `\f`) per case instead of just asserting any backslash, which would also pass if the serializer ever switched to `\uXXXX` long-form escapes. No behavioural change to the validator beyond the reordering (the deny-list is unchanged); the carve-out for `\n` is preserved. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drop the "list order matters for attribution" rationale from the FORBIDDEN_JSON_ESCAPES comment. All matches return the same generic ValidationError, so which entry fires first is not observable. Keep the defensive-first ordering as a cosmetic choice. Also corrected the parallel claim in test_validate_charset_rejects_backslash_slash_combination's doc-comment. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the literal tab character inside the sanity-check comment with an explicit "U+0009" reference so the intent is readable in source and not mangled by editors/linters. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Heads up: this repo is public, so internal Linear refs ( This PR's leaks:
Suggested convention: describe the bug in the name (e.g., |
Removes PRS-NNN references from comments so external readers of this public repo don't see internal triage IDs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Scrubbed PRS-231 refs from body and code/comments. Branch left as-is since it'll disappear on merge. |
|
Deep-review verdict: ship as-is. Verified the validator/serde contract empirically against
No emit/validate mismatch — the most important property for a string-pattern validator. Tests inject real control bytes and route them through Residual gap (follow-up, not blocking): the #331 is now merged so the gate is closed on every chain; this PR makes the validator strict enough to catch the short-form escapes that previously slipped past |
prasanna-anchorage
left a comment
There was a problem hiding this comment.
Approving per deep-review verdict above — ship as-is.
Summary
SignablePayload::validate_charset()was only blocking\usubstrings in the serialized JSON, then leaning onis_ascii() && (is_ascii_graphic() || is_ascii_whitespace()).serde_json::CompactFormatteremits short-form escapes (\b,\t,\n,\f,\r,\",\\) for control bytes and structural chars; all of those (except\n, see carve-out below) slip past the ASCII checks while still smuggling real control bytes into the decoded display string. The Tron converter forwardingparameter.type_urlstraight into a text field is the concrete sink; an attacker can break single-line wallet UI or splice display lines via\r,\t, etc.What changed
src/visualsign/src/lib.rs::validate_charset: replaced the single"\\u"substring check with a list covering the escapes the serializer can emit that we want to reject:\u,\t,\r,\b,\f,\",\\, plus\/defensively (the compact formatter does not currently emit\/, but the rule documents the intent in case a future serializer or replacement ever does).\/is listed ahead of\\so the defensive guard actually fires first on any future input containing bare\/.\nis intentionally not in the deny-list. It is the wallet's documented line separator for multi-line field text (e.g. "Program ID: X\nData: Y") and 30+ trusted call sites across the chain converters emit it deliberately. Untrusted strings flowing into field text must be sanitized at the insertion site instead.test_validate_charset_accepts_newline_in_textpins this contract.test_validate_charset_rejects_short_form_control_escapes(4 cases:\t,\r,\b,\f; each pins the exact short-form escape the serializer emits)test_validate_charset_rejects_quote_and_backslash_escapestest_validate_charset_rejects_backslash_slash_combinationtest_validate_charset_still_rejects_unicode_escapetest_validate_charset_accepts_plain_ascii(positive case)test_validate_charset_accepts_newline_in_text(positive case, pins carve-out)No callers changed. The fix is purely an extra deny-list inside the validator, so any payload that already passed continues to pass; payloads that snuck control bytes through (other than
\n) now get rejected with the sameValidationError("Restricted Characters Detected")already used for\u.Rollback
Revert this commit. No data migrations, no protobuf changes, no on-disk state. The validator returns to its previous (looser) behavior; nothing downstream depends on the new rejections beyond the new tests in this PR.
Backwards compatibility
Strictly tightens validation on data that should never have been allowed past it. No public API changed. Any caller currently producing payloads that contain
",\, or control bytes other than\nin text fields was already producing JSON that broke the validator's intent; those callers need to sanitize their input. None of the existing chain converters in this repo exercise that path with the current fixture corpus (visualsign unit + doctest suite is green; cross-chain fixture verification is left for the PR-level CI per scope guidance on the ticket).Test plan
Verification scoped to
visualsign, ran fromsrc/:cargo fmt, no diffcargo clippy -p visualsign --all-targets -- -D warnings, cleancargo test -p visualsign, all passing (incl. new regression tests and the pre-existingtest_signable_payload_to_json/ charset suite)Cross-chain CI on this PR will exercise the remaining converters end-to-end.