fix(dashcore): make OutPoint serde tolerant of ContentDeserializer's HR-quirk#708
Conversation
`OutPoint`'s `Deserialize` impl branched on `is_human_readable()` and used two unrelated visitors — a string-only `StringVisitor` for HR and a struct-only `StructVisitor` for non-HR. Serde's `ContentDeserializer` (used internally for `#[serde(tag = "...")]`, `flatten`, and untagged enums) always reports `is_human_readable() == true` regardless of the upstream format, so a value originally produced as a non-HR struct could be replayed into the HR branch as a map and fail with `invalid type: map, expected an OutPoint`. Fix the `serde_struct_human_string_impl!` macro to use a single visitor that accepts all three input shapes (`visit_str`, `visit_seq`, `visit_map`) and use `deserialize_any` in the HR branch so the actual content shape is picked up regardless of what `is_human_readable` reports. The non-HR branch keeps `deserialize_struct` so bincode (which doesn't support `deserialize_any`) still works. The serialize side is unchanged. The trade-off: raw JSON now also accepts the struct/seq forms in addition to the canonical \`\"txid:vout\"\` string form, and tolerates unknown fields in the struct form. This is intentional and required — internally-tagged enums replay the entire buffered content (including the tag field) when resolving each variant, so \`visit_map\` must accept unknown fields. Canonical JSON output is unchanged because the serialize side still emits the string form. Add a regression test that round-trips an \`OutPoint\` through an internally-tagged enum via \`serde_json::Value\` (forcing the \`ContentDeserializer\` path) and through bincode. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughRefactor serde deserialization for human-readable struct/string types into a unified visitor that accepts strings, sequences, and maps; add a serde-focused regression test for Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 60 minutes.Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@dash/src/blockdata/transaction/outpoint.rs`:
- Around line 372-378: The existing test only exercises the string
deserialization path; add a second round-trip that constructs a
serde_json::Value::Object representing the OutPoint (e.g. a Map/object with
"txid" and "vout" fields) so from_value triggers the visit_map path and then
assert equality. In the same test (where types OutPoint and Tagged and variables
original/restored are used), build a serde_json::Map or use serde_json::json!({
"txid": original.txid.to_string(), "vout": original.vout }) ->
serde_json::Value, call serde_json::from_value to get a Tagged, and
assert_eq!(original, restored) to cover the human-readable map replay path.
In `@dash/src/serde_utils.rs`:
- Around line 466-478: The deserializer currently overwrites repeated map fields
(the match arm Some(Enum::$fe) sets $fe = Some(map.next_value()?) without
checking), so update that branch to reject duplicates: before assigning, check
if $fe.is_some() and if so return
Err(A::Error::duplicate_field(stringify!($fe))); otherwise set $fe =
Some(map.next_value()?); keep the rest of the post-processing (the match that
returns missing_field) unchanged so duplicate inputs cause an explicit
duplicate_field error instead of silently keeping the last value.
🪄 Autofix (Beta)
❌ Autofix failed (check again to retry)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 19f476bf-da5a-4cea-a89f-7d5611a83025
📒 Files selected for processing (2)
dash/src/blockdata/transaction/outpoint.rsdash/src/serde_utils.rs
The dev-dependency on `bincode` doesn't enable the `serde` feature, so `bincode::serde::encode_to_vec` is not in scope. The non-human-readable struct path is already exercised by the existing dashcore test suite which uses bincode extensively, so the regression test only needs to lock in the `ContentDeserializer` behavior — kept the JSON-via-tagged- enum and string-form assertions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## v0.42-dev #708 +/- ##
=============================================
+ Coverage 70.56% 70.66% +0.10%
=============================================
Files 320 320
Lines 68150 68195 +45
=============================================
+ Hits 48087 48188 +101
+ Misses 20063 20007 -56
|
Adding `features = ["serde"]` to the bincode dev-dep so the regression test can exercise the symmetric non-human-readable struct path through the tagged enum, not just the human-readable JSON path. Reword the macro doc comment so it frames the underlying issue as serde's `ContentDeserializer` HR-quirk rather than implying a bug specific to dashcore — the same sharp edge bites every custom `Deserialize` that uses `is_human_readable()` for shape dispatch and the upstream stance is "accept any shape, don't branch". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A bincode round-trip *via* a `#[serde(tag = ...)]` enum can't work — serde internally-tagged dispatch buffers content by calling `deserialize_any` on the upstream deserializer, and bincode is not self-describing (`Serde(AnyNotSupported)`). That limitation is orthogonal to the bug this PR fixes — the original failing case in dashpay/platform routes through `platform_value::Value`, which *does* support `deserialize_any`. Replace the bogus bincode-tagged-enum assertion with a plain `OutPoint` bincode round-trip, which still locks the non-human-readable struct path (`visit_seq`). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`bincode::serde::encode_to_vec` takes the value by `Serialize` bound and `OutPoint` is `Copy`, so the `&raw` borrow is redundant (`clippy::needless_borrows_for_generic_args`). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address CodeRabbit review feedback on the OutPoint serde fix: 1. Add a test that hand-builds the JSON object/map shape of `OutPoint` inside a `#[serde(tag = ...)]` enum and deserializes from it. This is the exact shape the original bug exhibited downstream — serde serialized as a map (because the upstream format was non-human- readable), then the tagged enum replayed through `ContentDeserializer` into the HR branch where the old string-only visitor errored. The `serde_json::to_value` round-trip was only exercising the visit_str path through `ContentDeserializer` because canonical OutPoint HR serialization is a string. 2. Reject duplicate field keys in `serde_struct_human_string_impl!`'s `visit_map`. Previously a duplicate silently kept the last value; now it errors with `duplicate field` like serde's derive. Add a test covering this — parse from a raw JSON string (not `json!`) because `Value::Object` deduplicates on construction. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Note Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it. Autofix skipped. No unresolved CodeRabbit review comments with fix instructions found. |
|
Note Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it. Autofix skipped. No unresolved CodeRabbit review comments with fix instructions found. |
Per the wasm-dpp2 CONVENTIONS.md "Versioning" section: every versioned
protocol enum should use `#[serde(tag = "$formatVersion")]`. `Validator`
and `ValidatorSet` were the last two top-level versioned enums still
defaulting to externally-tagged `{"V0": {...}}` form.
Wire shape changes from:
{"V0": {"pro_tx_hash": "...", ...}}
to the canonical:
{"$formatVersion": "0", "pro_tx_hash": "...", ...}
JSON-side tests pass — dashcore hash newtypes (`ProTxHash`, `PubkeyHash`,
`QuorumHash`) deserialize cleanly from hex strings on the HR path.
Value-side tests are `#[ignore]`'d pending dashcore PR #708
(dashpay/rust-dashcore#708) — the dashcore hash
newtypes need dual-shape visitors so they round-trip through serde's
`ContentDeserializer`, which always reports `is_human_readable: true`
even when wrapping bytes from a non-HR source like `platform_value::Value`.
This is the same root cause as the OutPoint/Txid bug fixed locally in
commit 09c0a2b; ProTxHash/PubkeyHash trip the same wire on
`tag = "$formatVersion"` deserialization through ContentDeserializer.
Once that PR lands and we bump the dashcore dependency, drop the
`#[ignore]`s on the two value tests.
Note: `ValidatorSetV0::members` is `BTreeMap<ProTxHash, ValidatorV0>`
(not `BTreeMap<ProTxHash, Validator>`), so members are the bare V0
struct on the wire without their own `$formatVersion` tag — the test
documents this inline.
dpp lib: 3639 passing, 10 ignored (+2 from the new value-path
`#[ignore]`s, otherwise unchanged).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
#729) * fix(hashes): make SerdeHash tolerant of ContentDeserializer's HR-quirk This is the same kind of serde-tag incompatibility fixed for `OutPoint` in #708, applied to the hash-newtype family (sha256, sha256d, hash160, hash_x11, ripemd160, sha1, sha512 — affecting Txid, BlockHash, ProTxHash, PubkeyHash, QuorumHash, and every other type generated by `hash_newtype!` or `serde_impl!`). `SerdeHash::deserialize` used two separate visitors — a string-only HR visitor (`HexVisitor`) and a bytes-only non-HR visitor (`BytesVisitor`). That works fine in isolation but breaks the moment a hash-bearing struct is wrapped by an internally-tagged enum (`#[serde(tag = "...")]`), `flatten`, or an untagged enum. Serde routes those through `ContentDeserializer`, a format-agnostic intermediate buffer that always reports `is_human_readable() == true` regardless of the upstream format. A value originally written by a non-HR encoder is therefore replayed into the HR branch as raw bytes, which the previous `HexVisitor::visit_str` saw as "32 chars" instead of "64-char hex" and rejected with `bad hex string length 32 (expected 64)`. This was hit downstream in dashpay/platform when validators / validator sets (which contain `ProTxHash`, `PubkeyHash`, `QuorumHash`) were configured for the dpp `tag = "$formatVersion"` versioning convention. ## Fix Rework `SerdeHash::deserialize` to use a single `AnyShapeVisitor` that accepts every shape a hash can arrive in: - `visit_str` / `visit_borrowed_str` — ASCII hex (canonical HR form). - `visit_bytes` / `visit_borrowed_bytes` — disambiguated by length: exactly `N` bytes → raw hash, exactly `2*N` bytes → UTF-8 hex. Any other length is rejected. - `visit_seq` — length-prefixed `u8` sequence (used by bincode and other non-self-describing formats). Use `deserialize_any` in the HR branch so the actual content shape — not the reported HR flag — drives dispatch. Keep `deserialize_bytes` in the non-HR branch since bincode is non-self-describing and does not support `deserialize_any`. ## Trade-off Raw JSON now also accepts the byte-form (`"\x11..."` UTF-8 bytes vs. `"11..."` hex string) because `deserialize_any` in serde_json's self-describing mode dispatches based on the JSON token. We disambiguate strictly by length in `visit_bytes`, so anything that's neither N bytes nor 2*N bytes still errors. This is consistent with the OutPoint fix in #708 — accept any shape, validate by length. ## Implementation note: no_std / no alloc `dashcore_hashes` does not enable `serde/alloc` (it has only `serde-std` which transitively gates that), so `Visitor::visit_byte_buf` and `visit_string` (defined behind serde's `alloc` feature) are unavailable. The `visit_seq` path uses a stack array sized to fit the largest hash (64 bytes — sha512) instead of a `Vec`, keeping the crate's no-alloc posture. ## Tests Two new regression tests in `dash/src/hash_types.rs`: - `serde_round_trip_through_internally_tagged_enum` — wraps a `Txid` in a `#[serde(tag = "type")]` enum, round-trips through `serde_json::Value` (which forces buffering through `ContentDeserializer`), and asserts the round-trip is identity. Also verifies the canonical hex-string form still deserializes and that bincode round-trip still succeeds via the byte/seq path. - `serde_round_trip_through_internally_tagged_enum_pubkey_hash` — same shape with `PubkeyHash` (20-byte hash) to exercise the smaller-length disambiguation path. `bincode` dev-dep updated to `features = ["serde"]` (same change as #708) so the bincode regression assertion compiles. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(test): rustfmt + correct comment about N being in bytes * review: address PR feedback — real visit_seq test + debug_assert overflow Two in-scope fixes from review: 1. The Txid round-trip test had an abandoned `raw_txid_bytes` literal followed by `let _ = raw_txid_bytes; // documentation only` — leftover exploration that misled readers into thinking the bytes were used. Replace with a real assertion that constructs a `serde_json::Value::Array` of u8 numbers, wraps it in a `#[serde(tag = "type")]` enum, and round-trips through `serde_json::from_value`. This now actually exercises the new `visit_seq` path through `ContentDeserializer` — the security review noted that the prior test only hit `visit_str`, leaving `visit_bytes`/`visit_seq` regression coverage thin. 2. The `MAX_HASH_BYTES = 64` overflow check in `visit_seq` was returning a runtime error with a debug-prose string ("recompile with larger MAX") that leaked an internal type name to user error logs. Convert to `debug_assert!` — failure mode is now a test panic in debug builds (caught at CI time when adding a wider hash type), zero overhead in release. The condition is unreachable in any release build that compiled at all, since adding a wider digest would require updating `serde_impl!` invocations. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
This is a serde-tag incompatibility, not a bug specific to dashcore. The same sharp edge bites every custom
Deserializethat branches onis_human_readable()for shape dispatch.OutPoint'sDeserializeimpl used two separate visitors — a string-only HR visitor and a struct-only non-HR visitor. That works fine in isolation. It breaks the moment any caller wraps anOutPoint-bearing struct in an internally-tagged enum (#[serde(tag = "...")]),flatten, or an untagged enum, because serde routes those throughContentDeserializer, a format-agnostic intermediate buffer.ContentDeserializeralways reportsis_human_readable() == trueregardless of the upstream format — intentional in serde's source, with long-standing upstream issues filed; the maintainers consider it working-as-intended and the recommended pattern is "don't branch onis_human_readable()for shape dispatch — accept any shape."A value originally written by a non-HR encoder is therefore replayed into the HR branch as a map, which the previous string-only HR visitor rejected with
invalid type: map, expected an OutPoint.This was hit downstream in dashpay/platform when an
OutPoint-bearing struct (e.g.ChainAssetLockProof) was wrapped by a tagged enum (e.g.AssetLockProof) and round-tripped throughplatform_value::Value.Fix
Rework
serde_struct_human_string_impl!to use a single visitor that accepts all three input shapes (visit_str,visit_seq,visit_map). Usedeserialize_anyin the HR branch so the actual content shape — not the reported HR flag — drives dispatch. Keepdeserialize_structin the non-HR branch since bincode doesn't supportdeserialize_any. The serialize side is unchanged.This is the well-established serde workaround documented in third-party crates that hit the same wall (e.g.
BinaryData,Identifier,Bytes32in rs-platform-value).Trade-off
Raw JSON now also accepts the struct/seq forms (
{"txid": "...", "vout": 0}) in addition to the canonical"txid:vout"string form, and tolerates unknown fields in the struct form. This is forced by the design:ContentDeserializeris strict — no fall-through acrossdeserialize_*methods, so the only way to fix the tagged-enum path is to calldeserialize_anyand let the deserializer dispatch on actual content.visit_mapmust tolerate unknown fields, because internally-tagged enums replay the entire buffered content (including the tag field) into each variant's deserializer.Canonical JSON output is unchanged — the serialize side still emits the string form.
Test plan
outpoint::tests::serde_round_trip_through_internally_tagged_enumcovering:serde_json::Valuewrapped by#[serde(tag = "type")]enum (HRContentDeserializerpath — the original failing case)"txid:vout"string-form deserialize (canonical HR path stays green)features = ["serde"]on the dashcore dev-dep on bincode so the bincode side of the test compilescargo test -p dpp --lib --features json-conversion,value-conversion,serde-conversion,fixtures-and-mocks address_funding_from_asset_lock_transition::json_convertible_tests::value_round_trip_with_per_property_assertions -- --include-ignoredshould go frominvalid type: map, expected an OutPointto green once dashcore points at this branch🤖 Generated with Claude Code
Summary by CodeRabbit
Bug Fixes
Tests