Skip to content

fix(dashcore): make OutPoint serde tolerant of ContentDeserializer's HR-quirk#708

Merged
xdustinface merged 6 commits intov0.42-devfrom
fix/outpoint-serde-content-deserializer
May 6, 2026
Merged

fix(dashcore): make OutPoint serde tolerant of ContentDeserializer's HR-quirk#708
xdustinface merged 6 commits intov0.42-devfrom
fix/outpoint-serde-content-deserializer

Conversation

@shumkov
Copy link
Copy Markdown
Collaborator

@shumkov shumkov commented May 1, 2026

Summary

This is a serde-tag incompatibility, not a bug specific to dashcore. The same sharp edge bites every custom Deserialize that branches on is_human_readable() for shape dispatch.

OutPoint's Deserialize impl used two separate visitors — a string-only HR visitor and a struct-only non-HR visitor. That works fine in isolation. It breaks the moment any caller wraps an OutPoint-bearing struct in an internally-tagged enum (#[serde(tag = "...")]), flatten, or an untagged enum, because serde routes those through ContentDeserializer, a format-agnostic intermediate buffer. ContentDeserializer always reports is_human_readable() == true regardless of the upstream format — intentional in serde's source, with long-standing upstream issues filed; the maintainers consider it working-as-intended and the recommended pattern is "don't branch on is_human_readable() for shape dispatch — accept any shape."

A value originally written by a non-HR encoder is therefore replayed into the HR branch as a map, which the previous string-only HR visitor rejected with invalid type: map, expected an OutPoint.

This was hit downstream in dashpay/platform when an OutPoint-bearing struct (e.g. ChainAssetLockProof) was wrapped by a tagged enum (e.g. AssetLockProof) and round-tripped through platform_value::Value.

Fix

Rework serde_struct_human_string_impl! to use a single visitor that accepts all three input shapes (visit_str, visit_seq, visit_map). Use deserialize_any in the HR branch so the actual content shape — not the reported HR flag — drives dispatch. Keep deserialize_struct in the non-HR branch since bincode doesn't support deserialize_any. The serialize side is unchanged.

This is the well-established serde workaround documented in third-party crates that hit the same wall (e.g. BinaryData, Identifier, Bytes32 in rs-platform-value).

Trade-off

Raw JSON now also accepts the struct/seq forms ({"txid": "...", "vout": 0}) in addition to the canonical "txid:vout" string form, and tolerates unknown fields in the struct form. This is forced by the design:

  • ContentDeserializer is strict — no fall-through across deserialize_* methods, so the only way to fix the tagged-enum path is to call deserialize_any and let the deserializer dispatch on actual content.
  • visit_map must tolerate unknown fields, because internally-tagged enums replay the entire buffered content (including the tag field) into each variant's deserializer.

Canonical JSON output is unchanged — the serialize side still emits the string form.

Test plan

  • Added outpoint::tests::serde_round_trip_through_internally_tagged_enum covering:
    • Round-trip through serde_json::Value wrapped by #[serde(tag = "type")] enum (HR ContentDeserializer path — the original failing case)
    • Direct JSON "txid:vout" string-form deserialize (canonical HR path stays green)
    • Bincode round-trip via the same tagged enum (symmetric non-HR struct path stays green)
  • Enabled features = ["serde"] on the dashcore dev-dep on bincode so the bincode side of the test compiles
  • CI passes on this PR
  • Downstream verification in dashpay/platform: cargo test -p dpp --lib --features json-conversion,value-conversion,serde-conversion,fixtures-and-mocks address_funding_from_asset_lock_transition::json_convertible_tests::value_round_trip_with_per_property_assertions -- --include-ignored should go from invalid type: map, expected an OutPoint to green once dashcore points at this branch

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Bug Fixes

    • Improved data validation to properly detect and reject duplicate fields during deserialization.
    • Enhanced validation for required fields during data restoration.
    • Increased compatibility with multiple serialization format representations.
  • Tests

    • Added comprehensive regression tests for serialization/deserialization edge cases and format handling.

`OutPoint`'s `Deserialize` impl branched on `is_human_readable()` and used
two unrelated visitors — a string-only `StringVisitor` for HR and a
struct-only `StructVisitor` for non-HR. Serde's `ContentDeserializer`
(used internally for `#[serde(tag = "...")]`, `flatten`, and untagged
enums) always reports `is_human_readable() == true` regardless of the
upstream format, so a value originally produced as a non-HR struct could
be replayed into the HR branch as a map and fail with
`invalid type: map, expected an OutPoint`.

Fix the `serde_struct_human_string_impl!` macro to use a single visitor
that accepts all three input shapes (`visit_str`, `visit_seq`,
`visit_map`) and use `deserialize_any` in the HR branch so the actual
content shape is picked up regardless of what `is_human_readable`
reports. The non-HR branch keeps `deserialize_struct` so bincode
(which doesn't support `deserialize_any`) still works. The serialize
side is unchanged.

The trade-off: raw JSON now also accepts the struct/seq forms in
addition to the canonical \`\"txid:vout\"\` string form, and tolerates
unknown fields in the struct form. This is intentional and required —
internally-tagged enums replay the entire buffered content (including
the tag field) when resolving each variant, so \`visit_map\` must accept
unknown fields. Canonical JSON output is unchanged because the
serialize side still emits the string form.

Add a regression test that round-trips an \`OutPoint\` through an
internally-tagged enum via \`serde_json::Value\` (forcing the
\`ContentDeserializer\` path) and through bincode.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 1, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1e8a0c43-13af-42c3-aee6-b86c3356e084

📥 Commits

Reviewing files that changed from the base of the PR and between ec54f45 and be00e51.

📒 Files selected for processing (3)
  • dash/Cargo.toml
  • dash/src/blockdata/transaction/outpoint.rs
  • dash/src/serde_utils.rs
✅ Files skipped from review due to trivial changes (1)
  • dash/Cargo.toml

📝 Walkthrough

Walkthrough

Refactor serde deserialization for human-readable struct/string types into a unified visitor that accepts strings, sequences, and maps; add a serde-focused regression test for OutPoint round-trips through JSON/bincode and for map/duplicate-field handling; enable bincode's serde feature for dev/tests.

Changes

Cohort / File(s) Summary
Test Coverage
dash/src/blockdata/transaction/outpoint.rs
Adds a new #[cfg(feature = "serde")] unit test that constructs an internally-tagged enum containing an OutPoint, forces serde's intermediate Value representation, verifies JSON and bincode round-trips, validates map-form decoding, and asserts duplicate-txid field produces a "duplicate field" error.
Serde Deserialization Macro
dash/src/serde_utils.rs
Refactors serde_struct_human_string_impl! macro deserialization into a single Visitor implementing visit_str, visit_seq, and visit_map; when human-readable, uses deserializer.deserialize_any(Visitor) to accept buffered content; adds duplicate-field detection and missing-field errors while retaining Enum-based map discrimination.
Dev Dependency Feature
dash/Cargo.toml
Enables the serde feature for the bincode dev-dependency to allow bincode serde tests to run in development/test builds.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I nibble bytes in JSON light,
I hop through maps and strings at night,
One Visitor greets each door—
Round-trips and errors, checked once more,
Hooray! Serialization's right. ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly references the core issue being fixed: making OutPoint's serde deserialization tolerant of ContentDeserializer's behavior when is_human_readable() reports true for non-HR content.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/outpoint-serde-content-deserializer

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 0/1 reviews remaining, refill in 60 minutes.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@dash/src/blockdata/transaction/outpoint.rs`:
- Around line 372-378: The existing test only exercises the string
deserialization path; add a second round-trip that constructs a
serde_json::Value::Object representing the OutPoint (e.g. a Map/object with
"txid" and "vout" fields) so from_value triggers the visit_map path and then
assert equality. In the same test (where types OutPoint and Tagged and variables
original/restored are used), build a serde_json::Map or use serde_json::json!({
"txid": original.txid.to_string(), "vout": original.vout }) ->
serde_json::Value, call serde_json::from_value to get a Tagged, and
assert_eq!(original, restored) to cover the human-readable map replay path.

In `@dash/src/serde_utils.rs`:
- Around line 466-478: The deserializer currently overwrites repeated map fields
(the match arm Some(Enum::$fe) sets $fe = Some(map.next_value()?) without
checking), so update that branch to reject duplicates: before assigning, check
if $fe.is_some() and if so return
Err(A::Error::duplicate_field(stringify!($fe))); otherwise set $fe =
Some(map.next_value()?); keep the rest of the post-processing (the match that
returns missing_field) unchanged so duplicate inputs cause an explicit
duplicate_field error instead of silently keeping the last value.
🪄 Autofix (Beta)

❌ Autofix failed (check again to retry)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 19f476bf-da5a-4cea-a89f-7d5611a83025

📥 Commits

Reviewing files that changed from the base of the PR and between 8a68324 and ec54f45.

📒 Files selected for processing (2)
  • dash/src/blockdata/transaction/outpoint.rs
  • dash/src/serde_utils.rs

Comment thread dash/src/blockdata/transaction/outpoint.rs
Comment thread dash/src/serde_utils.rs
The dev-dependency on `bincode` doesn't enable the `serde` feature, so
`bincode::serde::encode_to_vec` is not in scope. The non-human-readable
struct path is already exercised by the existing dashcore test suite
which uses bincode extensively, so the regression test only needs to
lock in the `ContentDeserializer` behavior — kept the JSON-via-tagged-
enum and string-form assertions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 1, 2026

Codecov Report

❌ Patch coverage is 89.79592% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.66%. Comparing base (8a68324) to head (be00e51).

Files with missing lines Patch % Lines
dash/src/serde_utils.rs 81.13% 10 Missing ⚠️
Additional details and impacted files
@@              Coverage Diff              @@
##           v0.42-dev     #708      +/-   ##
=============================================
+ Coverage      70.56%   70.66%   +0.10%     
=============================================
  Files            320      320              
  Lines          68150    68195      +45     
=============================================
+ Hits           48087    48188     +101     
+ Misses         20063    20007      -56     
Flag Coverage Δ
core 76.04% <89.79%> (+0.23%) ⬆️
ffi 43.37% <ø> (ø)
rpc 20.00% <ø> (ø)
spv 87.42% <ø> (+0.08%) ⬆️
wallet 69.48% <ø> (ø)
Files with missing lines Coverage Δ
dash/src/blockdata/transaction/outpoint.rs 82.43% <100.00%> (+4.93%) ⬆️
dash/src/serde_utils.rs 36.03% <81.13%> (+19.36%) ⬆️

... and 5 files with indirect coverage changes

Adding `features = ["serde"]` to the bincode dev-dep so the regression
test can exercise the symmetric non-human-readable struct path through
the tagged enum, not just the human-readable JSON path. Reword the
macro doc comment so it frames the underlying issue as
serde's `ContentDeserializer` HR-quirk rather than implying a bug
specific to dashcore — the same sharp edge bites every custom
`Deserialize` that uses `is_human_readable()` for shape dispatch and
the upstream stance is "accept any shape, don't branch".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@shumkov shumkov changed the title fix(dashcore): make OutPoint serde survive ContentDeserializer replays fix(dashcore): make OutPoint serde tolerant of ContentDeserializer's HR-quirk May 1, 2026
shumkov and others added 3 commits May 1, 2026 18:51
A bincode round-trip *via* a `#[serde(tag = ...)]` enum can't work — serde
internally-tagged dispatch buffers content by calling `deserialize_any`
on the upstream deserializer, and bincode is not self-describing
(`Serde(AnyNotSupported)`). That limitation is orthogonal to the bug
this PR fixes — the original failing case in dashpay/platform routes
through `platform_value::Value`, which *does* support `deserialize_any`.

Replace the bogus bincode-tagged-enum assertion with a plain `OutPoint`
bincode round-trip, which still locks the non-human-readable struct
path (`visit_seq`).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`bincode::serde::encode_to_vec` takes the value by `Serialize` bound and
`OutPoint` is `Copy`, so the `&raw` borrow is redundant
(`clippy::needless_borrows_for_generic_args`).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address CodeRabbit review feedback on the OutPoint serde fix:

1. Add a test that hand-builds the JSON object/map shape of `OutPoint`
   inside a `#[serde(tag = ...)]` enum and deserializes from it. This
   is the exact shape the original bug exhibited downstream — serde
   serialized as a map (because the upstream format was non-human-
   readable), then the tagged enum replayed through `ContentDeserializer`
   into the HR branch where the old string-only visitor errored. The
   `serde_json::to_value` round-trip was only exercising the visit_str
   path through `ContentDeserializer` because canonical OutPoint HR
   serialization is a string.

2. Reject duplicate field keys in `serde_struct_human_string_impl!`'s
   `visit_map`. Previously a duplicate silently kept the last value;
   now it errors with `duplicate field` like serde's derive. Add a
   test covering this — parse from a raw JSON string (not `json!`)
   because `Value::Object` deduplicates on construction.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added the ready-for-review CodeRabbit has approved this PR label May 1, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 4, 2026

Note

Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it.

Autofix skipped. No unresolved CodeRabbit review comments with fix instructions found.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 4, 2026

Note

Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it.

Autofix skipped. No unresolved CodeRabbit review comments with fix instructions found.

shumkov added a commit to dashpay/platform that referenced this pull request May 4, 2026
Per the wasm-dpp2 CONVENTIONS.md "Versioning" section: every versioned
protocol enum should use `#[serde(tag = "$formatVersion")]`. `Validator`
and `ValidatorSet` were the last two top-level versioned enums still
defaulting to externally-tagged `{"V0": {...}}` form.

Wire shape changes from:

    {"V0": {"pro_tx_hash": "...", ...}}

to the canonical:

    {"$formatVersion": "0", "pro_tx_hash": "...", ...}

JSON-side tests pass — dashcore hash newtypes (`ProTxHash`, `PubkeyHash`,
`QuorumHash`) deserialize cleanly from hex strings on the HR path.

Value-side tests are `#[ignore]`'d pending dashcore PR #708
(dashpay/rust-dashcore#708) — the dashcore hash
newtypes need dual-shape visitors so they round-trip through serde's
`ContentDeserializer`, which always reports `is_human_readable: true`
even when wrapping bytes from a non-HR source like `platform_value::Value`.
This is the same root cause as the OutPoint/Txid bug fixed locally in
commit 09c0a2b; ProTxHash/PubkeyHash trip the same wire on
`tag = "$formatVersion"` deserialization through ContentDeserializer.

Once that PR lands and we bump the dashcore dependency, drop the
`#[ignore]`s on the two value tests.

Note: `ValidatorSetV0::members` is `BTreeMap<ProTxHash, ValidatorV0>`
(not `BTreeMap<ProTxHash, Validator>`), so members are the bare V0
struct on the wire without their own `$formatVersion` tag — the test
documents this inline.

dpp lib: 3639 passing, 10 ignored (+2 from the new value-path
`#[ignore]`s, otherwise unchanged).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
xdustinface pushed a commit that referenced this pull request May 6, 2026
#729)

* fix(hashes): make SerdeHash tolerant of ContentDeserializer's HR-quirk

This is the same kind of serde-tag incompatibility fixed for `OutPoint`
in #708, applied to the hash-newtype family (sha256, sha256d, hash160,
hash_x11, ripemd160, sha1, sha512 — affecting Txid, BlockHash, ProTxHash,
PubkeyHash, QuorumHash, and every other type generated by `hash_newtype!`
or `serde_impl!`).

`SerdeHash::deserialize` used two separate visitors — a string-only HR
visitor (`HexVisitor`) and a bytes-only non-HR visitor (`BytesVisitor`).
That works fine in isolation but breaks the moment a hash-bearing struct
is wrapped by an internally-tagged enum (`#[serde(tag = "...")]`),
`flatten`, or an untagged enum. Serde routes those through
`ContentDeserializer`, a format-agnostic intermediate buffer that always
reports `is_human_readable() == true` regardless of the upstream format.
A value originally written by a non-HR encoder is therefore replayed
into the HR branch as raw bytes, which the previous `HexVisitor::visit_str`
saw as "32 chars" instead of "64-char hex" and rejected with
`bad hex string length 32 (expected 64)`.

This was hit downstream in dashpay/platform when validators / validator
sets (which contain `ProTxHash`, `PubkeyHash`, `QuorumHash`) were
configured for the dpp `tag = "$formatVersion"` versioning convention.

## Fix

Rework `SerdeHash::deserialize` to use a single `AnyShapeVisitor` that
accepts every shape a hash can arrive in:

- `visit_str` / `visit_borrowed_str` — ASCII hex (canonical HR form).
- `visit_bytes` / `visit_borrowed_bytes` — disambiguated by length:
  exactly `N` bytes → raw hash, exactly `2*N` bytes → UTF-8 hex.
  Any other length is rejected.
- `visit_seq` — length-prefixed `u8` sequence (used by bincode and
  other non-self-describing formats).

Use `deserialize_any` in the HR branch so the actual content shape —
not the reported HR flag — drives dispatch. Keep `deserialize_bytes`
in the non-HR branch since bincode is non-self-describing and does not
support `deserialize_any`.

## Trade-off

Raw JSON now also accepts the byte-form (`"\x11..."` UTF-8 bytes vs.
`"11..."` hex string) because `deserialize_any` in serde_json's
self-describing mode dispatches based on the JSON token. We disambiguate
strictly by length in `visit_bytes`, so anything that's neither N bytes
nor 2*N bytes still errors. This is consistent with the OutPoint fix
in #708 — accept any shape, validate by length.

## Implementation note: no_std / no alloc

`dashcore_hashes` does not enable `serde/alloc` (it has only `serde-std`
which transitively gates that), so `Visitor::visit_byte_buf` and
`visit_string` (defined behind serde's `alloc` feature) are unavailable.
The `visit_seq` path uses a stack array sized to fit the largest hash
(64 bytes — sha512) instead of a `Vec`, keeping the crate's no-alloc
posture.

## Tests

Two new regression tests in `dash/src/hash_types.rs`:

- `serde_round_trip_through_internally_tagged_enum` — wraps a `Txid`
  in a `#[serde(tag = "type")]` enum, round-trips through
  `serde_json::Value` (which forces buffering through
  `ContentDeserializer`), and asserts the round-trip is identity. Also
  verifies the canonical hex-string form still deserializes and that
  bincode round-trip still succeeds via the byte/seq path.
- `serde_round_trip_through_internally_tagged_enum_pubkey_hash` —
  same shape with `PubkeyHash` (20-byte hash) to exercise the
  smaller-length disambiguation path.

`bincode` dev-dep updated to `features = ["serde"]` (same change as #708)
so the bincode regression assertion compiles.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(test): rustfmt + correct comment about N being in bytes

* review: address PR feedback — real visit_seq test + debug_assert overflow

Two in-scope fixes from review:

1. The Txid round-trip test had an abandoned `raw_txid_bytes` literal
   followed by `let _ = raw_txid_bytes; // documentation only` — leftover
   exploration that misled readers into thinking the bytes were used.
   Replace with a real assertion that constructs a `serde_json::Value::Array`
   of u8 numbers, wraps it in a `#[serde(tag = "type")]` enum, and
   round-trips through `serde_json::from_value`. This now actually
   exercises the new `visit_seq` path through `ContentDeserializer` —
   the security review noted that the prior test only hit `visit_str`,
   leaving `visit_bytes`/`visit_seq` regression coverage thin.

2. The `MAX_HASH_BYTES = 64` overflow check in `visit_seq` was returning
   a runtime error with a debug-prose string ("recompile with larger
   MAX") that leaked an internal type name to user error logs. Convert
   to `debug_assert!` — failure mode is now a test panic in debug builds
   (caught at CI time when adding a wider hash type), zero overhead in
   release. The condition is unreachable in any release build that
   compiled at all, since adding a wider digest would require updating
   `serde_impl!` invocations.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@xdustinface xdustinface merged commit cab6d7d into v0.42-dev May 6, 2026
62 of 63 checks passed
@xdustinface xdustinface deleted the fix/outpoint-serde-content-deserializer branch May 6, 2026 08:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-review CodeRabbit has approved this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants