fix(hashes): make SerdeHash tolerant of ContentDeserializer's HR-quirk by shumkov · Pull Request #729 · dashpay/rust-dashcore

shumkov · 2026-05-05T07:07:27Z

Summary

Same kind of serde-tag incompatibility as #708, but in a different macro family. #708 fixed OutPoint's serde_struct_human_string_impl!. This PR fixes the hash-newtype family: SerdeHash::deserialize (in hashes/src/serde_macros.rs), used by every hash_newtype! / serde_impl!-generated type — Txid, BlockHash, ProTxHash, PubkeyHash, QuorumHash, all the sha256/sha256d/hash160/hash_x11 wrappers.

SerdeHash::deserialize used two separate visitors — a string-only HR visitor (HexVisitor) and a bytes-only non-HR visitor (BytesVisitor). That works fine in isolation but breaks the moment a hash-bearing struct is wrapped by an internally-tagged enum (#[serde(tag = \"...\")]), flatten, or an untagged enum: serde routes those through ContentDeserializer, a format-agnostic intermediate buffer that always reports is_human_readable() == true regardless of the upstream format. A value originally written by a non-HR encoder is therefore replayed into the HR branch as raw bytes, which the previous HexVisitor::visit_str saw as "32 chars" instead of "64-char hex" and rejected with `bad hex string length 32 (expected 64)`.

This was hit downstream in dashpay/platform when Validator / ValidatorSet (which contain ProTxHash, PubkeyHash, QuorumHash) were configured for the dpp `tag = "$formatVersion"` versioning convention.

Fix

Rework SerdeHash::deserialize to use a single AnyShapeVisitor that accepts every shape a hash can arrive in:

visit_str / visit_borrowed_str — ASCII hex (canonical HR form).
visit_bytes / visit_borrowed_bytes — disambiguated by length: exactly N bytes → raw hash, exactly 2*N bytes → UTF-8 hex. Any other length errors.
visit_seq — length-prefixed u8 sequence (bincode and similar).

Use deserialize_any in the HR branch so the actual content shape — not the reported HR flag — drives dispatch. Keep deserialize_bytes in the non-HR branch since bincode is non-self-describing.

This is the well-established serde workaround documented in third-party crates that hit the same wall (e.g. BinaryData, Identifier, Bytes32 in rs-platform-value, and now OutPoint after #708).

Trade-off

Raw JSON now also accepts the byte form (\"\x11...\" UTF-8 bytes vs. \"11...\" hex string) because `deserialize_any` in serde_json's self-describing mode dispatches on the JSON token. We disambiguate strictly by length in `visit_bytes`, so anything that's neither `N` nor `2*N` bytes still errors. This is consistent with the OutPoint fix.

Implementation note: no_std / no alloc

dashcore_hashes does not enable serde/alloc (only serde-std which transitively does), so Visitor::visit_byte_buf and visit_string (gated behind serde's alloc feature) are unavailable. The `visit_seq` path uses a stack array sized to fit the largest hash (64 bytes — sha512) instead of a Vec, keeping the crate's no-alloc posture.

Tests

Two regression tests in dash/src/hash_types.rs:

serde_round_trip_through_internally_tagged_enum — wraps a Txid in a #[serde(tag = \"type\")] enum, round-trips through serde_json::Value (which forces buffering through ContentDeserializer), and asserts identity. Also verifies the canonical hex-string form still deserializes and bincode round-trip still succeeds via the byte/seq path.
serde_round_trip_through_internally_tagged_enum_pubkey_hash — same shape with PubkeyHash (20-byte hash) to exercise the smaller-length disambiguation path.

bincode dev-dep updated to features = [\"serde\"] (same change as #708) so the bincode regression assertion compiles.

Local test results

dashcore_hashes: 7 passed, 0 failed.
dashcore --features serde: 551 passed, 0 failed (17 pre-existing ignores), including the two new tests.

fix(dashcore): make OutPoint serde tolerant of ContentDeserializer's HR-quirk #708 — same root cause, different macro (OutPoint / serde_struct_human_string_impl!).
Downstream in dashpay/platform: this PR unblocks the value-side round-trip tests on Validator / ValidatorSet which were marked #[ignore] waiting for this fix.

🤖 Generated with Claude Code

Summary by CodeRabbit

Tests
- Added regression tests validating serialization and deserialization of core data types across multiple encoding formats (JSON, binary, and hex variants)
Chores
- Updated development dependencies
- Improved deserialization robustness to handle multiple input formats

This is the same kind of serde-tag incompatibility fixed for `OutPoint` in #708, applied to the hash-newtype family (sha256, sha256d, hash160, hash_x11, ripemd160, sha1, sha512 — affecting Txid, BlockHash, ProTxHash, PubkeyHash, QuorumHash, and every other type generated by `hash_newtype!` or `serde_impl!`). `SerdeHash::deserialize` used two separate visitors — a string-only HR visitor (`HexVisitor`) and a bytes-only non-HR visitor (`BytesVisitor`). That works fine in isolation but breaks the moment a hash-bearing struct is wrapped by an internally-tagged enum (`#[serde(tag = "...")]`), `flatten`, or an untagged enum. Serde routes those through `ContentDeserializer`, a format-agnostic intermediate buffer that always reports `is_human_readable() == true` regardless of the upstream format. A value originally written by a non-HR encoder is therefore replayed into the HR branch as raw bytes, which the previous `HexVisitor::visit_str` saw as "32 chars" instead of "64-char hex" and rejected with `bad hex string length 32 (expected 64)`. This was hit downstream in dashpay/platform when validators / validator sets (which contain `ProTxHash`, `PubkeyHash`, `QuorumHash`) were configured for the dpp `tag = "$formatVersion"` versioning convention. ## Fix Rework `SerdeHash::deserialize` to use a single `AnyShapeVisitor` that accepts every shape a hash can arrive in: - `visit_str` / `visit_borrowed_str` — ASCII hex (canonical HR form). - `visit_bytes` / `visit_borrowed_bytes` — disambiguated by length: exactly `N` bytes → raw hash, exactly `2*N` bytes → UTF-8 hex. Any other length is rejected. - `visit_seq` — length-prefixed `u8` sequence (used by bincode and other non-self-describing formats). Use `deserialize_any` in the HR branch so the actual content shape — not the reported HR flag — drives dispatch. Keep `deserialize_bytes` in the non-HR branch since bincode is non-self-describing and does not support `deserialize_any`. ## Trade-off Raw JSON now also accepts the byte-form (`"\x11..."` UTF-8 bytes vs. `"11..."` hex string) because `deserialize_any` in serde_json's self-describing mode dispatches based on the JSON token. We disambiguate strictly by length in `visit_bytes`, so anything that's neither N bytes nor 2*N bytes still errors. This is consistent with the OutPoint fix in #708 — accept any shape, validate by length. ## Implementation note: no_std / no alloc `dashcore_hashes` does not enable `serde/alloc` (it has only `serde-std` which transitively gates that), so `Visitor::visit_byte_buf` and `visit_string` (defined behind serde's `alloc` feature) are unavailable. The `visit_seq` path uses a stack array sized to fit the largest hash (64 bytes — sha512) instead of a `Vec`, keeping the crate's no-alloc posture. ## Tests Two new regression tests in `dash/src/hash_types.rs`: - `serde_round_trip_through_internally_tagged_enum` — wraps a `Txid` in a `#[serde(tag = "type")]` enum, round-trips through `serde_json::Value` (which forces buffering through `ContentDeserializer`), and asserts the round-trip is identity. Also verifies the canonical hex-string form still deserializes and that bincode round-trip still succeeds via the byte/seq path. - `serde_round_trip_through_internally_tagged_enum_pubkey_hash` — same shape with `PubkeyHash` (20-byte hash) to exercise the smaller-length disambiguation path. `bincode` dev-dep updated to `features = ["serde"]` (same change as #708) so the bincode regression assertion compiles. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-05-05T07:07:41Z

Warning

Rate limit exceeded

@shumkov has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 32 minutes and 14 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 77f04166-a6d0-4bb1-87b8-e226724836b7

📥 Commits

Reviewing files that changed from the base of the PR and between d648444 and 3e74bd8.

📒 Files selected for processing (2)

dash/src/hash_types.rs
hashes/src/serde_macros.rs

📝 Walkthrough

Walkthrough

This PR refactors serde deserialization for hash types by introducing a unified AnyShapeVisitor to replace separate hex and bytes visitors, handles multiple input formats (hex strings, raw bytes, sequences), enables the bincode serde feature in dev-dependencies, and adds regression tests for the new deserialization logic.

Changes

Serde Deserialization Refactor

Layer / File(s)	Summary
Core Deserialization Logic `hashes/src/serde_macros.rs`	Replaces `HexVisitor` and `BytesVisitor` with a single `AnyShapeVisitor` that handles ASCII hex strings, raw byte slices, hex-encoded UTF-8 bytes, and length-prefixed `u8` sequences via `visit_seq`. Updates `SerdeHash::deserialize` to route human-readable cases through `deserialize_any` and non-human-readable through `deserialize_bytes`.
Test Dependencies `dash/Cargo.toml`	Enables `bincode` `serde` feature in `dev-dependencies` to support serde round-trip testing.
Regression Tests `dash/src/hash_types.rs`	Adds gated `#[cfg(all(test, feature = "serde"))]` tests validating `Txid` and `PubkeyHash` round-trip through `serde_json` with internally tagged enums, including non-human-readable (byte) and human-readable (hex string) deserialization paths, plus bincode compatibility for `Txid`.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🐰 The desert hashes now flow through a single path so true,
AnyShape catches what comes through—hex, bytes, sequences too!
Tests whisper serde songs in tags internally bound,
Where round-trips and bincode now safely abound. ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title 'fix(hashes): make SerdeHash tolerant of ContentDeserializer's HR-quirk' directly and specifically summarizes the main change—fixing a serde deserialization incompatibility in the SerdeHash trait when wrapped by internally-tagged enums.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/hashes-serde-content-deserializer

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…e PR dashcore PR #729 (dashpay/rust-dashcore#729) is the companion to #708 — same `ContentDeserializer` HR-quirk root cause, but for the separate `hashes::serde_macros::SerdeHash` macro family that generates `Txid` / `BlockHash` / `ProTxHash` / `PubkeyHash` / `QuorumHash` etc. (vs. #708 which fixed `OutPoint` via `serde_struct_human_string_impl!`). Update the two `#[ignore]` notes on `Validator::value_round_trip` and `ValidatorSet::value_round_trip` to reference #729 instead of the vague "follow-up PR" phrasing. When #729 lands and we bump dashcore, drop the `#[ignore]`s. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai

🧹 Nitpick comments (1)

dash/src/hash_types.rs (1)

435-453: ⚡ Quick win

The byte-shape regression path is documented but not actually tested.

This block defines raw_txid_bytes and then discards it, so the test still doesn’t assert the exact failure mode (bytes replayed through ContentDeserializer in a tagged context). Please convert this into an executable deserialization assertion to lock the bugfix in.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dash/src/hash_types.rs` around lines 435 - 453, The test currently defines
raw_txid_bytes then drops it; change it to perform an actual
serialization/deserialization round-trip that exercises the tagged-enum +
ContentDeserializer path and assert the resulting Txid/newtype equals the
original bytes. Concretely, construct the same tagged enum JSON/serde Value that
would produce Value::Bytes32 (e.g., wrap raw_txid_bytes as the
non-human-readable bytes form used by platform_value), feed it through the same
deserialization path used in this test (invoking ContentDeserializer /
serde_json round-trip or serde_test bincode-like raw bytes), deserialize into
the Txid/newtype type used in this file, and add an
assert_eq!(deserialized_txid.as_bytes(), &raw_txid_bytes). This ensures the
previous "bad hex string length 32 (expected 64)" regression is covered.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@dash/src/hash_types.rs`:
- Around line 435-453: The test currently defines raw_txid_bytes then drops it;
change it to perform an actual serialization/deserialization round-trip that
exercises the tagged-enum + ContentDeserializer path and assert the resulting
Txid/newtype equals the original bytes. Concretely, construct the same tagged
enum JSON/serde Value that would produce Value::Bytes32 (e.g., wrap
raw_txid_bytes as the non-human-readable bytes form used by platform_value),
feed it through the same deserialization path used in this test (invoking
ContentDeserializer / serde_json round-trip or serde_test bincode-like raw
bytes), deserialize into the Txid/newtype type used in this file, and add an
assert_eq!(deserialized_txid.as_bytes(), &raw_txid_bytes). This ensures the
previous "bad hex string length 32 (expected 64)" regression is covered.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8d7b29f5-9a72-45ed-b977-1bd5458f31e2

📥 Commits

Reviewing files that changed from the base of the PR and between d67cc03 and d648444.

📒 Files selected for processing (3)

dash/Cargo.toml
dash/src/hash_types.rs
hashes/src/serde_macros.rs

codecov · 2026-05-05T07:12:31Z

Codecov Report

❌ Patch coverage is 90.47619% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 71.00%. Comparing base (d67cc03) to head (3e74bd8).
⚠️ Report is 1 commits behind head on v0.42-dev.

Files with missing lines	Patch %	Lines
hashes/src/serde_macros.rs	80.48%	8 Missing ⚠️

Additional details and impacted files

@@              Coverage Diff              @@
##           v0.42-dev     #729      +/-   ##
=============================================
+ Coverage      70.96%   71.00%   +0.03%     
=============================================
  Files            319      319              
  Lines          68387    68457      +70     
=============================================
+ Hits           48531    48605      +74     
+ Misses         19856    19852       -4

Flag	Coverage Δ
core	`75.92% <90.47%> (+0.09%)`	⬆️
ffi	`45.49% <ø> (ø)`
rpc	`20.00% <ø> (ø)`
spv	`87.52% <ø> (-0.01%)`	⬇️
wallet	`69.61% <ø> (ø)`

Files with missing lines	Coverage Δ
dash/src/hash_types.rs	`63.87% <100.00%> (+13.87%)`	⬆️
hashes/src/serde_macros.rs	`85.91% <80.48%> (+20.00%)`	⬆️

... and 3 files with indirect coverage changes

…flow Two in-scope fixes from review: 1. The Txid round-trip test had an abandoned `raw_txid_bytes` literal followed by `let _ = raw_txid_bytes; // documentation only` — leftover exploration that misled readers into thinking the bytes were used. Replace with a real assertion that constructs a `serde_json::Value::Array` of u8 numbers, wraps it in a `#[serde(tag = "type")]` enum, and round-trips through `serde_json::from_value`. This now actually exercises the new `visit_seq` path through `ContentDeserializer` — the security review noted that the prior test only hit `visit_str`, leaving `visit_bytes`/`visit_seq` regression coverage thin. 2. The `MAX_HASH_BYTES = 64` overflow check in `visit_seq` was returning a runtime error with a debug-prose string ("recompile with larger MAX") that leaked an internal type name to user error logs. Convert to `debug_assert!` — failure mode is now a test panic in debug builds (caught at CI time when adding a wider hash type), zero overhead in release. The condition is unreachable in any release build that compiled at all, since adding a wider digest would require updating `serde_impl!` invocations. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai Bot reviewed May 5, 2026

View reviewed changes

coderabbitai Bot previously approved these changes May 5, 2026

View reviewed changes

fix(test): rustfmt + correct comment about N being in bytes

ba771ed

shumkov dismissed coderabbitai[bot]’s stale review via ba771ed May 5, 2026 07:27

shumkov self-assigned this May 5, 2026

xdustinface approved these changes May 6, 2026

View reviewed changes

xdustinface merged commit 56fe09d into v0.42-dev May 6, 2026
57 of 58 checks passed

xdustinface deleted the fix/hashes-serde-content-deserializer branch May 6, 2026 08:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(hashes): make SerdeHash tolerant of ContentDeserializer's HR-quirk#729

fix(hashes): make SerdeHash tolerant of ContentDeserializer's HR-quirk#729
xdustinface merged 3 commits intov0.42-devfrom
fix/hashes-serde-content-deserializer

shumkov commented May 5, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 5, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai Bot left a comment

Uh oh!

codecov Bot commented May 5, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

shumkov commented May 5, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Fix

Trade-off

Implementation note: no_std / no alloc

Tests

Local test results

Related

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

shumkov commented May 5, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 5, 2026 •

edited

Loading

codecov Bot commented May 5, 2026 •

edited

Loading