New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Witness human-readable serde #899
Conversation
This also removes tests for JSON backward-compatible encoding. Human-readable encoding will be changed in the next commit and this will break backward compatibility, thus that part of the test is removed.
Vary strange that CI reports |
You might of forgotten The test failure seems to be because with this PR applied serde roundtrip fails for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two unit tests could be split up. They each contain a serde regression test and a serde roundtrip test.
Warning project management discussion: I'm confused, what does it mean that this PR got added to 0.28 milestone but does not use the RC-fix label, do we have two things now that each describe the same thing? Perhaps I should never have added the RC-fix label? |
My understanding that labels are useful for classification (priority, module, type) and not workflow (milestone, number of reviews required, stage of review) purposes. For the later, github provides milestones and project boards. So my proposal is to use appropriate tools and not to overload labels with all these responsibilities :) |
@tcharding thank you for the help with clarifying the source of the error. In fact I found out that error is not happening during PS: the actual error is not related to JSON serialization, but somehow to |
It really seems we have a The reason of the bug is the fact that |
Ok, completed my investigation and fixed the issue. While it is indeed a rust-bitcoin/src/blockdata/witness.rs Line 309 in c1d2104
Pushed the fixed version which made an explicit type: rust-bitcoin/src/blockdata/witness.rs Line 309 in 90d220d
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just two superficial suggestions :) Out of interest did you happen to audit all the serde impls we have to see if any others do not use is_human_readable
? (Just asking because I'll add it to my todo list if not.)
I looked into other places using |
Previous implementations of Witness (and Vec<Vec<u8>>) serde serialization didn't support human-readable representations. This resulted in long unreadable JSON/YAML byte arrays, which were especially ugly when pretty-printed (a line per each byte).
Is this needed for 0.28.0? |
i'd be a mild NACK here -- the best thing to do IMO is to have code on the other end that can parse a witness and represent it. i personally wouldn't use this in Sapio since i think for PSBTs i try to serialize them to encoded strings first almost universally (maybe i don't do this when it's just a internal-to-internal thing, but almost certainly i do it for internal/external since matching rust's choice of json for psbt is undesirable) |
@dr-orlovsky, I think this is also a feature request and not a bug introduced in 0.28.0. We never had human-readable serde for |
@apoelstra it is desirable for |
One option is to expose the new serialization in a minor release in 0.28 under a feature flag (improved-human-json) and then fully break in 0.29. |
or another -- just do an 0.29 in a month with the breaking changes if feature flag is bad. we shouldn't slow down releases just because we don't like 'ngu' on release version, semver is supposed to be used to help people pick, not moralize 0.28 as worse than 0.29. |
I personally see no harm of adding this functionality here as we did for all other types. Yes, it is possible to create wrappes upstream and manually do all the serde serializaiton, but since this is the only issue preventing from normal PSBT human readable serde, I do not see why we just can have it here and save a lot of work upstream. |
the harm is that it's a late phase substantial breaking change which would take more time to actually review. "human readable" is a attribute, not a purpose. Display is for "human readable" but not machine parsable. serde serializations are machine parsable. This breaks, in a non-trivial way, the representations of PSBTs late in the phase of a software release. after a rc.1, i don't think we should do breaking changes beyond something being newly broken, less os a feature change. e.g., if we had to change a method name, i wouldn't think that a huge deal as the fix is trivial. but changing the "AJI" of our serializations has ramifications for people building up the stack. this is not an argument against the change in general, but just that if you wanted it done a week before a release is not the time for it. |
Interestingly enough the argument is reversed compared to recent case of PSBT serde. Here we also have a consensus encoding for transaction data which is preferable to use - as was proposed by you in #898 (comment) Are you aware of people relying in their projects on serde human_readable serialization format for bitcoin Transaction-related data to whom this change will introduce harm, taking into account the amount of breaking changes we have in this release anyway? @shesek @romanz @RCasatta @Kixunil will this break anything in your projects? |
the difference is that change is for unreleased stuff, chiefly the newly introduced PsbtSigHashType variants. This would only affect, afaiu, people using PSBTs with taproot support. In contrast, the change to the PSBT type as a whole affects users from prior releases. |
Yes, the whole |
Seems like fair points from both of you, perhaps we have to accept that this 0.28 release is super unusual, is not going to be perfect, and should not set precedence on how we do releases? I'm starting to think that we should just use the Benevolent Dictator method for selection of PRs to merge for this release. Also I think we should spend a little time going over all the Taproot stuff once we release and do some consecutive major releases before rushing headlong into the next shiny new thing (MSRV/2018 bump). |
It doesn't break anything for me and I don't understand the opposition. The type was added in 0.28, so there can not be any break (Besides type being added but that's already a break in a different part.) |
The type wasn't added in 0.28, the Vec<Vec<>> would have the same json/yaml repr as Witness since Witness is a newtype wrapper. |
|
Yes, still I think hex represenation would be better/it's a bug not to. People concerned with perfect compatibility should probably use consensus encoding anyway - that one is guaranteed to be compatible and is also more compact. :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO incorrect size hint is quite annoying as it could cause memory blow up so I'd like to see it fixed. The rest looks correct.
} | ||
seq.end() | ||
} else { | ||
let vec: Vec<_> = self.to_vec(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note #942 applies, not urgent.
if serializer.is_human_readable() { | ||
let mut seq = serializer.serialize_seq(Some(self.serialized_len()))?; | ||
for elem in self.iter() { | ||
seq.serialize_element(&elem.to_hex())?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allocation should be avoidable, not urgent but we could create an issue if this gets merged.
use hashes::hex::ToHex; | ||
use serde::ser::SerializeSeq; | ||
if serializer.is_human_readable() { | ||
let mut seq = serializer.serialize_seq(Some(self.serialized_len()))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
serialized_len
looks wrong - it's number of bytes, not number of items.
{ | ||
use hashes::hex::FromHex; | ||
let mut ret = Vec::new(); | ||
while let Some(elem) = a.next_element::<String>()? { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-critical performance stuff: reserve()
, serde_str_helpers::DeserBorrowStr
instead of String
, try appending hex bytes to existing Witness
instead of allocating a new Vec
for each element.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to get @apoelstra approval for another external dependency :) But I can confirm that that crate was created by @Kixunil and managed by him and me only with the most strict requirements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also NIH it, it's fairly simple. I mainly brin-dumped all opts I could see.
let mut ret = Vec::new(); | ||
while let Some(elem) = a.next_element::<String>()? { | ||
let vec = Vec::<u8>::from_hex(&elem).map_err(|_| { | ||
serde::de::Error::invalid_value(serde::de::Unexpected::Str(&elem), &"hex byte representation") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not too important: I think this could be even more detailed - unexpected len and unexpected char.
@@ -282,8 +292,35 @@ impl<'de> serde::Deserialize<'de> for Witness { | |||
where | |||
D: serde::Deserializer<'de>, | |||
{ | |||
let vec: Vec<Vec<u8>> = serde::Deserialize::deserialize(deserializer)?; | |||
Ok(Witness::from_vec(vec)) | |||
struct Visitor; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: suggest calling it HRVisitor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can it help in hiring? :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We definitely do not want HR visitors around here :)
I cleared the 0.28.0 milestone from this PR (as well as closing the milestone) because 0.28 is released. Flagging it so you can add a milestone back to this PR if you wish to. |
@tcharding I can't edit milestones anymore |
Damn, this probably should've been RC fix :( |
What is the status of this PR please? Are you working on it @dr-orlovsky, should I pick it up, or should we close it? Thanks. |
Feel free to pick it up |
Closing in favour of #1068 |
a1df62a Witness human-readable serde test (Dr Maxim Orlovsky) 68577df Witness human-readable serde (Dr Maxim Orlovsky) 93b66c5 Witness serde: test binary encoding to be backward-compatible (Dr Maxim Orlovsky) b409ae7 witness: Refactor import statements (Tobin C. Harding) e23d3a8 Remove unnecessary whitespace (Tobin C. Harding) ac55b10 Add whitespace between functions (Tobin C. Harding) Pull request description: This is dr-orlovsky's [PR](#899) picked up at his permission in the discussion thread. I went through the review comments and implemented everything except the perf optimisations. Also includes a patch at the front of the PR that adds a unit test that can be run to see the "before and after", not sure if we want it in, perhaps it should be removed before merge. This PR implicitly fixes 942. To test this PR works as advertised run `cargo test display_transaction --features=serde -- --nocapture` after creating a unit test as follows: ```rust // Used to verify that parts of a transaction pretty print. // `cargo test display_transaction --features=serde -- --nocapture` #[cfg(feature = "serde")] #[test] fn serde_display_transaction() { let tx_bytes = Vec::from_hex( "02000000000101595895ea20179de87052b4046dfe6fd515860505d6511a9004cf12a1f93cac7c01000000\ 00ffffffff01deb807000000000017a9140f3444e271620c736808aa7b33e370bd87cb5a078702483045022\ 100fb60dad8df4af2841adc0346638c16d0b8035f5e3f3753b88db122e70c79f9370220756e6633b17fd271\ 0e626347d28d60b0a2d6cbb41de51740644b9fb3ba7751040121028fa937ca8cba2197a37c007176ed89410\ 55d3bcb8627d085e94553e62f057dcc00000000" ).unwrap(); let tx: Transaction = deserialize(&tx_bytes).unwrap(); let ser = serde_json::to_string_pretty(&tx).unwrap(); println!("{}", ser); } ``` Fixes: #942 ACKs for top commit: apoelstra: ACK a1df62a Kixunil: ACK a1df62a Tree-SHA512: d0ef5b8cbf1cf8456eaaea490a793f1ac7dfb18067c4019a2c3a1bdd9627a231a4dd0a0151a4df9af2b32b909d4b384a5bec1dd3e38d44dc6a23f9c40aa4f1f9
…for `Witness` a1df62a Witness human-readable serde test (Dr Maxim Orlovsky) 68577df Witness human-readable serde (Dr Maxim Orlovsky) 93b66c5 Witness serde: test binary encoding to be backward-compatible (Dr Maxim Orlovsky) b409ae7 witness: Refactor import statements (Tobin C. Harding) e23d3a8 Remove unnecessary whitespace (Tobin C. Harding) ac55b10 Add whitespace between functions (Tobin C. Harding) Pull request description: This is dr-orlovsky's [PR](rust-bitcoin/rust-bitcoin#899) picked up at his permission in the discussion thread. I went through the review comments and implemented everything except the perf optimisations. Also includes a patch at the front of the PR that adds a unit test that can be run to see the "before and after", not sure if we want it in, perhaps it should be removed before merge. This PR implicitly fixes 942. To test this PR works as advertised run `cargo test display_transaction --features=serde -- --nocapture` after creating a unit test as follows: ```rust // Used to verify that parts of a transaction pretty print. // `cargo test display_transaction --features=serde -- --nocapture` #[cfg(feature = "serde")] #[test] fn serde_display_transaction() { let tx_bytes = Vec::from_hex( "02000000000101595895ea20179de87052b4046dfe6fd515860505d6511a9004cf12a1f93cac7c01000000\ 00ffffffff01deb807000000000017a9140f3444e271620c736808aa7b33e370bd87cb5a078702483045022\ 100fb60dad8df4af2841adc0346638c16d0b8035f5e3f3753b88db122e70c79f9370220756e6633b17fd271\ 0e626347d28d60b0a2d6cbb41de51740644b9fb3ba7751040121028fa937ca8cba2197a37c007176ed89410\ 55d3bcb8627d085e94553e62f057dcc00000000" ).unwrap(); let tx: Transaction = deserialize(&tx_bytes).unwrap(); let ser = serde_json::to_string_pretty(&tx).unwrap(); println!("{}", ser); } ``` Fixes: #942 ACKs for top commit: apoelstra: ACK a1df62a Kixunil: ACK a1df62a Tree-SHA512: d0ef5b8cbf1cf8456eaaea490a793f1ac7dfb18067c4019a2c3a1bdd9627a231a4dd0a0151a4df9af2b32b909d4b384a5bec1dd3e38d44dc6a23f9c40aa4f1f9
Previous implementations of Witness (and Vec<Vec>) serde serialization didn't support human-readable representations. This resulted in long unreadable JSON/YAML byte arrays, which were especially ugly when pretty-printed (a line per each byte).
an example of that ugliness
This breaks backward compatibility for human-readable types, but I assume that previous representation was actually "non-human readable" and it was a bug, not a feature. Waiting for confirmation from @JeremyRubin who I know uses JSON serialization a lot.