-
Notifications
You must be signed in to change notification settings - Fork 636
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Witness struct to improve ser/de perfomance #672
Conversation
Have you seen https://github.com/rust-bitcoin/rust-bitcoin/pull/668/files#diff-c55d0186f3d5990b5e223c1a8aa1bbe60988201c452dd13b3324e892bc33119eR749-R825 ? It should be very complimentary to this PR |
I didn't but I've looked it now, I am not sure to grasp the complementarity since one it's address witness and the other is the witness stack in the inputs, could you elaborate? |
af9711e
to
8dcf3d6
Compare
By "complementing" I mean they solve strong types issue for segwit related structures. I gave you link just to prevent double work from you if you will occasionally do that type which I already did |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome! I've historically always wanted to do this on a broader scale - think a similar "store it in serialized form" for an entire transaction/block and just store pointers to the start of various datastructures to provide accessors. Of course that could come later, but this is awesome by itself.
|
||
/// contains the witness Vec<Vec<u8>> serialization without the initial varint indicating the | ||
/// number of elements (which is stored in len) | ||
content: Vec<u8>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're going down this path, does it make sense to also consider something like Bitcoin Core's "prevec" for the witness contents?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I mean such a thing is super-duper trivial to write.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a tinyvec like structure could probably be applicable broadly (like also for inputs/outpus) and as an addition to this, probably better to address in another PR
|
||
#[derive(Clone, PartialEq, Eq, PartialOrd, Ord, Debug, Hash)] | ||
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))] //TODO manually implement? | ||
pub struct Witness { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to have a "pointers to the start of each witness element" vec as well? I guess given most witnesses are like 2-5 elements long maybe its not worth it, but would be cool to add a benchmark for iterating a lightning HTLC transaction witness and just test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pointers to the start of each witness element would allow efficient index access but require another Vec.
I think for witness is not worth it because, as you said usually the witness is a few elements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe have offsets in SmallVec
?
We could also optimistically assume that transactions are smaller than 4M bytes, so we could cram offsets into struct U24(u8, u8, u8);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe have offsets in SmallVec?
#672 (comment)
Also, having offsets would allow efficient indexed access, but mutable access it's another story... In the end I don't think there are really use case requiring that. I think I would not add offsets here even with tinyvec/smallvec.
We could also optimistically assume that transactions are smaller than 4M bytes, so we could cram offsets into struct U24(u8, u8, u8);
the overall weight of the struct would not change because of padding (on most common 64 bits arch), I would leave this optimization in case we need other fields in the struct
One allocation improves cache locality so even ignoring allocations, it's faster to go through one chunk of linear memory than jump between many. |
8dcf3d6
to
8ccc278
Compare
Some updates:
|
for reference, I found this unmaintained implementation that stores blockchain data structures in slices while providing walker to access structured data |
Newtype for witness structure - yes please! It always felt strange to me having just Storing block in serialized form with pointers - this may help us with electrs performance. We're troubleshooting/brainstorming right now: romanz/electrs#547 cc @romanz I think it'd require a bit of reasonable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK. Didn't do as deep review as I do usually. Maybe I will look at it more later.
|
||
#[derive(Clone, PartialEq, Eq, PartialOrd, Ord, Debug, Hash)] | ||
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))] //TODO manually implement? | ||
pub struct Witness { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe have offsets in SmallVec
?
We could also optimistically assume that transactions are smaller than 4M bytes, so we could cram offsets into struct U24(u8, u8, u8);
src/blockdata/witness.rs
Outdated
/// Support structure to allow efficient and convenient iteration over the Witness elements | ||
pub struct WitnessIterator<'a> { | ||
witness: &'a Witness, | ||
cursor: usize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just core::slice::Iter<'a, u8>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because you can't get slices from it and should not work as Read?
Or maybe I don't get what you have in mind
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can, it has as_slice()
method. I used it in #662 and #673
Was also thinking of doing take_slice_or_kill
via extension trait, so it could be reused. You seem to be doing something similar so maybe worth revisiting that idea.
(Was also considering adding it to core
as it seems widely applicable.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added in 691179d
src/blockdata/witness.rs
Outdated
where | ||
S: serde::Serializer, | ||
{ | ||
let vec: Vec<_> = self.iter().map(|e| e.to_vec()).collect(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could be faster but for initial code probably good. (Manual serde impls are annoying as hell.)
Not sure how applicable it is, but in |
@dpc yes, also |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latest update looks great!
self.cursor = end; | ||
Some(&vec[start..end]) | ||
let varint = VarInt::consensus_decode(self.0.as_slice()).ok()?; | ||
self.0.nth(varint.len() - 1)?; // VarInt::len returns at least 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see VarInt::consensus_decode_iter()
one day but good enough for this PR.
Needs a rebase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without acessor methods to items inside Witness
(Index
impl or get
) it is literally impossible to use this structure in many cases (like in miniscript etc)
It is also required to change type of the witness data in PSBT inputs.
Can you also pls stick in impl Display for Witness {
fn fmt(&self, f: &mut Formatter<'_>) -> fmt::Result {
for item in self.iter() {
writeln!(f, "{}", item.to_hex())?;
}
Ok(())
}
} |
Here are all the improvements I'd like to have in this PR: RCasatta#4 |
I don't think it's literally impossible to access elements without
Also, I can't find extensive indexed access in the miniscript lib (please point me to the code if I am wrong). So I am not convinced to implement
I am open to changing my mind, what do other people think? |
Why |
This PR seems to be skipping deserializing witness, so not everyone has to pay for it, and code that do need it, can pay for it how they want. Seems like a win in most cases, and a small inconvenience for other. |
Well, I'd like to have in command-line pro wallets (I have one myself) printing tx witness like
And not
|
@RCasatta replied in that PR, but briefly, we need things like "accessing second to last element" in many cases, including parsing witness data for taproot and segwit v0, so why not to have a index map as a part of the structure and const-time |
Adding an index map would defeat the main [1] purpose of this PR [2] because it would require another allocation for the map. If "accessing second to last element" is a common use case (and also the last, I guess), we can add two [1]: main because I think deserialization happens more than operating on the witness |
I think that is a good solution! |
If there are multiple strong usecases here, how about having two (or even multiple)
Edit. One could even have Edit. I guess some types don't need to implement both decode and encode. If any is missing it will just make them un-encodeable/decodeabe. A |
src/blockdata/witness.rs
Outdated
cursor: usize, | ||
} | ||
|
||
impl From<Vec<Vec<u8>>> for Witness { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In c342adb:
I think the From
trait is inappropriate here because you're doing a new allocation and the conversion is nontrivial. Can you pull this into an independent from_vec
or something method? (Also, could you add a to_vec
method which is ust a convenience wrapper over iterate-and-collect?)
Relatedly, in your serde Deserialize
impl, I'm sure we can do something more efficient than decoding a Vec<Vec<u8>>
and then converting -- though for this PR that's fine, we can do optimizations later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added 8d23404
If the commit is what you expected I guess I'll squash the last 4 commits in the first
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not going to squash since it received the ACK it looks it's fine like it is
Done reviewing 00b2dd5. Looks great to me, just one comment. Thanks for adding the last and second-to-last accessors. I think these are pretty commonly useful, since in a complete witness stack the last element is the witness script, while the second-to-last is the actual "bottom" of the stack. |
231d784
to
8d23404
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 8d23404
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did a detailed code review. Looks good, left one comment about clear function.
I was going to ask for a pop function, but I think for verifying and interpreting scripts witnesses, we should convert it to Vec<Vec>.
src/blockdata/witness.rs
Outdated
self.witness_elements = 0; | ||
} | ||
|
||
/// Push a new element on the witness, require an allocation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: requires an allocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
src/blockdata/witness.rs
Outdated
assert_eq!(witness_serialized, serialize(&witness)); | ||
|
||
//assert_eq!(32, std::mem::size_of::<Witness>()); | ||
//assert_eq!(24, std::mem::size_of::<Option<Vec<u8>>>()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove comments here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They were removed in latest commit, but present in old one, after the squash they are removed completely
@@ -27,23 +28,35 @@ pub struct Witness { | |||
/// Number of elements in the witness. | |||
/// It is stored separately (instead of as VarInt in the initial part of content) so that method | |||
/// like [`Witness::push`] doesn't have case requiring to shift the entire array | |||
witness_elements: u64, | |||
witness_elements: usize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
last
and second_to_last
are not reset when calling the clear function
…eeping most usability. Witness struct is in place of the Vec<Vec<u8>> we have before this commit. from_vec() and to_vec() methods are provided to switch between this type and Vec<Vec<u8>> Moreover, implementation of Default, Iterator and others allows to have similar behaviour but using a single Vec prevent many allocations during deserialization which in turns results in better performance, even 20% better perfomance on recent block. last() and second_to_last() allows to access respective element without going through costly Vec transformation
ba224cf
to
106acdc
Compare
Fixed the clear method. I squashed all but the fuzz commit into one since they were changing the same code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 2fd0125
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 106acdc
Do you mean the tip commit? |
@dr-orlovsky could you remove the "changes requested" if everything is ok? I cannot merge otherwise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 106acdc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -657,10 +656,10 @@ impl<R: DerefMut<Target = Transaction>> SigHashCache<R> { | |||
/// let prevout_script = Script::new(); | |||
/// let _sighash = sig_hasher.segwit_signature_hash(inp, &prevout_script, 42, EcdsaSigHashType::All); | |||
/// // ... sign the sighash | |||
/// sig_hasher.witness_mut(inp).unwrap().push(Vec::new()); | |||
/// sig_hasher.witness_mut(inp).unwrap().push(&Vec::new()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
/// sig_hasher.witness_mut(inp).unwrap().push(&Vec::new()); | |
/// sig_hasher.witness_mut(inp).unwrap().push(&[]); |
#[cfg(feature = "serde")] | ||
use serde; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: this seems redundant (we always have access to the crate without use
ing it)
|
||
impl Witness { | ||
/// Creates [`Witness`] object from an array of byte-arrays | ||
pub fn from_vec(vec: Vec<Vec<u8>>) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: why not impl From<Vec<Vec<u8>> for Witness
instead?
And probably pub fn from<T: AsRef<[u8]>>(bytes: &[T])
will be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because From
is for cheap/trivial conversions. This is neither.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From
is for cheap/trivial conversions
Not really, impl From<&str> for String
exists for instance and that conversion is not usually considered "cheap".
@dr-orlovsky I consider AsRef<[u8]>
a footgun since it would accept strings without explicit as_bytes()
} | ||
|
||
/// Returns a struct implementing [`Iterator`] | ||
pub fn iter(&self) -> Iter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the name here is confusing: it is not clear do we iterate over witness stack elements - or witness byte sequence. So I will prefer fn byte_iter(&self)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We iterate over witness stack elements. It wouldn't be meaningful to iterate over bytes. I don't think there is any ambiguity here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I see in the code we do iterate over bytes here: self.content.iter()
where content: Vec<u8>
will iterate over bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.content
is used internally to provide the witness stack elements iterator via iter()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @dr-orlovsky here, chars()
and bytes()
exist on &str
for this reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is clearly meaningful to iterate over either the characters or bytes of a string. It is not clear to me what it means to iterate over the bytes of a witness, or even in what order they would be returned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, if it seems completely obvious that it means stack elements fine for me. I'd just like the docs to be more descriptive.
Just note that non-obvious iterators can be a footgun. I once got burned by &std::path::Path
implementing IntoIterator
(Can you guess what it iterates over without looking it up?)
…arking 247a14f Use test big block for bench_stream_reader instead of making one (Riccardo Casatta) b92dfbb exclude test_data when publishing the crate (Riccardo Casatta) f5a9681 include a big block in test_data, use it for ser/de benchmark (Riccardo Casatta) 09dada5 Move bip158 test vectors to test_data (Riccardo Casatta) 06d1a82 Remove testnet block hex from tests, use test_data with include_bytes! (Riccardo Casatta) Pull request description: In the first two commits I moved some data from source files to the newly introduced `test_data` dir, including it with `include_[str|bytes]!` macro. The second-to-last commit introduces a big block in test_data which is very handy in ser/de benchmark (I used it for #672) because with smaller blocks you may not notice performance improvements. Since I don't want to pollute the package the last commit excludes the `test_data` dir from the published package. I think it's fine to do it because dependent packages don't run dependencies tests. ACKs for top commit: apoelstra: ACK 247a14f Kixunil: tACK 247a14f Tree-SHA512: a2beb635b0a358737d0b57d3e7205b1ddf87652b9a8c889ce63e2867659a8eaf7e43a5b87a453345d56d953745913f40b58596f449e5fbc87340e0dd2aef0727
…ex), and get_tapscript 3c0d5ae Add get_tapscript to Witness (junderw) 4226d60 Add Index<usize> and nth(index) to Witness (junderw) Pull request description: Ref: #672 (comment) [Add Index<usize> and nth(index) to Witness](4226d60) [4226d60](4226d60) Arbitrary indexing into Witness fixes the API of last and second_to_last to be more flexible. This patch started off as an addition of third_to_last, but ended up evolving into arbitrary indexing to allow for future use cases. A list of the indices of the start byte for each witness element is stored as an ordered contiguous group of u32s represented as 4 bytes each in the Vec<u8> contents. The bytes are stored using to_ne_bytes for performance reasons. A helper function is added to the tests to allow for easier contruction of the contents Vec in test vectors. u32 was chosen because 22 bits are needed to store 4,000,000 which is the maximum weight limit for a block. This might need to be reworked in the event of consensus limits increasing, but u32 can hold 1000x the current limit, so it should be fine for the forseeable future. The push and consensus_deserialize functions utilize rotate_left and rotate_right to move the indices to the end of the new allocation. Depending on the size of the data, this might be more of a performance hit than just allocating a new temporary Vec to store the indices and append them after parsing is completed. However, for a majority of cases rotating the indices should be faster. Suggestions to use VecDeque instead of Vec for contents necessitate other considerations, since it is not a public facing change, those optimizations can be dealt with in future patches. The Index<usize> trait is implemented using the new nth method with expect. The Iter struct is reworked to make use of the new data representation. This new data structure makes it trivial to implement DoubleEndedIterator and other such traits, but I have decided to leave this as out of scope for this patch. --- [Add get_tapscript to Witness](a7501d9) [a7501d9](a7501d9) This new method will check the last witness element to see if it starts with 0x50, and depending on the result it will return the second to last or third to last witness element according to BIP341. In its current state, Witness can not know what type of script it is fulfilling, so it is up to the caller to verify if the previous output is a taproot output or not. --- Edit: This is the previous PR body: > In a taproot script payment with annex, quick access to the 3rd to last element (which is the actual script in this case) is convenient. > > This feels like kicking the can down the road again, but I think it's a nice to have method. > > RCasatta dr-orlovsky were discussing this issue. I would like to ask if they have any thoughts on the addition of this. ACKs for top commit: tcharding: ACK 3c0d5ae apoelstra: ACK 3c0d5ae Kixunil: ACK 3c0d5ae Tree-SHA512: 0038eed6ad56786b8dd6d98db0d1846753b8b25de0bc1089cdc75d5850d0ccc66dde9a10be7fe09589ad7db118fd50ee9f7993695968df5c389457ccfcdaa761
At the moment the Witness struct is
Vec<Vec<u8>>
, the vec inside a vec cause a lot of allocations, specifically:The proposed Witness struct contains the serialized format of the witness. This reduces the allocations to:
The inconvenience is having slightly less comfortable access to the witness, but the iterator is efficient (no allocations) and you can always collect the iteration to have a Vec of slices. If you collect the iteration you end up doing allocation anyway, but the rationale is that it is an operation you need to do rarely while ser/de is done much more often.
I had to add a bigger block to better see the improvement (ae86024), these are the results of the benches on my machine:
With an increased performance to deserialize a block of about 21% and to serialize a block of about 16% (seems even higher than expected, need to do more tests to confirm, I'll appreciate tests results from reviewers)