Add buffering during encode #138

tcharding · 2023-10-16T23:08:40Z

Currently we write char at a time to both writers (fmt::Write and std::io::Write). We can improve performance by first collecting the ASCII bytes of the encoded bech32 string into a buffer on the stack then writing the buffer as a single call.

This is purely an optimization.

The second patch is surprising, at least to me, would love to learn what I'm doing wrong. I thought this would be valid

let mut buf = [0_u8; Ck::CODE_LENGTH]

But the compiler does not like the usage of an associated const.

tcharding · 2023-10-17T01:13:53Z

Fuzzing for the win, uncovered #140

tcharding · 2023-10-17T03:52:51Z

Needs redoing on top of #142

apoelstra · 2023-11-01T17:44:36Z

I'm not thrilled with the casts from char to u8. There are a couple approaches we could take:

Add a bytes iterator alongside the chars iterator that yields u8s directly
Copy from Remove arbitrary padding limit hex-conservative#41 which calls as_bytes and len on a char to turn it into a byteslice which can be copied.

Alternately I can just ACK this as-is. What do you think?

apoelstra · 2023-11-01T17:44:58Z

ACK a957f7f other than that.

tcharding · 2023-11-01T20:35:51Z

I'm not thrilled with the casts from char to u8.

Why? HRP is checked to be valid ASCII and the fe32s are valid ASCII also so its safe to cast.

apoelstra · 2023-11-01T20:50:23Z

@tcharding because every time I read the code I need to mentally re-check those facts. I agree it's safe, but it's a code smell.

tcharding · 2023-11-01T21:14:56Z

Ah yes, I guess I did it in some place without code comments. I agree its smelly, I'll see if I can come up with something better.

Add an iterator adapter that is identical to `encode::CharIter` but yields `u8` byte values instead. This makes encoding cleaner because users do not need to cast `char`s to `u8`.

Currently we call `fmt::Write::write_char` and `io::Write::write_all` in a loop while iterating over the encoded characters of a bech32 string. This is inefficient. Add a buffer and copy ASCII bytes into it while looping the write the whole buffer out in a single call. The buffer has a size of 90 bytes because we know the bech32 string is guaranteed to be 90 chars or less.

As we did when writing in the `segwit` module; add a buffer to cache characters and do writes a buffer at a time. Although we do know the maximum length of the string we are encoding, and we enforce this limit using `encoded_length::<Ck>()` we use an unrelated fixed buffer size and loop over it as many times as needed. This is because, surprisingly the following is not valid Rust `let mut buf = [0_u8, Ck::CODE_LENGTH];` We use a buffer of length 1024 which is larger than the 1023 code length of the two `Checksum` implementations we provide with this crate (`Bech32` and `Bech32m`).

tcharding · 2023-11-05T23:28:39Z

Added primitives::encode::ByteIter and used that to remove the casts in crate::segwit.

apoelstra · 2023-11-06T13:30:43Z

Perfect, thanks! I'm a tiny bit disappointed that ByteIter goes through CharIter rather than directly converting fes to bytes and pulling bytes out of the Hrp, but I see that this would be a ton of complexity, and anyway it's just an optimization that's unlikely to be measurable.

apoelstra

ACK c215c3d

apoelstra · 2024-01-23T14:58:48Z

I wonder if we could stick a .buffered(buffer_size) adaptor onto CharIter (or whatever our "final" encoding iterator is), which wouldn't be an iterator anymore but would be fmt::Write and io::Write.

tcharding · 2024-01-31T04:57:16Z

Flagging this, is the comment above actionable before #168 merges?

apoelstra · 2024-01-31T13:24:17Z

@tcharding no. The more I think about this the less clear it is to me what we actually want to do here, buffering-wise. Let's just do 0.10.0 and then I think we'll be "done" our part (except for the error correction stuff which I continue to have on my todo list) and can ask Kix to come take a look.

tcharding mentioned this pull request Oct 17, 2023

bech32 string maximum length is not enforced #140

Closed

tcharding force-pushed the 10-17-buffered branch from c2316bd to 0d3216b Compare October 19, 2023 01:29

tcharding marked this pull request as draft October 19, 2023 01:29

tcharding mentioned this pull request Oct 22, 2023

Enforce maximum length #142

Merged

tcharding force-pushed the 10-17-buffered branch 3 times, most recently from da82e4e to a957f7f Compare October 30, 2023 20:11

tcharding marked this pull request as ready for review October 30, 2023 20:31

tcharding added 3 commits November 6, 2023 10:27

Add encode::ByteIter

596f1f7

Add an iterator adapter that is identical to `encode::CharIter` but yields `u8` byte values instead. This makes encoding cleaner because users do not need to cast `char`s to `u8`.

tcharding force-pushed the 10-17-buffered branch from a957f7f to c215c3d Compare November 5, 2023 23:28

apoelstra approved these changes Nov 6, 2023

View reviewed changes

apoelstra merged commit 2b3f42f into rust-bitcoin:master Nov 6, 2023
12 checks passed

apoelstra deleted the 10-17-buffered branch November 6, 2023 13:38

apoelstra mentioned this pull request Jan 23, 2024

Buffererd encoding #171

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add buffering during encode #138

Add buffering during encode #138

tcharding commented Oct 16, 2023 •

edited

Loading

tcharding commented Oct 17, 2023

tcharding commented Oct 17, 2023

apoelstra commented Nov 1, 2023

apoelstra commented Nov 1, 2023

tcharding commented Nov 1, 2023

apoelstra commented Nov 1, 2023

tcharding commented Nov 1, 2023 •

edited

Loading

tcharding commented Nov 5, 2023

apoelstra commented Nov 6, 2023

apoelstra left a comment

apoelstra commented Jan 23, 2024

tcharding commented Jan 31, 2024

apoelstra commented Jan 31, 2024

Add buffering during encode #138

Add buffering during encode #138

Conversation

tcharding commented Oct 16, 2023 • edited Loading

tcharding commented Oct 17, 2023

tcharding commented Oct 17, 2023

apoelstra commented Nov 1, 2023

apoelstra commented Nov 1, 2023

tcharding commented Nov 1, 2023

apoelstra commented Nov 1, 2023

tcharding commented Nov 1, 2023 • edited Loading

tcharding commented Nov 5, 2023

apoelstra commented Nov 6, 2023

apoelstra left a comment

Choose a reason for hiding this comment

apoelstra commented Jan 23, 2024

tcharding commented Jan 31, 2024

apoelstra commented Jan 31, 2024

tcharding commented Oct 16, 2023 •

edited

Loading

tcharding commented Nov 1, 2023 •

edited

Loading