Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use hex from internals rather than hashes #1476

Merged
merged 1 commit into from Dec 31, 2022

Conversation

Kixunil
Copy link
Collaborator

@Kixunil Kixunil commented Dec 15, 2022

bitcoin-internals contains a more performant implementation of hex encoding than what bitcoin_hashes uses internally. This switches the implementations for formatting trait implementations as a step towards moving over completely.

The public macros are also changed to delegate to inner type which is technically a breaking change but we will break the API anyway and the consuers should only call the macro on the actual hash newtypes where the inner types already have the appropriate implementations.

Apart from removing reliance on internal hex from public API this reduces duplicated code generated and compiled. E.g. if you created 10 hash newtypes of SHA256 the formatting implementation would be instantiated 11 times despite being the same.

To do all this some other changes were required to the hex infrastructure. Mainly modifying put_bytes to accept iterator (so that iter().rev() can be used) and adding a new DisplayArray type. The iterator idea was invented by Tobin C. Harding, this commit just adds a bound check and generalizes over u8 and &u8 returning iterators.

While it may seem that DisplayByteSlice would suffice it'd create and initialize a large array even for small arrays wasting performance. Knowing the exact length DisplayArray fixes this.

Another part of refactoring is changing from returning impl Display to return impl LowerHex + UpperHex. This makes selecting casing less annoying since the consumer no longer needs to import Case without cluttering the API with convenience methods.

@Kixunil Kixunil added code quality Makes code easier to understand and less likely to lead to problems perf Performance optimizations/issues labels Dec 15, 2022
@Kixunil Kixunil added the Blocked Some other work is required to make this possible label Dec 15, 2022
@Kixunil
Copy link
Collaborator Author

Kixunil commented Dec 15, 2022

Blocked on #1477

Copy link
Member

@tcharding tcharding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that perhaps the Hash trait should require fmt::UpperHex a well?

This review is for the hashes::internal_macros and hashes::util changes only, I didn't review deeply the new stuff in internals::hex yet, will wait til there is a clean build so I can play with it locally to fully understand the improvements.

Comment on lines 15 to 17
#[macro_export]
/// Adds hexadecimal formatting implementation of a trait `$imp` to a given type `$ty`.
macro_rules! hex_fmt_impl(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment is stale.

@@ -113,7 +121,7 @@ macro_rules! hash_newtype {
#[repr(transparent)]
pub struct $newtype($hash);

$crate::hex_fmt_impl!($newtype);
$crate::hex_fmt_impl!($reverse, $newtype);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have $reverse after $newtype so the args are ordered the same as for other macros.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generics have to be at the end and I think it'd be a bit weird to have $reverse between type name and generics. Agree this one is not great either. 🤷‍♂️

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we had a separate macro lpmi_tmf_xeh which was the reversed version, rather than having a parameter for $reverse.

ducks

@@ -21,14 +68,32 @@
/// `internal_engine` is required to initialize the engine for given hash type.
macro_rules! hash_trait_impls {
($bits:expr, $reversed:expr $(, $gen:ident: $gent:ident)*) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated but at some stage it would be nice to use a single term, currently we have $reversed and $reverse - I think we should favour $reverse because it matches DISPLAY_BACKWARD (i.e., $reversed pairs with DISPLAYED_BACKWARD).

hashes/src/util.rs Outdated Show resolved Hide resolved
`bitcoin-internals` contains a more performant implementation of hex
encoding than what `bitcoin_hashes` uses internally. This switches the
implementations for formatting trait implementations as a step towards
moving over completely.

The public macros are also changed to delegate to inner type which is
technically a breaking change but we will break the API anyway and the
consuers should only call the macro on the actual hash newtypes where
the inner types already have the appropriate implementations.

Apart from removing reliance on internal hex from public API this
reduces duplicated code generated and compiled. E.g. if you created 10
hash newtypes of SHA256 the formatting implementation would be
instantiated 11 times despite being the same.

To do all this some other changes were required to the hex
infrastructure. Mainly modifying `put_bytes` to accept iterator (so that
`iter().rev()` can be used) and adding a new `DisplayArray` type. The
iterator idea was invented by Tobin C. Harding, this commit just adds a
bound check and generalizes over `u8` and `&u8` returning iterators.

While it may seem that `DisplayByteSlice` would suffice it'd create and
initialize a large array even for small arrays wasting performance.
Knowing the exact length `DisplayArray` fixes this.

Another part of refactoring is changing from returning `impl Display` to
return `impl LowerHex + UpperHex`. This makes selecting casing less
annoying since the consumer no longer needs to import `Case` without
cluttering the API with convenience methods.
Copy link
Member

@tcharding tcharding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 3e520f9

@tcharding
Copy link
Member

Legend, making hex progress now!

Copy link
Member

@apoelstra apoelstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 3e520f9

@apoelstra apoelstra merged commit d06bb22 into rust-bitcoin:master Dec 31, 2022
@Kixunil Kixunil deleted the hex-refactor branch December 31, 2022 19:54
@tcharding tcharding mentioned this pull request Mar 21, 2023
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code quality Makes code easier to understand and less likely to lead to problems perf Performance optimizations/issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants