-
Notifications
You must be signed in to change notification settings - Fork 623
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hashes: Epic re-write #2770
base: master
Are you sure you want to change the base?
hashes: Epic re-write #2770
Conversation
Overall 6c7e143 looks good. The duplication of the merkle root code I don't like (but I don't think it's necessary ... in fact I suspect we could pull it into the We should have some public type aliases for HMAC-SHA256 and HMAC-SHA512 since those are the overwhelmingly most common forms of HMAC. As for "hash newtypes should not be generic hash types" I am leaning strongly into implementing this by adding visibility specifiers to the |
Leave it with me, I'm going to go for the least tricky solution that does exactly what we want.
Will look into this. |
lol, double lol - look at the diff required to get this functionality, 5 changed lines. diff --git a/bitcoin/src/blockdata/block.rs b/bitcoin/src/blockdata/block.rs
index 1a660fb8..98074e10 100644
--- a/bitcoin/src/blockdata/block.rs
+++ b/bitcoin/src/blockdata/block.rs
@@ -26,7 +26,7 @@ hashes::hash_newtype! {
/// A bitcoin block hash.
pub struct BlockHash(sha256d);
/// A hash of the Merkle tree branch or root for transactions.
- pub struct TxMerkleNode(sha256d);
+ pub struct TxMerkleNode(pub(crate) sha256d);
/// A hash corresponding to the Merkle tree root for witness data.
pub struct WitnessMerkleNode(sha256d);
/// A hash corresponding to the witness structure commitment in the coinbase transaction.
diff --git a/bitcoin/src/blockdata/transaction.rs b/bitcoin/src/blockdata/transaction.rs
index 4916e43a..2834be04 100644
--- a/bitcoin/src/blockdata/transaction.rs
+++ b/bitcoin/src/blockdata/transaction.rs
@@ -44,10 +44,10 @@ hashes::hash_newtype! {
/// versions of the Bitcoin Core software itself, this and other [`sha256d::Hash`] types, are
/// serialized in reverse byte order when converted to a hex string via [`std::fmt::Display`]
/// trait operations. See [`hashes::Hash::DISPLAY_BACKWARD`] for more details.
- pub struct Txid(sha256d);
+ pub struct Txid(pub(crate) sha256d);
/// A bitcoin witness transaction ID.
- pub struct Wtxid(sha256d);
+ pub struct Wtxid(pub(crate) sha256d);
}
impl_hashencode!(Txid);
impl_hashencode!(Wtxid);
diff --git a/hashes/src/macros.rs b/hashes/src/macros.rs
index 45b53ae1..6a32c935 100644
--- a/hashes/src/macros.rs
+++ b/hashes/src/macros.rs
@@ -128,11 +128,11 @@ macro_rules! hash_newtype {
/// Constructs a new engine.
#[inline]
- pub fn engine() -> $hash::Engine { $hash::Hash::engine() }
+ $field_vis fn engine() -> $hash::Engine { $hash::Hash::engine() }
/// Produces a hash froam the current state of a given engine.
#[inline]
- pub fn from_engine(e: $hash::Engine) -> Self { Self($hash::Hash::from_engine(e)) }
+ $field_vis fn from_engine(e: $hash::Engine) -> Self { Self($hash::Hash::from_engine(e)) }
/// Copies a byte slice into a hash object.
#[inline] |
I've used public functions in the /// Returns an engine for computing `HMAC-SHA256`.
///
/// Equivalent to `hmac::Engine::<sha256::Engine>::new(key)`.
pub fn new_engine_sha256(key: &[u8]) -> Engine<sha256::Engine> {
Engine::<sha256::Engine>::new(key)
}
/// Returns an engine for computing `HMAC-SHA512`.
///
/// Equivalent to `hmac::Engine::<sha512::Engine>::new(key)`.
pub fn new_engine_sha512(key: &[u8]) -> Engine<sha512::Engine> {
Engine::<sha512::Engine>::new(key)
} |
I agree its ugly, before we fix it we should ask how much change we are willing to accept to de-duplicate something that is two exact copies and likely never changes? AFIACT there is no way of doing this generically without an invasive change, hence the question. Said another way, there is no way of going from a generic type (we only have The quick obvious thing is to use a macro but I don't know if its possible or the implications of calling a macro recursively. |
f8c33e0
to
5f57240
Compare
This isn't going to pass CI because it has a patched manifest to use rust-bitcoin/hex-conservative#90 In case it gets past you, this PR is hot as f**k. Check out the Massive props @apoelstra for sticking to your guns and refusing to accept the split out of |
I'll play with it. Fine for now if we have to duplicate the code. But I think/hope we can somehow be generic over the engine.
I'd rather add a dummy trait before I added a macro. (Or a new |
Well that was pretty easy with a fresh set of eyes, I just threw this into the /// Trait used for types that can be hashed into a merkle tree.
pub trait Hash {
/// Constructs a new [`sha256d`] hash engine.
fn engine() -> sha256d::Engine { sha256d::Engine::new() }
/// Produces a hash froam the current state of a given engine.
fn from_engine(e: sha256d::Engine) -> Self;
}
impl Hash for Txid {
fn from_engine(e: sha256d::Engine) -> Self { Self::from_engine(e) }
}
impl Hash for Wtxid {
fn from_engine(e: sha256d::Engine) -> Self { Self::from_engine(e) }
} And changed the trait bounds on all the functions to Wallah, the original |
0a10c43
to
4622bf8
Compare
I'm happy with this now, I'll just leave it sitting here until the I had an idea - although the PR title is |
d187b12
to
442239d
Compare
I don't understand what you mean by this. Associated types are associated with the type. |
Lets just leave the const thing aside, I can't explain it anyways but I can guarantee that Rust does not implement associated consts to behave how one would think they should behave (at least in my opinion). Stepping back, I believe the problem is that HMAC (and HKDF) have different requirements of the various hash types than a either a normal user, or an advanced user [0]. We removed the I believe what we really want is no traits whatsoever for normal users and a single trait for HMAC that defines "a hash that can be used in HMAC-X (and HKDF)" - even though this is all the hash types. Does that make sense or am I mad? [0] "advanced" meaning create engine, hash data, hash more data, finalize engine. |
Problem, if a hash is to be used in a merkle tree (eg |
Does the HMAC trait differ from the "can be used in a Merkle structure" trait or the "can be used to tweak an EC key" trait?
This is ok I think -- |
Same trait, we need a name for it, on master it is
|
So |
Let's go with
It shouldn't implement |
Ok, I am going to risk my neck here, I believe you have in your mind associated consts working how a sane person would think they work. They don't, I'll try dig up the stuff I read on it to convince you (and better understand it myself). The |
Why not?
What is wrong with that? |
I pasted my proposed |
Ok, it looks like trigger traits don't actually work the way I want them to:
So I guess we do need macros to define inherent methods. That is super annoying. |
Were does that leave us?
|
Yeah :/.
Yes, but we shouldn't do either. Here is what I think we should do:
Then take a breather and we'll revisit I got about halfway through step 1 of this and it was quite cathartic to see how different parts of the codebase became explicit about whether they wanted to hash arbitrary data or whether they just wanted to deal with prepackaged hashes. |
The resulting codebase will let people keep using HKDF and HMAC and merkle trees with the "base hashes" but they won't be able to use these constructions with |
Aight, sounds like a plan. I need to take some time away from |
Chunked feeding of data to an engine is pretty common though. Perhaps at least those methods could be exposed.
IIRC we have non-cryptographic hashes in the crate and using HMAC with them is at least suspicious. If you think it's not worth a marker trait I can understand that.
IIRC I did see a case when there was a good reason to have it but I think it was old taproot stuff that currently uses sane strong types. IMO the casts are reasonably explicit.
That seems excessive to me, but not strictly wrong. You can still accidentally feed some data into wrong engine but at least you can't assign an engine to an incompatible one. I don't think people are assigning engines around too much. They just use them to compute the hashes. Though I could see a benefit of having
I believe inherent
Just add a bound?
I'd like to see that but
Not really. Hashing two txids into a merkle node doesn't produce a
Yes, and they can still do exotic things explicitly by going through byte arrays. That's what I wanted. |
Yes, pretty common but it still seems uncommon enough that it's okay for users to need to import a trait to do it. But agreed, just like everything we should have inherents for this.
Yeah :/ agreed it may be strictly better to have a marker trait, but I continue to think it's not worth it. I think siphash is the only non-cryptographic hash in this crate (other than sha1 which needs to be used in cryptographic contexts for legacy/compatibility reasons, even though I would definitely not recommend its use as a crypto hash).
Yeah, exactly. I'd like to have a
Yeah :( earlier in this thread I thought there was a way with "trigger traits" that we could do this without macros. It turns out that you can use these to implement inherent methods on
Yeah, great points. Need to think about what our Merkle tree computation function should look like then.. unfortunately we need to resolve this as step 1 of this project (well, maybe step 2, after just splitting |
I think we should open a new PR since github has started hiding comments on this one. |
I also think that "new PR" should be one that just splits up |
I think we can provide generic merkle tree over non-wrapped hashes (maybe in |
I think you're right. That's the approach I'd try first. |
Thanks for the input fellas, just wanted to say that I read everything (multiple times) but the PRs that fell out today may not seem like I did. There are six hundred things going on, not exactly sure how to pull everything together. (I did read and grok you list in #2770 (comment) @apoelstra, but I went a different way - no disrespect intended). |
There has been so much discussion on the I believe the correct observation from @Kixunil that a past state did not correctly tie the Hmac to the hash engine was correct (we used to only have the (Note that this PR should be re-done on top of what ever is decided for the merkle tree stuff.) |
There are a few niggling problems with the current `hashes` API that we would like to improve: - Its not that discoverable - It is annoying to have to import the `Hash` trait all the time. - New hash types (created with `hash_newtype`) are general purpose but we'd like to hide this from users of `rust-bitcoin`. - The API of the `hmac` module is different to the other hash types. - The `hash_newtype` and `sha256t_hash_newtype` macros are a bit hard to read and work on This re-write of the `hashes` API is an attempt to address all these issues. Note that the public API surface of `hashes` does not change all that much.
It needs to be generic over the hash type. If it is only generic over If it's now generic over both |
If you can get rid of it then I'll be happy to see how that was achieved, as far as I can tell you are going to hit all the associated const problems I've been hitting - its a language limitation. But by all means please do try to solve it if you think I've missed something.
So we've been having API design discussions, with you giving me directives of things to explore, and ways you want it done, and you never looked at the latest state. That is pretty frustrating man. Even when I repeatedly said that we are hitting language limitations and that associated const don't "just work" as one would expect. There are so many problems with |
Re-write the
hashes
API for fun and profit.Please note, this removes the ability to display tagged hashes backward.
Resolves the following issues:
Close: #1481
Close: #1552
Close: #1689
Close: #2197
Close: #2377
Close: #2665
Close: #2811
TODO
QUESTIONS
inhashes/examples/*.rs