-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for hashing from a reader #159
base: feat/no-digest
Are you sure you want to change the base?
Conversation
Remove the per-hasher digest type. Instead, store hash digests inside the hashers and "borrow" it. In all cases, we're going to copy it into a `Multihash<S>` anyways. This: 1. Removes bunch of code. 2. Means that hashers don't need to be generic over the size (unless they actually support multiple sizes). This fixes the UX issue introduced in the const generics PR. 3. Avoids some copying. BREAKING CHANGE 1. `Hasher.digest` no longer exists. Users should use `Code::SomeCode.digest` where possible. 2. The hasher digests no longer exist.
df6b56f
to
3954ba5
Compare
quote!(Self::#ident => { | ||
let mut hasher = #hasher::default(); | ||
io::copy(reader, &mut hasher)?; | ||
Ok(Multihash::wrap(#code, hasher.finalize()).unwrap()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add a comment why it is OK to use unwrap()
here (or using an expect()
which makes it clear?
fn digest(&self, input: &[u8]) -> Multihash<S>; | ||
fn digest(&self, input: &[u8]) -> Multihash<S> { | ||
let mut input = input; | ||
self.digest_reader(&mut input).unwrap() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, please add a comment or expect()
to make clear why it is OK to use unwrap()
here.
src/multihash.rs
Outdated
@@ -49,7 +69,9 @@ pub trait MultihashDigest<const S: usize>: | |||
/// let hash = Code::Sha3_256.wrap(&hasher.finalize()); | |||
/// println!("{:02x?}", hash); | |||
/// ``` | |||
fn wrap(&self, digest: &[u8]) -> Multihash<S>; | |||
fn wrap(&self, digest: &[u8]) -> Multihash<S> { | |||
Multihash::wrap((*self).into(), digest).unwrap() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, please add a comment or expect()
to make clear why it is OK to use unwrap()
here.
In regards to the io imports. This seems to work: +++ b/derive/src/multihash.rs
@@ -238,7 +238,8 @@ pub fn multihash(s: Structure) -> TokenStream {
let code_into_u64 = hashes.iter().map(|h| h.code_into_u64(¶ms));
let code_from_u64 = hashes.iter().map(|h| h.code_from_u64());
let code_digest = hashes.iter().map(|h| h.code_digest());
- let code_reader = hashes.iter().map(|h| h.code_reader());
+ let code_reader_core = hashes.iter().map(|h| h.code_reader());
+ let code_reader_std = hashes.iter().map(|h| h.code_reader());
quote! {
/// A Multihash with the same allocated size as the Multihashes produces by this derive.
@@ -253,11 +254,22 @@ pub fn multihash(s: Structure) -> TokenStream {
}
}
- fn digest_reader<R: #io_path::Read>(&self, reader: &mut R) -> #io_path::Result<Multihash> {
- use #io_path;
+ #[cfg(feature = "std")]
+ fn digest_reader<R: std::io::Read>(&self, reader: &mut R) -> std::io::Result<Multihash> {
+ use std::io;
use #mh_crate::Hasher;
match self {
- #(#code_reader,)*
+ #(#code_reader_core,)*
+ _ => unreachable!(),
+ }
+ }
+
+ #[cfg(not(feature = "std"))]
+ fn digest_reader<R: core2::io::Read>(&self, reader: &mut R) -> core2::io::Result<Multihash> {
+ use core2::io;
+ use #mh_crate::Hasher;
+ match self {
+ #(#code_reader_std,)*
_ => unreachable!(),
}
} |
@vmx so, that brings up an interesting question. The first two are "safe" (except for the identity hash) because we know the digests will fit. But the last one isn't as a digest could have an arbitrary size. This isn't a new issue, but it would be nice to fix it while we're changing these APIs. Options:
Thoughts? |
The multihash digest may not fit.
3954ba5
to
f002173
Compare
For now I'm returning a result. |
I'm not sure about this. In the default table we would know. But if someone defines their own table (and people really should), then they could define an allocation size, which is smaller than what the hasher returns. This behaviour might be fine, but we need documentation for the panic cases. |
You know what, I think I have a good solution. I'm going to codegen tests. |
So, actually, what I can do is:
|
@Stebalien what exactly do you mean with "statically assert"? At compile time or at run-time via |
I mean at compile time. I can't remember exactly what I was planning, but I think it was something like: when deriving But I can't remember exactly how I intended to do it. Something like:
But I'm not sure if there's a way to get a better error. |
@mriise On one hand I don't want to drag you into yet another const generics related discussion, on the other hand I'm pretty sure you would have a good idea on how to check if a certain size is smaller then other nicely at compile time :) |
here is a playground link to the best solution right now IMO: https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=de22846e5857a2b607e884e0589e9aae The error it puts out is a bit confusing at first glance but its not too terrible, might be even better with different ident names. as for shrinking arrays, since it can fail at runtime assuming we are shrinking it to the |
Use config flags instead of introducing the new attribute `io_path`.
With some compile-time asserts we can make sure that certain `unwrap()` calls won't panic.
I've pushed two new commits (I intentionally didn't rebase this PR for easier review. Once reviewed, I'll rebase and squash it into a single commit). The first one is about the The second one adds those compile-time asserts. Thanks @mriise your playground like was really useful. |
@mriise I've added you a few times as a reviewer, when I'd appreciate your input. Though please don't feel obliged to do a review. |
derive/src/multihash.rs
Outdated
@@ -253,11 +245,22 @@ pub fn multihash(s: Structure) -> TokenStream { | |||
} | |||
} | |||
|
|||
fn digest_reader<R: #io_path::Read>(&self, reader: &mut R) -> #io_path::Result<Multihash> { | |||
use #io_path; | |||
#[cfg(feature = "std")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This gets expanded in the importing crate, not in the multihash-derive crate. That means if the importing crate doesn't have an std
feature, we'll use core2
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it so that it is expanded at the multihash-derive level. Would that work?
While testing the std/core2 expansion, I found out that Perhaps we should also check examples being compiled without the
|
Hm. Yes, it is. Especially because the unstable API is being replaced. Maybe upstream would accept a stable version? There's no good reason it can't be supported on stable. |
I've opened technocreatives/core2#17, let's see what happens. |
Solving this properly probably will take a while. What do you folks think about releasing current master and leaving this one for the release after (especially since this change is just and addition, without breaking APIs)? |
Yeah, that makes sense. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ass support for hashing from a reader please. Thanks
fixes #141
The io import stuff is a bit annoying, but I can't find a better way to do this.