Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add non-copying version of murmur3_32 that reads directly from a byte buffer #19

Merged
merged 5 commits into from
Feb 16, 2023

Conversation

eamsden
Copy link
Contributor

@eamsden eamsden commented Feb 2, 2023

I'm using murmur3 on very large in-memory buffers. My initial approach was to create a wrapper type around my buffers that implements read(), but the overhead of the call to this function for each 4 bytes in very large buffers made the hashing unreasonably slow.

This PR adds an implementation which directly hashes a byte buffer, without the necessity of copying into a local 4-byte buffer. It immediately and significantly improved the performance of my application.

@stusmall
Copy link
Owner

This looks fantastic. I appreciate you putting this together. It should knock two items off my list, better performance and an easy path to no_std support. I've got two asks for it:

  1. The new method should get added to the unit tests, proptests and bench.rs. That's needed to merge.
  2. Maybe name it murmur3_32_of_slice? I'm not hung up on that name so I can easily be talked into leaving it as is or to something else.

Once those are taken care of and CI turns green I'll merge it.

I'd love to get this added to all the other hashing methods. I'm happy to merge this without the others but probably won't be releasing a new version of the crate until I can find some time to do that. If you can help with that I'd appreciate it.

@eamsden
Copy link
Contributor Author

eamsden commented Feb 14, 2023

@stusmall the name change is fine. And sure I can add it to tests and benchmarks.

I am seeing a clippy failure in CI but can't replicate it locally. Perhaps I'm doing something wrong? I passed the same flags to cargo clippy that CI passed to ~/.cargo/bin/clippy

@eamsden
Copy link
Contributor Author

eamsden commented Feb 14, 2023

I haven't looked at the other hashing methods (we only use the 32 bit version) but if they're as easy as this was I can knock out the _of_slice versions for them into a short time.

@stusmall
Copy link
Owner

@stusmall the name change is fine. And sure I can add it to tests and benchmarks.

I am seeing a clippy failure in CI but can't replicate it locally. Perhaps I'm doing something wrong? I passed the same flags to cargo clippy that CI passed to ~/.cargo/bin/clippy

What version are you running? Clippy regularly adds checks with new releases. I'd try updating to the newest clippy.

Side note: it seems like this lint failure existed before your change. I'm guessing it must have been adding very recently. CI always uses the newest stable toolchain and the last run didn't have this warning but it is appearing on master now

@eamsden
Copy link
Contributor Author

eamsden commented Feb 14, 2023

I'm using nix for my rust environment so I have clippy 1.60. I'll fiddle with it later today and get it all the way up to date.

@eamsden
Copy link
Contributor Author

eamsden commented Feb 14, 2023

Also notably the benchmarks only build with rust nightly because of the testing feature, so I'll need to get that toolchain going.

@eamsden
Copy link
Contributor Author

eamsden commented Feb 14, 2023

image

@eamsden
Copy link
Contributor Author

eamsden commented Feb 16, 2023

@stusmall latest commit adds unit/prop tests. I'd like to get this passing CI and merged, and I can open and self-assign an issue to make the other of_slice variants in a separate PR.

@stusmall stusmall merged commit 07e7a1a into stusmall:master Feb 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants