Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uses FXHash, rather than Rust's default hasher #27

Closed
wants to merge 1 commit into from

Conversation

sts10
Copy link
Contributor

@sts10 sts10 commented Apr 23, 2022

Uses FxHash rather than Rust's default hasher, in an attempt to improve performance.

My tests have been a bit inconclusive, so would appreciate it if others compared these changes to the main branch in whatever way they think best. FxHash might even be a touch slower in some cases? I'm not sure.

There are definitely other hashers that may be worth exploring, like ahash. Maybe alternatives to try can be discussed in the comments here.

@sirwart
Copy link
Owner

sirwart commented Apr 24, 2022

Hey Sam, I would be very surprised if the choice of hash map had an effect on overall performance. Most of the time should be spent looking for regex matches in files, since you're looking for a small number of needles in a very large haystack. If the code for determining whether a needle is actually a secret or not is running slow then I would guess that it's caused by the algorithm being slow (i.e something being accidentally exponential instead of linear). If that's the case, I think the algorithm would need to be fixed rather than the hash map implementation.

Given the built-in hash map has reasonably fast performance, I would prefer to stick with the built-in hash map and not add an additional dependency.

While benchmarks are useful for seeing whether an optimization worked or not, I think profiling is a good way to highlight which parts of the system are worth optimizing. However before even doing that, I would wait until there's a case where ripsecrets runs unreasonably slowly for some input (which I'm sure there are edge cases where it does) since that will likely be a case where optimization is helpful.

@sts10
Copy link
Contributor Author

sts10 commented Apr 24, 2022

Ah, thanks for the explanation! Think I was a bit overeager to make more contributions.

@sts10 sts10 closed this Apr 24, 2022
@sirwart
Copy link
Owner

sirwart commented Apr 24, 2022

No problem, your contributions have been greatly appreciated! I'm mostly waiting for more feedback for folks in terms of what improvements are needed, but there are still a few small improvements that I'm aware of. I filed them all in the issue tracker https://github.com/sirwart/ripsecrets/issues in case any of them peaked your interest.

@sts10 sts10 deleted the fxhash branch September 6, 2022 03:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants