Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

provide compiled binaries for b3sum #78

Closed
oconnor663 opened this issue Apr 3, 2020 · 5 comments
Closed

provide compiled binaries for b3sum #78

oconnor663 opened this issue Apr 3, 2020 · 5 comments

Comments

@oconnor663
Copy link
Member

No description provided.

@sergeevabc
Copy link

At last!

@oconnor663
Copy link
Member Author

Done! See the releases page. I've sanity checked these by running them on my own machines, and they seem to work, but I'd love to have other folks kick the tires too. Also I haven't done anything fancy with the binaries apart from building with --release (I haven't even stripped symbols), so if there are any Best Practices I'm missing please let me know.

@sergeevabc
Copy link

@oconnor663, it kinda works, but slow as hell.

@oconnor663
Copy link
Member Author

oconnor663 commented May 2, 2020

@sergeevabc I'd like to help you figure out the problem, but I can only do that if you give me the details of what you're measuring. (There's an unfortunate pattern in online tech discussions, where A asks a question without any details, and B asks for details, and then later A comes back with details that make the question answerable. That makes answering the question take days instead of minutes. If the "more details" cycle needs to repeat a few times, it could take weeks. The only way to break this cycle is to give way more detail than you think you need to in the very first step.)

One known performance problem is if you have a large file (too large to fit in memory) on a spinning disk. In that case, the default behavior of b3sum will be to real many different parts of the file at the same time, and that'll thrash the disk. If b3sum file performs poorly, but b3sum < file or b3sum file --num-threads=1 performs better, that could be the problem you're seeing. Unfortunately, I don't know of a portable way to automatically detect "file cached in memory or not".

@sergeevabc
Copy link

sergeevabc commented Mar 17, 2021

Almost a year passed, b3sum has grown from 0.3.5 to 0.3.7 and I tried to calculate a hash again, now using 4.4 GiB rip of “City lights” by Charlie Chaplin. Especially I was eager to see how SSE2 tweaks had improved the performance. Alas, in less than 30 seconds Process Hacker indicated 99% consumption of memory (b3sum’s working set took it all), then disk trashing began and I lost control for a few minutes. This is unacceptable as no other hashing app (digest, gohash, rhash to name a few), be it conservative or contemporary algorithm, caused such a disaster (or, even a memory leak!). As a last-ditch attempt I tried your advice about < and --num-threads=1 — only the former finished the job with fewer troubles, but this mode does not allow *.*

Err, and how do we proceed from here? I’m not ready to evangelize this implementation, each time adding “use that switch”.
On the other hand, RapidCRC, FileHash, Hashsum, and HashCheck fork have no such drawback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants