-
Notifications
You must be signed in to change notification settings - Fork 335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Excessive futex syscalling with no-mmap #54
Comments
Possibly related #32 (should ask OP about CPU usage) |
Note that 8 KiB is the default buffer size of |
What size is your test file? And how does that compare to your available RAM? |
@oconnor663 My test file is about 100 GB and my RAM is 128 GB. Page cache or not, b3sum will still do thousands of syscalls per seconds, destroying performance. Perhaps put a simple https://doc.rust-lang.org/std/io/struct.BufReader.html inbetween the file and hasher? |
Can you point me to the source? I was looking for an issue like this. imo this should be filed as a Rust issue, no stdlib should behave that badly |
@oconnor663 Sorry, seems like my diagnosis was wrong. 8KB reads are totally fine (no significant CPU load)
-- I think the big problem is the |
I'm not seeing any futex calls at all on my machine, using a file that's bigger than RAM (though not as big as yours):
I do see futex calls if I re-enable mmapping, though there are very few system calls overall:
Could you give me a complete series of commands that gives the result you saw? Does it only happen for gigantic files, or can you get a repro with something smaller? |
I think I found something: I don't get any
Summary of environments with confirmed futex spam:
Test commands fyi:
EDIT: Benchmarking ... Seems like there is a performance improvement, too
|
The default |
Hmm, normally with |
Hardcoding |
That must be it. I think the right answer here is to figure out a solution to #25 and then to make the |
This is the one thing I didn't consider :D No, neither of the machines tested support AVX2. |
No worries. This is a dumb bug, and I'm ashamed of it :-D |
This is a new interface that allows the caller to provide a multi-threading implementation. It's defined in terms of a new `Join` trait, for which we provide two implementations, `SerialJoin` and `RayonJoin`. This lets the caller control when multi-threading is used, rather than the previous all-or-nothing design of the "rayon" feature. Although existing callers should keep working, this is a compatibility break, because callers who were relying on automatic multi-threading before will now be single-threaded. Thus the next release of this crate will need to be version 0.2. See #25 and #54.
I just landed fc219f4 above. It adds a generic way to control multi-threading, and it also fixes Tests are failing on the latest nightly compiler at the moment because of rust-lang/rust#68905. Hopefully that'll get fixed soon. In the meantime, you might need to use stick to stable or beta. |
@oconnor663 No more |
Awesome. This change will be part of our 0.2 release, and there's some unrelated work to finish before we ship it, but hopefully sometime in the next few days. |
Problem:
b3sum --no-mmap
is using a tiny read buffer size (8KB) and syscallsread(2)
andfutex(2)
aggressively. I find that hashing at only 110 MB/s maxes out all 24 cores, doing about 10k syscalls per second (tail -f strace.log | pv -l >/dev/null
).top
strace.log
Environment:
cargo install b3sum
(as of 2020-02-03)cargo 1.41.0 (626f0f40e 2019-12-03)
5.4.0-9-generic
The text was updated successfully, but these errors were encountered: