New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement zstd:threads=0
#2083
Implement zstd:threads=0
#2083
Conversation
The bsdtar manual page claims that setting zstd:threads to 0 tells zstd to use as many threads as there are cores in the system, but it actually disables multi-threading. Replace 0 with the number of configured processors. While here, add a previously missing overflow check.
d27a4a0
to
a490d7c
Compare
@@ -223,7 +226,10 @@ archive_compressor_zstd_options(struct archive_write_filter *f, const char *key, | |||
if (string_to_number(value, &threads) != ARCHIVE_OK) { | |||
return (ARCHIVE_WARN); | |||
} | |||
if (threads < 0) { | |||
if (threads == 0) { | |||
threads = sysconf(_SC_NPROCESSORS_CONF); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think sysconf exists on windows.
Also note that this differs from the behaviour of the zstd command line utility (this returns vcores, the zstd util uses physical cores), which might be confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using physical cores is wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The zstandard developers seem to disagree- they use physical cores by default and there's another separate option to change that to use logical cores. I figured this code should map to zstd options as much as possible, to avoid surprises and keep things simple in case we decide to add the same choice (physical vs logical) later.
https://man.freebsd.org/cgi/man.cgi?query=zstd&sektion=1&format=html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The zstandard developers are wrong, feel free to take it up with them, I don't care.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I agree that using physical cores might actually give a better result given the memory bandwidth that zstd is using, I don't think the difference justifies the extra complexity here. That said, I think this needs a feature test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
“using physical cores might actually give a better result” assumes things about physical vs logical cores that might or might not be true today and might or might not be true tomorrow, using sysconf()
at least gives the administrator some measure of control.
Whether to use _SC_NPROCESSORS_ONLN
or _SC_NPROCESSORS_CONF
is up for debate, they may or may not return different values depending on the platform.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's some discussion where the main zstandard developer explains why they prefer a more conservative number-of-physical-cores default (and there's a linked PR that added the option to choose between physical/logical core count):
facebook/zstd#2071 (comment)
Choosing either default makes assumptions.
If users switch from using the external zstd cli util with tar to using the library with bsdtar and we use the logical core count, this could double the memory usage which is significant for zstandard at higher levels.
Re _SC_NPROCESSORS_ONLN
vs _SC_NPROCESSORS_CONF
, I think the former makes most sense (on linux: don't count offline-but-hot-pluggable CPUs - not sure about other platforms).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the requirements for this option are subjective- what does the libarchive project want here?
Personally, my only really key concern is portability: we need to support Windows, macOS, Linux, FreeBSD, etc, and we need to run well on a variety of processors.
Apart from that, simplicity is really nice.
Excessive memory use is a concern: bsdtar is the major consumer of this code, but someone might well want to keep a dearchiving pipeline open in a server for a few hours.
I'm really not worried about getting this exactly right. Anything from 50% to 200% of the "theoretical ideal" value seems fine:
- An under-estimate just leaves some room for other things running on the system.
- A modest over-estimate will add a minor amount of thread-switch overhead.
- The real goal of this default is convenience; people who need to fine-tune these behaviors have the tools to do so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that's the case, feel free to copy+paste this implementation for windows from my earlier PR:
#if defined(_WIN32)
SYSTEM_INFO sysinfo;
GetSystemInfo(&sysinfo);
if (sysinfo.dwNumberOfProcessors > 0) {
cores = sysinfo.dwNumberOfProcessors;
return cores;
}
#endif
IIUC that will return at most 32 or 64 depending on the windows bitness, but might be sufficient if you want the code to be simple? (I guess that should be mentioned in a comment.)
And if you wrap the sysconf usage in #if defined(HAVE_UNISTD_H) && defined(_SC_NPROCESSORS_ONLN) ... #endif
then this will compile on windows and probably work on most current unix setups.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found a better API to use for windows (confirmed to work for more than 64 logical cpu cores), updated #2071 along with the manpage changes + extra bounds checking from this PR.
@dag-erling Can you resolve the conflicts that this brought in? |
The bsdtar manual page claims that setting zstd:threads to 0 tells zstd to use as many threads as there are cores in the system, but it actually disables multi-threading. Replace 0 with the number of configured processors. While here, add a previously missing overflow check. Co-authored-by: Martin Matuska <martin@matuska.de>
Merged with modifications in 3efcadf |
This is the correct solution for what #2071 tries to do, and also improves the manual page somewhat.