Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement zstd:threads=0 #2083

Closed
wants to merge 2 commits into from

Conversation

dag-erling
Copy link
Contributor

This is the correct solution for what #2071 tries to do, and also improves the manual page somewhat.

The bsdtar manual page claims that setting zstd:threads to 0 tells zstd to use as many threads as there are cores in the system, but it actually disables multi-threading.  Replace 0 with the number of configured processors.

While here, add a previously missing overflow check.
@@ -223,7 +226,10 @@ archive_compressor_zstd_options(struct archive_write_filter *f, const char *key,
if (string_to_number(value, &threads) != ARCHIVE_OK) {
return (ARCHIVE_WARN);
}
if (threads < 0) {
if (threads == 0) {
threads = sysconf(_SC_NPROCESSORS_CONF);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think sysconf exists on windows.

Also note that this differs from the behaviour of the zstd command line utility (this returns vcores, the zstd util uses physical cores), which might be confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using physical cores is wrong.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The zstandard developers seem to disagree- they use physical cores by default and there's another separate option to change that to use logical cores. I figured this code should map to zstd options as much as possible, to avoid surprises and keep things simple in case we decide to add the same choice (physical vs logical) later.

https://man.freebsd.org/cgi/man.cgi?query=zstd&sektion=1&format=html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The zstandard developers are wrong, feel free to take it up with them, I don't care.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I agree that using physical cores might actually give a better result given the memory bandwidth that zstd is using, I don't think the difference justifies the extra complexity here. That said, I think this needs a feature test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

“using physical cores might actually give a better result” assumes things about physical vs logical cores that might or might not be true today and might or might not be true tomorrow, using sysconf() at least gives the administrator some measure of control.

Whether to use _SC_NPROCESSORS_ONLN or _SC_NPROCESSORS_CONF is up for debate, they may or may not return different values depending on the platform.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's some discussion where the main zstandard developer explains why they prefer a more conservative number-of-physical-cores default (and there's a linked PR that added the option to choose between physical/logical core count):
facebook/zstd#2071 (comment)

Choosing either default makes assumptions.

If users switch from using the external zstd cli util with tar to using the library with bsdtar and we use the logical core count, this could double the memory usage which is significant for zstandard at higher levels.

Re _SC_NPROCESSORS_ONLN vs _SC_NPROCESSORS_CONF, I think the former makes most sense (on linux: don't count offline-but-hot-pluggable CPUs - not sure about other platforms).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the requirements for this option are subjective- what does the libarchive project want here?

Personally, my only really key concern is portability: we need to support Windows, macOS, Linux, FreeBSD, etc, and we need to run well on a variety of processors.

Apart from that, simplicity is really nice.

Excessive memory use is a concern: bsdtar is the major consumer of this code, but someone might well want to keep a dearchiving pipeline open in a server for a few hours.

I'm really not worried about getting this exactly right. Anything from 50% to 200% of the "theoretical ideal" value seems fine:

  • An under-estimate just leaves some room for other things running on the system.
  • A modest over-estimate will add a minor amount of thread-switch overhead.
  • The real goal of this default is convenience; people who need to fine-tune these behaviors have the tools to do so.

Copy link
Contributor

@mostynb mostynb Mar 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's the case, feel free to copy+paste this implementation for windows from my earlier PR:

#if defined(_WIN32)
	SYSTEM_INFO sysinfo;
	GetSystemInfo(&sysinfo);
	if (sysinfo.dwNumberOfProcessors > 0) {
		cores = sysinfo.dwNumberOfProcessors;
		return cores;
	}
#endif

IIUC that will return at most 32 or 64 depending on the windows bitness, but might be sufficient if you want the code to be simple? (I guess that should be mentioned in a comment.)

And if you wrap the sysconf usage in #if defined(HAVE_UNISTD_H) && defined(_SC_NPROCESSORS_ONLN) ... #endif then this will compile on windows and probably work on most current unix setups.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a better API to use for windows (confirmed to work for more than 64 logical cpu cores), updated #2071 along with the manpage changes + extra bounds checking from this PR.

@kientzle
Copy link
Contributor

@dag-erling Can you resolve the conflicts that this brought in?

mmatuska added a commit to mmatuska/libarchive that referenced this pull request Apr 23, 2024
The bsdtar manual page claims that setting zstd:threads to 0 tells zstd
to use as many threads as there are cores in the system, but it actually
disables multi-threading.  Replace 0 with the number of configured
processors.

While here, add a previously missing overflow check.

Co-authored-by: Martin Matuska <martin@matuska.de>
@mmatuska
Copy link
Member

Merged with modifications in 3efcadf

@mmatuska mmatuska closed this Apr 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants