New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make xz compression use all available cores #11
Conversation
Compression takes a while. This should make it faster. (Tested on my local machine)
Since one thread is implied to be running, we need to forcefully ignore one of the cores so that the number of compression threads matches the number of physical cores.
| @@ -78,7 +78,7 @@ for i in $FILES; do | |||
| COMPRESS="bzip2 -c -" | |||
| NEWFILE="${BASENAME#_service:}.bz2" | |||
| elif [ "$MYCOMPRESSION" == "xz" ]; then | |||
| COMPRESS="xz -c -" | |||
| COMPRESS="xz --threads=$(nproc --ignore=1) -c -" | |||
| NEWFILE="${BASENAME#_service:}.xz" | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not simply setting threads to 0? From man xz:
Setting threads to a special value 0 makes xz use as many threads as there are CPU cores on the system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting threads to 0 works (just tested it), and is simpler, it just did not cross my mind at the time. I'll change it.
Simpler solution
|
Hmm, 3 commits for 1 line changed - could you sqash them? |
|
Yes, |
|
Note that the same option can be used for |
|
I found yesterday that our b4 package build became unreproducible because xz compression is different if 1 thread is used. https://lists.archlinux.org/pipermail/arch-dev-public/2019-March/029520.html also noticed this. And here is a reproducer: Could we use |
PR openSUSE#11 introduced compression depending on number of available CPUs, but xz then produces different output on 1-core-VMs: for n in 2 3 ; do echo | taskset $n xz --threads=0 -c - | md5sum ; done See https://reproducible-builds.org/ for why this matters.
|
It's a known limitation of |
|
the need to avoid threads=1 is not a reason to create larger files IMHO... |
Compression takes a while. This should make it faster. (Tested on my local machine)