Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make xz compression use all available cores #11

Merged
merged 3 commits into from Jul 27, 2020

Conversation

XenonPK
Copy link
Contributor

@XenonPK XenonPK commented Apr 6, 2017

Compression takes a while. This should make it faster. (Tested on my local machine)

Compression takes a while.  This should make it faster. (Tested on my local machine)
Since one thread is implied to be running, we need to forcefully ignore one of the cores so that the number of compression threads matches the number of physical cores.
@@ -78,7 +78,7 @@ for i in $FILES; do
COMPRESS="bzip2 -c -"
NEWFILE="${BASENAME#_service:}.bz2"
elif [ "$MYCOMPRESSION" == "xz" ]; then
COMPRESS="xz -c -"
COMPRESS="xz --threads=$(nproc --ignore=1) -c -"
NEWFILE="${BASENAME#_service:}.xz"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not simply setting threads to 0? From man xz:

Setting threads to a special value 0 makes xz use as many threads as there are CPU cores on the system.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting threads to 0 works (just tested it), and is simpler, it just did not cross my mind at the time. I'll change it.

Simpler solution
@M0ses
Copy link
Collaborator

M0ses commented Jul 27, 2017

Hmm, 3 commits for 1 line changed - could you sqash them?

@marxin
Copy link

marxin commented Jul 26, 2020

Yes, -T0 seems to me like a logic option.

@marxin
Copy link

marxin commented Jul 26, 2020

Note that the same option can be used for zstd compression.

@adrianschroeter adrianschroeter merged commit b460544 into openSUSE:master Jul 27, 2020
@bmwiedemann
Copy link
Member

I found yesterday that our b4 package build became unreproducible because xz compression is different if 1 thread is used. https://lists.archlinux.org/pipermail/arch-dev-public/2019-March/029520.html also noticed this.

And here is a reproducer:
for n in 2 3 ; do echo | taskset $n xz --threads=0 -c - | md5sum ; done

Could we use --threads=$(n=$(nproc); [[ $n > 1 ]] || n=2; echo $n)

bmwiedemann added a commit to bmwiedemann/obs-service-recompress that referenced this pull request Aug 1, 2020
PR openSUSE#11 introduced compression depending on number of available CPUs,
but xz then produces different output on 1-core-VMs:
for n in 2 3 ; do echo | taskset $n xz --threads=0 -c - | md5sum ; done

See https://reproducible-builds.org/ for why this matters.
@marxin
Copy link

marxin commented Aug 4, 2020

It's a known limitation of xz compression and that's why we want to use zstd that doesn't suffer from the problem.

@adrianschroeter
Copy link
Member

the need to avoid threads=1 is not a reason to create larger files IMHO...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants