Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does not the rpmbuild command execute (de)compression of files in parallel? #113

Closed
leemgs opened this Issue Jan 2, 2017 · 7 comments

Comments

Projects
None yet
4 participants
@leemgs
Copy link

commented Jan 2, 2017

It seems that a 'rpmbuild' command only support single-core cpu environment. Can we decompress and compress the binary files in parallel while executing a rpm packaging operation with 'rpmbuild' command?

  • The log messages while executing 'rpmbuild' command
 
[   46s] + exec rpmbuild --define '_srcdefattr (-,root,root)' --nosignature --target=armv7l-tizen-linux --define '_build_create_debug 1' -ba /home/abuild/rpmbuild/SOURCES/chromium-efl.spec
[   46s] Building target platforms: armv7l-tizen-linux
[   46s] Building for target armv7l-tizen-linux
[   46s] Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.nfcsUa
[   46s] + umask 022
[   46s] + cd /home/abuild/rpmbuild/BUILD
[   46s] + cd /home/abuild/rpmbuild/BUILD
[   46s] + rm -rf chromium-efl-47.2526.69.49
[   46s] + /bin/bzip2 -dc /home/abuild/rpmbuild/SOURCES/chromium-efl-47.2526.69.49.tar.bz2
[   46s] + /bin/tar -xf -
[  161s] + STATUS=0
[  161s] + '[' 0 -ne 0 ']'
[  161s] + cd chromium-efl-47.2526.69.49
[  161s] + /bin/chmod -Rf a+rX,u+w,g-w,o-w .
[  164s] + exit 0
[  164s] Executing(%build): /bin/sh -e /var/tmp/rpm-tmp.JOaBJs

As we can see, the rpmbuild uses out-of-date bzip command without up-to-date pbzip command.
Anyone who had similar experience like me in multi-core CPU(s)?

In the past, we sometimes use a "--threads" option of xz package. For example, "$ tar -cf - source | xz --threads=0 > foo-destination.tar.xz [enter]". Currently, It seems that we can just enable multi-threaded operation with xz(=lzma) compression type for rpm packaging in multi-core systems according to the rpm mailing-list (lists.rpm.org) as following:

@leemgs

This comment has been minimized.

Copy link
Author

commented Jan 2, 2017

As a workaround, How about we try to do the below way? At this time, Let's assume that the default compression of rpm package is "*.tar.bz2".

u$> cat ./sample.spec
  1 Summary: The "Hello World" program from GNU                                                                        
  2 Name: hello
  3 Version: 2.8
  4 Release: 1%{?dist}
  5 Source0: %{name}-%{version}.tar.bz2  <===== HERE!!!!
  6 License: GPLv3+
  7 Group: Development/Tools

There are a lot of multi-core aware compression utilities.

sudo apt-get install plzip
sudo apt-get install pigz
sudo apt-get install pbzip2
sudo apt-get install lbzip2
sudo apt-get install lrzip
sudo apt-get install pxz
sudo apt-get install p7zip

At this time, Let's see how to use parallel bzip2 (e.g., pbzip2). First of all, Replace the existing bzip2 command with the pbzip2 command as an workaround on Ubuntu 16.04 X64 LTS.

u$> sudo apt-get install rpm pigz pbzip2 lbzip2 
u$> sudo mv /bin/bzip2 /bin/bzip2.disable
u$> sudo cp /usr/bin/pbzip2 /bin/bzip2

And the, let's execute "rpmbuild -bb sample.spec" command. Is it okay?
Alternatively, I can utilize 'alias' command as following:

$ sudo vi /etc/profile
# Enable parallelism to all compression utilities for multi-core systems.
alias bsdtar='bsdtar --use-compress-program=pbzip2'
alias tar='tar --use-compress-program=pbzip2'
alias gzip='/usr/bin/pigz'
alias gunzip='/usr/bin/unpigz'
alias bzip2='/usr/bin/pbzip2'
alias bunzip2='/usr/bin/pbunzip2'

leemgs added a commit to leemgs/rpm that referenced this issue Jan 13, 2017

[WIP] Replace gzip/bzip2/xz with pigz/pbzip2/pxz for multicore enviro…
…nment

This PR is to fix issue rpm-software-management#113. And It can be used as an alternative of PR rpm-software-management#117.
In modern generation, most of the developers are using a machine based on
multi-core architecture. From now on, Let's prepare to support parallelism.

pigz is compatible with the existing gzip. It means that pigz is using gzip's library.
pbzip2 is compatible with the existing pbzip2.
pxz is compatible with the existing xz.

Signed-off-by: Geunsik Lim <geunsik.lim@samsung.com>

leemgs added a commit to leemgs/rpm that referenced this issue Jan 13, 2017

[WIP] Replace gzip/bzip2/xz with pigz/pbzip2/pxz for multicore enviro…
…nment

This PR is to fix issue rpm-software-management#113. And It can be used as an alternative of PR rpm-software-management#117.
In modern generation, most of the developers are using a machine based on
multi-core architecture. From now on, Let's prepare to support parallelism.

pigz is compatible with the existing gzip because pigz is using gzip's library.
pbzip2 is compatible with the existing pbzip2 because pbzip2 is using bzip2's library.
pxz is compatible with the existing xz because pxz is using xz's library.

Signed-off-by: Geunsik Lim <geunsik.lim@samsung.com>

leemgs added a commit to leemgs/rpm that referenced this issue Jan 13, 2017

[WIP] Replace gzip/bzip2/xz with pigz/pbzip2/pxz for multicore enviro…
…nment

This PR is to fix issue rpm-software-management#113. And It can be used as an alternative of PR rpm-software-management#117.
In modern generation, most of the developers are using a machine based on
multi-core architecture. From now on, Let's prepare to support parallelism.

pigz is compatible with the existing gzip because pigz is using gzip's library.
$ ldd /usr/bin/pigz | grep -e 'libz\|pthread'
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fe318f23000)
	libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007fe318d09000)

pbzip2 is compatible with the existing pbzip2 because pbzip2 is using bzip2's library.
$ ldd /usr/bin/pbzip2 | grep -e 'libbz2\|pthread'
	libbz2.so.1.0 => /lib/x86_64-linux-gnu/libbz2.so.1.0 (0x00007fa9d089a000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fa9d067d000)

pxz is compatible with the existing xz because pxz is using xz's library.
$ ldd /usr/bin/pxz |grep -e 'liblzma\|pthread'
	liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007f27d3fd5000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f27d3b95000)

Signed-off-by: Geunsik Lim <geunsik.lim@samsung.com>
@ffesti

This comment has been minimized.

Copy link
Contributor

commented Feb 1, 2017

The main problem here is that rpm uses compression libraries for creating and unpacking the package content. Unfortunately the number of libraries offering parallel compression are very limited. There is now support for multi threaded xz compression if compiled against xz > 5.2.0. You can set
%_source_payload %_binary_payload to "w9T16.xzdio" with the first number being the compression level while the second being the number of threads.

External tools are all configurable by macro. Switching to more modern variants is not something that rpm upstream is likely to do soon. It is the decision of the distributions to support them and change the macros accordingly. Upstream may follow the consensus of the distributions later.

@leemgs

This comment has been minimized.

Copy link
Author

commented Feb 2, 2017

It is the decision of the distributions to support them and change the macros accordingly.
Upstream may follow the consensus of the distributions later.

Right. However, I hope that rpm upstream will enable multi threaded xz compression by default for them. :) For example,

# https://fedoraproject.org/wiki/Features/XZRpmPayloads
%define _compression_level    7
%define _smp_mflags -j%(echo "`/usr/bin/getconf _NPROCESSORS_ONLN`")
%_source_payload   w%{_compression_level}T%{_smp_mflags}.xzdio
%_binary_payload   w%{_compression_level}T%{_smp_mflags}.xzdio
@pmatilai

This comment has been minimized.

Copy link
Contributor

commented Feb 2, 2017

Multithreaded compression is not some magic holy grail. Sure it can make the compression phase faster but it requires considerably more memory and more importantly the compression ratio degrades so for distros it might be a bad choice. Also AFAICT deltarpm requires bit-by-bit equivalent compression everywhere to operate correctly and that's not the case if compression depends on thinks like number of processors on the system.

Point is, anybody considering enabling it needs to consider whether it's an actual win for a particular use-case - there are cases like CI builds where speed can be far more important than compression ratio.

@ffesti

This comment has been minimized.

Copy link
Contributor

commented Feb 2, 2017

Two ways to drive the adoption of multi threaded compressors could be
a) Talk to distributions to make those compressors part of their core set of packages and change the macros in their rpm package
b) Offer patches upstream to add those compressors as alternatives in the macros.in or configure file to make it easier for distributions to switch.

@Conan-Kudo

This comment has been minimized.

Copy link
Member

commented Feb 3, 2017

I would hazard to say I would almost never recommend switching to multithreaded stuff by default, after accounting for how it would potentially break DeltaRPM. Unless there's a way to make that deterministic, I see no pathway that would allow for widespread usage of multithreaded compression.

@pmatilai

This comment has been minimized.

Copy link
Contributor

commented Feb 17, 2017

So...

I've yet to see any evidence that parallel decompression speeds up anything at all. In fact, that none of the p* variants implement that seems to suggest it doesn't. So speculating on how to best change the macros to use the parallel versions is totally moot because they're only used for decompression.

As for parallel compression, we already have that for XZ which is the most widely used format by distros these days since it offers the best compression. Switching to XZ as upstream default could be discussed (in a separate ticket/PR) but enabling parallel compression by default is not going to happen because it's a tradeoff to the worse in many common usecases.

@pmatilai pmatilai closed this Feb 17, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.