Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

postbuild rpm install fails on different machine with: cpio: MD5 sum mismatch #262

Open
dferrante opened this Issue Sep 19, 2012 · 33 comments

Comments

Projects
None yet
6 participants

after building an rpm with binaries, the rpm install fails like this:

[dferrant@server ~]$ sudo rpm -i --force feedback-1.0-671.x86_64.rpm
error: unpacking of archive failed on file /var/www/feedback/bin/python;5059d80a: cpio: MD5 sum mismatch

the rpm was built on a build server and than installed on a system with an extremely similar architecture. the package built contained linux binaries for python installed via virtualenv.

one fix for this is to remove the prelink package from the build server. running prelink -u against any binaries you are packaging may fix as well. also adding the --nomd5 flag to the rpm installation lets it install without error.

Owner

jordansissel commented Feb 26, 2015

Is this still a problem for folks? I've never run into this problem myself, but if it's still occurring, we can fix it.

This happened to me too on RedHat 5
'# rpm -i python2.7.9-piksel-rhel5-x86_64.rpm'
error: unpacking of archive failed on file /usr/local/python-2.7.9/lib/libbz2.so;55284b0b: cpio: MD5 sum mismatch

opearo commented Jun 25, 2015

Yes. It's still a problem for us. We cannot remove prelink on our build server and have had to resort to using Maven's RPM plugin to get this to work. Would like to use fpm in the future when this is resolved.

I have run into this problem, too, with a number of closed source applications.

Contributor

djhaskin987 commented Oct 5, 2016 edited

We seem to be running into this:
https://access.redhat.com/solutions/148653

The jury is out on what to do with it.

This guy says "Just unprelink first": http://stackoverflow.com/questions/17948184/install-rpm-on-centos

and that might indeed work on EL 5, but not (so far for me) on EL 6. I'll keep digging.

Owner

jordansissel commented Oct 5, 2016

hold the phone. is rpmbuild running prelink on binaries at package-time or something?

Contributor

djhaskin987 commented Oct 5, 2016

Solutions I've seen:

  • Uninstall prelink from the build machine. This is not useful and appears to only work on EL 5.
  • Remove the '/etc/rpm/macros.prelink' file (or whatever it is). Same problem as above.
  • Define the macro __prelink_undo_cmd as /bin/cat prelink library. Seems to be of questionable value and may only work on EL 5.
  • in the %install section, run prelink -u <so-file-with-the-problem>. This seems to be the best so far, but I haven't tried it out yet.
  • Optionally, adding a file in the RPM telling prelink not to touch this file (echo '-b /usr/bin/name' > $RPM_BUILD_ROOT/etc/prelink.conf.d/%{name}.conf) so that rpm -V still works on the target machine (from here: https://lists.fedoraproject.org/pipermail/devel/2011-December/160415.html )
Owner

jordansissel commented Oct 5, 2016

On Fedora 23, I grepped /usr/lib/rpm/** for prelink and only found something commented-out. On Fedora 23 I can't find prelink available for install at all. Maybe prelink was given-up on?

Contributor

djhaskin987 commented Oct 5, 2016

I should hope so, but I grepped also on CentOS 6. It seems to be a problem still on EL 6 and EL 5. Happily, though, the CentOS 7 build I'm currently working on has no problems.

Contributor

djhaskin987 commented Oct 5, 2016

@jordansissel post the commented out block? It may help find a solution :)

Contributor

djhaskin987 commented Oct 5, 2016 edited

Ah, I see the commented out block.

[bamboo-agent@cm-cent6-taatra84 centre]$ grep -r 'prelink' /usr/lib/rpm/
Binary file /usr/lib/rpm/rpmdb_recover matches
Binary file /usr/lib/rpm/rpmdb_archive matches
Binary file /usr/lib/rpm/rpmdb_load matches
Binary file /usr/lib/rpm/rpmdb_deadlock matches
Binary file /usr/lib/rpm/rpmdeps matches
Binary file /usr/lib/rpm/rpmdb_upgrade matches
Binary file /usr/lib/rpm/rpmdb_stat matches
Binary file /usr/lib/rpm/rpmdb_dump matches
Binary file /usr/lib/rpm/debugedit matches
Binary file /usr/lib/rpm/javadeps matches
/usr/lib/rpm/macros:# XXX rpm-4.1 verifies prelinked libraries using a prelink undo helper.
/usr/lib/rpm/macros:#   Normally this macro is defined in /etc/rpm/macros.prelink, installed
/usr/lib/rpm/macros:#   with the prelink package. If the macro is undefined, then prelinked
/usr/lib/rpm/macros:#   than MD5 verifying the output of the prelink undo helper.
/usr/lib/rpm/macros:#%__prelink_undo_cmd     /usr/sbin/prelink prelink -y library
Binary file /usr/lib/rpm/rpmdb_verify matches
Binary file /usr/lib/rpm/rpmdb_printlog matches

With this in my /usr/lib/rpm/macros, I can confirm that:

  1. I am un-prelinking every SO file in the RPM I'm making.
  2. After I create the RPM, the SO file is again prelinked. I verified this by doing this after running FPM.
prelink -u <fpm-work-directory>/BUILD/<path-to-the-problem-so-file>.so

The command worked, so the file was indeed prelinked. So I have a file which I know wasn't prelinked, but the copy of that SO file in the BUILD directory is prelinked.

Hope this helps

Owner

jordansissel commented Oct 5, 2016

Yeah that's what mine looks like also. Maybe it's not part of rpm macros but somehow still is executed during packaging?

Owner

jordansissel commented Oct 5, 2016

I'm on CentOS 6.7 with prelink installed.

I test this such:

% echo 'int main() { return 7; }' > test.c
% gcc -o foo ./test.c

# Checksum original.
% sha1sum ./foo
800d59431340ea1bc96031d08afb19622eb3cbfb  ./foo

# Prelink manually to see what happens.
% prelink -C /tmp/cache ./foo
% sha1sum ./foo
759e512ffb3f408fd0ae53b1983f0cf7328b8ebf  ./foo

# Unprelink so we can package it and see what happens:
% prelink -u -C /tmp/cache ./foo
% sha1sum ./foo
800d59431340ea1bc96031d08afb19622eb3cbfb  ./foo

# Package it
% fpm -s dir -t rpm -n example ./foo=/opt/fancy/test
Created package {:path=>"example-1.0-1.x86_64.rpm"}

# Test the files in the package
% rpm2cpio example-1.0-1.x86_64.rpm | cpio -i --make-directories
% sha1sum ./foo opt/fancy/test
800d59431340ea1bc96031d08afb19622eb3cbfb  opt/fancy/test

# ^^^ so packaging didn't modify it. 

# try installing it:
% sudo yum install example-1.0-1.x86_64.rpm
...
% sha1sum /opt/fancy/test
800d59431340ea1bc96031d08afb19622eb3cbfb  /opt/fancy/test

I am not seeing prelink involved at all in the packaging step with fpm+rpmbuild.

Hmm :|

Owner

jordansissel commented Oct 5, 2016

I tried with a .so in my rpm package also and it was also unchanged in the rpm

Contributor

djhaskin987 commented Oct 5, 2016

I'm on 6.6 . I don't know if that matters, but it might?

Owner

jordansissel commented Oct 5, 2016

No idea, haha. Can you strace -fe trace=file,execve -o /tmp/strace.out fpm ... when invoking fpm? Maybe we can see how/when it invokes prelink during rpmbuild?

Contributor

djhaskin987 commented Oct 6, 2016

Good idea, I'll do that today :)

Contributor

djhaskin987 commented Oct 6, 2016 edited

Found it!

From the strace log:

9353  open("/etc/rpm/macros.prelink", O_RDONLY) = 8

Which has:

%__prelink_undo_cmd     /usr/sbin/prelink prelink -y library

uncommented in it. So it's in etc, not usr/lib. Inconsistency (╯°□°)╯︵ ┻━┻
Also, we have this line:

14393 execve("/usr/sbin/prelink", ["prelink", "-y", "/home/<build-user>/<git-repo>/<project>/<project>-<app>/fpm_work/package-rpm-build20161006-14385-1kjlt5p/BUILD/opt/<company-name>/<long-app-name>/bin/libpython3.5m.so.1.0"], ["HOSTNAME=<buildagent-hostname>", "TERM=screen.rxvt", "SHELL=/bin/bash", "HISTSIZE=1000", "OLDPWD=/home/<build-user>/<git-repo>", "QTDIR=/usr/lib64/qt-3.3", "QTINC=/usr/lib64/qt-3.3/include", "USER=<build-user>", "LS_COLORS=", "POSIXLY_CORRECT=1", "VIRTUAL_ENV=/home/<build-user>/<git-repo>/<project>/<project>-<app>/.env", "PATH=/home/<build-user>/<git-repo>/<project>/<project>-<app>/.env/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/<build-user>/bin", "MAIL      =/var/spool/mail/<build-user>", "PIP_CONFIG_FILE=/etc/pip.conf", "PWD=/home/<build-user>/<git-repo>/<project>/<project>-<app>", "LANG=en_US.UTF-8", "PS1=(.env) ", "HISTCONTROL=ignoredups", "HOME=/home/<build-user>", "SHLVL=2", "LOGNAME=<build-u      ser>", "CVS_RSH=ssh", "QTLIB=/usr/lib64/qt-3.3/lib", "LESSOPEN=||/usr/bin/lesspipe.sh %s", "G_BROKEN_FILENAMES=1", "_=/hom


e/<build-user>/<git-repo>/<project>/<project>-<app>/.env/bin/python", "TMP=/home/<build-user>/<git-repo>/<project>/<project>-<app>/fpm_work"]) = 0

So prelink is happening.

Strace attached. I search-and-replaced company-specific stuff. You'll see a directory in there called 'fpm_work', that's where I told fpm to put its temporary directory stuff so I can look at it later :) let me know if this helps, I'm still digging through it

Contributor

djhaskin987 commented Oct 6, 2016 edited

I can positively confirm it.

I'm running this:

[bamboo-agent@cm-cent6-taatra84 ~]$ fpm --debug-workspace -s tar -n fo -v 4 -t rpm fo/fo.tar.gz 
no value for epoch is set, defaulting to nil {:level=>:warn}
no value for epoch is set, defaulting to nil {:level=>:warn}
Created package {:path=>"fo-4-1.x86_64.rpm"}
plugin directory {:plugin=>"tar", :path=>"/tmp/package-tar-staging20161006-9296-vno81l", :pathtype=>:staging_path}
plugin directory {:plugin=>"tar", :path=>"/tmp/package-tar-build20161006-9296-13mnmli", :pathtype=>:build_path}
plugin directory {:plugin=>"rpm", :path=>"/tmp/package-tar-staging20161006-9296-vno81l", :pathtype=>:staging_path}
plugin directory {:plugin=>"rpm", :path=>"/tmp/package-rpm-build20161006-9296-19pu9a6", :pathtype=>:build_path}

When prelink is installed on my system, I get this dump on the RPM file:

[bamboo-agent@cm-cent6-taatra84 ~]$ rpm -qp --dump fo-4-1.x86_64.rpm 
/libpython3.5m.so.1.0 2743928 1475795707 00000000000000000000000000000000 0100755 root root 0 0 0 X

When it isn't, I get this:

[bamboo-agent@cm-cent6-taatra84 ~]$ rpm -qp --dump fo-4-1.x86_64.rpm 
/libpython3.5m.so.1.0 2743928 1475795707 ba63a27f85afb22cd9ba52c4645d8a99 0100755 root root 0 0 0 X

When prelink is present, the md5 sum of the file is miswritten in the RPM and this causes it to fail on install.

It is called as part of the function open_dso, found in the file rpmio/rpmfileutils.c in the 4.8.0 version of the RPM source code. This function is in turn called by the rpmDoDigest function of the same file. Basically, when the file is opened for reading via open_dso, it is first filtered through the prelink executable, if the executable is available.

RPM version:

[bamboo-agent@cm-cent6-taatra84 ~]$ rpm --version
RPM version 4.8.0

CentOS version:

[bamboo-agent@cm-cent6-taatra84 ~]$ cat /etc/redhat-release 
CentOS release 6.8 (Final)

Currently working on a solution.

Contributor

djhaskin987 commented Oct 7, 2016 edited

OK.

Found it.

Sometimes, you copy a shared object file from the current system's file system into the directory that you are fpm-ing. On CentOS, this shared object file is pre-linked, since it used to live on the file system, and prelink runs via cron every so often to keep its libraries freshly pre-linked for your use and pleasure. However, when you copy this SO file into the directory that you hope to make into a package via FPM, this violates its prelinking cache, found housed in the actual ELF headers of the file. In my case, this was the file libpython3.5m.so.1.0, graciously copied from the filesystem via cx_Freeze (a python packaging library/system). If RPM finds a pre-linked shared object file to package, RPM tries to un-prelink before computing its digest hash. To do so, it uses the command found in the RPM macro __prelink_undo_cmd, etc. That command pipes the newly-un-prelinked file (libpython3.5m.so.1.0) through a pipe back to RPM so that it can compute its md5 digest for use in the CPIO archive. This command is usually set as prelink prelink -y library, which in my case resolved to the string prelink -y libpython3.5m.so.1.0. This command returned an error (prelink: libpython3.5m.so.1.0: at least one of file's dependencies has changed since prelinking) and closed the pipe it had open between itself and RPM. RPM, in turn, detected the error in the function rpmDoDigest, the function responsible for computing a digest of the file. When that function detected the error, it copied zeros into the buffer where it was to write the digest and returned a return code of 1. Inside the file build/files.c, the function which called rpmDoDigest then happily ignores this error code and went right on going, assuming the digest was computed correctly. This probably seemed okay to the programmer at the time, since he wrapped the call in a check to make sure the file was a regular file. "Nothing can go wrong now", he likely thought, "since the only reason this file would throw an error is if the file was a directory or something." NOTE TO ALL C PROGRAMMERS EVERYWHERE: ALWAYS CHECK YOUR ERROR CODES.

This explains the solutions people use.

Setting __prelink_undo_cmd to /bin/cat prelink library would have resolved in my case to /bin/cat libpython3.5m.so.1.0, which would have happily piped the contents of the library un-changed into RPM's digest computation function, resulting in a (hopefully) correct digest.

Another solution is un-prelinking the file first, like this:

cd fo/
prelink -u libpython3.5m.so.1
cd ..
fpm -s dir -C fo -n fo -v 4 -t rpm

Using this solution, the problem is avoided because RPM detects that the file isn't prelinked to begin with, so instead of "undoing" the prelink by piping it through prelink -y, it simply opens the file and uses the open file handle to compute the digest hash.

The last solution, that of uninstalling prelink, also works because prelink isn't installed. RPM figures this out, and since it's not installed RPM falls back to simply opening the file, rather than piping it through a non-existent command line program.

I found that the best solution for me is to un-prelink before anything happens to start with.

Contributor

djhaskin987 commented Oct 7, 2016 edited

The best solution, thinking about it more, is something fpm might be able to help with :)
It seems to be the case that this error is caused by prelinking caches which are out of date. The BEST thing to do is this at the end of the %install section in the SPEC file ERB template:

if which prelink >/dev/null 2>&1
then
<% for each library that fpm has detected has been previously prelinked %>
    prelink -u %{buildroot}/<%= location of prelinked file %>
<% end for each %>
fi

Syntax wrong, but you get the idea.

This seems to fix my problem. I have to effect it using fpm --edit, though :( it would be totally awesome if this got into FPM.

Owner

jordansissel commented Oct 7, 2016

I agree fpm can help. I'm trying to figure out when prelink is actually invoked on systems.

So far, I have only found where prelink is invoked by cron (daily). However, I see small hints that prelink is somehow invoked when rpms are installed (or upgraded?), but I'm not having much luck finding exactly what triggers that...

Owner

jordansissel commented Oct 7, 2016

I'll rephrase a bit, what I think I know:

  • rpm's verify code is built to be aware of prelink.
  • I presume this is so rpm -V works on systems where prelink is running daily (default on some/most centos 5/6's?)

What I don't know:

  • where/how prelink is being executed during rpmbuild.
Owner

jordansissel commented Oct 7, 2016

@djhaskin987 Does rpm -qa --triggers | grep prelink show anything for you? (On my centos 6.7 box I see nothing, but also I cannot reproduce the prelink-during-rpmbuild problem)

Owner

jordansissel commented Oct 7, 2016

@djhaskin987 in addition to checking triggers (rpm -qa --triggers), does rpmbuild --showrc show any hints that it might be doing prelink?

Owner

jordansissel commented Oct 7, 2016

Oooooh verified! I think my shared library I was testing with was somehow ignored by rpmbuild as something to prelink:

% fpm -fs dir -t rpm -n example /usr/lib64/libpython2.6.so.1.0=/opt/example.so
% sudo rpm -i example-1.0-1.x86_64.rpm
error: unpacking of archive failed on file /opt/example.so;57f662ef: cpio: Digest mismatch
Owner

jordansissel commented Oct 7, 2016

I'm still not sure what the bug here, is, but I have a workaround.

If we add this macro, the rpm is happy:

%__prelink_undo_cmd     /bin/cat cat library

For example, if I put the above in ~/.rpmmacros, then run fpm:

% fpm -fs dir -t rpm -n example ./example.so=/opt/example.so
% sudo rpm -iv example-1.0-1.x86_64.rpm
Preparing packages for installation...
example-1.0-1

Success. ?

Feels like there's a bug somewhere in the digest computation in rpm itself.

Owner

jordansissel commented Oct 7, 2016

Short conclusion:

There is a bug in rpm that causes it to incorrectly calculate the digest during installation of a file that has already had prelink run on it.

Workarounds:

  1. Run prelink -u on all shared libraries and dynamic binaries before packaging with fpm (rpmbuild)
  2. or, a worse workaround: trick rpmbuild into using the file's digest, not the unprelinked digest, by putting %__prelink_undo_cmd /bin/cat cat library in ~/.rpmmacros
  3. or, maybe we hack fpm to, as proposed, unprelink everything before packaging it, since this rpm bug only affects files prelinked before packaging.

Here's my detailed conclusion:

  1. During rpmbuild, rpmDoDigest is called on each file needing digestion
  2. rpmDoDigest is prelink aware and calls prelink -y so it can compute the digest.
  3. Other places inside rpm calls rpmDoDigest -- sweeet.
  4. ... except when unpacking the cpio inside the rpm (during installation). not sweet.

Breaking out my rusty gdb skills... I find that inside expandRegular is where this is failing.

% sudo gdb =rpm --args =rpm -iv example-1.0-1.x86_64.rpm
...
(gdb) break expandRegular
(gdb) break fdFiniDigest
(gdb) run
...
Breakpoint 2, fdFiniDigest (fd=0x20af6d0, hashalgo=PGPHASHALGO_MD5, datap=0x7ffc3f1e9c80, lenp=0x0, asAscii=0) at rpmio.c:2033
(gdb) step
<run 'step' many many times stepping through the code>
... 
(gdb) step
216      digestlen = HASH_ResultLenContext(ctx->hashctx);
...
(gdb) step
217      digest = xmalloc(digestlen);
(gdb)
221      HASH_End(ctx->hashctx, digest, (unsigned int *) &digestlen, digestlen);

## This looks promsiing, hash end, eh? Let's see what it is:
(gdb) print digestlen
$7 = 16

(gdb) x/16xb digest
0x20afe00:      0x50    0xa0    0xc2    0x66    0xba    0xb8    0x11    0x3d
0x20afe08:      0x20    0xa2    0x54    0xf9    0x2c    0x99    0x8d    0xb1

Taking the digest above into a single hex string:

  • 50a0c266bab8113d20a254f92c998db1

Ok, so what is the original file?

% md5sum example.so
50a0c266bab8113d20a254f92c998db1  example.so

INTERESTING.

Now, what is the "unprelinked" file md5?

% prelink -C /tmp/cache -y example.so| md5sum
2597429e87780b6342bbb13891a9a691  -

And the final piece, what md5 is listed in the rpm's header?

% rpm -qp example-1.0-1.x86_64.rpm --qf "%{FILEMD5S}\n"
2597429e87780b6342bbb13891a9a691
Owner

jordansissel commented Oct 7, 2016

I tested the same failing rpm above (built on centos 6 and included example.so as prelinked) on my Fedora 23 workstation. It appears the file unpacking is still not aware of prelink:

% cat /etc/fedora-release
Fedora release 23 (Twenty Three)

% sudo rpm -i example-1.0-1.x86_64.rpm
error: unpacking of archive failed on file /opt/example.so;57f72722: cpio: Digest mismatch
error: example-1.0-1.x86_64: install failed
Owner

jordansissel commented Oct 7, 2016 edited

Rebuilding the rpm with fpm on Fedora 23 works around it though, in case this helps:

% cat /etc/fedora-release
Fedora release 23 (Twenty Three)

% sudo rpm -i example-1.0-1.x86_64.rpm
error: unpacking of archive failed on file /opt/example.so;57f7289f: cpio: Digest mismatch
error: example-1.0-1.x86_64: install failed

% fpm -s rpm -t rpm --iteration 2 example-1.0-1.x86_64.rpm
Created package {:path=>"example-1.0-2.x86_64.rpm"}

# Installing works.
% sudo rpm -iv example-1.0-2.x86_64.rpm
Preparing packages...
example-1.0-2.x86_64

And copying the rebuilt rpm to my centos 6 box? Installs... but rpm -V fails :(

# Copying to my centos box.
% scp example-1.0-2.x86_64.rpm 192.168.1.200:
example-1.0-2.x86_64.rpm                                                                                                                            100%  634KB 633.6KB/s   00:00     

% cat /etc/centos-release
CentOS release 6.7 (Final)
% sudo rpm -iv example-1.0-2.x86_64.rpm
Preparing packages for installation...
example-1.0-2

% rpm -V example
S.5......    /opt/example.so
Contributor

djhaskin987 commented Oct 7, 2016

Thanks for all your work @jordansissel ! That really helps me on my end.

I wanted to mention something about how I fixed my problem.

I tried to un-prelink (prelink -u) all the files before I ran FPM. I thought this would be the best way to do things. I was able to verify that the directory I pointed FPM at had un-prelinked files, but for some reason, by the time rpmbuild's %install section is running, they're prelinked again :(

I hope I can use the research you've done to figure out why it's prelinked after I unprelinked it. I like that better than patching FPM for this.

I can also report that this isn't a problem on CentOS 7. I didn't need to do this wicked hack to get things running there; things just worked. Maybe this is only a problem on rpm 4.8 or less. Certainly I've only seen stuff online pertaining to EL 5 and EL 6, not EL 7. :)

Thanks again for digging deep!

Contributor

djhaskin987 commented Oct 7, 2016 edited

Another update.

I WAS wrong when I did it. When I actually un-prelink every SO file which was previously pre-linked in the directory at which I point FPM, (instead of not that directory :P) FPM and rpmbuild work fine. No need to use --edit in my case :) total hack avoided.

If FPM were to be improved, it might try to un-prelink SO files in the staging directory in the output section in rpm.rb.

@shahzebsiddiqui shahzebsiddiqui referenced this issue in easybuilders/easybuild-easyconfigs Apr 4, 2017

Open

Intel Advisor RPM issue #4443

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment