Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Section re-ordering confuses strip #10

Closed
staticfloat opened this issue Jan 17, 2014 · 31 comments
Closed

Section re-ordering confuses strip #10

staticfloat opened this issue Jan 17, 2014 · 31 comments

Comments

@staticfloat
Copy link

Running patchelf to enlarge my RUNPATH entry causes my section headers to change from this:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .interp           PROGBITS         0000000000400238  00000238
       000000000000001c  0000000000000000   A       0     0     1
  [ 2] .note.ABI-tag     NOTE             0000000000400254  00000254
       0000000000000020  0000000000000000   A       0     0     4
  [ 3] .note.gnu.build-i NOTE             0000000000400274  00000274
       0000000000000024  0000000000000000   A       0     0     4
  [ 4] .gnu.hash         GNU_HASH         0000000000400298  00000298
       0000000000000064  0000000000000000   A       5     0     8
  [ 5] .dynsym           DYNSYM           0000000000400300  00000300
       00000000000011b8  0000000000000018   A       6     1     8
  [ 6] .dynstr           STRTAB           00000000004014b8  000014b8
       0000000000000868  0000000000000000   A       0     0     1
...

To this:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .dynsym           DYNSYM           00000000003ffb00  001ffb00
       00000000000011b8  0000000000000018   A       2     1     8
  [ 2] .dynstr           STRTAB           00000000003ff270  00000270
       0000000000000890  0000000000000000   A       0     0     8
  [ 3] .gnu.hash         GNU_HASH         0000000000400cb8  00001cb8
       0000000000000064  0000000000000000   A       1     0     8
  [ 4] .interp           PROGBITS         0000000000400d20  00001d20
       000000000000001c  0000000000000000   A       0     0     8
  [ 5] .note.ABI-tag     NOTE             0000000000400d40  00001d40
       0000000000000020  0000000000000000   A       0     0     8
  [ 6] .note.gnu.build-i NOTE             0000000000400d60  00001d60
       0000000000000024  0000000000000000   A       0     0     8
...

This confuses tools such as strip because they use a header offset of 0 as an error code; here's the relevant function assign_file_positions_for_non_load_sections in elf.c from binutils. Running e.g. gdb --args strip -g my_binary, and setting a breakpoint at that position shows that we get a valid section, but since its filepos is equal to zero, binutils ignores it and continues on, getting confused. Note that even readelf gets a little confused by this, as it doesn't even list the .dynsym segment:

$ readelf -l my_binary
...
 Section to Segment mapping:
  Segment Sections...
   00     
   01     .interp 
   02     .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame 
   03     .init_array .fini_array .jcr .data.rel.ro .dynamic .got .got.plt .data .bss 
   04     .dynamic 
   05     .note.ABI-tag .note.gnu.build-id 
   06     .eh_frame_hdr 
   07     
   08     .init_array .fini_array .jcr .data.rel.ro .dynamic .got 
$ readelf my_patched_binary
...
 Section to Segment mapping:
  Segment Sections...
   00     
   01     
   02     .dynstr 
   03     .gnu.hash .interp .note.ABI-tag .note.gnu.build-id .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame 
   04     
   05     .interp 
   06     .eh_frame_hdr 
   07     .init_array .fini_array .jcr .data.rel.ro .dynamic .got .got.plt .data .bss 
   08     .init_array .fini_array .jcr .data.rel.ro .dynamic .got 
   09     .dynamic 

I theorize that if we keep the .interp section at the beginning of the file, we might be able to still use tools such as strip, but I don't actually know. Is there an easy way to permute the order of headers that patchelf outputs? If you can just point me in the direction of how to do this, I can submit a PR.

@chenz
Copy link

chenz commented Jul 16, 2014

ldconfig gets confused as well, but only in combination with strip:

lib.so -> strip -> patchelf -> ldconfig

/sbin/ldconfig.real: file ./libstdc++.so.6.0.19 is truncated

lib.so -> patchelf -> strip

BFD: stas0up1: Not enough room for program headers, try linking with -N
strip:stas0up1[.hash]: Bad value

lib.so -> patchelf -> ldconfig

OK

@sjackman
Copy link

I'm seeing this error as well, which makes it impossible to run strip after patchelf.

@davidedelvento
Copy link

FWIW this problem affects me too, via https://github.com/schmir/bbfreeze

@mitchblank
Copy link

Yeah, I'm seeing something similar. On CentOS 7 strip (2.23.52.0.1-30.el7_1.2) I don't get that exact error, but patchelf 0.9 followed by strip does produce a broken binary.

In the post-patchelf'ed binary, DT_STRTAB seems to end up in the first page of the executable, which includes references to things like needed libraries.

$ dd bs=4k count=1 if=after-patchelf 2>/dev/null | strings | head
__libc_csu_fini
_start
__tls_get_addr
GLIBC_2.3
ld-linux-x86-64.so.2
[...]

but after running strip on it , they're no longer in that position of the file:

$ dd bs=4k count=1 if=after-patchelf-and-strip 2>/dev/null | strings | wc -l
0

However, when I've been walking through the dynamic linker code in gdb, it seems that in _dl_check_map_versions the strtab points into this page and is all \0 bytes, breaking its ability to find the libcs:

(gdb) x/40c 0x3ff580
0x3ff580:   0 '\000'    0 '\000'    0 '\000'    0 '\000'    0 '\000'    0 '\000'    0 '\000'    0 '\000'
0x3ff588:   0 '\000'    0 '\000'    0 '\000'    0 '\000'    0 '\000'    0 '\000'    0 '\000'    0 '\000'
0x3ff590:   0 '\000'    0 '\000'    0 '\000'    0 '\000'    0 '\000'    0 '\000'    0 '\000'    0 '\000'
0x3ff598:   0 '\000'    0 '\000'    0 '\000'    0 '\000'    0 '\000'    0 '\000'    0 '\000'    0 '\000'
0x3ff5a0:   0 '\000'    0 '\000'    0 '\000'    0 '\000'    0 '\000'    0 '\000'    0 '\000'    0 '\000'

I'm still digging a bit, hoping to find at least a workaround. Unfortunately I need to produce unstripped binaries to hand to our release engineering process where do the final strip before shipping, so right now trying to use patchelf to fix the rpath inside is resulting in binaries that can't be stripped.

@mitchblank
Copy link

OK, I dug into this more. The problem doesn't seem specific to the re-ordering, but that the PT_LOAD program sections that patchelf set up are...dubious. Here is some of the information from readelf -Wl on my original executable (with a little manual reformatting for clarity. I'm only including the LOAD segments since that's what controls what the kernel will actually map into our image.

  Type  Offset    VirtAddr           PhysAddr           FileSiz   MemSiz    Flg Align
  LOAD  0x0000000 0x0000000000400000 0x0000000000400000 0x1c3faa0 0x1c3faa0 R E 0x1000
  LOAD  0x1c3faa0 0x0000000002040aa0 0x0000000002040aa0 0x000bfc8 0x005b170 RW  0x1000

   02     .interp .note.ABI-tag .note.gnu.build-id .dynsym .dynstr .gnu.hash .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .text .fini .rodata .gcc_except_table .stapsdt.base .eh_frame .eh_frame_hdr
   03     .dynamic .got .got.plt .data .jcr .tm_clone_table .fini_array .init_array .data.rel.ro.local .data.rel.ro .bss

The .dynstr is one of the many segments in the first LOAD section... so far so unremarkable.

Now after running patchelf 0.9 things get odd:

  Type  Offset    VirtAddr           PhysAddr           FileSiz   MemSiz    Flg Align
  LOAD  0x0000000 0x00000000003ff000 0x00000000003ff000 0x0001000 0x0001000 RW  0x1000
  LOAD  0x0001000 0x0000000000400000 0x0000000000400000 0x1c3faa0 0x1c3faa0 R E 0x1000
  LOAD  0x1c40aa0 0x0000000002040aa0 0x0000000002040aa0 0x000bfc8 0x005b170 RW  0x1000

   02     .dynamic
   04     .dynsym .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .text .fini .rodata .gcc_except_table .stapsdt.base .eh_frame .eh_frame_hdr
   08     .got .got.plt .data .jcr .tm_clone_table .fini_array .init_array .data.rel.ro.local .data.rel.ro .bss

so now there are three LOAD segments. The first rw- one covers just one page (just under address 0x400000) and then is immediately followed by a larger r-x one. But if you look at the dynstr section...

  [Nr] Name                   Type       Address            Off      Size   ES Flg Lk Inf Al
  [ 2] .dynstr                STRTAB     00000000003ff580   000580   2fc53d 00   A  0   0  8

...it starts in the first PT_LOAD section and continues onto the second one! That definitely looks wrong: having a segment cross section boundaries can't really be safe can it?

Anyway, then when we run strip on this binary it unsurprisingly doesn't survive. You still get the three LOAD sections:

  Type  Offset    VirtAddr           PhysAddr           FileSiz   MemSiz    Flg Align
  LOAD  0x0000000 0x00000000003ff000 0x00000000003ff000 0x0000580 0x0000580 RW  0x1000
  LOAD  0x0001000 0x0000000000400000 0x0000000000400000 0x1c3faa0 0x1c3faa0 R E 0x1000
  LOAD  0x1c40aa0 0x0000000002040aa0 0x0000000002040aa0 0x000bfc8 0x005b170 RW  0x1000

...notice that the FileSiz/MemSiz has now shrunk and now doesn't cover the whole page. strip has moved the dynstr and now it's no longer covered in the LOAD sections.

  [Nr] Name                   Type       Address          Off     Size    ES Flg Lk Inf Al
  [ 3] .dynstr                STRTAB     00000000003ff580 1dff580  2fc53d 00   A  0   0  8

...and thus ld-linux.so can no longer find our dependencies. The will appear to be there if you look at the file with objdump since they're present in the file, but they're no longer in a section of the file that gets memory mapped in at execve() time.

@Shonallein
Copy link

I'm having a hard time with this issue as well. @mitchblank Did you find a work around the issue since you narrowed it down ? Or do you just don't strip your executable after patchelf ?

@mitchblank
Copy link

No, I didn't find a workaround. For now I'm pursuing other options.

@ogrisel
Copy link

ogrisel commented Jan 7, 2017

Any idea how to fix this issue? The problem is also affecting binary wheel packages for Python using the manylinux1 specification (patchelf is internally used by the auditwheel tool to graft third party dynamic libs into those Python packages when necessary).

@staticfloat
Copy link
Author

staticfloat commented Jan 9, 2017

I made some progress on this tonight; here are the results of my investigations:

  • strip on libraries was, (in the test case I added here) failing on this line in bfd/elf.c inside of the function assign_file_positions_for_load_sections(). Essentially, the destination virtual memory address of the section being loaded in was lower than the current file offset, which means it's overlapping the program header table, which strip() didn't like. So, I added in a fix to relocate sections that were overlapping to the end of the library, and it seems to have worked! You can try out my branch by checking out the sf/pr_10 branch on my fork, building the patchelf executable and using it in your workflows. If you have a library that continues to give the Not enough room for program headers, try linking with -N, post it here, and I'll take a look while this code is still in my brain's cache.

  • I'm having trouble reproducing the original error I reported, (namely, errors when running strip on executables (not libraries) that have been modified by patchelf) if anyone has an executable that causes an error with the latest versions of patchelf and strip (which is a part of binutils) I'd like to be brought to its attention so that I can add in a test case for it to patchelf and hopefully start chipping away at that one too!

@ogrisel
Copy link

ogrisel commented Jan 9, 2017

Thanks @staticfloat, I will try to test your branch in the manylinux1 / Python workflow sometimes this week (if nobody does it first).

@sjackman
Copy link

sjackman commented Jan 9, 2017

Thanks, Elliot! That's great news.

I see this error message Not enough room for program headers, try linking with -N when trying to build GHC (GNU Haskell Compiler) with Linuxbrew using a recent binary snapshot of GHC to compile a recent source distribution.
See https://github.com/Linuxbrew/homebrew-core/pull/1378
I'm not sure if it's caused by a library or an executable without looking into it further. I can give more information if you'd like to try to reproduce this error yourself.

@ogrisel
Copy link

ogrisel commented Jan 10, 2017

I tried to build a manylinux1 wheel (fixed by your version of patchelf) for the lxml project (that uses external libraries lxml2 and libxslt).

Here is the kind of errors I got prior to using your fix when trying to strip the resulting .so files:

$ find envs/teststrip/ -name "*.so*" | xargs strip
BFD: envs/teststrip/lib/python3.6/site-packages/lxml/stQQUQun: Not enough room for program headers, try linking with -N
strip:envs/teststrip/lib/python3.6/site-packages/lxml/stQQUQun[.note.gnu.build-id]: Bad value
BFD: envs/teststrip/lib/python3.6/site-packages/lxml/stQQUQun: Not enough room for program headers, try linking with -N
strip:envs/teststrip/lib/python3.6/site-packages/lxml/stQQUQun: Bad value
BFD: envs/teststrip/lib/python3.6/site-packages/lxml/stutiC3s: Not enough room for program headers, try linking with -N
strip:envs/teststrip/lib/python3.6/site-packages/lxml/stutiC3s[.note.gnu.build-id]: Bad value
BFD: envs/teststrip/lib/python3.6/site-packages/lxml/stutiC3s: Not enough room for program headers, try linking with -N
strip:envs/teststrip/lib/python3.6/site-packages/lxml/stutiC3s: Bad value
BFD: envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stoA4ECy: Not enough room for program headers, try linking with -N
strip:envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stoA4ECy[.note.gnu.build-id]: Bad value
BFD: envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stoA4ECy: Not enough room for program headers, try linking with -N
strip:envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stoA4ECy: Bad value
BFD: envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stuEcLbE: Not enough room for program headers, try linking with -N
strip:envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stuEcLbE[.note.gnu.build-id]: Bad value
BFD: envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stuEcLbE: Not enough room for program headers, try linking with -N
strip:envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stuEcLbE: Bad value
BFD: envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stB5xbLJ: Not enough room for program headers, try linking with -N
strip:envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stB5xbLJ[.gnu.hash]: Bad value
BFD: envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stB5xbLJ: Not enough room for program headers, try linking with -N
strip:envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stB5xbLJ: Bad value
BFD: envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stA69EkP: Not enough room for program headers, try linking with -N
strip:envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stA69EkP[.gnu.hash]: Bad value
BFD: envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stA69EkP: Not enough room for program headers, try linking with -N
strip:envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stA69EkP: Bad value
BFD: envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stzlGcUU: Not enough room for program headers, try linking with -N
strip:envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stzlGcUU[.gnu.hash]: Bad value
BFD: envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stzlGcUU: Not enough room for program headers, try linking with -N
strip:envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stzlGcUU: Bad value
BFD: envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stlI8Nt0: Not enough room for program headers, try linking with -N
strip:envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stlI8Nt0[.note.gnu.build-id]: Bad value
BFD: envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stlI8Nt0: Not enough room for program headers, try linking with -N
strip:envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stlI8Nt0: Bad value
(teststrip) ogrisel@is146148:~$ du -hs envs/teststrip/
46M	envs/teststrip/

with your fix I get the following output:

(teststrip) ogrisel@is146148:~$ find envs/teststrip/ -name "*.so*" | xargs strip
BFD: envs/teststrip/lib/python3.6/site-packages/lxml/stwFheJb: section .dynamic lma 0x277268 adjusted to 0x2e8228
BFD: envs/teststrip/lib/python3.6/site-packages/lxml/stwFheJb: section .dynstr lma 0x2774a8 adjusted to 0x2e8468
BFD: envs/teststrip/lib/python3.6/site-packages/lxml/st8HVQ4R: section .dynamic lma 0x42a268 adjusted to 0x640228
BFD: envs/teststrip/lib/python3.6/site-packages/lxml/st8HVQ4R: section .dynstr lma 0x42a4a8 adjusted to 0x640468
BFD: envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stxmfwsy: section .dynstr lma 0x21c2a0 adjusted to 0x234260
BFD: envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stysfiQe: section .dynstr lma 0x3711c0 adjusted to 0x4e1180
BFD: envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stzaYReV: section .dynstr lma 0x274188 adjusted to 0x2e8148
BFD: envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stybfEDB: section .dynstr lma 0x214150 adjusted to 0x228110
BFD: envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stybfEDB: section .gnu.hash lma 0x2145e8 adjusted to 0x2285a3
BFD: envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stybfEDB: section `.gnu.hash' can't be allocated in segment 5
LOAD: .dynstr .gnu.hash
BFD: envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stjXKv2h: section .dynstr lma 0x203150 adjusted to 0x206110
BFD: envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stjXKv2h: section .gnu.hash lma 0x2032c0 adjusted to 0x20627d
BFD: envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stjXKv2h: section `.gnu.hash' can't be allocated in segment 5
LOAD: .dynstr .gnu.hash
BFD: envs/teststrip/lib/python3.6/site-packages/lxml/.libs/stkLHqrY: section .dynstr lma 0x2451f8 adjusted to 0x2861b8
(teststrip) ogrisel@is146148:~$ du -hs envs/teststrip/
30M	envs/teststrip/

the section .gnu.hash can't be allocated in segment 5 is suspicious but at least we no longer get the originally reported Not enough room for program headers error message and the resulting stripped files are indeed smaller.

I can put the resulting binaries (after patchelf but before strip) online if that helps.

@staticfloat
Copy link
Author

If I could get the original .so files (before patchelf, and then the patchelf command that you're running on them so I can see how the binary is effected by patchelf) that would be great.

@staticfloat
Copy link
Author

Oh, also, if you're using a docker image to do your development inside of, I'd like to use that as well, as there can be a lot of different pieces in this puzzle (version of binutils, etc..) and I'd like to eliminate as many variables as possible.

@ogrisel
Copy link

ogrisel commented Jan 11, 2017

The docker image is quay.io/pypa/manylinux1_x86_64. A previous version of patchelf (without your fix) has been installed in /usr/local/bin. Here are the build scripts and Dockerfile for that image:

https://github.com/pypa/manylinux

Today I don't have access on the computer I used yesterday. Let me generate the binaries again.

@ogrisel
Copy link

ogrisel commented Jan 11, 2017

Here are the precise patchelf commands I executed:

Repairing lxml-3.7.2-cp36-cp36m-linux_x86_64.whl
Grafting: /usr/local/lib/libexslt.so.0.8.17 -> lxml/.libs/libexslt-ffbaa1db.so.0.8.17
Setting RPATH: lxml/.libs/libexslt-ffbaa1db.so.0.8.17 to "$ORIGIN/."
patchelf --replace-needed libexslt.so.0 libexslt-ffbaa1db.so.0.8.17 lxml/etree.cpython-36m-x86_64-linux-gnu.so
Grafting: /usr/local/lib/libxml2.so.2.9.3 -> lxml/.libs/libxml2-8d315a96.so.2.9.3
patchelf --replace-needed libxml2.so.2 libxml2-8d315a96.so.2.9.3 lxml/etree.cpython-36m-x86_64-linux-gnu.so
Grafting: /usr/local/lib/libxslt.so.1.1.29 -> lxml/.libs/libxslt-655d9087.so.1.1.29
Setting RPATH: lxml/.libs/libxslt-655d9087.so.1.1.29 to "$ORIGIN/."
patchelf --replace-needed libxslt.so.1 libxslt-655d9087.so.1.1.29 lxml/etree.cpython-36m-x86_64-linux-gnu.so
Grafting: /lib64/libz.so.1.2.3 -> lxml/.libs/libz-a147dcb0.so.1.2.3
patchelf --replace-needed libz.so.1 libz-a147dcb0.so.1.2.3 lxml/etree.cpython-36m-x86_64-linux-gnu.so
Setting RPATH: lxml/etree.cpython-36m-x86_64-linux-gnu.so to "$ORIGIN/.libs"
patchelf --replace-needed libexslt.so.0 libexslt-ffbaa1db.so.0.8.17 lxml/objectify.cpython-36m-x86_64-linux-gnu.so
patchelf --replace-needed libxml2.so.2 libxml2-8d315a96.so.2.9.3 lxml/objectify.cpython-36m-x86_64-linux-gnu.so
patchelf --replace-needed libxslt.so.1 libxslt-655d9087.so.1.1.29 lxml/objectify.cpython-36m-x86_64-linux-gnu.so
patchelf --replace-needed libz.so.1 libz-a147dcb0.so.1.2.3 lxml/objectify.cpython-36m-x86_64-linux-gnu.so
Setting RPATH: lxml/objectify.cpython-36m-x86_64-linux-gnu.so to "$ORIGIN/.libs"
patchelf --replace-needed libxslt.so.1 libxslt-655d9087.so.1.1.29 lxml/.libs/libexslt-ffbaa1db.so.0.8.17
patchelf --replace-needed libxml2.so.2 libxml2-8d315a96.so.2.9.3 lxml/.libs/libexslt-ffbaa1db.so.0.8.17
patchelf --replace-needed libxml2.so.2 libxml2-8d315a96.so.2.9.3 lxml/.libs/libxslt-655d9087.so.1.1.29

The original binaries can be found in this .whl file (you can just unzip it to get all the .so files in it:

https://dl.dropboxusercontent.com/u/5743203/tmp/lxml-3.7.2-cp36-cp36m-linux_x86_64.whl

Here are the original system libs (from /usr/local/lib) that are being grafted into the package using patchelf:

https://dl.dropboxusercontent.com/u/5743203/tmp/libexslt.so.0
https://dl.dropboxusercontent.com/u/5743203/tmp/libxml2.so.2
https://dl.dropboxusercontent.com/u/5743203/tmp/libxslt.so.1

Here is the resulting package (with your fix):

https://dl.dropboxusercontent.com/u/5743203/tmp/after-patchelf-7f6fcb43-lxml-3.7.2-cp36-cp36m-manylinux1_x86_64.whl

and here is the copy of the same results without your fix in patchelf:

https://dl.dropboxusercontent.com/u/5743203/tmp/before-patchelf-7f6fcb43-lxml-3.7.2-cp36-cp36m-manylinux1_x86_64.whl

staticfloat added a commit to staticfloat/manylinux_patchelf_debugging that referenced this issue Jan 12, 2017
@staticfloat
Copy link
Author

staticfloat commented Jan 12, 2017

Thanks @ogrisel, that was really helpful!

When patchelf relocates the program header table to the end of the file, everything gets wonked out. (That is the technical term) It appears that libbfd has very strong opinions about the allowable segment offsets in a segment that contains the program headers. That is to say, it enforces that they must be at the beginning of the file. This is very strange to me, but I'm willing to roll with it. I cannot say I understand libbfd well enough to determine whether this is a bug, but I can say that simply telling patchelf to not put the program header table at the end of the file and to keep it where it is seems to have worked out when put in conjunction with my earlier changes. That is to say, the test suite passes, and strip doesn't crap out when processing libexslt.so as provided by @ogrisel.

I'm testing this out using this docker repo, the scripts in there will download everything you need to play around with this stuff, in case for some reason you want to. I don't have an easy way to test that the libraries that strip is outputting actually work, so be sure to test these really thoroughly! The test suite of patchelf is pretty light.

@ogrisel
Copy link

ogrisel commented Jan 16, 2017

I cannot say I understand libbfd well enough to determine whether this is a bug

I think it's worth asking the libbfd maintainers.

@ogrisel
Copy link

ogrisel commented Jan 16, 2017

I don't have an easy way to test that the libraries that strip is outputting actually work, so be sure to test these really thoroughly! The test suite of patchelf is pretty light.

@matthew-brett do you think you could adapt your manylinux1 testing infrastructure to ensure that this patchelf fix does not introduce a regression on all the manylinux1 packages you tested in the past?

@matthew-brett
Copy link

Olivier - just to clarify - you want me to try building and testing a few wheels with the patchelf fix you refer to here : pypa/manylinux#78 (comment) ?

@ogrisel
Copy link

ogrisel commented Jan 17, 2017

Yes I was thinking about leveraging https://github.com/matthew-brett/manylinux-builds to check that the branch https://github.com/staticfloat/patchelf/tree/sf/pr_10 does not cause a regression on a variety of packages at once. However know I realize that the manylinux-builds does not run the tests for those projects (contrary to the individual multibuild-based CI configs of those projects).

I guess some manual testing is required then.

@matthew-brett
Copy link

Hmm - I guess we could modify the upload container for the manylinux-builds .travis.yml, and then run manylinux-testing, reading from that new container...

@darealshinji
Copy link
Contributor

Is this issue fixed by #117 ?

@JonathonReinhart
Copy link

As @darealshinji mentioned, I believe this is fixed by #117. Would a maintainer mind closing it if this is indeed true?

Also, as we've seen, there are a lot of tools depending on this new, correct behavior. Are there plans to cut a new release of patchelf anytime soon? Thanks.

@sjackman
Copy link

Is a new release of patchelf upcoming?

@lukateras
Copy link
Member

This is not fixed. With the latest version, even with #127 applied, when I create a binary with rpath larger than it was before (e.g. via patchelf --set-rpath $(patchelf --print-rpath $f):foo $f) and then strip it, the binary gets corrupted.

lukateras added a commit to lukateras/nixpkgs that referenced this issue Nov 9, 2017
This is based on Jan Tojnar's work in NixOS#31228.

When patchelf has to grow rpath beyond original capacity, it sets
dontStrip, to work around NixOS/patchelf#10.
domenkozar pushed a commit to domenkozar/patchelf that referenced this issue Jan 5, 2018
startPage is adjusted unconditionally for all executables.
This results in incorrect addresses assigned to INTERP and LOAD
program headers, which breaks patched executable.

Adjusting startPage variable only when startOffset > startPage
should fix this.

This change is related to the issue NixOS#10

Signed-off-by: Ed Bartosh <ed.bartosh@linux.intel.com>
@sjackman
Copy link

Cool work around:

When patchelf has to grow rpath beyond original capacity, it sets dontStrip, to work around #10.

See lukateras/nixpkgs@aebc2b8

@sjackman
Copy link

@staticfloat Has this issue been resolved? Is there a related PR that fixes this issue?

@staticfloat
Copy link
Author

It looks to me like #117 fixed this.

@sjackman
Copy link

Excellent! Are there plans for a new stable release of patchelf that includes this fix?

@staticfloat
Copy link
Author

It doesn't look like there's been a release for quite a while; we use patchelf as part of the fundamental infrastructure of our BinaryBuilder.jl system to build binary dependencies for the Julia language project. I just recently built redistributable binaries of the latest patchelf for all the major operating systems, hosted here if that's helpful to anyone looking for an easy way to get at them. (Just download and extract the tarball)

cmatsuoka pushed a commit to cmatsuoka/patchelf that referenced this issue Mar 28, 2019
The current approach to changing sections in ET_DYN executables is to move
the INTERP section to the end of the file. +This means changing PT_PHDR to
add an extra PT_LOAD section so that the new section is mmaped into memory
by the elf loader in the kernel. In order to extend PHDR, this means moving
it to the end of the file.

Its documented in BUGS there is a kernel 'bug' which means that if you have holes
in memory between the base load address and the PT_LOAD segment that contains PHDR,
it will pass an incorrect PHDR address to ld.so and fail to load the binary, segfaulting.

To avoid this, the code currently inserts space into the binary to ensure that when
loaded into memory there are no holes between the PT_LOAD sections. This inflates the
binaries by many MBs in some cases. Whilst we could make them sparse, there is a second
issue which is that strip can fail to process these binaries:

$ strip fixincl
Not enough room for program headers, try linking with -N
[.note.ABI-tag]: Bad value

This turns out to be due to libbfd not liking the relocated PHDR section either
(NixOS#10).

Instead this patch implements a different approach, leaving PHDR where it is but extending
it in place to allow addition of a new PT_LOAD section. This overwrites sections in the
binary but those get moved to the end of the file in the new PT_LOAD section.

This is based on patches linked from the above github issue, however whilst the idea
was good, the implementation wasn't correct and they've been rewritten here.

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
(cherry picked from commit c4deb5e)
Signed-off-by: Claudio Matsuoka <claudio.matsuoka@canonical.com>
sergiusens pushed a commit to sergiusens/patchelf that referenced this issue Apr 10, 2019
startPage is adjusted unconditionally for all executables.
This results in incorrect addresses assigned to INTERP and LOAD
program headers, which breaks patched executable.

Adjusting startPage variable only when startOffset > startPage
should fix this.

This change is related to the issue NixOS#10

Signed-off-by: Ed Bartosh <ed.bartosh@linux.intel.com>
cmatsuoka pushed a commit to cmatsuoka/patchelf that referenced this issue Apr 10, 2019
startPage is adjusted unconditionally for all executables.
This results in incorrect addresses assigned to INTERP and LOAD
program headers, which breaks patched executable.

Adjusting startPage variable only when startOffset > startPage
should fix this.

This change is related to the issue NixOS#10

Signed-off-by: Ed Bartosh <ed.bartosh@linux.intel.com>
cmatsuoka pushed a commit to cmatsuoka/patchelf that referenced this issue Apr 11, 2019
startPage is adjusted unconditionally for all executables.
This results in incorrect addresses assigned to INTERP and LOAD
program headers, which breaks patched executable.

Adjusting startPage variable only when startOffset > startPage
should fix this.

This change is related to the issue NixOS#10

Signed-off-by: Ed Bartosh <ed.bartosh@linux.intel.com>
(cherry picked from commit 1cc234f)
Signed-off-by: Claudio Matsuoka <claudio.matsuoka@canonical.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests