Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Library corruption when setting RPATH #466

Closed
pablogsal opened this issue Sep 19, 2020 · 11 comments
Closed

Library corruption when setting RPATH #466

pablogsal opened this issue Sep 19, 2020 · 11 comments
Assignees

Comments

@pablogsal
Copy link

Describe the bug
Some shared libraries are corrupted when lief sets the RPATH to some new value.

To Reproduce

Consider this shared library:

https://www.dropbox.com/s/n47wla3tgw4ny56/libfreebl3.so?dl=0

If you create a very simple executable that links against it, everything works:

echo "int main(){return 0;}" > lel.c
gcc lel.c -L . -lfreebl3
./a.out

but now, changing the RPATH makes the program segfault:

import lief
elf = lief.parse("./libfreebl3_old.so")
elf += lief.ELF.DynamicEntryRunPath("$ORIGIN")
elf.write("libfreebl3.so")

Running the program now segfaults:

❯ ./a.out
[1]    19792 segmentation fault (core dumped)  ./a.out

Apparently, the segfault happens when the linker is calling the deallocator for the library:

LD_DEBUG=all ./a.out
...
     37836:     symbol=_dl_signal_exception;  lookup in file=/usr/lib/libc.so.6 [0]
     37836:     binding file /lib64/ld-linux-x86-64.so.2 [0] to /usr/lib/libc.so.6 [0]: normal symbol `_dl_signal_exception' [GLIBC_PRIVATE]
     37836:     symbol=_dl_signal_error;  lookup in file=./a.out [0]
     37836:     symbol=_dl_signal_error;  lookup in file=libfreebl3.so [0]
     37836:     symbol=_dl_signal_error;  lookup in file=/usr/lib/libc.so.6 [0]
     37836:     binding file /lib64/ld-linux-x86-64.so.2 [0] to /usr/lib/libc.so.6 [0]: normal symbol `_dl_signal_error' [GLIBC_PRIVATE]
     37836:     symbol=_dl_catch_error;  lookup in file=./a.out [0]
     37836:     symbol=_dl_catch_error;  lookup in file=libfreebl3.so [0]
     37836:     symbol=_dl_catch_error;  lookup in file=/usr/lib/libc.so.6 [0]
     37836:     binding file /lib64/ld-linux-x86-64.so.2 [0] to /usr/lib/libc.so.6 [0]: normal symbol `_dl_catch_error' [GLIBC_PRIVATE]
     37836:
     37836:     calling init: /lib64/ld-linux-x86-64.so.2
     37836:
     37836:
     37836:     calling init: /usr/lib/libc.so.6
     37836:
     37836:
     37836:     calling init: /usr/lib/libdl.so.2
     37836:
     37836:
     37836:     calling init: libfreebl3.so
     37836:
     37836:
     37836:     initialize program: ./a.out
     37836:
     37836:
     37836:     transferring control: ./a.out
     37836:
     37836:
     37836:     calling fini: ./a.out [0]
     37836:
     37836:
     37836:     calling fini: libfreebl3.so [0]
     37836:
[1]    37836 segmentation fault (core dumped)  LD_DEBUG=all ./a.out

Expected behavior
The library is not corrupted and running the previous example works correctly without segfaulting.

Environment (please complete the following information):

@romainthomas
Copy link
Member

I acknowledge this issue and it seems that the crash appends in a ELF destructor of libfreebl3.so`. Strangely, when forcing the imported function resolution with LD_BIND_NOW=1`` it does not crash.

@pablogsal
Copy link
Author

pablogsal commented Sep 23, 2020

I guess that the problem is that the address of that deallocator is not correct after LIEF writes the new elf because the program counter there is not reflected in the DWARF information.

The address is also absent from the .eh_frame section

@romainthomas
Copy link
Member

Well actually it seems that they are two pointers at the beginning of the .got.plt section that are not valid after the write operation.
Do you have more details about how libfreebl3.so has been complied?

@pablogsal
Copy link
Author

pablogsal commented Sep 23, 2020

Do you have more details about how libfreebl3.so has been complied?

Unfortunately no, but I can tell you that it comes from the system libraries in RHEL6.8 system. There are some particularities that I noticed, one of the clear ones is that the value of the offset and the address in the section headers is not equal (which is totally valid but uncommon).

@romainthomas
Copy link
Member

Yes I also noticed that it uses _GLOBAL_OFFSET_TABLE_[1] which is also uncommon and inconsistent in the binary that crashes.

@romainthomas
Copy link
Member

If think I found the fix! I still need to check some things but it should be pushed in the next couple of days.

@pablogsal
Copy link
Author

If think I found the fix! I still need to check some things but it should be pushed in the next couple of days.

Fantastic! I am curious, where did the problem reside in LIEF?

@romainthomas
Copy link
Member

From what I understood, the beginning of the GOT is structured as follows,
according to the x86-64 ABI:

  1. When first creating the memory image of the program, the dynamic linker
    sets the second and the third entries in the global offset table to special
    values. Steps below explain more about these values.
    ...
  2. After pushing the relocation index, the program then jumps to .PLT0, the
    first entry in the procedure linkage table. The pushq instruction places the
    value of the second global offset table entry (GOT+8) on the stack, thus giv-
    ing the dynamic linker one word of identifying information. The program
    then jumps to the address in the third global offset table entry (GOT+16),
    which transfers control to the dynamic linker.

Which is implemented in the linker as follows:

// dl-machine.h from glibc 2.32
static inline int
elf_machine_runtime_setup (struct link_map *l, int lazy)
{
  extern void _dl_runtime_resolve (Elf32_Word);

  if (lazy)
    {
      /* The GOT entries for functions in the PLT have not yet been filled
         in.  Their initial contents will arrange when called to push an
         offset into the .rel.plt section, push _GLOBAL_OFFSET_TABLE_[1],
         and then jump to _GLOBAL_OFFSET_TABLE[2].  */
      Elf32_Addr *got = (Elf32_Addr *) D_PTR (l, l_info[DT_PLTGOT]);
      got[1] = (Elf32_Addr) l;	/* Identify this shared object.  */

      /* This function will get called to fix up the GOT entry indicated by
         the offset on the stack, and then jump to the resolved address.  */
      got[2] = (Elf32_Addr) &_dl_runtime_resolve;
    }

  return lazy;
}
...
/* If a library is prelinked but we have to relocate anyway,
   we have to be able to undo the prelinking of .got.plt.
   The prelinker saved us here address of .plt + 0x16.  
*/
if (got[1]) {
  l->l_mach.plt = got[1] + l->l_addr;
  l->l_mach.gotplt = (ElfW(Addr)) &got[3];
}

In the case of libfreebl3.so, got[1] is set which is unlikely and LIEF didn't not support this cases. The patch will fix this issue.

Reference: https://raw.githubusercontent.com/wiki/hjl-tools/x86-psABI/x86-64-psABI-1.0.pdf

romainthomas pushed a commit that referenced this issue Sep 25, 2020
@romainthomas
Copy link
Member

@pablogsal I'm waiting for the CI but the fix is here: c4a44d5#diff-c0ab473ba9cb221f36850d63e9f329c6R702

romainthomas pushed a commit that referenced this issue Sep 26, 2020
romainthomas pushed a commit that referenced this issue Sep 26, 2020
@romainthomas
Copy link
Member

See: 1a416ea

@pablogsal
Copy link
Author

Thank you very much @romainthomas !

romainthomas pushed a commit that referenced this issue Jan 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants