Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Last entry in .data section content is not updated to new offset when segment is added #418

Closed
pdreiter opened this issue May 21, 2020 · 5 comments
Assignees
Labels

Comments

@pdreiter
Copy link
Contributor

pdreiter commented May 21, 2020

Describe the bug
Content of .data section, when last 64 bytes is an .rodata address, is not updated when a new segment is added.

To Reproduce
I do not have a simple input binary that demonstrates this issue, but :

  1. parse AIS-Lite from https://github.com/trailofbits/cb-multios, compiled with gcc (CC=gcc CXX=g++ build.sh)
  2. add new segment to ./AIS-Lite and generate output binary ./added_seg.bin => contents of ./added_seg.bin have mostly been shifted by 0x1000
  3. crudely evaluating the last 8 bytes of contents from each the .data section:
>>>x=lief.parse("./AIS-Lite")
>>>y=lief.parse("./added_seg.bin")
>>>lx=len(x.get_section(".data").content)
>>>ly=len(y.get_section(".data").content)
>>>print(x.get_section(".data").content[lx-8:])
[43, 65, 0, 0, 52, 65, 0, 0]
>>>print(z.get_section(".data").content[lz-8:])
[43, 81, 0, 0, 52, 65, 0, 0]

These addresses correspond to the .rodata offsets 0x41b2 and 0x4134 in the original binary.
In the ./added_seg.bin, the corresponding .rodata offsets for these symbols are 0x51b2 and 0x5134, but in the actual .data contents of ./added_seg.bin are 0x51b2 and 0x4134, respectively. Symbols are fine, just the last .data content has not been updated to the new offset.
I'm not sure if this information is relevant, but the problematic global symbol is a const array of char*.

Expected behavior
I'm expecting that if some .data content is updated to the new offset, then all content is updated.
That means this:

>>>print(z.get_section(".data").content[lz-8:])
[43, 81, 0, 0, 52, 65, 0, 0]

should be:

>>>print(z.get_section(".data").content[lz-8:])
[43, 81, 0, 0, 52, **81**, 0, 0]

Environment (please complete the following information):

  • Ubuntu 19.04
  • Target format: ELF
  • LIEF commit version: 0.10.1-bfe5415

Additional context

@romainthomas
Copy link
Member

I think I understand the issue but the modification seems not generic ? Also, did you check the relocations ?

@pdreiter
Copy link
Contributor Author

I did check the relocation sections, but I will check again.
Anything in particular you want me to report?
I will also update with more info about the problematic global symbol.

I’m rephrasing what I have seen in case my first attempt lacked clarity:
The global .data symbol whose addresses are partially converted is an 9 element array of const char*s. LIEF updated EPFD[7] to the new virtual address in .rodata, but EPFD[8] contains the unmodified offset.

Update: the test passed after manually updating .data’s EPFD[8] to the new virtual address.

@pdreiter
Copy link
Contributor Author

Dynamic relocations look fine - no symbols are associated with the RELATIVE relocation entry, but this relocation entry's address and EPFD symbol value are equivalent.

But the issue are the relative pointers in the EPFD array - both EPFD and STATUS are const char* of size 9 => 36 bytes associated with each symbol.

Here's the .rodata section from the original binary ./AIS-Lite, x, and the binary with LIEF inserted segment ./added_seg.bin, y:

>>> xro=x.get_section(".rodata")
>>> yro=y.get_section(".rodata")
>>> print("{:x} {:x} {:x}".format(xro.virtual_address,xro.offset,xro.size))
4000 4000 24c
>>> print("{:x} {:x} {:x}".format(yro.virtual_address,yro.offset,yro.size))
5000 5000 24c

Here's some information for EPFD and STATUS symbols from the original ./AIS-Lite binary:

>>> i=x.get_symbol("STATUS")
>>> j=x.sections[i.shndx]
>>> status_offset=i.value-j.virtual_address
>>> for o in range(0,i.size):
...    print("{:x}".format(j.content[status_offset+o]),end=', ')
... 
9, 40, 0, 0, 1f, 40, 0, 0, 29, 40, 0, 0, 3b, 40, 0, 0, 56, 40, 0, 0, 6d, 40, 0, 0, 74, 40, 0, 0, 7c, 40, 0, 0, 84, 40, 0, 0, >>>
>>> i=x.get_symbol("EPFD")
>>> j=x.sections[i.shndx]
>>> epfd_offset=i.value-j.virtual_address
>>> for o in range(0,i.size):
...    print("{:x}".format(j.content[epfd_offset+o]),end=', ')
... 
d4, 40, 0, 0, de, 40, 0, 0, e2, 40, 0, 0, ea, 40, 0, 0, ff, 40, 0, 0, 7, 41, 0, 0, e, 41, 0, 0, 2b, 41, 0, 0, 34, 41, 0, 0, >>> 

Here is the same information from the ./added_seg.bin:

>>> i=y.get_symbol("STATUS")
>>> j=y.sections[i.shndx]
>>> status_offset=i.value-j.virtual_address
>>> for o in range(0,i.size):
...    print("{:x}".format(j.content[status_offset+o]),end=', ')
... 
9, 50, 0, 0, 1f, 50, 0, 0, 29, 50, 0, 0, 3b, 50, 0, 0, 56, 50, 0, 0, 6d, 50, 0, 0, 74, 50, 0, 0, 7c, 50, 0, 0, 84, 50, 0, 0, >>> 
>>> i=y.get_symbol("EPFD")
>>> j=y.sections[i.shndx]
>>> epfd_offset=i.value-j.virtual_address
>>> for o in range(0,i.size):
...    print("{:x}".format(j.content[epfd_offset+o]),end=', ')
... 
d4, 50, 0, 0, de, 50, 0, 0, e2, 50, 0, 0, ea, 50, 0, 0, ff, 50, 0, 0, 7, 51, 0, 0, e, 51, 0, 0, 2b, 51, 0, 0, 34, 41, 0, 0, >>> 
>>>

Since all STATUS elements [pointers to .rodata] have been updated to the new .rodata offset, perhaps it has more to do with the last element of the array EPFD is the last valid memory location in the section?

>>> print (epfd_offset+i.size)
164
>>> print(j.size)
164

@pdreiter
Copy link
Contributor Author

I was able to generate a VERY simple scenario that duplicates this bug.
Please note that I found this with 32b ELF executables.

contents of test_418.c

#include <stdio.h>
//--------------------------------------------------------
// basic scenario to test out lief bug 418 filed by pdreiter
//--------------------------------------------------------

static char* words[] = {
"hello","my","baby",         // # 0-2
"hello","my","honey",      // # 3-5
"hello","my","ragtime","gal" // # 6-9
};


void main(){

for (int i=0;i<10;i++) {
   printf(words[i]);
   printf(" ");
}

}

COMPILE:
gcc -m32 test_418.c -o lief_test_418

lief manipulation:

import lief
x=lief.parse("lief_test_418")
x.add(lief.ELF.Segment())
x.write("lief_bug_418")

output of "./lief_test_418":
hello my baby hello my honey hello my ragtime gal
output of "./lief_bug_418"
hello my baby hello my honey hello my ragtime

I cannot duplicate this error with a 64b ELF binary input:
gcc test_418.c -o lief64_test_418

@pdreiter
Copy link
Contributor Author

@romainthomas - I hope that this is good enough for you to root cause!

pdreiter added a commit to pdreiter/LIEF that referenced this issue Apr 1, 2021
addresses boundary scenario for relocations whose
relative offsets abut segment_size
romainthomas added a commit that referenced this issue Apr 3, 2021
xhochy pushed a commit to xhochy/LIEF that referenced this issue May 25, 2021
addresses boundary scenario for relocations whose
relative offsets abut segment_size
romainthomas pushed a commit that referenced this issue Jan 17, 2022
addresses boundary scenario for relocations whose
relative offsets abut segment_size
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants