New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ASSERT_CURIOSITY "crashed while walking dynamic header" on AArch64 #3385
Comments
|
I've uploaded output from readelf for libarmflang.so: The compiler is available from https://developer.arm.com/products/software-development-tools/hpc/arm-compiler-for-hpc |
|
The fundamental issue is that DR is locating .dynstr using using a virtual offset (DT_STRTAB) and then accessing .dynstr after the first flat mmap by the loader, when .dynstr's segment may not in fact be properly mapped. Most libraries have .dynstr in their first +rx segment, and so its virtual offset matches its file offset, and everything works at flat-file-mapping time. The reason libarmflang.so is unusual is that .dynstr is in its last +rw segment: The first mmap of the whole file may not reach as far as the later-mapped final segment, resulting in SIGBUS from reading off the end of the file on the mmap's final page. If the file is instead long enough, DR may just read some bogus value as the string. It looks like /data/arm_hpc_compiler/opt/arm/gcc-8.2.0_Generic-AArch64_Ubuntu-16.04_aarch64-linux/lib64/libstdc++.so.6.0.25 also has this property, with .dynstr at the end, and DR also reads bogus values for soname here. There are existing checks for not reading off the end of the mapping, but they are not looking at the file size. Unlike on Windows where a single system call maps in all the segments, with the multi-step segment loading process it is not as simple to figure out when library loading is finished. This is why DR likes to look for and process libraries on the first mmap. This has caused other problems in the past: xref #884. I could see three possible solutions: A) Just try to get .dynstr reading to work at the current flat-mmap point by looking at the DT_LOAD entries and computing the file offset for .dynstr. B) Do not try to read .dynstr at the flat-mmap: instead delay until some later point, either the same 1st-execution as #884 or something else. Maybe evaluate who relies on it to decide. C) Change DR's entire library analysis scheme to shift later after the segment mmaps. This may open up corner cases of not properly handling non-standard code file mapping. Certainly this is the largest and most complex change of the 3. |
|
Thanks Derek! |
|
Hi Derek, I've tried to fix this based on your suggestions A) and B). For A I couldn't figure what other way there is of computing the offset of .dynstr. Using p_offset (0x510000) rather than p_vaddr (0x540000) makes no difference. For B waiting until the end of os_module_area_init() to access soname after module_walk_program_headers() makes no difference either: I think it would be more productive for us to work out a fix on a PR referring to real code rather than this discussion. To that end I've created #3419 with a simple update which delays accessing soname until all program headers have been processed in module_walk_program_headers(). Thanks |
Are you sure? What is the file offset of the segment, in case it's different (maybe your version has multiple sections in that final segment)? Using the file offset works in my case, which is running the app you provided and looking at this library: DR originally read the wrong string here (it didn't crash b/c the values happened to stay within the library; it just reads the wrong data): The maps file at that point: So we have this offset: 0x00007fffa8e9bd8b-0x7fffa8ae5000 = 0x3b6d8b And here's the string DR read from the flat-mapped file: Now if we look at that segment we see the load offset of 0x3a0000, which was used by DR, vs the file offset of 0x4e0000: Using 0x4e0000 instead of 0x3a0000 we get to the right place: |
|
I just saw this comment in os_add_new_app_module(): If it's really not mapping the full file size the first time, then this option A is not going to be safe. I suppose the fact that we saw SIGBUS in the first place should corroborate this. |
|
So for A we'd have to do our own mmap of the whole file (or read from disk). |
I think that is still at the first mmap? I don't think it's late enough: the segments have not been loaded yet. |
My bad, I was not using the p_offset correctly! Using the correct offset in the correct way, od shows the right string:
It looks like it is mapping the full size first time. I will post a fix shortly. The fix isn't going to be as simple as I thought because the |
|
Update in PR #5947 - I moved dynamic_info initialization from AFAIK this issue with late .dynstr section is very rare, so just late initialization may be enough? This pattern used for handling Android loader system, and the same function I re-used in PR. |
This issue has been raised as a consequence of https://groups.google.com/forum/#!topic/dynamorio-users/UTWiYoc9TvA
The library load which exposes this failure in the ELF loader is libarmflang.so shipped as part of the Arm HPC compiler 19.0. I think it may be related to #1589 as the library is quite large (5.2M).
A SIGBUS (not SIGSEGV!) is caused by the strlen() call in the soname check in core/unix/module_elf.c:
It was reproduced using a simple Fortran test case built with armflang:
ldd shows which libraries are linked:
The size of each library:
Adding a SYSLOG_INTERNAL_INFO() call to look at *soname for all libraries loaded by hw shows a corrupt soname for libarmflang:
Interestingly, when run with GDB, a corrupt string appears in the SYSLOG_INTERNAL_INFO() output but not when looking at *soname pointer after SIGBUS:
With a development version of libarmflang.so (not yet released) which is bigger than the released version GDB says *soname hasn't been mapped:
AFAICT there's nothing wrong with the arithmetic of *dynstr and soname_index, but I could be wrong.
However, the fact that the failure is a SIGBUS rather than SIGSEGV suggests that it's an alignment, cross page mapping or memory map sync error.
Some of the comments in module_elf.c imply that the relevant mapping(s) may not be in memory at the time soname is accessed.
Running with strace shows libarmflang.so is mapped to ffff99d94000-ffff9a16c000 with soname pointing to 0x0000ffff9a16ad81:
The text was updated successfully, but these errors were encountered: