Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ASSERT_CURIOSITY "crashed while walking dynamic header" on AArch64 #3385

Open
AssadHashmi opened this issue Feb 21, 2019 · 11 comments
Open

ASSERT_CURIOSITY "crashed while walking dynamic header" on AArch64 #3385

AssadHashmi opened this issue Feb 21, 2019 · 11 comments
Assignees

Comments

@AssadHashmi
Copy link
Contributor

This issue has been raised as a consequence of https://groups.google.com/forum/#!topic/dynamorio-users/UTWiYoc9TvA

The library load which exposes this failure in the ELF loader is libarmflang.so shipped as part of the Arm HPC compiler 19.0. I think it may be related to #1589 as the library is quite large (5.2M).

A SIGBUS (not SIGSEGV!) is caused by the strlen() call in the soname check in core/unix/module_elf.c:

                /* test string readability while still in try/except
                 * in case we screwed up somewhere or module is
                 * malformed/only partially mapped */
                if (*soname != NULL && strlen(*soname) == -1) {
                    ASSERT_NOT_REACHED();
                }

It was reproduced using a simple Fortran test case built with armflang:

$ cat hw.f
       program hello
          print *, "Hello World!"
       end program hello
$
$ armflang hw.f -o hw
$
$ ./hw
 Hello World!
$

ldd shows which libraries are linked:

$ ldd hw
        linux-vdso.so.1 (0x0000ffffb0649000)
        libarmflang.so => /opt/arm/arm-hpc-compiler-19.0_Generic-AArch64_SUSE-12_aarch64-linux/lib/libarmflang.so (0x0000ffffb0243000)
        libomp.so => /opt/arm/arm-hpc-compiler-19.0_Generic-AArch64_SUSE-12_aarch64-linux/lib/libomp.so (0x0000ffffb0165000)
        libm.so.6 => /lib64/libm.so.6 (0x0000ffffb0092000)
        librt.so.1 => /lib64/librt.so.1 (0x0000ffffb0071000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x0000ffffb003c000)
        libc.so.6 => /lib64/libc.so.6 (0x0000ffffafec6000)
        libstdc++.so.6 => /opt/arm/gcc-8.2.0_Generic-AArch64_SUSE-12_aarch64-linux/lib64/libstdc++.so.6 (0x0000ffffafccd000)
        libgcc_s.so.1 => /opt/arm/gcc-8.2.0_Generic-AArch64_SUSE-12_aarch64-linux/lib64/libgcc_s.so.1 (0x0000ffffafc9c000)
        libdl.so.2 => /lib64/libdl.so.2 (0x0000ffffafc7b000)
        /lib/ld-linux-aarch64.so.1 (0x0000ffffb061b000)
$

The size of each library:

5.1M    /opt/arm/arm-hpc-compiler-19.0_Generic-AArch64_SUSE-12_aarch64-linux/lib/libarmflang.so
1.1M    /opt/arm/arm-hpc-compiler-19.0_Generic-AArch64_SUSE-12_aarch64-linux/lib/libomp.so
4.0K    /lib64/libm.so.6
4.0K    /lib64/librt.so.1
4.0K    /lib64/libpthread.so.0
4.0K    /lib64/libc.so.6
4.0K    /opt/arm/gcc-8.2.0_Generic-AArch64_SUSE-12_aarch64-linux/lib64/libstdc++.so.6
844K    /opt/arm/gcc-8.2.0_Generic-AArch64_SUSE-12_aarch64-linux/lib64/libgcc_s.so.1
4.0K    /lib64/libdl.so.2

Adding a SYSLOG_INTERNAL_INFO() call to look at *soname for all libraries loaded by hw shows a corrupt soname for libarmflang:

$ git diff
diff --git a/core/unix/module_elf.c b/core/unix/module_elf.c
index 8d4db788..e2efca55 100644
--- a/core/unix/module_elf.c
+++ b/core/unix/module_elf.c
@@ -454,6 +454,8 @@ module_fill_os_data(ELF_PROGRAM_HEADER_TYPE *prog_hdr, /* PT_DYNAMIC entry */
                 /* test string readability while still in try/except
                  * in case we screwed up somewhere or module is
                  * malformed/only partially mapped */
+                SYSLOG_INTERNAL_INFO("DEBUG accessing *soname=[%p]", *soname);
+                SYSLOG_INTERNAL_INFO("DEBUG accessing *soname=[%s]\n", *soname);
                 if (*soname != NULL && strlen(*soname) == -1) {
                     ASSERT_NOT_REACHED();
                 }
lines 1-13/13 (END)

$ drrun -- ./hw
<Starting application /path/to/hw (28477)>
<Initial options = -no_dynamic_options -code_api -stack_size 56K -signal_stack_size 32K -max_elide_jmp 0 -max_elide_call 0 -early_inject -emulate_brk -no_inline_ignored_syscalls -native_exec_default_list '' -no_native_exec_managed_code -no_indcall2direct >
<DEBUG accessing *soname=[0x0000000071009e8c]>
<DEBUG accessing *soname=[libdynamorio.so]
>
<DEBUG accessing *soname=[0x0000000071009e8c]>
<DEBUG accessing *soname=[libdynamorio.so]
>
<Paste into GDB to debug DynamoRIO clients:
set confirm off
add-symbol-file '/path/to/dynamorio/build/lib64/debug/libdynamorio.so' 0x0000000071013560
>
<DEBUG accessing *soname=[0x0000ffff838d0252]>
<DEBUG accessing *soname=[linux-vdso.so.1]
>
<DEBUG accessing *soname=[0x0000ffff838f7929]>
<DEBUG accessing *soname=[ld-linux-aarch64.so.1]
>
<DEBUG accessing *soname=[0x0000ffff838cdd81]>
<DEBUG accessing *soname=[^E^S^F<82>^E^KJ^E^U^FK^E^M^FJ^E^S^FK^E^K^Fº^D^K^E]   <-- should be libarmflang.so
>
<DEBUG accessing *soname=[0x0000ffff8342385a]>
<DEBUG accessing *soname=[libomp.so]
>
<DEBUG accessing *soname=[0x0000ffff8336225f]>
<DEBUG accessing *soname=[libm.so.6]
>
<DEBUG accessing *soname=[0x0000ffff83338280]>
<DEBUG accessing *soname=[librt.so.1]
>
<(1+x) Handling our fault in a TRY at 0x00000000712e9274>
<DEBUG accessing *soname=[0x0000ffff83305ffc]>
<DEBUG accessing *soname=[libpthread.so.0]
>
<DEBUG accessing *soname=[0x0000ffff831a302b]>
<DEBUG accessing *soname=[libc.so.6]
>
<DEBUG accessing *soname=[0x0000ffff83184173]>
<DEBUG accessing *soname=[]
>
<DEBUG accessing *soname=[0x0000ffff82f64135]>
<DEBUG accessing *soname=[libgcc_s.so.1]
>
<DEBUG accessing *soname=[0x0000ffff82f418c4]>
<DEBUG accessing *soname=[libdl.so.2]
>
 Hello World!
<Stopping application /path/to/hw (28477)>

Interestingly, when run with GDB, a corrupt string appears in the SYSLOG_INTERNAL_INFO() output but not when looking at *soname pointer after SIGBUS:

. . .
Missing separate debuginfo for /lib/ld-linux-aarch64.so.1
Try: zypper install -C "debuginfo(build-id)=e8104675ba94d7c698d02558d81423b4fe5bff11"
Missing separate debuginfo for /lib64/libc.so.6
Try: zypper install -C "debuginfo(build-id)=9a94d7a23a8802dd0f216f3a3cea0eb29d703de0"
process 29487 is executing new program: /path/to/dynamorio/build/lib64/debug/libdynamorio.so
<Starting application /path/to/hw (29487)>
<Initial options = -no_dynamic_options -code_api -stack_size 56K -signal_stack_size 32K -max_elide_jmp 0 -max_elide_call 0 -early_inject -emulate_brk -no_inline_ignored_syscalls -native_exec_default_list '' -no_native_exec_managed_code -no_indcall2direct >
<DEBUG accessing *soname=[0x0000000071009e8c]>
<DEBUG accessing *soname=[libdynamorio.so]
>
<DEBUG accessing *soname=[0x0000000071009e8c]>
<DEBUG accessing *soname=[libdynamorio.so]
>
<Paste into GDB to debug DynamoRIO clients:
set confirm off
add-symbol-file '/path/to/dynamorio/build/lib64/debug/libdynamorio.so' 0x0000000071013560
>
<DEBUG accessing *soname=[0x0000ffffbf6a7252]>
<DEBUG accessing *soname=[linux-vdso.so.1]
>
<DEBUG accessing *soname=[0x0000ffffbf6ce929]>
<DEBUG accessing *soname=[ld-linux-aarch64.so.1]
>
<DEBUG accessing *soname=[0x0000ffffbf6a4d81]>
<DEBUG accessing *soname=[
JK                        JK

  ]
>
<DEBUG accessing *soname=[0x0000ffffbf1fa85a]>
<DEBUG accessing *soname=[libomp.so]
>
<DEBUG accessing *soname=[0x0000ffffbf13925f]>
<DEBUG accessing *soname=[libm.so.6]
>
<DEBUG accessing *soname=[0x0000ffffbf10f280]>
<DEBUG accessing *soname=[librt.so.1]
>

Program received signal SIGBUS, Bus error.
safe_read_asm_pre () at /path/to/dynamorio/core/arch/aarch64/aarch64.asm:400
400             strb     w3, [ARG1]
(gdb)
0xffffbf6ce929 <myvals+6377>:   "ld-linux-aarch64.so.1"
(gdb) x /bs 0x0000ffffbf6a4d81
0xffffbf6a4d81: "libarmflang.so"                           <-- valid soname !
(gdb)

With a development version of libarmflang.so (not yet released) which is bigger than the released version GDB says *soname hasn't been mapped:

$ gdb drrun
GNU gdb (GDB; openSUSE Tumbleweed) 8.0.1
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "aarch64-suse-linux".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://bugs.opensuse.org/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /path/to/dynamorio/build/bin64/drrun...Reading symbols from /path/to/dynamorio/build/bin64/drrun.debug...done.
done.
(gdb) set confirm off
(gdb) add-symbol-file '/path/to/dynamorio/build/lib64/debug/libdynamorio.so' 0x0000000071013560
add symbol table from file "/path/to/dynamorio/build/lib64/debug/libdynamorio.so" at
        .text_addr = 0x71013560
Reading symbols from /path/to/dynamorio/build/lib64/debug/libdynamorio.so...Reading symbols from /path/to/dynamorio/build/lib64/debug/libdynamorio.so.debug...done.
done.
(gdb) r ./hw
Starting program: /path/to/dynamorio/build/bin64/drrun ./hw
Missing separate debuginfo for /lib/ld-linux-aarch64.so.1
Try: zypper install -C "debuginfo(build-id)=e8104675ba94d7c698d02558d81423b4fe5bff11"
Missing separate debuginfo for /lib64/libc.so.6
Try: zypper install -C "debuginfo(build-id)=9a94d7a23a8802dd0f216f3a3cea0eb29d703de0"
process 29516 is executing new program: /path/to/dynamorio/build/lib64/debug/libdynamorio.so
<Starting application /path/to/hw (29516)>
<Initial options = -no_dynamic_options -code_api -stack_size 56K -signal_stack_size 32K -max_elide_jmp 0 -max_elide_call 0 -early_inject -emulate_brk -no_inline_ignored_syscalls -native_exec_default_list '' -no_native_exec_managed_code -no_indcall2direct >
<DEBUG accessing *soname=[0x0000000071009e8c]>
<DEBUG accessing *soname=[libdynamorio.so]
>
<DEBUG accessing *soname=[0x0000000071009e8c]>
<DEBUG accessing *soname=[libdynamorio.so]
>
<Paste into GDB to debug DynamoRIO clients:
set confirm off
add-symbol-file '/path/to/dynamorio/build/lib64/debug/libdynamorio.so' 0x0000000071013560
>
<DEBUG accessing *soname=[0x0000ffffbf6a7252]>
<DEBUG accessing *soname=[linux-vdso.so.1]
>
<DEBUG accessing *soname=[0x0000ffffbf6ce929]>
<DEBUG accessing *soname=[ld-linux-aarch64.so.1]
>
<DEBUG accessing *soname=[0x0000ffffbf6a5adf]>

Program received signal SIGBUS, Bus error.
0x00000000711cbe54 in our_vsnprintf (s=0x48ad5f61 "0x0000ffffbf6a5adf]", max=2048, fmt=0x713b99d8 "DEBUG accessing *soname=[%s]\n", ap=...)
    at /path/to/dynamorio/core/iox.h:685
685                     while (*str) {
(gdb) x /bs 0x0000ffffbf6ce929
0xffffbf6ce929 <myvals+6377>:   "ld-linux-aarch64.so.1"
(gdb) x /bs 0x0000ffffbf6a5adf
0xffffbf6a5adf: <error: Cannot access memory at address 0xffffbf6a5adf>      <--  Not mapped!?
(gdb)

AFAICT there's nothing wrong with the arithmetic of *dynstr and soname_index, but I could be wrong.
However, the fact that the failure is a SIGBUS rather than SIGSEGV suggests that it's an alignment, cross page mapping or memory map sync error.
Some of the comments in module_elf.c imply that the relevant mapping(s) may not be in memory at the time soname is accessed.

Running with strace shows libarmflang.so is mapped to ffff99d94000-ffff9a16c000 with soname pointing to 0x0000ffff9a16ad81:

. . .
29725 mmap(NULL, 4026480, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 4</opt/arm/arm-hpc-compiler-19.0_Generic-AArch64_SUSE-12_aarch64-linux/lib/libarmflang.so>, 0) = 0xffff99d94000
29725 openat(AT_FDCWD, "/proc/29725/maps", O_RDONLY) = 5</proc/29725/maps>
29725 read(5</proc/29725/maps>, "
00400000-00401000 r-xp 00000000 00:37 3893600                            /path/to/hw
00401000-0041f000 ---p 00000000 00:00 0
0041f000-00421000 rw-p 0000f000 00:37 3893600                            /path/to/hw
00421000-00422000 ---p 00000000 00:00 0
00422000-00423000 rw-p 00000000 00:00 0
4f1fc000-4f1fd000 ---p 00000000 00:00 0
4f1fd000-4f203000 rw-p 00000000 00:00 0
4f203000-4f205000 ---p 00000000 00:00 0
4f205000-4f213000 rw-p 00000000 00:00 0
. . .
4f2c3000-4f2d6000 ---p 00000000 00:00 0
4f2d6000-4f2d7000 rwxp 00000000 00:00 0
4f2d7000-4f2e6000 ---p 00000000 00:00 0
4f2e6000-4f2e9000 rwxp 00000000 00:00 0
4f2e9000-5f1fc000 ---p 00000000 00:00 0
71000000-713bc000 r-xp 00000000 00:37 3894297                            /path/to/dynamorio/build/lib64/debug/libdynamorio.so
713bc000-713db000 ---p 00000000 00:00 0
713db000-713fe000 rw-p 003cb000 00:37 3894297                            /path/to/dynamorio/build/lib64/debug/libdynamorio.so
713fe000-71434000 rw-p 00000000 00:00 0
71434000-71435000 ---p 00000000 00:00 0
ffff99d94000-ffff9a16c000 r-xp 00000000 00:39 1315912                    /opt/arm/arm-hpc-compiler-19.0_Generic-AArch64_SUSE-12_aarch64-linux/lib/libarmflang.so
ffff9a16c000-ffff9a16d000 r--p 00000000 00:00 0                          [vvar]
ffff9a16d000-ffff9a16e000 r-xp 00000000 00:00 0                          [vdso]
ffff9a192000-ffff9a194000 rw-p 00000000 00:00 0
ffff9a194000-ffff9a1b3000 r-xp 00000000 00:2c 4322882                    /lib64/ld-2.27.so
ffff9a1b3000-ffff9a1c3000 ---p 00000000 00:00 0
ffff9a1c3000-ffff9a1c5000 rw-p 0001f000 00:2c 4322882                    /lib64/ld-2.27.so
ffff9a1c5000-ffff9a1c6000 rw-p 00000000 00:00 0
ffff9a1c6000-ffff9a1c7000 ---p 00000000 00:00 0
ffffca2d1000-ffffca2f3000 rw-p 00000000 00:00 0                          [stack]
", 4103) = 3155
29725 close(5</proc/29725/maps>)        = 0
29725 mprotect(0x4f25e000, 4096, PROT_READ|PROT_WRITE) = 0
29725 rt_sigprocmask(SIG_SETMASK, NULL, [], 8) = 0
29725 write(2<pipe:[42719551]>, "<DEBUG accessing *soname=[0x0000ffff9a16ad81]>\n", 47) = 47
29725 write(2<pipe:[42719551]>, "<DEBUG accessing *soname=[\5\23\6\202\5\vJ\5\25\6K\5\r\6J\5\23\6K\5\v\6\272\4\v\5]\n>\n", 56) = 56
. . .
@AssadHashmi
Copy link
Contributor Author

I've uploaded output from readelf for libarmflang.so:
libarmflang.elfdump.txt

The compiler is available from https://developer.arm.com/products/software-development-tools/hpc/arm-compiler-for-hpc

@derekbruening
Copy link
Contributor

The fundamental issue is that DR is locating .dynstr using using a virtual offset (DT_STRTAB) and then accessing .dynstr after the first flat mmap by the loader, when .dynstr's segment may not in fact be properly mapped. Most libraries have .dynstr in their first +rx segment, and so its virtual offset matches its file offset, and everything works at flat-file-mapping time.

The reason libarmflang.so is unusual is that .dynstr is in its last +rw segment:

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x00000000002ede6c 0x00000000002ede6c  R E    0x10000
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     0x10
  GNU_EH_FRAME   0x00000000002aee40 0x00000000002aee40 0x00000000002aee40
                 0x000000000000d87c 0x000000000000d87c  R      0x4
  LOAD           0x00000000002f1588 0x0000000000301588 0x0000000000301588
                 0x000000000002c3a4 0x0000000000061cb0  RW     0x10000
  GNU_RELRO      0x00000000002f1588 0x0000000000301588 0x0000000000301588
                 0x000000000000ea78 0x000000000000ea78  R      0x1
  DYNAMIC        0x00000000002ff968 0x000000000030f968 0x000000000030f968
                 0x0000000000000240 0x0000000000000240  RW     0x8
  LOAD           0x00000000004b0000 0x0000000000370000 0x0000000000370000
                 0x0000000000021b68 0x0000000000021b68  RW     0x10000
  LOAD           0x00000000004e0000 0x00000000003a0000 0x00000000003a0000
                 0x0000000000017098 0x0000000000017098  RW     0x10000

 Section to Segment mapping:
  Segment Sections...
   00     .dynsym .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame 
   01     
   02     .eh_frame_hdr 
   03     .init_array .fini_array .data.rel.ro .dynamic .got .got.plt .data .bss 
   04     .init_array .fini_array .data.rel.ro .dynamic .got 
   05     .dynamic 
   06     .gnu.hash 
   07     .dynstr 

The first mmap of the whole file may not reach as far as the later-mapped final segment, resulting in SIGBUS from reading off the end of the file on the mmap's final page. If the file is instead long enough, DR may just read some bogus value as the string.

It looks like /data/arm_hpc_compiler/opt/arm/gcc-8.2.0_Generic-AArch64_Ubuntu-16.04_aarch64-linux/lib64/libstdc++.so.6.0.25 also has this property, with .dynstr at the end, and DR also reads bogus values for soname here.

There are existing checks for not reading off the end of the mapping, but they are not looking at the file size.

Unlike on Windows where a single system call maps in all the segments, with the multi-step segment loading process it is not as simple to figure out when library loading is finished. This is why DR likes to look for and process libraries on the first mmap. This has caused other problems in the past: xref #884.

I could see three possible solutions:

A) Just try to get .dynstr reading to work at the current flat-mmap point by looking at the DT_LOAD entries and computing the file offset for .dynstr.

B) Do not try to read .dynstr at the flat-mmap: instead delay until some later point, either the same 1st-execution as #884 or something else. Maybe evaluate who relies on it to decide.

C) Change DR's entire library analysis scheme to shift later after the segment mmaps. This may open up corner cases of not properly handling non-standard code file mapping. Certainly this is the largest and most complex change of the 3.

@AssadHashmi
Copy link
Contributor Author

Thanks Derek!
C seems quite risky. I think I understand A and will look at using that as a fix.

@AssadHashmi
Copy link
Contributor Author

Hi Derek, I've tried to fix this based on your suggestions A) and B).

For A I couldn't figure what other way there is of computing the offset of .dynstr.
AIUI .dynstr points to the string table of the dynamic section because .dynstr is of type STRTAB:

  [36] .dynstr           STRTAB           0000000000540000  00510000
       0000000000017dcc  0000000000000000   A       0     0     8

Using p_offset (0x510000) rather than p_vaddr (0x540000) makes no difference.

For B waiting until the end of os_module_area_init() to access soname after module_walk_program_headers() makes no difference either:

      ma->names.inode = inode;
      if (soname == NULL)
          ma->names.module_name = NULL;
      else
          ma->names.module_name = dr_strdup(soname HEAPACCT(which));

I think it would be more productive for us to work out a fix on a PR referring to real code rather than this discussion.

To that end I've created #3419 with a simple update which delays accessing soname until all program headers have been processed in module_walk_program_headers().

Thanks

@derekbruening
Copy link
Contributor

Using p_offset (0x510000) rather than p_vaddr (0x540000) makes no difference.

Are you sure? What is the file offset of the segment, in case it's different (maybe your version has multiple sections in that final segment)?

Using the file offset works in my case, which is running the app you provided and looking at this library:

/data/arm_hpc_compiler/opt/arm/arm-hpc-compiler-19.0_Generic-AArch64_Ubuntu-16.04_aarch64-linux/lib/libarmflang.so

DR originally read the wrong string here (it didn't crash b/c the values happened to stay within the library; it just reads the wrong data):

<*soname=[0x00007fffa8e9bd8b]=[pc-compiler-19.0/flang/runtime/libpgmath/lib/generic/around.c]>

The maps file at that point:

7fffa8ae5000-7fffa8e9d000 r-xp 00000000 08:11 9437437                    /data/arm_hpc_compiler/opt/arm/arm-hpc-compiler-19.0_Generic-AArch64_Ubuntu-16.04_aarch64-linux/lib/libarmflang.so
7fffa8e9d000-7fffa8e9f000 rw-p 00000000 00:00 0 

So we have this offset: 0x00007fffa8e9bd8b-0x7fffa8ae5000 = 0x3b6d8b

And here's the string DR read from the flat-mapped file:

$ od -t c -A x -j 0x3b6d8b libarmflang.so | head -4
3b6d8b   p   c   -   c   o   m   p   i   l   e   r   -   1   9   .   0
3b6d9b   /   f   l   a   n   g   /   r   u   n   t   i   m   e   /   l
3b6dab   i   b   p   g   m   a   t   h   /   l   i   b   /   g   e   n
3b6dbb   e   r   i   c   /   a   r   o   u   n   d   .   c  \0   _   _

Now if we look at that segment we see the load offset of 0x3a0000, which was used by DR, vs the file offset of 0x4e0000:

  LOAD           0x00000000004e0000 0x00000000003a0000 0x00000000003a0000
                 0x0000000000017098 0x0000000000017098  RW     0x10000

  [35] .dynstr           STRTAB           00000000003a0000  004e0000
       0000000000017097  0000000000000000   A       0     0     8

Using 0x4e0000 instead of 0x3a0000 we get to the right place:

$ od -t c -A x -j 0x4f6d8b libarmflang.so | head -4
4f6d8b   l   i   b   a   r   m   f   l   a   n   g   .   s   o  \0   X

@derekbruening
Copy link
Contributor

I just saw this comment in os_add_new_app_module():

    /* Mapping in a new module.  From what we've observed of the loader's
     * behavior, it first maps the file in with size equal to the final
     * memory image size (I'm not sure how it gets that size without reading
     * in the elf header and then walking through all the program headers to
     * get the largest virtual offset).  This is necessary to reserve all the

If it's really not mapping the full file size the first time, then this option A is not going to be safe. I suppose the fact that we saw SIGBUS in the first place should corroborate this.

@derekbruening
Copy link
Contributor

So for A we'd have to do our own mmap of the whole file (or read from disk).

@derekbruening
Copy link
Contributor

For B waiting until the end of os_module_area_init() to access soname after module_walk_program_headers() makes no difference either:

I think that is still at the first mmap? I don't think it's late enough: the segments have not been loaded yet.

@AssadHashmi
Copy link
Contributor Author

Are you sure? What is the file offset of the segment, in case it's different (maybe your version has multiple sections in that final segment)?

My bad, I was not using the p_offset correctly! Using the correct offset in the correct way, od shows the right string:

$ od -t c -A x -j 0x527adf /opt/arm/arm-hpc-compiler-19.1_Generic-AArch64_SUSE-12_aarch64-linux/lib/libarmflang.so | head -4
527adf   l   i   b   a   r   m   f   l   a   n   g   .   s   o  \0   G
527aef   L   I   B   C   _   2   .   1   7  \0   G   C   C   _   4   .
527aff   0   .   0  \0   G   C   C   _   3   .   0  \0   V   E   R   S
527b0f   I   O   N  \0   X   X   X   X   X   X   X   X   X   X   X   X

If it's really not mapping the full file size the first time, then this option A is not going to be safe.

It looks like it is mapping the full size first time. I will post a fix shortly.

The fix isn't going to be as simple as I thought because the PT_DYNAMIC header processing which sets dynstr currently happens before the PT_LOAD header processing which has the correct offset for the STRTAB. You highlighted the late occurrence of .dynstr which is an unusual feature of this library. Thanks.

@derekbruening
Copy link
Contributor

PR #5947 has a partial solution delaying (re-trying actually) initialization until a later point: so along the lines of solution B). See also further analysis and discussion in the now-closed-as-duplicate issue #5946 which will continue here.

@apach301
Copy link
Contributor

apach301 commented Apr 5, 2023

Update in PR #5947 - I moved dynamic_info initialization from dr_get_proc_address routine to the instrument_module_load_trigger - it called just once for every module just before instrumenting it with user callbacks.

AFAIK this issue with late .dynstr section is very rare, so just late initialization may be enough? This pattern used for handling Android loader system, and the same function I re-used in PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants