Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: More failing AArch32 tests #48

Closed
wants to merge 5 commits into from

Conversation

langston-barrett
Copy link
Contributor

Some very exciting error messages!

RefurbishTests
  Rewriting LayoutStrategy {allocator = Parallel, grouping = BlockGrouping, trampolines = AlwaysTrampoline}
    tests/arm/test-indirect-calls.opt.nostdlib.aarch32.exe:   FAIL
      Exception: Empty block at address ConcreteAddress 0x10138
    tests/arm/linked-list.opt.nostdlib.aarch32.exe:           FAIL
      Exception: RewrittenTextSectionSizeMismatch 504 4260352
    tests/arm/test-reporting.opt.stdlib.aarch32.exe:          FAIL
      Exception: user error (No unallocated virtual address space within jumping range of the text section is available for use as a new extratext section.)
    tests/arm/linked-list.noopt.nostdlib.aarch32.exe:         FAIL
      Exception: RewrittenTextSectionSizeMismatch 596 4260444
    tests/arm/test-direct-calls.opt.nostdlib.aarch32.exe:     OK (1.48s)
    tests/arm/test-just-exit.noopt.nostdlib.aarch32.exe:      OK (1.43s)
    tests/arm/test-just-exit.opt.nostdlib.aarch32.exe:        OK (1.44s)
    tests/arm/test-direct-calls.noopt.nostdlib.aarch32.exe:   OK (1.54s)
    tests/arm/test-conditional.opt.nostdlib.aarch32.exe:      OK (1.44s)
    tests/arm/test-reporting.noopt.stdlib.aarch32.exe:        FAIL
      Exception: user error (No unallocated virtual address space within jumping range of the text section is available for use as a new extratext section.)
    tests/arm/test-conditional.noopt.nostdlib.aarch32.exe:    OK (1.52s)
    tests/arm/test-indirect-calls.noopt.nostdlib.aarch32.exe: OK (1.55s)
  Rewriting LayoutStrategy {allocator = Parallel, grouping = BlockGrouping, trampolines = WholeFunctionTrampoline}
    tests/arm/test-indirect-calls.opt.nostdlib.aarch32.exe:   FAIL
      Exception: Empty block at address ConcreteAddress 0x10138
    tests/arm/linked-list.opt.nostdlib.aarch32.exe:           FAIL
      Exception: RewrittenTextSectionSizeMismatch 504 4260352
    tests/arm/test-reporting.opt.stdlib.aarch32.exe:          FAIL
      Exception: user error (No unallocated virtual address space within jumping range of the text section is available for use as a new extratext section.)
    tests/arm/linked-list.noopt.nostdlib.aarch32.exe:         FAIL
      Exception: RewrittenTextSectionSizeMismatch 596 4260444
    tests/arm/test-direct-calls.opt.nostdlib.aarch32.exe:     OK (1.45s)
    tests/arm/test-just-exit.noopt.nostdlib.aarch32.exe:      OK (1.50s)
    tests/arm/test-just-exit.opt.nostdlib.aarch32.exe:        OK (1.40s)
    tests/arm/test-direct-calls.noopt.nostdlib.aarch32.exe:   OK (1.50s)
    tests/arm/test-conditional.opt.nostdlib.aarch32.exe:      OK (1.38s)
...skipping...
      Exception: RewrittenTextSectionSizeMismatch 596 4260444
    tests/arm/test-direct-calls.opt.nostdlib.aarch32.exe:     OK (1.48s)
    tests/arm/test-just-exit.noopt.nostdlib.aarch32.exe:      OK (1.41s)
    tests/arm/test-just-exit.opt.nostdlib.aarch32.exe:        OK (1.42s)
    tests/arm/test-direct-calls.noopt.nostdlib.aarch32.exe:   OK (1.47s)
    tests/arm/test-conditional.opt.nostdlib.aarch32.exe:      OK (1.43s)
    tests/arm/test-reporting.noopt.stdlib.aarch32.exe:        FAIL
      Exception: user error (No unallocated virtual address space within jumping range of the text section is available for use as a new extratext section.)
    tests/arm/test-conditional.noopt.nostdlib.aarch32.exe:    OK (1.52s)
    tests/arm/test-indirect-calls.noopt.nostdlib.aarch32.exe: OK (1.55s)
  Rewriting LayoutStrategy {allocator = Compact SortedOrder, grouping = FunctionGrouping, trampolines = WholeFunctionTrampoline}
    tests/arm/test-indirect-calls.opt.nostdlib.aarch32.exe:   FAIL
      Exception: Empty block at address ConcreteAddress 0x10138
    tests/arm/linked-list.opt.nostdlib.aarch32.exe:           FAIL
      Exception: RewrittenTextSectionSizeMismatch 504 4260352
    tests/arm/test-reporting.opt.stdlib.aarch32.exe:          FAIL
      Exception: user error (No unallocated virtual address space within jumping range of the text section is available for use as a new extratext section.)
    tests/arm/linked-list.noopt.nostdlib.aarch32.exe:         FAIL
      Exception: RewrittenTextSectionSizeMismatch 596 4260444
    tests/arm/test-direct-calls.opt.nostdlib.aarch32.exe:     OK (1.47s)
    tests/arm/test-just-exit.noopt.nostdlib.aarch32.exe:      OK (1.43s)
    tests/arm/test-just-exit.opt.nostdlib.aarch32.exe:        OK (1.48s)
    tests/arm/test-direct-calls.noopt.nostdlib.aarch32.exe:   OK (1.54s)
    tests/arm/test-conditional.opt.nostdlib.aarch32.exe:      OK (1.44s)
    tests/arm/test-reporting.noopt.stdlib.aarch32.exe:        FAIL
      Exception: user error (No unallocated virtual address space within jumping range of the text section is available for use as a new extratext section.)
    tests/arm/test-conditional.noopt.nostdlib.aarch32.exe:    OK (1.44s)
    tests/arm/test-indirect-calls.noopt.nostdlib.aarch32.exe: OK (1.54s)
  Injecting LayoutStrategy {allocator = Parallel, grouping = BlockGrouping, trampolines = AlwaysTrampoline}
    tests/injection-base/injection-base.ppc64.exe:            OK (1.48s)
  Injecting LayoutStrategy {allocator = Compact SortedOrder, grouping = BlockGrouping, trampolines = AlwaysTrampoline}
    tests/injection-base/injection-base.ppc64.exe:            OK (1.45s)

25 out of 62 tests failed (57.66s)

@langston-barrett langston-barrett self-assigned this Jun 10, 2020
@travitch
Copy link
Contributor

I'll take a look. I'm pretty sure I know what is going on with the address space issue. The empty block one is really interesting and might actually be possible...

This range is much larger now that we implemented long jumps by storing pointers
in the instruction stream.
For the x86 and PowerPC backends this didn't matter - we constructed the partial
blocks and then threw them away since we couldn't rewrite them.  This became a
problem with the AArch32 backend, which actually inspects block terminators and
panics when it can't figure out how to interpret failures.

This commit just changes the logic to avoid even trying to translate incomplete
blocks, which was useless work anyway.
For some reason, the mac build is using a version of the base package from ghc 8.10...
@langston-barrett
Copy link
Contributor Author

langston-barrett commented Jun 12, 2020

There's still something going on with the biggest binary in the bunch:

RefurbishTests
  Rewriting LayoutStrategy {allocator = Parallel, grouping = BlockGrouping, trampolines = AlwaysTrampoline}
-- snip
    tests/binaries/test-reporting.opt.stdlib.aarch32.exe:          FAIL (577.38s)
      tests/Identity.hs:24:
      Exit code
      expected: ExitFailure 3
       but got: ExitFailure 139

Interestingly, it's failing before it gets to user code:

      0x10908 <__renovate___libc_setup_tls+56> add    r3,  r3,  #32
      0x1090c <__renovate___libc_setup_tls+60> cmp    r3,  r1
      0x10910 <__renovate___libc_setup_tls+64> bcs    0x1097c <__renovate___libc_setup_tls+172>
 →    0x10914 <__renovate___libc_setup_tls+68> ldr    r2,  [r3]
      0x10918 <__renovate___libc_setup_tls+72> cmp    r2,  #7
      0x1091c <__renovate___libc_setup_tls+76> bne    0x10908 <__renovate___libc_setup_tls+56>

A diff of the functions shows almost nothing:

 diff ./files/brittle/tls-*.txt
40c40
<    10968:       eb013713        bl      5e5bc <__renovate___udivsi3>
---
>    10968:       eb013713        bl      5e5bc <__udivsi3>
65c65
<    109cc:       eb006436        bl      29aac <__renovate___sbrk>
---
>    109cc:       eb006436        bl      29aac <__sbrk>
82c82
<    10a10:       eb005a32        bl      272e0 <__renovate_memcpy>
---
>    10a10:       eb005a32        bl      272e0 <memcpy>
113c113
<    10a8c:       eb0136ca        bl      5e5bc <__renovate___udivsi3>
---
>    10a8c:       eb0136ca        bl      5e5bc <__udivsi3>
149,161c149,161

Here are the binaries in question:
reporting.zip

Here's the source for the function in question: https://github.com/lattera/glibc/blob/895ef79e04a953cac1493863bcae29ad85657ee1/csu/libc-tls.c#L106

@langston-barrett
Copy link
Contributor Author

langston-barrett commented Jun 12, 2020

@travitch said:

I've seen some issues like that in the x86 backend due to TLS initialization being dependent on the order of sections or segments in the binary, and some orderings causing TLS to not be initialized.
I started adding printk calls in the kernel ELF loader until I figured out what was wrong and then re-ordered segments until it went away, which really sucked
It would be really nice if we could figure out how to disable TLS for now... I'm not sure how practical that is
The real bummer would be if there were different requirements between x86 and ARM such that there was no single layout that worked

Here's readelf -a on the original binary:

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  EXIDX          0x0660b8 0x000760b8 0x000760b8 0x005a0 0x005a0 R   0x4
  LOAD           0x000000 0x00010000 0x00010000 0x6665c 0x6665c R E 0x10000
  LOAD           0x066b7c 0x00086b7c 0x00086b7c 0x01364 0x02374 RW  0x10000
  NOTE           0x000114 0x00010114 0x00010114 0x00044 0x00044 R   0x4
  TLS            0x066b7c 0x00086b7c 0x00086b7c 0x00010 0x00030 R   0x4
  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x10
  GNU_RELRO      0x066b7c 0x00086b7c 0x00086b7c 0x00484 0x00484 R   0x1

and on the rewritten binary:

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  PHDR           0x0a4000 0x00900000 0x00900000 0x00160 0x00160 R   0x1000
  LOAD           0x0a4000 0x00900000 0x00900000 0x00160 0x00160 R   0x1000
  EXIDX          0x0660b8 0x000760b8 0x000760b8 0x005a0 0x005a0 R   0x4
  LOAD           0x000000 0x00010000 0x00010000 0x6665c 0x6665c R E 0x10000
  LOAD           0x066b7c 0x00086b7c 0x00086b7c 0x02384 0x02384 RW  0x10000
  NOTE           0x000114 0x00010114 0x00010114 0x00044 0x00044 R   0x4
  TLS            0x066b7c 0x00086b7c 0x00086b7c 0x00010 0x00030 R   0x4
  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x8
  GNU_RELRO      0x066b7c 0x00086b7c 0x00086b7c 0x00484 0x00484 R   0x1
  LOAD           0x07f000 0x00089000 0x00089000 0x03f6c 0x03f6c R E 0x1000
  LOAD           0x099000 0x00030000 0x00030000 0x01000 0x01000 RW  0x1000

@travitch
Copy link
Contributor

After a bit of digging, it looks like the failure occurs because the code in __libc_setup_tls is walking through the first loadable segment to gather metadata to set up TLS. In the original binary, that is EXIDX. In the rewritten binary, it is PHDR (because we had to ensure that was first on x86 for the same reason). It looks like ARM does not use PHDR, as such, and that EXIDX is the equivalent. We'll have to adapt the ELF rewriter to ensure that EXIDX is first on ARM (and we probably don't need PHDR on ARM - the original binary in this case does not have it - I suspect they serve similar roles).

@travitch
Copy link
Contributor

Update: the EXIDX section contains exception unwind tables. It is described in this ABI document. That document says that the other ARM-specific segment, PT_ARM_ARCHEXT, must come before any PT_LOAD segment. It isn't clear if the PT_ARM_EXIDX section must also come before PT_LOAD, though that would be potentially consistent with observations.

@langston-barrett
Copy link
Contributor Author

Merged in #55

@langston-barrett langston-barrett deleted the lb/aarch-32-tests-more branch July 21, 2020 02:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants