Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: add capability to glibc heap commands for bruteforcing the main_arena #932

Merged
merged 13 commits into from Mar 21, 2023

Conversation

theguy147
Copy link
Collaborator

@theguy147 theguy147 commented Mar 12, 2023

Description/Motivation/Screenshots

Fixes #927 by adding the capability of bruteforcing the main_arena from the .data section of the glibc if all other methods (finding symbols and offset to __malloc_hook) fail.

EDIT:
Also, the PR fixes a regression error we had in which heap chunks only ever displays the same chunks from the main_arena for every arena in the binary. To fix this I also had to fix and extend the GlibcHeapInfo struct. A test to make sure that each arena has distinct chunks is included.

Against which architecture was this tested ?

  • x86-32
  • x86-64
  • ARM
  • AARCH64
  • MIPS
  • POWERPC
  • SPARC
  • RISC-V

Checklist

  • My PR was done against the dev branch, not main.
  • My code follows the code style of this project.
  • My change includes a change to the documentation, if required.
  • If my change adds new code, adequate tests have been added.
  • I have read and agree to the CONTRIBUTING document.

@theguy147
Copy link
Collaborator Author

theguy147 commented Mar 12, 2023

FYI: This PR is still a work-in-progress but because I didn't have a lot of time to work on it lately I wanted to put it out there so others can take a look.

On my amd64 machine two of the tests are still failing:

  • tests/commands/heap.py::HeapCommand::test_cmd_heap_bins_tcache
  • tests/commands/heap.py::HeapCommand::test_cmd_heap_bins_tcache_all
    I also did not yet get around to test it on different architectures.

Also I have not yet added tests for the PR:

  • e.g. for updated heap API

EDIT:
I believe that the failing tcache tests on my machine are not related to any changes from this PR but due to another change in current glibc versions. As these tests do not fail on the CI machines this should probably be addressed in a separate PR.

@theguy147
Copy link
Collaborator Author

theguy147 commented Mar 14, 2023

The bruteforce method explained in the linked issue assumes an alignment of 0x20 for the main_arena symbol in .data of the glibc. Unfortuntely, on a qemu aarch64 machine the actual alignment was 0x8, which quadruples the search space and thereby the execution time. But at least it does still work.

For testing I used qemu-system-aarch64 with a Debian 11 Bullseye image (8 cores, 4G memory). Finding the main_arena by bruteforce took approx 7 seconds on the test system.

Currently, I try to check if there are multiple possible candidates for the main_arena but if one would exit the search loop early once the first candidate has found, the execution time can be greatly improved (but false positives might happen and might not be easy to debug when they do)

Using the early exit approach, finding the main_arena on the aarch64 test system took approx 3 seconds.

EDIT:
It might be noteworthy to say that bruteforcing the main_arena is only necessary the first time a heap command is being executed for the currently debugged binary and therefore does not heavily influence the user experience and at least makes it possible to use the heap commands at all.

Also, the bruteforcing is only used in case the main_arena symbol is missing and the hack of finding it through an offset from __malloc_hook is not working.

@theguy147
Copy link
Collaborator Author

I now added two new configuration options:

  • gef.bruteforce_main_arena: user need to explicitly opt-in to the bruteforcing in order to prevent confusion that it might create on false positives
  • gef.main_arena_offset: if the user knows the offset of the main_arena from the libc base address then it can be set here, such that no further bruteforcing is necessary and the user has better control. This is especially useful in case bruteforcing takes longer than acceptable and in general the offset should not change between debugging sessions if the system stays the same and no glibc updates have been installed.

I also changed the algorithm to exit on the first main_arena candidate for now - but I can easily revert that change if other people have other opinions about this.

@theguy147
Copy link
Collaborator Author

I am not quite sure how to write additional tests that test the changes because our CI's glibc versions are smaller than 2.34 and thereby the whole bruteforce thing is not necessary. Any suggestions?

@theguy147 theguy147 marked this pull request as ready for review March 14, 2023 11:39
@theguy147 theguy147 requested a review from hugsy March 14, 2023 11:40
@theguy147
Copy link
Collaborator Author

For testing on powerpc (ppc64el) I used qemu-system-ppc64 with a Debian 11 Bullseye image (4G memory).
The affected commands by this PR work fine but there are some oddities: for heap-multiple-heaps.out, a couple of chunks (=3) in the non-main-arena are not correctly parsed in my test: gef thinks the "non-main-arena" flags are not set for them even though they are.

Here is a list of unrelated tests that failed on ppc64 - tests/commands/format_string_helper.py::FormatStringHelperCommand::test_cmd_format_string_helper
- tests/commands/heap_analysis.py::HeapAnalysisCommand::test_cmd_heap_analysis
- tests/commands/pcustom.py::PcustomCommand::test_cmd_pcustom_show
- tests/config/__init__.py::TestGefConfigUnit::test_config_show_opcodes_size
- tests/functions/elf_sections.py::ElfSectionGdbFunction::test_func_got

gef.py Outdated Show resolved Hide resolved
gef.py Show resolved Hide resolved
gef.py Outdated Show resolved Hide resolved
gef.py Show resolved Hide resolved
gef.py Outdated Show resolved Hide resolved
gef.py Show resolved Hide resolved
Copy link
Owner

@hugsy hugsy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some minor stuff to change (from me and graz) but I think we can roll it

gef.py Outdated Show resolved Hide resolved
gef.py Outdated
if self.ar_ptr - self.address < 0x60:
# special case: first heap of non-main-arena
arena = GlibcArena(f"*{self.ar_ptr:#x}")
return arena.heap_addr()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type

Suggested change
return arena.heap_addr()
return arena.heap_addr() or 0

gef.py Outdated Show resolved Hide resolved
@theguy147
Copy link
Collaborator Author

I have fixed all suggestions, apart from the caching of find_libc_version and the explanation of the hardcoded value of 0x60 (see my comments above). When you have time take a look at the changes and see if they can be resolved.

gef.py Outdated Show resolved Hide resolved
gef.py Outdated Show resolved Hide resolved
gef.py Outdated Show resolved Hide resolved
@hugsy hugsy merged commit 0cf291d into hugsy:dev Mar 21, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

heap commands do not currently work for stripped glibc versions >= 2.34
3 participants