Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emulate PLT for better symbols #899

Merged
merged 2 commits into from
Feb 15, 2017
Merged

Emulate PLT for better symbols #899

merged 2 commits into from
Feb 15, 2017

Conversation

zachriggle
Copy link
Member

@zachriggle zachriggle commented Feb 14, 2017

As pointed out by #886, it is not possible to have accurate PLT symbols for PIE/RELRO binaries without parsing or emulating the instructions in the PLT.

Since Unicorn Engine exists, just use that.

@zachriggle zachriggle added this to the Someday milestone Feb 14, 2017
@zachriggle zachriggle self-assigned this Feb 14, 2017
@zachriggle zachriggle changed the title Parse all relocation sections for PLT symbols Emulate all relocation sections for PLT symbols Feb 15, 2017
@zachriggle
Copy link
Member Author

zachriggle commented Feb 15, 2017

@nneonneo Can you take a look at this and see if it meets your scenario now? Instead of a hand-rolled parser for each architecture, we just emulate a few instructions with @aquynh's Unicorn Engine.

@idolf Are you okay with adding another dependency? Unicorn is pip install-able, so it shouldn't be a problem. We end up emulating a very small number of instructions, so the performance hit should be unnoticeable.

There is currently some debugging code left in which prints out:

  1. Each emulated instruction, and the memory address it ends up accessing
  2. The in-memory-order PLT addresses

To see the debug code, add DEBUG to the command line (e.g. checksec binary.elf DEBUG).

To assuage any concerns of running untrusted code, the emulator is only given access to the PLT, stops after 4 instructions, stops on any memory accesses, and doesn't have e.g. a stack allocated, or any syscall handlers installed. That said, Unicorn Engine used to be buggy af.

@zachriggle zachriggle changed the title Emulate all relocation sections for PLT symbols Emulate PLT for better symbols Feb 15, 2017
@zachriggle
Copy link
Member Author

Also @psifertex this adds an explicit GPL dependency (Unicorn Engine). The problem really isn't solve-able for RELRO, PIE, Intel binaries without emulating the instructions (or parsing them manually). Capstone Engine (the disassembler) is BSD 3-Clause and could replace this code with a lot of work that I don't want to do, and perhaps with more accuracy, but also less flexibility.

This adds a dependency on Unicorn
@nneonneo
Copy link

Unicorn seems like a really heavyweight dependency to add. Have you profiled the ELF parsing speed before and after this change? What about for a program with a lot of PLT entries (like a C++ binary)?

@zachriggle
Copy link
Member Author

zachriggle commented Feb 15, 2017 via email

@zachriggle
Copy link
Member Author

zachriggle commented Feb 15, 2017

A quick test on my laptop shows that this is significantly slower on e.g. Google Chrome (110MB binary after extraction) than the current dev branch, but on-par with #886.

branch time
dev 1.3s
#899 27.7s
#886 27.5s

It seems that most of the time is spent parsing relocations for the GOT, not actually emulating any of the code. If the GOT parsing is short-circuited to add a single fake symbol and exit immediately, parse-time of this PR (#899) goes down to ~2.3 seconds. Adding a single second of parse-time for use of an emulator on a 110MB binary seems reasonable.

The GOT slowdown is because we parse all of the relocation sections, rather than just the one which corresponds to the .plt. The code on dev misses GOT entries which correspond to data-only use cases (e.g. &srand).

A caching layer might speed up follow-up parsing time. We already do this for e.g. the pwnlib.rop module for gadget caching.

@zachriggle
Copy link
Member Author

zachriggle commented Feb 15, 2017

Looks like we can greatly cut down GOT parsing time by checking symbol indices for validity before attempting to resolve them. Down to ~8.5s total parse time for 120MB chrome, most of that is still spent looking up GOT symbol names.

This seems pretty reasonable, IMHO. I'm not sure there are any additional performance improvements that can be made to the GOT, without writing custom STRTAB-extraction routines, which seems excessive.

@nneonneo
Copy link

nneonneo commented Feb 15, 2017 via email

@zachriggle
Copy link
Member Author

Got an OK from Idolf, merging in its current state, we can make performance improvements later

@zachriggle zachriggle merged commit a786a58 into Gallopsled:dev Feb 15, 2017
@aquynh
Copy link

aquynh commented Feb 15, 2017 via email

@nneonneo
Copy link

@zachriggle: I tested this, it doesn't work for any big-endian ARM architecture:

test-aarch64-big
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/2.7/bin/checksec", line 11, in <module>
    load_entry_point('pwntools==3.6.0.dev0', 'console_scripts', 'checksec')()
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pwntools-3.6.0.dev0-py2.7.egg/pwnlib/commandline/common.py", line 32, in main
    pwnlib.commandline.main.main()
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pwntools-3.6.0.dev0-py2.7.egg/pwnlib/commandline/main.py", line 47, in main
    commands[args.command](args)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pwntools-3.6.0.dev0-py2.7.egg/pwnlib/commandline/checksec.py", line 37, in main
    e = ELF(f.name)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pwntools-3.6.0.dev0-py2.7.egg/pwnlib/elf/elf.py", line 290, in __init__
    self._populate_plt()
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pwntools-3.6.0.dev0-py2.7.egg/pwnlib/elf/elf.py", line 720, in _populate_plt
    got_addrs)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pwntools-3.6.0.dev0-py2.7.egg/pwnlib/elf/plt.py", line 53, in emulate_plt_instructions
    uc = U.Uc(arch, mode)
  File "build/bdist.macosx-10.6-intel/egg/unicorn/unicorn.py", line 255, in __init__
unicorn.unicorn.UcError: Invalid mode (UC_ERR_MODE)

@nneonneo
Copy link

It also fails to locate a canary in test-{arm|thumb}{-relro|}-pie, which means it isn't parsing PLT properly in those cases.

@TethysSvensson TethysSvensson modified the milestones: 3.6.0, Someday Feb 16, 2017
@zachriggle
Copy link
Member Author

Hmm, it might be the last minute change to the PR to assume PLT entries are 8-byte aligned
I'll check the big end Ian issue, too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants