Emulate PLT for better symbols #899

zachriggle · 2017-02-14T04:09:12Z

As pointed out by #886, it is not possible to have accurate PLT symbols for PIE/RELRO binaries without parsing or emulating the instructions in the PLT.

Since Unicorn Engine exists, just use that.

zachriggle · 2017-02-15T12:05:36Z

@nneonneo Can you take a look at this and see if it meets your scenario now? Instead of a hand-rolled parser for each architecture, we just emulate a few instructions with @aquynh's Unicorn Engine.

@idolf Are you okay with adding another dependency? Unicorn is pip install-able, so it shouldn't be a problem. We end up emulating a very small number of instructions, so the performance hit should be unnoticeable.

There is currently some debugging code left in which prints out:

Each emulated instruction, and the memory address it ends up accessing
The in-memory-order PLT addresses

To see the debug code, add DEBUG to the command line (e.g. checksec binary.elf DEBUG).

To assuage any concerns of running untrusted code, the emulator is only given access to the PLT, stops after 4 instructions, stops on any memory accesses, and doesn't have e.g. a stack allocated, or any syscall handlers installed. That said, Unicorn Engine used to be buggy af.

zachriggle · 2017-02-15T12:11:59Z

Also @psifertex this adds an explicit GPL dependency (Unicorn Engine). The problem really isn't solve-able for RELRO, PIE, Intel binaries without emulating the instructions (or parsing them manually). Capstone Engine (the disassembler) is BSD 3-Clause and could replace this code with a lot of work that I don't want to do, and perhaps with more accuracy, but also less flexibility.

This adds a dependency on Unicorn

nneonneo · 2017-02-15T17:10:18Z

Unicorn seems like a really heavyweight dependency to add. Have you profiled the ELF parsing speed before and after this change? What about for a program with a lot of PLT entries (like a C++ binary)?

zachriggle · 2017-02-15T17:15:01Z

I haven't, but generally in Pwntools performance isn't critical. The only things anybody has ever really cared about is start-up time (thus the deferred import) and total throughput of the tubes module. There's more compute power than developer hours, so this seems a perfect use of Unicorn, rather than manually implementing a disassembler / emulator.

…

On Wed, Feb 15, 2017 at 12:10 PM Robert Xiao ***@***.***> wrote: Unicorn seems like a really heavyweight dependency to add. Have you profiled the ELF parsing speed before and after this change? What about for a program with a lot of PLT entries (like a C++ binary)? — You are receiving this because you were assigned. Reply to this email directly, view it on GitHub <#899 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAG0GKP4WJ4do1KipimSX9amMI27Pes4ks5rczF7gaJpZM4MACcO> .

zachriggle · 2017-02-15T17:26:08Z

A quick test on my laptop shows that this is significantly slower on e.g. Google Chrome (110MB binary after extraction) than the current dev branch, but on-par with #886.

branch	time
`dev`	1.3s
#899	27.7s
#886	27.5s

It seems that most of the time is spent parsing relocations for the GOT, not actually emulating any of the code. If the GOT parsing is short-circuited to add a single fake symbol and exit immediately, parse-time of this PR (#899) goes down to ~2.3 seconds. Adding a single second of parse-time for use of an emulator on a 110MB binary seems reasonable.

The GOT slowdown is because we parse all of the relocation sections, rather than just the one which corresponds to the .plt. The code on dev misses GOT entries which correspond to data-only use cases (e.g. &srand).

A caching layer might speed up follow-up parsing time. We already do this for e.g. the pwnlib.rop module for gadget caching.

zachriggle · 2017-02-15T18:38:29Z

Looks like we can greatly cut down GOT parsing time by checking symbol indices for validity before attempting to resolve them. Down to ~8.5s total parse time for 120MB chrome, most of that is still spent looking up GOT symbol names.

This seems pretty reasonable, IMHO. I'm not sure there are any additional performance improvements that can be made to the GOT, without writing custom STRTAB-extraction routines, which seems excessive.

nneonneo · 2017-02-15T21:10:11Z

Yeah, I suspect this is because the majority of the relocations are internal relocations, not PLT or external linkage relocations. In theory it's easy to identify these by their relocation types, which would provide another appreciable boost in speed.

…

On Wed, Feb 15, 2017 at 10:38 AM Zach Riggle ***@***.***> wrote: Looks like we can greatly cut down GOT parsing time by checking symbol indices for validity before attempting to resolve them. Down to ~8.5s total parse time for chrome, most of that is still spent looking up GOT symbol names. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#899 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEmuZ-DDTjmCj-Rr0vFz6tr7s1xymhgks5rc0YpgaJpZM4MACcO> .

zachriggle · 2017-02-15T21:57:16Z

Got an OK from Idolf, merging in its current state, we can make performance improvements later

aquynh · 2017-02-15T23:12:31Z

awesome job, guys! linked to your fantastic project now from our website: http://www.unicorn-engine.org/showcase/ keep it up, cheers.

nneonneo · 2017-02-16T07:19:01Z

@zachriggle: I tested this, it doesn't work for any big-endian ARM architecture:

test-aarch64-big
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/2.7/bin/checksec", line 11, in <module>
    load_entry_point('pwntools==3.6.0.dev0', 'console_scripts', 'checksec')()
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pwntools-3.6.0.dev0-py2.7.egg/pwnlib/commandline/common.py", line 32, in main
    pwnlib.commandline.main.main()
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pwntools-3.6.0.dev0-py2.7.egg/pwnlib/commandline/main.py", line 47, in main
    commands[args.command](args)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pwntools-3.6.0.dev0-py2.7.egg/pwnlib/commandline/checksec.py", line 37, in main
    e = ELF(f.name)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pwntools-3.6.0.dev0-py2.7.egg/pwnlib/elf/elf.py", line 290, in __init__
    self._populate_plt()
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pwntools-3.6.0.dev0-py2.7.egg/pwnlib/elf/elf.py", line 720, in _populate_plt
    got_addrs)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/pwntools-3.6.0.dev0-py2.7.egg/pwnlib/elf/plt.py", line 53, in emulate_plt_instructions
    uc = U.Uc(arch, mode)
  File "build/bdist.macosx-10.6-intel/egg/unicorn/unicorn.py", line 255, in __init__
unicorn.unicorn.UcError: Invalid mode (UC_ERR_MODE)

nneonneo · 2017-02-16T07:20:40Z

It also fails to locate a canary in test-{arm|thumb}{-relro|}-pie, which means it isn't parsing PLT properly in those cases.

zachriggle · 2017-02-16T17:05:44Z

Hmm, it might be the last minute change to the PR to assume PLT entries are 8-byte aligned
I'll check the big end Ian issue, too.

zachriggle added the enhancement label Feb 14, 2017

zachriggle added this to the Someday milestone Feb 14, 2017

zachriggle self-assigned this Feb 14, 2017

zachriggle requested a review from TethysSvensson February 14, 2017 04:09

zachriggle mentioned this pull request Feb 14, 2017

Fix MIPS ELF bit-ness by checking masks in increasing order #901

Merged

zachriggle force-pushed the plt2 branch 4 times, most recently from 0d0a940 to 58c105d Compare February 15, 2017 11:58

zachriggle changed the title ~~Parse all relocation sections for PLT symbols~~ Emulate all relocation sections for PLT symbols Feb 15, 2017

zachriggle force-pushed the plt2 branch from 58c105d to 3230114 Compare February 15, 2017 12:02

zachriggle force-pushed the plt2 branch 2 times, most recently from cbb9664 to 4da4049 Compare February 15, 2017 12:08

zachriggle changed the title ~~Emulate all relocation sections for PLT symbols~~ Emulate PLT for better symbols Feb 15, 2017

zachriggle force-pushed the plt2 branch 3 times, most recently from 59f7656 to 067ce69 Compare February 15, 2017 12:22

Emulate all relocation sections for PLT symbols

94b7601

This adds a dependency on Unicorn

zachriggle force-pushed the plt2 branch from 067ce69 to 94b7601 Compare February 15, 2017 12:34

Don't look up empty symbols

a84b7f9

zachriggle mentioned this pull request Feb 15, 2017

Fix checksec nx, execstack, relro reporting #904

Merged

zachriggle added the feature label Feb 15, 2017

zachriggle merged commit a786a58 into Gallopsled:dev Feb 15, 2017

zachriggle mentioned this pull request Feb 15, 2017

Parse PLT instructions #886

Closed

TethysSvensson modified the milestones: 3.6.0, Someday Feb 16, 2017

zachriggle deleted the plt2 branch February 16, 2017 19:58

zachriggle mentioned this pull request Feb 21, 2017

Fix ARM PLT parsing, and big-endian ARM emulation #910

Merged

zachriggle mentioned this pull request Mar 4, 2017

Find the relocation section with .rela.dyn #838

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Emulate PLT for better symbols #899

Emulate PLT for better symbols #899

zachriggle commented Feb 14, 2017 •

edited

Loading

zachriggle commented Feb 15, 2017 •

edited

Loading

zachriggle commented Feb 15, 2017

nneonneo commented Feb 15, 2017

zachriggle commented Feb 15, 2017 via email

zachriggle commented Feb 15, 2017 •

edited

Loading

zachriggle commented Feb 15, 2017 •

edited

Loading

nneonneo commented Feb 15, 2017 via email

zachriggle commented Feb 15, 2017

aquynh commented Feb 15, 2017 via email

nneonneo commented Feb 16, 2017

nneonneo commented Feb 16, 2017

zachriggle commented Feb 16, 2017

Emulate PLT for better symbols #899

Emulate PLT for better symbols #899

Conversation

zachriggle commented Feb 14, 2017 • edited Loading

zachriggle commented Feb 15, 2017 • edited Loading

zachriggle commented Feb 15, 2017

nneonneo commented Feb 15, 2017

zachriggle commented Feb 15, 2017 via email

zachriggle commented Feb 15, 2017 • edited Loading

zachriggle commented Feb 15, 2017 • edited Loading

nneonneo commented Feb 15, 2017 via email

zachriggle commented Feb 15, 2017

aquynh commented Feb 15, 2017 via email

nneonneo commented Feb 16, 2017

nneonneo commented Feb 16, 2017

zachriggle commented Feb 16, 2017

zachriggle commented Feb 14, 2017 •

edited

Loading

zachriggle commented Feb 15, 2017 •

edited

Loading

zachriggle commented Feb 15, 2017 •

edited

Loading

zachriggle commented Feb 15, 2017 •

edited

Loading