-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Emulate PLT for better symbols #899
Conversation
0d0a940
to
58c105d
Compare
@nneonneo Can you take a look at this and see if it meets your scenario now? Instead of a hand-rolled parser for each architecture, we just emulate a few instructions with @aquynh's Unicorn Engine. @idolf Are you okay with adding another dependency? Unicorn is There is currently some debugging code left in which prints out:
To see the debug code, add To assuage any concerns of running untrusted code, the emulator is only given access to the PLT, stops after 4 instructions, stops on any memory accesses, and doesn't have e.g. a stack allocated, or any syscall handlers installed. That said, Unicorn Engine used to be buggy af. |
cbb9664
to
4da4049
Compare
Also @psifertex this adds an explicit GPL dependency (Unicorn Engine). The problem really isn't solve-able for RELRO, PIE, Intel binaries without emulating the instructions (or parsing them manually). Capstone Engine (the disassembler) is BSD 3-Clause and could replace this code with a lot of work that I don't want to do, and perhaps with more accuracy, but also less flexibility. |
59f7656
to
067ce69
Compare
This adds a dependency on Unicorn
Unicorn seems like a really heavyweight dependency to add. Have you profiled the ELF parsing speed before and after this change? What about for a program with a lot of PLT entries (like a C++ binary)? |
I haven't, but generally in Pwntools performance isn't critical. The only
things anybody has ever really cared about is start-up time (thus the
deferred import) and total throughput of the tubes module.
There's more compute power than developer hours, so this seems a perfect
use of Unicorn, rather than manually implementing a disassembler / emulator.
…On Wed, Feb 15, 2017 at 12:10 PM Robert Xiao ***@***.***> wrote:
Unicorn seems like a really heavyweight dependency to add. Have you
profiled the ELF parsing speed before and after this change? What about for
a program with a lot of PLT entries (like a C++ binary)?
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#899 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAG0GKP4WJ4do1KipimSX9amMI27Pes4ks5rczF7gaJpZM4MACcO>
.
|
A quick test on my laptop shows that this is significantly slower on e.g. Google Chrome (110MB binary after extraction) than the current
It seems that most of the time is spent parsing relocations for the GOT, not actually emulating any of the code. If the GOT parsing is short-circuited to add a single fake symbol and exit immediately, parse-time of this PR (#899) goes down to ~2.3 seconds. Adding a single second of parse-time for use of an emulator on a 110MB binary seems reasonable. The GOT slowdown is because we parse all of the relocation sections, rather than just the one which corresponds to the A caching layer might speed up follow-up parsing time. We already do this for e.g. the |
Looks like we can greatly cut down GOT parsing time by checking symbol indices for validity before attempting to resolve them. Down to ~8.5s total parse time for 120MB This seems pretty reasonable, IMHO. I'm not sure there are any additional performance improvements that can be made to the GOT, without writing custom STRTAB-extraction routines, which seems excessive. |
Yeah, I suspect this is because the majority of the relocations are
internal relocations, not PLT or external linkage relocations.
In theory it's easy to identify these by their relocation types, which
would provide another appreciable boost in speed.
…On Wed, Feb 15, 2017 at 10:38 AM Zach Riggle ***@***.***> wrote:
Looks like we can greatly cut down GOT parsing time by checking symbol
indices for validity before attempting to resolve them. Down to ~8.5s total
parse time for chrome, most of that is still spent looking up GOT symbol
names.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#899 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEmuZ-DDTjmCj-Rr0vFz6tr7s1xymhgks5rc0YpgaJpZM4MACcO>
.
|
Got an OK from Idolf, merging in its current state, we can make performance improvements later |
awesome job, guys! linked to your fantastic project now from our website:
http://www.unicorn-engine.org/showcase/
keep it up, cheers.
|
@zachriggle: I tested this, it doesn't work for any big-endian ARM architecture:
|
It also fails to locate a canary in |
Hmm, it might be the last minute change to the PR to assume PLT entries are 8-byte aligned |
As pointed out by #886, it is not possible to have accurate PLT symbols for PIE/RELRO binaries without parsing or emulating the instructions in the PLT.
Since Unicorn Engine exists, just use that.