New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds frida script for gathering code coverage #17
Conversation
Implements a frida script that gathers code coverage information and saves it using drcov format. Probably should be considered experimental, as it's the first thing I've written with frida and Im sure I've screwed something up =p
Of course right after submitting this I figured some other stuff to add/change. Will try to get it updated tonight. |
This allows frida to filter inside the target by thread id. This is probably only useful if you have other introspection into the process to see which thread you're interested. However, on larger targets if you can use this it improves results significantly.
Going to create a new PR with additional features, maybe better perf. |
I'm bad at git. |
I started making some perf fixes before realizing some stuff about frida. So, the good news: these perf fixes do in fact speed up the client by ~10% or so. The bad news: this tracer is still 2 orders of magnitude slower than native execution. From memory, DR is approx 33% slower than native, and pin is ~%50. So this is quite a bit worse. Intuitively, this makes sense as all coverage data has to be IPC'd out to the `recv()`ing process. There are ways this could be improved: * Cache the bbs in frida address space and only IPC them out on client detatch. However, if the client `exit()`'s or crashes, all coverage data would be lost. * Reduce allocations and copies inside the frida script. This is complicated by javascripts lack of native u64's. I don't really know javascript well enough to effectively optimize this. * The one thats probably a good idea: Right now the javascript creates the drcov objects and sends them to the python, which just is responsible for uniquing them. If IPC is the big bottleneck, if we uniqued them on the javascript side, we would reduce that and in theory improve performance. Some initial tests of this were not successful, so I'm going to have to play around with this more. For what it's worth, frida offers two modes for tracing at the granularity we care about: `compile` and `block`. `compile` will fire an event the first time a bb is seen, whereas `block` will fire an event every time a bb is seen. Switching to `compile` improves perf by an order of magnitude, however any blocks that were hit before you attach that you re-execute will not show up in your trace. It's pretty easy to toggle between them, and if people think making this an option has value, I'm not opposed. My thoughts were that in either case it's still really slow in comparison, so we might as well be slow, accurate, and easy to use. So yeah, perf improvements, but they don't really matter \o/
We'll use the standard hackertext indicators, because this is a hacker tool.
Frida has been picking up traction lately, so this is an awesome contribution. I'll test and review this soon :-) |
This looks like an awesome use-case for Stalker, great job! 👍 Just a few notes on the Frida-specific bits:
After some optimizations a few months back, the code generated by Stalker should only be a little bit slower. I wrote a benchmark that measured it on LZMA compression, and the ratio is currently somewhere between 1.x and 2.x slowdown. This is however after it has "warmed up" and has all the basic blocks cached, and has applied the back-patching optimizations and warmed up the inline caches. It is careful not to re-use blocks in case of self-modifying code, though, so you pay performance overhead every time it context-switches into the runtime to look up the target of a branch, followed by checking if the original code changed since it was compiled. Stalker achieves high performance by back-patching branches and updating inline caches once they're considered stable. You can configure the
The recommended way to deal with this is to hook
This will require additional hooks, per OS. Each thread will be recompiling the code, though, so the warm-up cost isn't shared between all of them. (Though this won't matter if the threads happen to execute wildly different code – so this really depends on the application.)
Stalker has been designed with this in mind, but it depends on how you configure
Did you run into issues with the ModuleMap API? It supports providing a filter function if you only care about certain modules and want faster lookups.
You can safely use |
Awesome, thanks @oleavr, that is insanely helpful. I've implemented the non-OS specific changes, and it results in a roughly 60x speedup on my little toy benchmark. My guess is the remaining latency that's being introduced is due to me streaming the events as opposed to caching and flushing. One of the potential use cases for this might be tracing applications that crash, in which case streaming events should be a bit better, albeit at the cost of speed and with the same coverage dropping near termination as For the OS specific things, I'm not really able to test them on all the platforms Frida supports, so I'm inclined to leave it OS agnostic and keep it as a general limitation, as it's not too onerous and the script it still quite useful without them. Thanks again, that's exactly the type of feedback I was hoping for. |
Incoperate @oleavr's feedback. Three changes: * Instead of instrumenting on 'block' events, we now instrument on 'compile' events. This dramatically improves the performance. Since we're only doing block coverage anyways, and since all the blocks get uniqued, we lose nothing from this change. * Set the Stalker trust threshhold to 0 . This means we're completely punting on self modifying code in favor speed, but IDA/Lighthouse can't visualize self modifying code anyways, so again, we lose nothing. * Use frida's ModuleMap api instead of making a worse version ourselves. The first time reading the docs I misunderstood this API, but @oleavr helpfully cleared it up!
Just a bit cleaner, missed a reference when I was editing it earlier
And delete pointless variable.
Add a few more comments, change a few names so when I have to edit it in three years I hate myself a little less.
So one of the big benefits of frida is non-local tracing. We should allow the user to do that.
Wow, well we couldn't have gotten better eyes on it than that! Thanks for taking a look @oleavr :-) |
Implements a frida script that gathers code coverage information and
saves it using drcov format.
Probably should be considered experimental, as it's the first thing I've
written with frida and Im sure I've screwed something up =p