Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can you give a sample example for extracting IR #16

Closed
bsmondal opened this issue Sep 13, 2015 · 7 comments
Closed

Can you give a sample example for extracting IR #16

bsmondal opened this issue Sep 13, 2015 · 7 comments

Comments

@bsmondal
Copy link

Hi,

Which instruction I have to use to extract IR for a whole x86 binary? It will be better for me if you could give me a simple example.
Thank you.

@rhelmot
Copy link
Member

rhelmot commented Sep 14, 2015

Hello!

There's a bit of a problem here, that VEX itself doesn't really have a representation of a full-program IR. The best we have right now is using the main angr module to construct a control-flow graph via its CFG analysis.

We do keep seeing this problem where we're using an IR that was meant to be used for single blocks but trying to apply it to full-program analysis...

@bsmondal
Copy link
Author

Hi
Thanks for your prompt reply. Before your reply I never heard about angr. It looks like the exact what I am looking for. My goal is to extract C data type from IR. Could you please tell me that, is angr a good choice for that or not ?

Thanks for your support and time.

@rhelmot
Copy link
Member

rhelmot commented Sep 14, 2015

I may be a little biased, as I work on the dev team for the project, but I think that angr is an excellent choice for that. In order to do any sort of full-program analysis with the VEX IR, you're going to need a control flow graph that's at least a little bit smart (i.e. has any hope of resolving indirect jumps), which angr provides. Even though it's not 100% accurate, it's pretty good.

Data type recovery is an open field of research at the moment, but so far the best results available have been produced with static analysis tools like angr. I'm actually working on my own datatype recovery analysis using angr, but my needs are probably sufficiently different than yours that you shouldn't wait for me :)

@bsmondal
Copy link
Author

Wow .. that's grate. You are doing some excellent job. Here, I have designed a selective execution engine which can execute each function individually inside a binary but for that I need the exact function prototype like argument and return type of a function. That's why I am now interested on type inference in assembly. Could you tell me any tools that can execute each function individually without knowing their exact prototype? From my knowledge I know only XForce and MicroX can execute code fragment by providing random input but they are not open source.
:D

@bsmondal
Copy link
Author

One more thing, for datatype recovery, can you help me to find out which section I have to focus in angr. Hope this will make my task faster. Thank you.

@rhelmot
Copy link
Member

rhelmot commented Sep 14, 2015

If by selective execution you mean nothing more than using the original binary as a shared library and jumping into the appropriate function with arguments set, angr can actually do that without knowing the prototype! The execution itself is a little slow (... between ten thousand and a million times slowdown depending on the content) but it also works with symbolic values. The part of angr that can do this is the callable, half-assedly documented here. You can also look at one of our testcases that uses callable here. (it provides prototypes, but they're optional.)

To get started real quick in angr, just make a python virtual environment (mkvirtualenv angr) and then install it from the python package index (pip install angr). Then, in a python interpreter, you can load your binary into a project object. For your purposes you don't want to consider shared libraries, so use import angr; p = angr.Project('path/to/binary', load_options={'auto_load_libs': False}). Then, you can construct a control-flow graph with cfg = p.analyses.CFG(). Actually using the resulting CFG object is documented here.

@bsmondal
Copy link
Author

Thank you a lot for your valuable time and suggestion. I do appreciate your quick support . You have saved my lot of time.

@rhelmot rhelmot closed this as completed Sep 15, 2015
shaymargolis pushed a commit to shaymargolis/pyvex that referenced this issue Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants