Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using pyvex for transformations/instrumentation #47

Closed
drone29a opened this issue Oct 31, 2016 · 8 comments
Closed

Using pyvex for transformations/instrumentation #47

drone29a opened this issue Oct 31, 2016 · 8 comments

Comments

@drone29a
Copy link
Contributor

Hello, currently pyvex makes a call into C code to create an IRSB. That IRSB and its components (IRStmt, etc) are exposed as Python objects, but there doesn't seem to be a way to create an empty IRSB or construct IRStmt instances not backed by guest machine instructions.

I'd like to modify IRSBs, IRStmts, and IRExprs generated from code as well as programmatically generating instances without backing code. Is this functionality within the scope of pyvex? I'm happy to work on it and submit a patch, if so.

@rhelmot
Copy link
Member

rhelmot commented Oct 31, 2016

This is a longstanding TODO for pyvex, unfortunately... it shouldn't be any
harder than just shuffling around the constructors for all the IR objects,
but it'll be some work...

On Sun, Oct 30, 2016 at 6:56 PM, Matt Revelle notifications@github.com
wrote:

Hello, currently pyvex makes a call into C code to create an IRSB. That
IRSB and its components (IRStmt, etc) are exposed as Python objects, but
there doesn't seem to be a way to create an empty IRSB or construct IRStmt
instances not backed by guest machine instructions.

I'd like to modify IRSBs, IRStmts, and IRExprs generated from code as well
as programmatically generating instances without backing code. Is this
functionality within the scope of pyvex? I'm happy to work on it and submit
a patch, if so.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#47, or mute the thread
https://github.com/notifications/unsubscribe-auth/ACYg9ftWlkNGIoHPQcFTNaXWszqR-TYpks5q5UrGgaJpZM4KkmEn
.

@drone29a
Copy link
Contributor Author

Glad to hear it's a TODO.

There are a bunch of ways we can support creating IR component instances from the corresponding existing C structs (the current implementation) and from sub components (this is what the constructors in libvex look like).

We probably don't want to break the pyvex API at this point (maybe in major version bump?), so the current implementation of init would remain. Each IR component class can also have a static method which mirrors the underlying libvex constructor. Eventually we could move the current init to a static method and then move the libvex-style constructor in init.

Does adding support for libvex-style constructors via static methods attached to respective IR component classes sound reasonable to you all?

@rhelmot
Copy link
Member

rhelmot commented Oct 31, 2016

Yes, that was exactly what I was thinking of! If you want to just do this,
a PR would be totally welcome :)

Unfortunately, we have to keep all the angr components version numbers in
lockstep otherwise dependency hell becomes a serious problem... One way to
keep everything totally compatible would be to have there only be one
constructor, but you can pass in either a raw struct pointer as a keyword
argument or the normal construction parameters. Might be a little harder to
set up, though.

On Mon, Oct 31, 2016 at 10:52 AM, Matt Revelle notifications@github.com
wrote:

Glad to hear it's a TODO.

There are a bunch of ways we can support creating IR component instances
from the corresponding existing C structs (the current implementation) and
from sub components (this is what the constructors in libvex look like).

We probably don't want to break the pyvex API at this point (maybe in
major version bump?), so the current implementation of init would
remain. Each IR component class can also have a static method which mirrors
the underlying libvex constructor. Eventually we could move the current
init to a static method and then move the libvex-style constructor in
init.

Does adding support for libvex-style constructors via static methods
attached to respective IR component classes sound reasonable to you all?


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#47 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ACYg9X2VmwtqSHEhwoXrbdFi1r2RuCvuks5q5irjgaJpZM4KkmEn
.

@drone29a
Copy link
Contributor Author

drone29a commented Nov 5, 2016

Ok, I'm going forward with a create staticmethod attached to all the IR component classes (we can change the name to whatever) that will mirror the libvex constructors.

Is it OK for us to use "ABI mode" with cffi for calling the libvex constructors? It looks like you all wrote your own wrapper (pyvex.c) which gets statically linked with libvex. Since all the libvex functions, structs, etc are included in the generated vex_ffi.py module we can reference them without making any changes.

It sounds like the preferred approach is to include the current contents of pyvex.c with a C extension module generated by cffi (see "API mode"). I'm leaning towards going ahead and leaving pyvex.c alone and referencing the constructors via ABI mode, at least for now.

More on the cffi modes:
http://cffi.readthedocs.io/en/latest/overview.html#overview

@rhelmot
Copy link
Member

rhelmot commented Nov 5, 2016

Yeah, please leave the pyvex_c stuff alone. We need to have it compiled as
a standalone shared object so that a separate shared library can link
against it.

Interestingly enough, we decided recently that there actually would be a
major version number bump in angr pretty soon, so if you want to make your
changes so that init takes the raw components and there's a separate
static method for unpacking a raw libvex struct, we can merge those into an
in-progress branch that should get merged in a while.

On Saturday, November 5, 2016, Matt Revelle notifications@github.com
wrote:

Ok, I'm going forward with a create staticmethod attached to all the IR
component classes (we can change the name to whatever) that will mirror the
libvex constructors.

Is it OK for us to use "ABI mode" with cffi for calling the libvex
constructors? It looks like you all wrote your own wrapper (pyvex.c) which
gets statically linked with libvex. Since all the libvex functions,
structs, etc are included in the generated vex_ffi.py module we can
reference them without making any changes.

It sounds like the preferred approach is to include the current contents
of pyvex.c with a C extension module generated by cffi (see "API mode").
I'm leaning towards going ahead and leaving pyvex.c alone and referencing
the constructors via ABI mode, at least for now.

More on the cffi modes:
http://cffi.readthedocs.io/en/latest/overview.html#overview


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#47 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ACYg9VGJ6gYfPnsVlbboqC3l_jZeWQjUks5q7LjXgaJpZM4KkmEn
.

@drone29a
Copy link
Contributor Author

drone29a commented Nov 7, 2016

Here's a commit showing example changes to the IRConst classes.

drone29a@cbc81d4

There's now a from_c static method attached to each IRConst subclass which takes an IRConst* and extracts the value. The value is then used to construct the Python representation of the same IRConst type. The __init__ method for each class just takes a value. There's no validation of the value happening, we assume the correct type is being provided.

I also added a tag class variable to every IRConst instead of having the tag passed to the IRConst base class. There's already a type and it felt like tag might as well be defined similarly.

We may later want to have a method complementary to _translate which would create the corresponding C struct from a Python IR component instance. But I'm not sure we currently have a need for this?

Please let me know if there are any tweaks/critiques I should take into account and I'll then go ahead and finish making similar changes for the other IR components. Thanks!

@drone29a
Copy link
Contributor Author

Hey @rhelmot, just checking in about the proposed changes so I can go ahead and wrap this up. Thanks for any feedback. Heard there was a plague going around your all's lab, hope you escaped it. =)

@rhelmot
Copy link
Member

rhelmot commented Nov 16, 2016

Yes, this is good!

Reports of a plague have been highly exaggerated, yeah everyone is sick and there's been a nasty deadline, but nobody is dead yet.

It also just occurred to me that we actually have pretty much zero possibility of breaking any interfaces as long as pyvex is internally consistent - really, the only way anyone ever interacts with pyvex right now is by constructing an IRSB and then accessing properties on it and its children, which you're not planning on touching. You should be able to just fiddle with the constructors and make everything work the way it should be and none of the rest of angr should have to care.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants