[engine] Move to new-style CFFI callbacks. #4324

Merged
merged 6 commits into from Mar 13, 2017

Conversation

Projects
None yet
3 participants
@kwlzn
Member

kwlzn commented Mar 11, 2017

Fixes #4036

Problem

We currently use "old-style callbacks" with CFFI. These are known to be slower and cause more complicated stack traces during debugging/performance analysis.

Solution

Switch to "new-style callbacks" using the following approach:

  • Bootstrap CFFI C sources pre-pants execution by way of a simple, venv-backed entrypoint fed by native.py and integrated into bootstrap.sh.
  • Compile and link the generated CFFI C sources into the rust binary at build time with cargo, by way of a build script + the gcc-rs package.
    • On OSX, we instruct cargo to use a linker flag (via .cargo/config) that enables weak linking to avoid missing python symbols during linking (this behavior is the default on Linux). The weakly linked symbols (e.g. _PyImport_ImportModule) are then dynamically resolved within the address space of the parent binary (python) at runtime.
  • Rename the native-engine rust binary to native_engine.so to be python import compatible.
  • Dynamically load the native_engine.so binary as a python module by injecting the binaryutil dir into sys.path at runtime and initializing the new-style callbacks with the closed-loop ffi object.

This results in a single native-engine binary that can be loaded both as a python module (import) and as a C module (dlopen) and is not statically or dynamically linked to python.

Result

As tested on my Sierra laptop, this is good for a >10% speedup as measured by ./pants --enable-v2-engine list :: in the pants repo.

@baroquebobcat

It looks pretty good to me. I've got a couple of things I'd like addressed / clarified.

+ return self.ffi.dlopen(self.binary)
+
+ @memoized_property
+ def ffi(self):

This comment has been minimized.

@baroquebobcat

baroquebobcat Mar 13, 2017

Contributor

I'm a little confused about the use of _ffi vs ffi. Why do some methods use one vs the other?

It looks like ffi is the ffi for the loaded externs while _ffi is the generic cffi interface, which only allows for calling new, new_handle and from_handle. Is that right?

@baroquebobcat

baroquebobcat Mar 13, 2017

Contributor

I'm a little confused about the use of _ffi vs ffi. Why do some methods use one vs the other?

It looks like ffi is the ffi for the loaded externs while _ffi is the generic cffi interface, which only allows for calling new, new_handle and from_handle. Is that right?

This comment has been minimized.

@kwlzn

kwlzn Mar 13, 2017

Member

the diff mightve cut off the class context, but there are two different classes here that are both holding a reference to the same CompiledCFFI object (the equivalent of from _native_engine import ffi):

  • Native (the subsystem), which exposes the self.ffi property
  • ExternContext, which gets a self._ffi attribute at creation time (which is just a reference to Native.ffi)

the loaded externs are in the ffi_lib property, which is the equivalent of from _native_engine import lib and is where the loaded externs hang off of (e.g. lib.extern_invoke_runnable would be the callable).

@kwlzn

kwlzn Mar 13, 2017

Member

the diff mightve cut off the class context, but there are two different classes here that are both holding a reference to the same CompiledCFFI object (the equivalent of from _native_engine import ffi):

  • Native (the subsystem), which exposes the self.ffi property
  • ExternContext, which gets a self._ffi attribute at creation time (which is just a reference to Native.ffi)

the loaded externs are in the ffi_lib property, which is the equivalent of from _native_engine import lib and is where the loaded externs hang off of (e.g. lib.extern_invoke_runnable would be the callable).

@@ -95,17 +103,21 @@ function bootstrap_native_code() {
# Bootstraps the native code and overwrites the native_engine_version to the resulting hash
# version if needed.
local native_engine_version="$(calculate_current_hash)"

This comment has been minimized.

@baroquebobcat

baroquebobcat Mar 13, 2017

Contributor

I think this hash should include the native.py file, since it's part of the binary now.

I'd also like to have tests for these bootstrap bits, but I won't block on it.

@baroquebobcat

baroquebobcat Mar 13, 2017

Contributor

I think this hash should include the native.py file, since it's part of the binary now.

I'd also like to have tests for these bootstrap bits, but I won't block on it.

This comment has been minimized.

@kwlzn

kwlzn Mar 13, 2017

Member

great call - had intended to, but forgot to do this. fixed!

@kwlzn

kwlzn Mar 13, 2017

Member

great call - had intended to, but forgot to do this. fixed!

This comment has been minimized.

@kwlzn

kwlzn Mar 13, 2017

Member

and regarding tests - this is pretty well covered as a fundamental dependency of basically all v2 engine integration tests (all the tests break when this is broken). but would be happy to add tests if there's a specific area you think would benefit from it - maybe we can discuss more offline.

@kwlzn

kwlzn Mar 13, 2017

Member

and regarding tests - this is pretty well covered as a fundamental dependency of basically all v2 engine integration tests (all the tests break when this is broken). but would be happy to add tests if there's a specific area you think would benefit from it - maybe we can discuss more offline.

This comment has been minimized.

@baroquebobcat

baroquebobcat Mar 13, 2017

Contributor

hm. I'd like there to be tests around how the bootstrap could fail. eg asserting that changes to files triggers recompiling, or failing to compile should result in a missing binary / early exit. Things are still pretty early for this part of the code, so it might make sense to leave them unspecified. But, it's something I'm always thinking about.

@baroquebobcat

baroquebobcat Mar 13, 2017

Contributor

hm. I'd like there to be tests around how the bootstrap could fail. eg asserting that changes to files triggers recompiling, or failing to compile should result in a missing binary / early exit. Things are still pretty early for this part of the code, so it might make sense to leave them unspecified. But, it's something I'm always thinking about.

This comment has been minimized.

@stuhood

stuhood Mar 13, 2017

Member

That's a good idea, but orthogonal to this change probably.

@stuhood

stuhood Mar 13, 2017

Member

That's a good idea, but orthogonal to this change probably.

@stuhood

Awesome. Thanks a lot for tackling this Kris!

@@ -0,0 +1,2 @@
+[target.x86_64-apple-darwin]
+rustflags = ["-C", "link-args=-undefined dynamic_lookup"]

This comment has been minimized.

@stuhood

stuhood Mar 13, 2017

Member

Can you leave a comment here explaining this?

@stuhood

stuhood Mar 13, 2017

Member

Can you leave a comment here explaining this?

This comment has been minimized.

@kwlzn

kwlzn Mar 13, 2017

Member

sure, fixed!

@kwlzn

kwlzn Mar 13, 2017

Member

sure, fixed!

build-support/bin/native/bootstrap.sh
- git ls-files -c -o --exclude-standard "${NATIVE_ROOT}" | \
- git hash-object -t blob --stdin-paths | fingerprint_data
+ git ls-files -c -o \
+ --exclude-standard "${NATIVE_ROOT}" \

This comment has been minimized.

@stuhood

stuhood Mar 13, 2017

Member

AFAIK, --exclude-standard doesn't take an argument, so you should only need to include it once.

@stuhood

stuhood Mar 13, 2017

Member

AFAIK, --exclude-standard doesn't take an argument, so you should only need to include it once.

This comment has been minimized.

@kwlzn

kwlzn Mar 13, 2017

Member

ah, my bad - fixed.

@kwlzn

kwlzn Mar 13, 2017

Member

ah, my bad - fixed.

@@ -0,0 +1,7 @@
+extern crate gcc;
+

This comment has been minimized.

@stuhood

stuhood Mar 13, 2017

Member

This is fairly clear, but a snippet/blurb (basically, some of the review description) would be good inline.

@stuhood

stuhood Mar 13, 2017

Member

This is fairly clear, but a snippet/blurb (basically, some of the review description) would be good inline.

This comment has been minimized.

@kwlzn

kwlzn Mar 13, 2017

Member

sure, added!

@kwlzn

kwlzn Mar 13, 2017

Member

sure, added!

@kwlzn kwlzn merged commit 8e24991 into pantsbuild:master Mar 13, 2017

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

lenucksi added a commit to lenucksi/pants that referenced this pull request Apr 25, 2017

[engine] Move to new-style CFFI callbacks. (#4324)
Fixes #4036

Problem

We currently use "old-style callbacks" with CFFI. These are known to be slower and cause more complicated stack traces during debugging/performance analysis.

Solution

Switch to "new-style callbacks" using the following approach:

Bootstrap CFFI C sources pre-pants execution by way of a simple, venv-backed entrypoint fed by native.py and integrated into bootstrap.sh.
Compile and link the generated CFFI C sources into the rust binary at build time with cargo, by way of a build script + the gcc-rs package.
On OSX, we instruct cargo to use a linker flag (via .cargo/config) that enables weak linking to avoid missing python symbols during linking (this behavior is the default on Linux). The weakly linked symbols (e.g. _PyImport_ImportModule) are then dynamically resolved within the address space of the parent binary (python) at runtime.
Rename the native-engine rust binary to native_engine.so to be python import compatible.
Dynamically load the native_engine.so binary as a python module by injecting the binaryutil dir into sys.path at runtime and initializing the new-style callbacks with the closed-loop ffi object.
This results in a single native-engine binary that can be loaded both as a python module (import) and as a C module (dlopen) and is not statically or dynamically linked to python.

Result

As tested on my Sierra laptop, this is good for a >10% speedup as measured by ./pants --enable-v2-engine list :: in the pants repo.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment