Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider switching to "new style" cffi callbacks #4036

Closed
stuhood opened this issue Nov 6, 2016 · 4 comments · Fixed by #4324
Closed

Consider switching to "new style" cffi callbacks #4036

stuhood opened this issue Nov 6, 2016 · 4 comments · Fixed by #4324
Assignees

Comments

@stuhood
Copy link
Sponsor Member

stuhood commented Nov 6, 2016

To avoid needing a compile step, we currently use "old style" cffi callbacks. The new style are purportedly faster and less error prone, but they require an additional build step, and so we would want to confirm the speedup before switching.

@stuhood
Copy link
Sponsor Member Author

stuhood commented Feb 1, 2017

It looks like switching to the new callbacks makes the entire execution about 10% faster overall (meaning the individual CFFI calls were probably 50%ish faster). And the profile got about 10x more usable, which is great.

Additionally, it looks like it would probably replace the way we ship the native engine binary: instead, you'd have two *.so files: one for the cffi generated extern stub python module, and one for the rust binary. Loading the generated stub module *.so triggers loading of the second *.so in the same directory.

The only issue (and it seems like a significant one to me) is that it links against python. It feels like that would complicate things. If that is surmountable, then I think we should absolutely switch.

cc @kwlzn

@stuhood stuhood self-assigned this Feb 1, 2017
@stuhood
Copy link
Sponsor Member Author

stuhood commented Feb 22, 2017

I've posted some partial work on this one here... it's working, but manually integrated in the form of two checked-in *.so files.

@stuhood
Copy link
Sponsor Member Author

stuhood commented Feb 22, 2017

cc @kwlzn

@kwlzn
Copy link
Member

kwlzn commented Mar 12, 2017

have an implementation out in #4324 that solves for the deployability concerns mentioned above. ptal!

kwlzn added a commit that referenced this issue Mar 13, 2017
Fixes #4036

Problem

We currently use "old-style callbacks" with CFFI. These are known to be slower and cause more complicated stack traces during debugging/performance analysis.

Solution

Switch to "new-style callbacks" using the following approach:

Bootstrap CFFI C sources pre-pants execution by way of a simple, venv-backed entrypoint fed by native.py and integrated into bootstrap.sh.
Compile and link the generated CFFI C sources into the rust binary at build time with cargo, by way of a build script + the gcc-rs package.
On OSX, we instruct cargo to use a linker flag (via .cargo/config) that enables weak linking to avoid missing python symbols during linking (this behavior is the default on Linux). The weakly linked symbols (e.g. _PyImport_ImportModule) are then dynamically resolved within the address space of the parent binary (python) at runtime.
Rename the native-engine rust binary to native_engine.so to be python import compatible.
Dynamically load the native_engine.so binary as a python module by injecting the binaryutil dir into sys.path at runtime and initializing the new-style callbacks with the closed-loop ffi object.
This results in a single native-engine binary that can be loaded both as a python module (import) and as a C module (dlopen) and is not statically or dynamically linked to python.

Result

As tested on my Sierra laptop, this is good for a >10% speedup as measured by ./pants --enable-v2-engine list :: in the pants repo.
lenucksi pushed a commit to lenucksi/pants that referenced this issue Apr 25, 2017
Fixes pantsbuild#4036

Problem

We currently use "old-style callbacks" with CFFI. These are known to be slower and cause more complicated stack traces during debugging/performance analysis.

Solution

Switch to "new-style callbacks" using the following approach:

Bootstrap CFFI C sources pre-pants execution by way of a simple, venv-backed entrypoint fed by native.py and integrated into bootstrap.sh.
Compile and link the generated CFFI C sources into the rust binary at build time with cargo, by way of a build script + the gcc-rs package.
On OSX, we instruct cargo to use a linker flag (via .cargo/config) that enables weak linking to avoid missing python symbols during linking (this behavior is the default on Linux). The weakly linked symbols (e.g. _PyImport_ImportModule) are then dynamically resolved within the address space of the parent binary (python) at runtime.
Rename the native-engine rust binary to native_engine.so to be python import compatible.
Dynamically load the native_engine.so binary as a python module by injecting the binaryutil dir into sys.path at runtime and initializing the new-style callbacks with the closed-loop ffi object.
This results in a single native-engine binary that can be loaded both as a python module (import) and as a C module (dlopen) and is not statically or dynamically linked to python.

Result

As tested on my Sierra laptop, this is good for a >10% speedup as measured by ./pants --enable-v2-engine list :: in the pants repo.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants