New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider switching to "new style" cffi callbacks #4036
Comments
It looks like switching to the new callbacks makes the entire execution about 10% faster overall (meaning the individual CFFI calls were probably 50%ish faster). And the profile got about 10x more usable, which is great. Additionally, it looks like it would probably replace the way we ship the native engine binary: instead, you'd have two The only issue (and it seems like a significant one to me) is that it links against python. It feels like that would complicate things. If that is surmountable, then I think we should absolutely switch. cc @kwlzn |
I've posted some partial work on this one here... it's working, but manually integrated in the form of two checked-in |
cc @kwlzn |
have an implementation out in #4324 that solves for the deployability concerns mentioned above. ptal! |
Fixes #4036 Problem We currently use "old-style callbacks" with CFFI. These are known to be slower and cause more complicated stack traces during debugging/performance analysis. Solution Switch to "new-style callbacks" using the following approach: Bootstrap CFFI C sources pre-pants execution by way of a simple, venv-backed entrypoint fed by native.py and integrated into bootstrap.sh. Compile and link the generated CFFI C sources into the rust binary at build time with cargo, by way of a build script + the gcc-rs package. On OSX, we instruct cargo to use a linker flag (via .cargo/config) that enables weak linking to avoid missing python symbols during linking (this behavior is the default on Linux). The weakly linked symbols (e.g. _PyImport_ImportModule) are then dynamically resolved within the address space of the parent binary (python) at runtime. Rename the native-engine rust binary to native_engine.so to be python import compatible. Dynamically load the native_engine.so binary as a python module by injecting the binaryutil dir into sys.path at runtime and initializing the new-style callbacks with the closed-loop ffi object. This results in a single native-engine binary that can be loaded both as a python module (import) and as a C module (dlopen) and is not statically or dynamically linked to python. Result As tested on my Sierra laptop, this is good for a >10% speedup as measured by ./pants --enable-v2-engine list :: in the pants repo.
Fixes pantsbuild#4036 Problem We currently use "old-style callbacks" with CFFI. These are known to be slower and cause more complicated stack traces during debugging/performance analysis. Solution Switch to "new-style callbacks" using the following approach: Bootstrap CFFI C sources pre-pants execution by way of a simple, venv-backed entrypoint fed by native.py and integrated into bootstrap.sh. Compile and link the generated CFFI C sources into the rust binary at build time with cargo, by way of a build script + the gcc-rs package. On OSX, we instruct cargo to use a linker flag (via .cargo/config) that enables weak linking to avoid missing python symbols during linking (this behavior is the default on Linux). The weakly linked symbols (e.g. _PyImport_ImportModule) are then dynamically resolved within the address space of the parent binary (python) at runtime. Rename the native-engine rust binary to native_engine.so to be python import compatible. Dynamically load the native_engine.so binary as a python module by injecting the binaryutil dir into sys.path at runtime and initializing the new-style callbacks with the closed-loop ffi object. This results in a single native-engine binary that can be loaded both as a python module (import) and as a C module (dlopen) and is not statically or dynamically linked to python. Result As tested on my Sierra laptop, this is good for a >10% speedup as measured by ./pants --enable-v2-engine list :: in the pants repo.
To avoid needing a compile step, we currently use "old style" cffi callbacks. The new style are purportedly faster and less error prone, but they require an additional build step, and so we would want to confirm the speedup before switching.
The text was updated successfully, but these errors were encountered: