A C++ library that embeds a single Python interpreter, providing thread‑pooled, asynchronous, and synchronous invocation of Python functions from C++ with optimized GIL usage.
Optimized as an execution engine for Machine Learning pipelines.
- Thread Safe Implementation
Ensures only one Python interpreter is ever initialized per process for pre python3.13.
- Pre Python 3.13
- Python 3.13+ no-gil sub interpreter
- Thread‑Pooled Execution
Uses a high‑performance round robin queue to minimize latency during high throughput workloads - Optimized GIL Management
Acquires/releases the Global Interpreter Lock only around actual Python execution. - Synchronous & Asynchronous APIs
Easily call Python functions synchronously or schedule them with callbacks returningstd::future
. - Opportunistic Batching Batches similar workloads together to minimize up-call latencies into Python.
System Dependencies
- CMake ≥ 3.14
- C++17 (or later)
- Python ≥ 3.x development headers
- CUDAToolkit
Package Dependencies
Can be installed ob Ubuntu using sudo apt install pybind11-dev libconcurrentqueue-dev libdlpack-dev
- Initialize submodules
git submodule update --init --recursive
- We provide a simple
configure.sh
script to invoke CMake configuration. Use the-h | --help
flag to see the possible options.
# Configure a Release build, including examples
./configure.sh -t Release -e
- Install
sudo cmake --install build
- Uninstall
sudo xargs rm < build/install_manifest.txt
The configure script also supports custom install prefix. You can either use the -p | --prefix
flag, or specify the PYTHON_UDL_INTERFACE_PREFIX
environment variable.