This project is a code generation tool for the Arrow-DataFusion project.
With JIT codegen, we could generate specific code for each query to reduce branching overhead from the generalized interpret mode execution. Furthermore, we could reduce the memory footprint during the execution by chaining multiple Arrow compute kernels together and reusing the intermediate vectors.
I've just finished the proof of concept and will first try to accelerate row and columnar data transformation introduced in apache/datafusion#1782.
TODOs:
- Function register and reuse
- Hook JIT codegen with DataFusion
RuntimeEnv
- Support unsigned int types
- ...