JIT Code Generation for Arrow-DataFusion

This project is a code generation tool for the Arrow-DataFusion project.

With JIT codegen, we could generate specific code for each query to reduce branching overhead from the generalized interpret mode execution. Furthermore, we could reduce the memory footprint during the execution by chaining multiple Arrow compute kernels together and reusing the intermediate vectors.

I've just finished the proof of concept and will first try to accelerate row and columnar data transformation introduced in apache/datafusion#1782.

Development

TODOs:

Function register and reuse
Hook JIT codegen with DataFusion RuntimeEnv
Support unsigned int types
...

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
datafusion-jit		datafusion-jit
row		row
.gitignore		.gitignore
Cargo.toml		Cargo.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

JIT Code Generation for Arrow-DataFusion

Development

About

Releases

Packages

Languages

yjshen/df-codegen

Folders and files

Latest commit

History

Repository files navigation

JIT Code Generation for Arrow-DataFusion

Development

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages