A quick and dirty way to link to Velox and experiment with it!
Currently, there's one program here: from_substrait. It takes in as an argument a path to a protobuf file containing a substrait plan and attempts to run it using Velox. There are also python bindings that run substrait from Python using Velox.
- Follow the instructions on Velox's homepage to compile it (using the Makefile). Make sure that Substrait and Parquet are enabled!
- By default, the Makefile in this repository assumes that Velox's root directory is at ~/velox. If you downloaded it elsewhere, be sure to set the environment variable VELOX_ROOT to the appropriate directory.
- Run
make - Run
./from_substrait plan.jsonand./from_substrait plan_simple.json! The program outputs via stdout, so the output can be long. You can do./from_substrait plan.json > results.txtto get them in a file.
- After running
make, there should be a filevelox.so. - If you run
pythonfrom the repository directory, you should be able toimport python - Take your plan's json and feed it to
velox.from_json!
import velox
import pyarrow as pa
# Read the plan's substrait json and feed it to Velox
# It returns a VeloxResult
with open('plan.json', 'r') as f:
result = velox.from_json(f.read())
# Iterate through the result to get VeloxVectors
for vec in result:
# VeloxVector's are convertible to pyarrow record batches!
rb = vec.to_arrow()