Contributing to Daft

Reporting Issues

To report bugs and issues with Daft, please report in detail:

Operating system
Daft version
Python version
Runner that your code is using

Proposing Features

Please start a GitHub Discussion in our Ideas channel. Once the feature is clarified, fleshed out and approved, the corresponding issue(s) will be created from the GitHub discussion.

When proposing features, please include:

Feature Summary (no more than 3 sentences)
Example usage (pseudo-code to show how it is used)
Corner-case behavior (how should this code behave in various corner-case scenarios)

Contributing Code

Development Environment

To set up your development environment:

Ensure that your system has a suitable Python version installed (>=3.7, <=3.11)
Install the Rust compilation toolchain
Clone the Daft repo: git clone git@github.com:Eventual-Inc/Daft.git
Run make .venv from your new cloned Daft repository to create a new virtual environment with all of Daft's development dependencies installed
Run make hooks to install pre-commit hooks: these will run tooling on every commit to ensure that your code meets Daft development standards

Developing

make build: recompile your code after modifying any Rust code in src/
make test: run tests
DAFT_RUNNER=ray make test: set the runner to the Ray runner and run tests (DAFT_RUNNER defaults to py)

Developing with Ray

Running a development version of Daft on a local Ray cluster is as simple as including daft.context.set_runner_ray() in your Python script and then building and executing it as usual.

To use a remote Ray cluster, run the following steps on the same operating system version as your Ray nodes, in order to ensure that your binaries are executable on Ray.

mkdir wd: this is the working directory, it will hold all the files to be submitted to Ray for a job
ln -s daft wd/daft: create a symbolic link from the Python module to the working directory
make build-release: an optimized build to ensure that the module is small enough to be successfully uploaded to Ray. Run this after modifying any Rust code in src/
ray job submit --working-dir wd --address "http://<head_node_host>:8265" -- python script.py: submit wd/script.py to be run on Ray

Benchmarking

Benchmark tests are located in tests/benchmarks. If you would like to run benchmarks, make sure to first do make build-release instead of make build in order to compile an optimized build of Daft.

pytest tests/benchmarks/[test_file.py] --benchmark-only: Run all benchmarks in a file
pytest tests/benchmarks/[test_file.py] -k [test_name] --benchmark-only: Run a specific benchmark in a file

More information about writing and using benchmarks can be found on the pytest-benchmark docs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CONTRIBUTING.md

CONTRIBUTING.md

Contributing to Daft

Reporting Issues

Proposing Features

Contributing Code

Development Environment

Developing

Developing with Ray

Benchmarking

Files

CONTRIBUTING.md

Latest commit

History

CONTRIBUTING.md

File metadata and controls

Contributing to Daft

Reporting Issues

Proposing Features

Contributing Code

Development Environment

Developing

Developing with Ray

Benchmarking