-
Notifications
You must be signed in to change notification settings - Fork 898
Guidelines for contributing developers
This page explains the principles and development process that we ask contributing developers to follow.
Any contributions you make will be under the Apache 2.0 Software License.
In short, when you submit code changes, your submissions are understood to be under the same the Apache 2.0 License that covers the Kedro project. You should have permission to share the submitted code.
You don't need to contribute code to help the Kedro project. See our list of other ways you can contribute to Kedro. |
---|
This guide is a practical description of:
- How to set up your development environment to contribute to Kedro.
- How to prepare a pull request against the Kedro repository.
To work on the Kedro codebase, you will need to be set up with Git, and Make.
You will also need to create and activate virtual environment. If this is unfamiliar to you, read through our pre-requisites documentation.
Next, you'll need to fork the Kedro source code from the GitHub repository:
- Fork the project by clicking Fork in the top-right corner of the Kedro GitHub repository
- Choose your target account
If you need further guidance, consult the GitHub documentation about forking a repo.
You are almost ready to go. In your terminal, navigate to the folder into which you forked the Kedro code.
Run these commands to install everything you need to work with Kedro:
make install-test-requirements
make install-pre-commit
Once the above commands have executed successfully, do a sanity check to ensure that kedro
works in your environment:
make test
Note: If you are working on kedro-datasets
, you may need additional system dependencies
. If you find the unrelated tests failing, you can either install all the dependencies, or simply go to the PR and check if CI is failing.
Once you are ready to contribute, a good place to start is to take a look at the good first issues
and help wanted issues
on GitHub.
We focus on two areas for contribution: core
and plugin
:
-
core
refers to the primary Kedro library. Read thecore
contribution process for details. -
plugin
refers to new functionality that requires a Kedro CLI command e.g. adding in Airflow functionality and adding a new dataset to thekedro-datasets
package. Theplugin
development documentation contains guidance on how to design and develop a Kedroplugin
.
Typically, we only accept small contributions to the core
Kedro library.
To contribute:
-
Create a feature branch on your forked repository and push all your local changes to that feature branch.
-
Is your change non-breaking and backwards-compatible? Your feature branch should branch off from:
-
main
if you intend for it to be a non-breaking, backwards-compatible change. -
develop
if you intend for it to be a breaking change.
-
-
Before you submit a pull request (PR), please ensure that unit tests, end-to-end (E2E) tests and linters are passing for your changes by running
make test
,make e2e-tests
andmake lint
locally; see the development set up section above. -
Open a PR:
- For backwards compatible changes, open a PR against the
kedro-org:main
branch from your feature branch. - For changes that are NOT backwards compatible, open a PR against the
kedro-org:develop
branch from your feature branch.
- For backwards compatible changes, open a PR against the
-
Await reviewer comments.
-
Update the PR according to the reviewer's comments.
-
Your PR will be merged by the Kedro team once all the comments are addressed.
We will work with you to complete your contribution, but we reserve the right to take over abandoned PRs. |
---|
Give your pull request a descriptive title. Before you submit it, consider the following:
- You should aim for cross-platform compatibility on Windows, macOS and Linux
- We use Semantic Versioning for versioning
- We have designed our code to be compatible with Python 3.7 onwards and our style guidelines are (in cascading order):
-
PEP 8 conventions for all Python code
-
Google docstrings for code comments
-
PEP 484 type hints for all user-facing functions/class methods; e.g.
def count_truthy(elements: List[Any]) -> int: return sum(1 for elem in elements if element)
-
Ensure that your PR builds cleanly before you submit it, by running the CI/CD checks locally, as follows:
-
make lint
: PEP-8 Standards (ruff
,black
) -
make test
: unit tests, 100% coverage (pytest
,pytest-cov
) -
make e2e-tests
: end-to-end tests (behave
)
We place conftest.py files in some test directories to make fixtures reusable by any tests in that directory. If you need to see which test fixtures are available and where they come from, you can issue the following command pytest --fixtures path/to/the/test/location.py . |
---|
The Kedro repository requires that you squash and merge your pull request commits, and, in most cases, the merge message for a squash merge then defaults to the pull request title.
For clarity, your pull request title should be descriptive, and we ask you to follow some guidelines suggested by Chris Beams in his post How to Write a Git Commit Message. In particular, for your pull request title, we suggest that you:
- Limit the length to 50 characters
- Capitalise the first letter of the first word
- Omit the period at the end
- Use the imperative tense
pre-commit
hooks run checks automatically on all the changed files on each commit but can be skipped with the --no-verify
or -n
flag:
git commit --no-verify <...>
All checks will run during CI build, so skipping checks on commit will not allow you to merge your code with failing checks. You can uninstall the pre-commit
hooks by running:
make uninstall-pre-commit
pre-commit
will still be used by make lint
, but will not install the git hooks.
We require that all contributions comply with the Developer Certificate of Origin (DCO). This certifies that the contributor wrote or otherwise has the right to submit their contribution.
All commits must be signed off by including a Signed-off-by
line in the commit message:
This is my commit message
Signed-off-by: Random J Developer <random@developer.example.org>
The sign-off can be added automatically to your commit message using the -s
option:
git commit -s -m "This is my commit message"
To avoid needing to remember the -s
flag on every commit, you might like to set up a git alias for git commit -s
. Alternatively, run make sign-off
to setup a commit-msg
Git hook that automatically signs off all commits (including merge commits) you make while working on the Kedro repository.
If your PR is blocked due to unsigned commits, then you must follow the instructions under "Rebase the branch" on the GitHub Checks page for your PR. This will retroactively add the sign-off to all unsigned commits and allow the DCO check to pass.
Working on your first pull request? You can learn how from these resources:
Previous Q&A on GitHub discussions and the searchable archive of Slack discussions. You can ask new questions about the development process on Slack too!
- Contribute to Kedro
- Guidelines for contributing developers
- Contribute changes to Kedro that are tested on Databricks
- Backwards compatibility and breaking changes
- Contribute to the Kedro documentation
- Kedro documentation style guide
- Creating developer documentation
- Kedro new project creation - how it works
- The CI Setup: GitHub Actions