Skip to content

Latest commit

 

History

History
190 lines (127 loc) · 18.7 KB

CONTRIBUTING.md

File metadata and controls

190 lines (127 loc) · 18.7 KB

Contributing to PyKale

Light involvements (viewers/users) | Medium involvements (contributors) | Heavy involvements (maintainers)

Ask questions | Report bugs | Suggest improvements | Branch, fork & pull | Coding style | Test | Review & merge | Release & management

Thank you for your interest! You can contribute to the PyKale project in a wide range of ways listed above, from light to heavy involvements. You can also reach us via email if needed. The participation in this open source project is subject to Code of Conduct.

Light involvements (viewers/users)

See the ReadMe for installation instructions. Your contribution can start as light as asking questions.

Ask questions

Ask any questions about PyKale on the PyKale's GitHub Discussions tab and we will discuss and answer you there. Questions help us identify blind spots in our development and can greatly improve the code quality.

Report bugs

Search current issues to see whether they are already reported. If not, report bugs by creating issues using the provided template. Even better, if you know how to fix them, make the suggestions and/or propose changes with pull requests.

Suggest improvements

Suggest possible improvements such as new features or code refactoring by creating issues using the respective templates. Even better, you are welcome to propose such changes with pull requests.

Medium involvements (contributors)

We follow PyTorch to use US English spelling and recommend spell check via Grazie in PyCharm and Code Spell Checker in VS code with US English setting.

Branch, fork and pull

A maintainer with write access can create a branch directly here in pykale to make changes under the shared repository model, following the steps below while skipping the fork step.

Anyone can use the fork and pull model to contribute code to PyKale:

  • Fork pykale (also see the guide on forking projects).
    • Keep the fork main branch synced with pykale:main by syncing a fork.
    • Install pre-commit to enforce style via pip install pre-commit and pre-commit install at the root.
  • Create a branch based on the latest main in your fork with a descriptive name on what you plan to do, e.g. to fix an issue, starting with the issue ticket number.
    • Make changes to this branch using detailed commit messages and following the coding style below. In particular, do frequent commits and small-scale pull requests to make them more focused and easier to review.
    • Sync your branch with the main frequently so that potential problems can be identified earlier.
    • Document the update in Google Style Python Docstrings. Update docs following docs update steps. Build docs via make html and verify locally built documentations under docs\build\html.
    • Build tests and do tests (not enforced yet, to be done).
  • Create a draft pull request or pull request or from the task branch above to the main branch pykale:main explaining the changes, add an appropriate label or more, and choose one reviewer or more, using a template.
    • If merged, the title of your PR (typically start with a verb) will automatically become part of the changelog in the next release, where the label of your PR will be used to group the changes into categories. Make the title and label precise and descriptive.
    • A draft pull request helps start a conversation with collaborators in a draft state. It will not be reviewed or merged until you change the status to “Ready for review” near the bottom of your pull request.
    • View the continuous integration (CI) status checks to fix the found problems. Some test actions may fail/cancel due to server reasons, particularly Test (macos-latest, ...). In such cases, re-run the test workflow later can usually pass. Also, when the check messages say files are changed, they mean changes in the simulated environment, NOT on the branch.
    • You need to address merge conflicts if they arise. Resolve the conflicts locally.
    • After passing all CI checks and resolving the conflicts, you should request a review. If you know who is appropriate or like the suggested reviewers, request/assign that person. Otherwise, we will assign one shortly.
    • A reviewer will follow the review and merge guidelines. The reviewer may discuss with you and request explanations/changes before merging.
    • Merging to the main branch requires ALL checks to pass AND at least one approving review.
    • Small pull requests are preferred for easier review. In exceptional cases of a long branch with a large number of commits in a PR, you may consider breaking it into several smaller branches and PRs, e.g. via git-cherry-pick, for which a video is available to help.

Before pull requests: pre-commit hooks

We set up several pre-commit hooks to ensure code quality, including

You need to install pre-commit and the hooks from the root directory via

pip install pre-commit
pre-commit install

This will install the pre-commit hooks at pykale\.git\hooks, to be triggered by each new commit to automatically run them over the files you commit. In this way, problems can be detected and fixed early. Several important points to note:

  • Pre-commit hooks are configured in .pre-commit-config.yaml. Only administrator should modify it.
  • These hooks, e.g., black and isort, will automatically fix some problems for you by changing the files, so please check the changes after you trigger commit.
  • If your commits can not pass the above checks, read the error message to see what has been automatically fixed and what needs your manual fix, e.g. flake8 errors. Some flake8 errors may be fixed by some hooks so you can rerun the pre-commit (e.g. re-commit to trigger it) or just run flake8 to see the updated flake8 errors.
  • If your commits can not pass the check for added large files and see the error message of json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0), try to upgrade your git to a version >= 2.29.2 to fix it.

Manual checks and fixes (be CAREFUL)

Required libraries will be automatically installed but if you wish, you may install them manually and run them from the root directory (so that the PyKale configurations are used). For example,

pip install black # The first time
black ./kale/embed/new_module.py # "black ." do it for all files
pip install isort # The first time
isort ./kale/embed/new_module.py # "isort ." do it for all files
pip install flake8 # The first time
flake8 ./kale/embed/new_module.py # "flake8 ." do it for all files

Run black and isort will fix the found problems automatically by modifying the files but they will be automatically run and you do not need to do it manually. Remaining flake8 or other errors need to be manually fixed.

Important: Run these commands from the root directory so that the PyKale configuration files (setup.cfg, pyproject.toml, and .pre-commit-config.yaml) are used for these tools. Otherwise, the default configurations will be used, which differ from the PyKale configurations and are consistent.

IDE integration: flake8 linting can be set up for both VSCode and PyCharm but you must use setup.cfg to configure it. In this way, you could fix linting errors on the go.

Automated GitHub workflows (continuous integration)

For continuous integration (CI) and continuous deployment (CD), we use several GitHub workflows (actions) that will be triggered upon a push or pull request as specified at pykale/.github/workflows/

  • Build: install Python dependencies (set up)
  • Linting: run flake8 and pre-commit
  • Tests: unit and regression tests (in progress)

We will make the above more complete and rigorous, e.g. with more tests and code coverage analysis etc.

Pull request template

We have a pull request template. Please use it for all pull requests and mark the status of your pull requests.

  • Ready: ready for review and merge (if no problems found). Reviewers will be assigned.
  • Work in progress: for core team's awareness of this development (e.g. to avoid duplicated efforts) and possible feedback (e.g. to find problems early, such as linting/CI issues). Not ready to merge yet. Change it to Ready when ready to merge.
  • Hold: not for attention yet.

Coding style

We aim to design the core kale modules to be highly reusable, generic, and customizable, and follow these guidelines:

  • Follow the continuous integration practice to make small changes and commit frequently with clear descriptions for others to understand what you have done. This can detect errors sooner, reduces debug need, make it easier to merge changes, and eventually save the overall time.
  • Use highly readable names for variables, functions, and classes. Using verbs is preferred when feasible for compactness. Use spell check with US English setting, e.g., Grazie in PyCharm and Code Spell Checker in VS code.
  • Use logging instead of print to log messages. Users can choose the level via, e.g., logging.getLogger().setLevel(logging.INFO). See the benefits.
  • Include detailed docstrings in code for generating documentations, following the Google Style Python Docstrings.
  • Highly reusable modules should go into kale. Highly data/example-specific code goes into Examples.
  • Configure learning systems using YAML following YACS. See our examples.
  • Use PyTorch and PyTorch Lightning (Video) as much as possible.
  • If high-quality existing code from other sources are used, add credit and license information at the top of the file.
  • Use pre-commit hooks to enforce consistent styles via flake8, black, and isort), with common PyKale configuration files.

Recommended development software

Testing

All new code should be covered by tests following the pykale test guidelines.

Heavy involvements (maintainers)

Review and merge pull requests

A maintainer assigned to review a pull request should follow GitHub guidelines on how to review changes in pull requests and incorporate changes from a pull request to review and merge the pull requests. Merging can be automated (see Automation), in which case an approving review will trigger the merging. You should NOT approve the changes if they are not ready to merge.

If you think you are not the right person to review, let the administrator (haipinglu) know for a reassignment. If multiple reviewers are assigned, anyone can approve and merge unless more approvals are explicitly required.

For simple problems, such as typos, hyperlinks, the reviewers can fix it directly and push the changes rather than comment and wait for the author to fix. This will speed up the development.

Release and management

The release will be done manually in GitHub, but with automatic upload to PyPI.

Versions

We follow the Semantic Versioning guidelines. Given a version number MAJOR.MINOR.PATCH, increment the:

  • MAJOR version when you make incompatible API changes,
  • MINOR version when you add functionality in a backwards compatible manner, and
  • PATCH version when you make backwards compatible bug fixes.

Additional labels for pre-release and build metadata are available as extensions to the MAJOR.MINOR.PATCH format.

Project boards

We set up project boards to manage the progress of development. A single default project contains all active/planned works, with automation.

Automation

We have adopted the GitHub automations including

  • Automerge: merges automatically when 1) one approving review is completed; 2) all CI checks have passed; and 3) one maintainer has enabled the auto-merge for a PR.
  • Auto branch deletion: deletes the head branches automatically after pull requests are merged. Deleted branches can be restored if needed.
  • Project board automation: automates project board card management.

References

The following libraries from the PyTorch ecosystem are good resources to learn from:

  • PyTorchLightning: a lightweight PyTorch wrapper for high-performance AI research
  • GPyTorch: a highly efficient and modular implementation of Gaussian processes in PyTorch
  • Kornia: computer vision library for PyTorch by the OpenCV team
  • MONAI: deep learning-based healthcare imaging workflows
  • PyTorch_Geometric: deep learning library for graphs
  • TensorLy: a library for tensor learning in Python
  • Torchio: medical image pre-processing and augmentation toolkit for deep learning