From 2f49d709ba07c655539f0397ebe6269ee09abc02 Mon Sep 17 00:00:00 2001 From: Ralf Gommers Date: Wed, 28 Mar 2012 22:48:19 +0200 Subject: [PATCH] DOC: add document on how to contribute to SciPy. --- doc/HOWTO_CONTRIBUTE.rst.txt | 246 +++++++++++++++++++++++++++++++++++ 1 file changed, 246 insertions(+) create mode 100644 doc/HOWTO_CONTRIBUTE.rst.txt diff --git a/doc/HOWTO_CONTRIBUTE.rst.txt b/doc/HOWTO_CONTRIBUTE.rst.txt new file mode 100644 index 000000000000..d659d58fc57c --- /dev/null +++ b/doc/HOWTO_CONTRIBUTE.rst.txt @@ -0,0 +1,246 @@ +Contributing to SciPy +===================== + +This document aims to give an overview of how to contribute to SciPy. It +tries to answer commonly asked questions, and provide some insight into how the +community process works in practice. Readers who are familiar with the SciPy +community and are experienced Python coders may want to jump straight to the +`git workflow`_ documentation. + + +Contributing new code +--------------------- + +If you have been working with the scientific Python toolstack for a while, you +probably have some code lying around of which you think "this could be useful +for others too". Perhaps it's a good idea then to contribute it to SciPy or +another open source project. The first question to ask is then, where does +this code belong? That question is hard to answer here, so we start with a +more specific one: *what code is suitable for putting into SciPy?* +Almost all of the new code added to scipy has in common that it's potentially +useful in multiple scientific domains and it fits in the scope of existing +scipy submodules. In principle new submodules can be added too, but this is +far less common. For code that is specific to a single application, there may +be an existing project that can use the code. Some scikits (`scikit-learn`_, +`scikits-image`_, `statsmodels`_, etc.) are good examples here; they have a +narrower focus and because of that more domain-specific code than SciPy. + +Now if you have code that you would like to see included in SciPy, how do you +go about it? The first step is to discuss on the scipy-dev mailing list. All +new features, as well as changes to existing code, are discussed and decided on +there. You can, and probably should, already start this discussion before your +code is finished. + +Assuming the outcome of the discussion on the mailing list is positive and you +have a function or piece of code that does what you need it to do, what next? +Before code is added to SciPy, it at least has to have good documentation, unit +tests and correct code style. + +1. Unit tests + In principle you should aim to create unit tests that exercise all the code + that you are adding. This gives some degree is confidence that your code + runs correctly, also on Python versions and hardware or OSes that you don't + have available yourself. An extensive description of how to write unit + tests is given in the NumPy `testing guidelines`_. + +2. Documentation + Clear and complete documentation is essential in order for users to be able + to find and understand the code. Documentation for individual functions + and classes -- which includes at least a basic description, type and + meaning of all parameters and returns values, and usage examples -- is put + in docstrings. Those docstrings can be read within the interpreter, and + are compiled into a reference guide in html and pdf format. Higher-level + documentation for key (areas of) functionality is provided in tutorial + format and/or in module docstrings. A guide on how to write documentation + is given in `how to document`_. + +3. Code style + Uniformity of style in which code is written is important to others trying + to understand the code. SciPy follows the standard Python guidelines for + code style, `PEP8`_. In order to check that your code conforms to PEP8, + you can use the `pep8 package`_ style checker. Most IDEs and text editors + have settings that can help you follow PEP8, for example by translating + tabs by four spaces. Using `pyflakes`_ to check your code is also a good + idea. + +At the end of this document a checklist is given that may help to check if your +code fulfills all requirements for inclusion in SciPy. + +Another question you may have is: *where exactly do I put my code*? To answer +this, it is useful to understand how the SciPy public API is defined. For most +modules the API is two levels deep, which means your new function should appear +as ``scipy.submodule.my_new_func``. ``my_new_func`` can be put in an existing +or new file under ``/scipy//``, its name is added to the ``__all__`` +dict in that file (which lists all public functions in the file), and those +public functions are then imported in ``/scipy//__init__.py``. Any +private functions/classes should have a leading underscore (``_``) in their +name. A more detailed description of what the public API of SciPy is, is given +in `SciPy API`_. + +Once you think your code is ready for inclusion in SciPy, you can send a pull +request (PR) on Github. We won't go into the details of how to work with git +here, this is described well in the `git workflow`_ section of the NumPy +documentation and in the Github help pages. When you send the PR for a new +feature, be sure to also mention this on the scipy-dev mailing list. This can +prompt interested people to help review your PR. Assuming that you already got +positive feedback before on the general idea of your code/feature, the purpose +of the code review is to ensure that the code is correct, efficient and meets +the requirements outlined above. In many cases the code review happens +relatively quickly, but it's possible that it stalls. If you have addressed +all feedback already given, it's perfectly fine to ask on the mailing list +again for review (after a reasonable amount of time, say a couple of weeks, has +passed). Once the review is completed, the PR is merged into the "master" +branch of SciPy. + +The above describes the requirements and proces for adding code to SciPy. It +doesn't yet answer the question though how decisions are made exactly, and how +makes them. The basic answer is: decisions are made by consensus, by everyone +who chooses to participate in the discussion on the mailing list. This +includes developers, other users and yourself. Aiming for consensus in the +discussion is important -- SciPy is a project by and for the scientific Python +community. In those rare cases that agreement cannot be reached, the +`maintainers`_ of the module in question can decide the issue. + + +Contributing by helping maintain existing code +---------------------------------------------- + +The previous section talked specifically about adding new functionality to +SciPy. A large part of that discussion also applies to maintenance of existing +code. Maintenance means fixing bugs, improving code quality or style, +documenting existing functionality better, keeping build scripts up-to-date, +etc. The SciPy `Trac`_ bug tracker contains all reported bugs, +build/documentation issues, etc. Fixing issues described in Trac tickets helps +improve the overall quality of SciPy, and is also a good way of getting +familiar with the project. You may also want to fix a bug because you ran into +it and need the function in question to work correctly. + +The discussion on code style and unit testing above apllies equally to bug +fixes. It is usuallly best to start by writing a unit test that shows the +problem, i.e. it should pass but doesn't. Once you have that, you can fix the +code so that the test does pass. That should be enough to send a PR for this +issue. Unlike when adding new code, discussing this on the mailing list may +not be necessary - if the old behavior of the code is clearly incorrect, no one +will object to having it fixed. It may be necessary to add some warning or +deprecation message for the changed behavior. This should be part of the +review process. + + +Other ways to contribute +------------------------ + +There are many ways to contribute other than contributing code. Participating +in discussions on the scipy-user and scipy-dev *mailing lists* is a contribution +in itself. The `scipy.org`_ *website* contains a lot of information on the +SciPy community and can always use a new pair of hands. A redesign of this +website is ongoing, see `scipy.github.com`_. The redesigned website is a +static site based on Sphinx, the sources for it are +also on Github at `scipy.org-new`_. + +The SciPy documentation is constantly being improved by many developers and +users. You can send PRs that improve the documentation, but there's also a +`documentation wiki`_ that is very convenient for making edits to docstrings +(and doesn't require git knowledge). Anyone can register a username on that +wiki, ask on the scipy-dev mailing list for edit rights and make edits. The +documentation there is updated every day with the latest changes in the SciPy +master branch, and wiki edits are regularly reviewed and merged into master. + +Code that doesn't belong in SciPy itself or in another package but helps users +accomplish a certain task is valuable. `SciPy Central`_ is the place to share +this type of code (snippets, examples, plotting code, etc.). + + +Useful links, FAQ, checklist +---------------------------- + +Checklist before submitting a PR +```````````````````````````````` + + - Are there unit tests with good code coverage? + - Do all public function have docstrings including examples? + - Is the code style correct (PEP8, pyflakes) + - Is the new functionality tagged with ``.. versionadded:: X.Y.Z``? + - Is the new functionality mentioned in the release notes of the next release? + - Is the new functionality added to the reference guide? + - In case of larger additions, is there a tutorial or more extensive + module-level description? + - In case compiled code is added, is it integrated correctly via setup.py + (and preferably also Bento/Numscons configuration files)? + - If you are a first-time contributor, did you add yourself to THANKS.txt? + Please note that this is perfectly normal and desirable - the aim is to + give every single contributor credit, and if you don't add yourself it's + simply extra work for the reviewer (or worse, he may forget). + + +Useful SciPy documents +`````````````````````` + + - The `how to document`_ guidelines + - NumPy/SciPy `testing guidelines`_ + - `SciPy API`_ + - SciPy `maintainers`_ + - NumPy/SciPy `git workflow`_ + + +FAQ +``` + +*Can I use a programming language other than Python to speed up my code?* + +Yes. The languages used in SciPy are Python, Cython, C, C++ and Fortran. All +of these have their pros and cons. If Python really doesn't offer enough +performance, one of those languages can be used. Important concerns when +using compiled languages are maintainability and portability. For +maintainability, Cython is clearly preferred over C/C++/Fortran. Cython and C +are more portable than C++/Fortran. A lot of the existing C and Fortran code +in SciPy is older, battle-tested code that was only wrapped in (but not +specifically written for) Python/SciPy. Therefore the basic advice is: use +Cython. If there's specific reasons why C/C++/Fortran should be preferred, +please discuss those reasons first. + + +*There's overlap between Trac and Github, which do I use for what?* + +Trac_ is the bug tracker, Github_ the code repository. Before the SciPy code +repository moved to Github, the preferred way to contribute code was to create +a patch and attach it to a Trac ticket. The overhead of this approach is much +larger than sending a PR on Github, so please don't do this anymore. Use Trac +for bug reports, Github for patches. + + + +.. _scikit-learn: http://scikit-learn.org + +.. _scikits-image: http://scikits-image.org/ + +.. _statsmodels: http://statsmodels.sourceforge.net/ + +.. _testing guidelines: https://github.com/numpy/numpy/blob/master/doc/TESTS.rst.txt + +.. _how to document: https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt + +.. _PEP8: http://www.python.org/dev/peps/pep-0008/ + +.. _pep8 package: http://pypi.python.org/pypi/pep8 + +.. _pyflakes: http://pypi.python.org/pypi/pyflakes + +.. _SciPy API: http://docs.scipy.org/doc/scipy/reference/api.html + +.. _git workflow: http://docs.scipy.org/doc/numpy/dev/gitwash/index.html + +.. _maintainers: https://github.com/scipy/scipy/blob/maintainers/doc/MAINTAINERS.rst.txt + +.. _Trac: http://projects.scipy.org/scipy/timeline + +.. _Github: https://github.com/scipy/scipy + +.. _scipy.org: http://scipy.org/ + +.. _scipy.github.com: http://scipy.github.com/ + +.. _scipy.org-new: https://github.com/scipy/scipy.org-new + +.. _documentation wiki: http://docs.scipy.org/scipy/Front%20Page/ + +.. _SciPy Central: http://scipy-central.org/