Rework the contributor guide

biotite-dev · Apr 30, 2024 · 48b41ed · 48b41ed
1 parent 30a6bb4
commit 48b41ed
Show file tree

Hide file tree

Showing 9 changed files with 579 additions and 433 deletions.
diff --git a/doc/contribute.rst b/doc/contribute.rst
diff --git a/doc/contribution/deployment.rst b/doc/contribution/deployment.rst
@@ -0,0 +1,79 @@
+Deployment of a new release
+===========================
+This section describes how create and deploy a release build of the *Biotite*
+package and documentation.
+Therefore, this section primarily addresses the maintainers of the project.
+
+CCD update
+----------
+:mod:`biotite.structure.info` bundles selected information from the
+`Chemical Component Dictionary <https://www.wwpdb.org/data/ccd>`_ (CCD).
+From time to time, this dataset needs an update to include new components
+added to the CCD.
+This is achieved by running ``setup_ccd.py``.
+
+To keep the size of the repository small, the original commit from the initial
+script run should be rewritten, if the formats of the affected files are
+compatible with the original ones.
+
+Version bump
+------------
+A version bump requires changes in multiple locations:
+
+- ``src/biotite/__init__.py``: The main source of the version number.
+- ``doc/static/switcher.json``: The current version needs to be added and set
+  as the preferred one. This allows the documentation website to switch between
+  different versions of the documentation.
+
+The version bump is conducted by running the ``bump_version.yml`` CI job.
+It can be triggered via the GitHub Action ``Bump version``.
+This action creates a new pull request with the required changes.
+
+Creating a new release
+----------------------
+When a new *GitHub* release is created, the CI jobs building the distributions
+and documentation in ``test_and_deploy.yml`` are triggered.
+After the successful completion of these jobs, the artifacts are added to the
+release.
+The distributions for different platforms and Python versions are automatically
+uploaded to *PyPI*.
+
+Conda release
+-------------
+Some time after the release on GitHub, the ``conda-forge`` bot will also create
+an automatic pull request for the new release of the
+`Conda package <https://github.com/conda-forge/biotite-feedstock>`_.
+If no dependencies changed, this pull request can usually be merged without
+further effort.
+
+Documentation website
+---------------------
+The final step of the deployment is putting the directory containing the built
+documentation onto the server hosting the website.
+
+The document root of the website should look like this:
+
+.. code-block::
+
+   ├─ .htaccess
+   ├─ latest -> x.y.z/
+   ├─ x.y.z/
+   │  ├─ index.html
+   │  ├─ ...
+   ├─ a.b.c/
+      ├─ index.html
+      ├─ ...
+
+``x.y.z/`` and ``a.b.c/`` represent the documentation directories for two
+different *Biotite* release versions.
+
+``.htaccess`` should have the following content:
+
+.. code-block:: apache
+
+   RewriteBase /
+   RewriteEngine On
+   # Redirect if page name does not start with 'latest' or version identifier
+   RewriteRule ^(?!latest|\d+\.\d+\.\d+|robots.txt)(.*) latest/$1 [R=301,L]
+
+   ErrorDocument 404 /latest/404.html
diff --git a/doc/contribution/development.rst b/doc/contribution/development.rst
@@ -0,0 +1,187 @@
+Writing source code
+===================
+
+Scope
+-----
+The scope of *Biotite* are methods that make up the backbone of
+computational molecular biology. Thus, new functionalities added to
+*Biotite* should be relatively general and well established.
+
+Code of which the purpose is too special could be published as
+:ref:`extension package <extension_packages>` instead.
+
+Consistency
+-----------
+New functionalities should act on the existing central classes, if applicable
+to keep the code as uniform as possible.
+Specifically, these include
+
+- :class:`biotite.structure.AtomArray`,
+- :class:`biotite.structure.AtomArrayStack`,
+- :class:`biotite.structure.BondList`,
+- :class:`biotite.sequence.Sequence` and its subclasses,
+- :class:`biotite.sequence.Alphabet`,
+- :class:`biotite.sequence.Annotation`,
+  including :class:`biotite.sequence.Feature`
+  and :class:`biotite.sequence.Location`,
+- :class:`biotite.sequence.AnnotatedSequence`,
+- :class:`biotite.sequence.Profile`,
+- :class:`biotite.sequence.align.Alignment`,
+- :class:`biotite.application.Application` and its subclasses,
+- and in general :class:`numpy.ndarray`.
+
+If you think that the currently available classes miss a central *object*
+in bioinformatics, you might consider opening an issue on *GitHub* or reach
+out to the maintainers.
+
+Small *helper classes* for a functionality (for example an :class:`Enum` for a
+function parameter) is also permitted, as long as it does not introduce a
+redundancy with the classes mentioned above.
+
+Python version and interpreter
+------------------------------
+The package supports the three most recent versions of Python.
+In consequence, language features that were introduced after the oldest
+supported Python version are not allowed.
+
+This time span balances the support for older Python versions as well as
+the ability to use more recent features of the programming language.
+Furthermore, this package is currently made for usage with CPython.
+Official support for PyPy might be added someday.
+
+Code style
+----------
+*Biotite* is in compliance with PEP 8.
+The maximum line length is 79 for code lines and 72 for docstring and
+comment lines.
+An exception is made for docstring lines, if it is not possible to use a
+maximum of 72 characters (e.g. tables), and for
+`doctest <https://docs.python.org/3/library/doctest.html>`_ lines,
+where the actual code may take up to 79 characters.
+
+Dependencies
+------------
+*Biotite* aims to rely only on a few dependencies to keep the installation
+small.
+However optional dependencies for a specific dependency are also allowed if
+necessary.
+In this case add your special dependency to the list of extra
+requirements in ``install.rst``.
+The import statement for the dependency should be located directly inside the
+function or class, rather than module level, to ensure that the package is not
+required for any other functionality or for building the API documentation.
+
+An example for this approach is the support for trajectory files in
+:mod:`biotite.structure.io`, that require `MDTraj <http://mdtraj.org/>`_.
+The usage of these packages is not only allowed but even encouraged.
+
+Code efficiency
+---------------
+The central aims of *Biotite* are that it is both, convenient and fast.
+Therefore, the code should be vectorized as much as possible using *NumPy*.
+In cases the problem cannot be reasonably or conveniently solved this way,
+writing modules in `Cython <https://cython.readthedocs.io/en/latest/>`_ is the
+preferred way to go.
+Writing extensions directly in C/C++ is discouraged due to the bad readability.
+Writing extensions in other programming languages
+(e.g. in *Rust* via `PyO3 <https://pyo3.rs>`_) is currently not permitted to
+keep the build process simple.
+
+Docstrings
+----------
+*Biotite* uses
+`numpydoc <https://numpydoc.readthedocs.io/en/latest/format.html>`_
+formatted docstrings for its documentation.
+These docstrings can be interpreted by *Sphinx* via the ``numpydoc`` extension.
+All publicly accessible attributes must be fully documented.
+This includes functions, classes, methods, instance and class variables and the
+``__init__`` modules:
+
+The ``__init__`` module documentation summarizes the content of the entire
+subpackage, since the single modules are not visible to the user.
+In the class docstring, the class itself is described and the constructor is
+documented.
+The publicly accessible instance variables are documented under the
+`Attributes` headline, while class variables are documented in their separate
+docstrings.
+Methods do not need to be summarized in the class docstring.
+
+Module imports
+--------------
+In *Biotite*, the user imports packages in contrast to single modules
+(similar to *NumPy*).
+In order for that to work, the ``__init__.py`` file of each *Biotite*
+subpackage needs to import all of its modules, whose content is publicly
+accessible, in a relative manner.
+
+.. code-block:: python
+
+   from .module1 import *
+   from .module2 import *
+
+Import statements should be the only statements in a ``__init__.py`` file.
+
+In case a module needs functionality from another subpackage of *Biotite*,
+use a relative import.
+This import should target the module directly and not the package to avoid
+circular imports and thus an ``ImportError``.
+So import statements like the following are totally OK:
+
+.. code-block:: python
+
+   from ...package.subpackage.module import foo
+
+In order to prevent namespace pollution, all modules must define the `__all__`
+variable with all publicly accessible attributes of the module.
+
+Versioning
+----------
+Biotite adopts `Semantic Versioning <https://semver.org>`_ for its releases.
+This means that the version number is composed of three parts:
+
+- Major version: Incremented when incompatible API changes are made.
+- Minor version: Incremented when a new functionality is added in a backwards
+  compatible manner.
+- Patch version: Incremented when backwards compatible bug fixes are made.
+
+Note, that such backwards incompatible changes in minor/patch versions are only
+disallowed regarding the *public API*.
+This means that names and types of parameters and the type of the return value
+must not be changed in any function/class documented in the API reference.
+However, behavioral changes (especially small ones) are allowed.
+
+Although minor versions may not remove existing functionalities, they can
+deprecate them by
+
+- marking them as deprecated via a notice in the docstring and
+- raising a `DeprecationWarning` when a deprecated functionality is used.
+
+This gives the user a heads-up that the functionality will be removed soon.
+In the next major version, deprecated functionalities can be removed entirely.
+
+.. _extension_packages:
+
+Extension packages
+------------------
+*Biotite* extension packages are Python packages that provide further
+functionality for *Biotite* objects (:class:`AtomArray`, :class:`Sequence`,
+etc.)
+or offer objects that build up on these ones.
+
+There can be good reasons why one could choose to publish code as extension
+package instead of contributing it directly to the *Biotite* project:
+
+   - Independent development
+   - An incompatible license
+   - The code's use cases are too specialized
+   - Unsuitable dependencies
+   - Extensions written in a non-permitted programming language
+
+If your code fulfills the following conditions
+
+   - extends *Biotite* functionality
+   - is documented
+   - is well tested
+
+you can open an issue to ask for addition of the package to the
+:doc:`extension package page <../extensions>`.