Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Requirements for code to be included into MDAnalysis
The aim of this guide is to have a uniform and well-tested code base for MDAnalysis in order to improve code quality and maintainability. The Style Guide prescribes various aspects that new code has to conform to; old code should be refactored to conform to these requirements.
(For background and history see Issue #404, which contains the original discussion on the Style Guide.)
MDAnalysis is a project with a long history and many contributors and hasn't used a consistent coding style. Since version 0.11.0 we are trying to update all the code to conform with PEP8. Our strategy is to update the style every time we touch an old function and thus switch to PEP8 continuously.
- keep line length to 79 characters or less; break long lines sensibly
- indent with spaces and use 4 spaces per level
- naming (follows PEP8):
- classes: CapitalClasses (i.e. capitalized nouns without spaces)
- methods and functions: underscore_methods (lower case, with underscores for spaces)
Python 2/3 compatibility
MDAnalysis strives to be compatible with python 2 and 3 at the same time. To deal with the differences
we use six which takes care of loading the appropriate functions for each version of python. So instead of
iterzip use the
range function provided by six like this.
from six.moves import zip, range for i,j in zip(range(3), range(3)): print(i, j)
We recommend that you either use a Python IDE (PyCharm and others). You can also use external tools like flake8. For integration of external tools with emacs and vim check out elpy (emacs) and python-mode (vim).
To apply the code formatting in an automated way you can also use code formatters. As external tools there are autopep8 and yapf. Most IDE's either have their own code formatter or will work with one of the above through plugins.
We are striving to keep module dependencies small and lightweight (i.e., easily installable with
General rules for importing
- Imports must all happen at the start of a module (not inside classes or functions).
- Modules must be imported in the following order:
from __future__ import absolute_import, print_function, division)
- Compatibility imports (eg six)
- global imports (installed packages)
- local imports (MDAnalysis modules)
- future (
- use absolute imports in the library (i.e. relative imports must be explicitly indicated), e.g.,
from __future__ import absolute_import from six.moves import range import numpy as np import .core import ..units
Module imports in
MDAnalysis.analysis, all imports must be at the top level (as in the General Rule) — see #666
- Optional modules can be imported.
- No analysis module is imported automatically at the
MDAnalysis.analysislevel to avoid breaking the installation when optional dependencies are missing.
Module imports in the test suite
- In the test suite, use the order above, but import
- Do not use relative imports (e.g.
import .datafiles) in the test suite because it breaks running the tests from inside the test directory (see #189)
- Never import the
MDAnalysismodule from the
MDAnalysisTestsor from any of its plugins (it's ok to import from test files). Importing
MDAnalysisfrom the test setup code will cause severe coverage underestimation.
List of core module dependencies
Any module from the standard library can be used, as well as the following nonstandard libraries:
because these packages are always installed.
If you must depend on a new external package, first discuss its use on the developer mailing list or as part of the issue/PR.
Modules in the "core"
The core of MDAnalysis contains all packages that are not in
MDAnalysis.visualization. Only packages in the List of core module dependencies can be imported.
MDAnalysis.analysis are considered independent. Each can have its own set of dependencies. We strive to keep them small as well but module authors are in principle free to import what they need. These dependencies beyond the core dependencies are optional dependencies (and should be listed in
setup.py under analysis). (See also #1159 for a discussion.)
A user who does not have a specific optional package installed must still be able to import everything else in MDAnalysis. An analysis module may simply raise an
ImportError if a package is missing but it is recommended that it should print and log an error message notifying the user that a specific additional package needs to be installed to use this module.
If a large portion of the code in the module does not depend on a specific optional module then you should guard the import at the top level with a
try/except, print and log a warning, and only raise an
ImportError in the specific function or method that would depend on the missing module.
Package and test suite
The source code is distributed over the
package directory (the library and documentation) and the
testsuite (test code and data files). Commits can contain code in both directories, e.g. a new feature and a test case can be committed together.
MDAnalysis contains compiled code (cython, C, C++) in addition to pure Python. With #444 we agreed on the following layout:
Place all source files for compiled shared object files into the same directory as the final shared object file.
*.pyx files and cython-generated
*.c would be in the same directory as the
*.so. External dependent C/C++/Fortran libraries are in dedicated
include folders. See the following tree as an example.
MDAnalysis |--lib | |-- _distances.so | |-- distances.pyx | |-- distances.c |-- coordinates |-- _dcdmodule.so |-- src |-- dcd.c |-- include |-- dcd.h
We use git for version control. The DevelopmentWorkflow uses pull requests against the DevelopmentBranch (named develop) followed by automated testing and code review by project members. Successful PRs are merged into develop.
Follow git commits.
- Tests are mandatory for all new contributions.
- Functional tests of individual methods of classes or functions are preferred but "integration tests" that test most of the functionality in a single test are also acceptable.
- We strive for test coverage > 90% — check the coverage of testing when the PR is automatically tested.
- Try to test exceptions (i.e. that your code fails in predictable ways, see #597).
- Tests must follow the Style Guide: Tests
See Writing Tests (but that needs to be cleaned up) on more background and details on how to structure tests and how to include data files.
Tests for the core
The core (everything except
MDAnalysis.visualization) is critical and special scrutiny is applied to all changes and additions to the core. Good tests are absolutely required and code will not be merged unless extensive and comprehensive tests are also provided.
#743 outlines the testing requirements for analysis code:
New code contributions
- New code for
MDAnalysis.analysismust come with unit tests. All old tests and the new tests must pass before code is merged into develop.
- Tests for analysis classes and functions should at a minimum perform regression tests, i.e., run on input and compare to values generated when the code was added so that we know when the output changes in the future. (Even better are tests that test for absolute correctness of results but regression tests are the minimum requirement.)
Existing code in
Any code in
MDAnalysis.analysis that does not have substantial testing (at least 70% coverage) will be moved to a special
MDAnalysis.analysis.legacy module (by release 1.0.0) that will come with its own warning that this is essentially unmaintained functionality that is still provided because there's no alternative. Legacy packages that receive sufficient upgrades in testing can come back to the normal
MDAnalysis.analysis name space.
No consensus has emerged yet how to best test visualization code. At least minimal tests that run the code are typically requested.