Speeding up PlasmaPy's continuous integration checks #2585

namurphy · 2024-03-16T01:35:48Z

namurphy
Mar 16, 2024
Maintainer

Over time, we have made some significant performance improvements to PlasmaPy's continuous integration (CI) tests and documentation build. Rapid CI is incredibly helpful! When tests are quick and easy to run, we run tests more often and get more immediate feedback. We've gotten to the point that our GitHub workflow for performing tests takes ∼50 s (when we exclude slow tests) and our documentation build takes ∼250 s. The current times are improvements by a factor of ∼3–4 compared to the longest durations in years past. I wanted to summarize the changes we made, with the highest effort-to-impact ratio changes listed first, along with some other possibilities for speeding things up that we've considered.

The main takeaway points are:

If you use tox, enable tox-uv. If you use nox, enable the uv backend.
Enable pytest-xdist to perform tests in parallel.
Have a test run that skips all tests that take ≳ 0.3 s.
Adopt pre-commit hooks to find and autofix issues without having to wait for CI to finish.
Minimize the size of the rst_prolog and rst_epilog configuration variables for Sphinx.

Changes that made a significant improvement

Using `tox-uv` as an extension for `tox`

PlasmaPy uses tox as its test runner. uv was recently released by Astral as a drop-in replacement for pip, pip-tool, and virtualenv. The tox-uv extension to tox makes it so that uv creates the virtual environment during tox runs. This extension can be enabled by installing tox-uv in the Python environment that tox is run from, which is done in #2584. Similarly, nox has options to use uv as its back-end.

I recorded some timings. When using the previous toolchain (pip, etc.), the overhead for running pytest in a GitHub workflow was 155 s. With uv, the overhead dropped to 16 s: an order of magnitude improvement! (These timings exclude caching .tox as described below, as well as actually running pytest.) Because tox is used for both tests and documentation builds, using tox-uv significantly speeds up several of our workflows compared to when we were using pip to recreate the environment. Using uv will not speed everything up, (e.g., compiling code will take just as long).

Using `pytest-xdist`

pytest-xdist is a pytest extension that allows tests to be run in parallel. If pytest-xdist is installed, then tests can be run on the available processors with pytest -n auto. Each GitHub workflow can currently use up to two processes, so using pytest-xdist can provide a significant improvement. Running tests locally can see significantly faster speedups.

When we first attempted to use pytest-xdist a pandemic timescale ago, we ran into some problems with our tests order not being deterministic (#750) because of set operations. Making the test order deterministic resolved this problem.

Skipping slow tests in at least one test run

We have decorated nearly all tests that take ≳ 0.3 s with @pytest.mark.slow, found using pytest --durations=25. We can skip these tests with pytest -m "not slow". With my current setup, this command runs ∼96% of PlasmaPy's tests in ∼30–40 s. On pull requests, we skip slow tests in all but one of our testing workflows. In the remaining workflow, we still perform all tests so that we can get accurate code coverage. We also perform all tests for multiple platforms and versions of Python as a weekly cron job.

Caching `.tox` between GitHub workflows

In #2552, we made it so that the .tox directory is cached between runs. In doing this, we made it so that the Python environment does not need to be recreated every time we run tests. Caching .tox saved more time than using tox-uv, but not by that much (as shown in these timings).

How to set up and invalidate the cache can be a tricky. Probably the easiest way would be to have the cache only be valid on the same day that it was created.

Because enabling tox-uv is significantly less effort than setting up a cache and provides almost as much of a speed-up, I suggest trying out tox-uv first. Caching .tox may still be worth it for active repositories. For us, caching .tox saves ∼10 s on top of using tox-uv.

Adopting `ruff` and `ruff-format` for linting and formatting

We recently adopted ruff and ruff-format to replace flake8, black, and isort. While these tools are high performance, the biggest impact has been that ruff performs autofixes for $\sim \frac{1}{3}$ of its ∼700 linter rules. If pre-commit.ci has been enabled and ruff is included as a pre-commit check, then autofixes can be automatically applied by commenting pre-commit.ci autofix on a pull request.

Adopting `pre-commit` hooks to lint documentation

PlasmaPy's documentation build takes ∼4–5 minutes to build. However, there are several pre-commit hooks that can find common issues without having to wait for the full documentation build. These include sphinx-lint, rst-directive-colons and rst-inline-touching-normal from pygrep-hooks, and ruff with most of its pydocstyle rule set.

Minimizing usage of `rst_prolog` and `rst_epilog` with Sphinx

The rst_prolog and rst_epilog configuration variables include text that gets added to every source of reStructuredText when being processed by Sphinx. PlasmaPy formerly defined a bunch of reStructuredText subsitutions in one of these variables, including dozens of lines. Because the contents of these variables got added to every reStructuredText source, they significantly slowed down PlasmaPy's documentation build. By several minutes.

We sped up our doc builds by a factor of ∼2–3 by greatly shortening rst_epilog by:

Moving substitutions that were used in ≲ 2 files from rst_epilog to the files they were used in.
Using my favorite Sphinx extension — sphinxcontrib-globalsubs — to define global substitutions that replaced the substitutions and links that were formerly in rst_epilog (Speed up doc build by factor of 2 by defining substitutions via sphinxcontrib-globalsubs instead of common_links.rst #2281). [This approach also sped up Astropy's doc build from ∼1038 s to ∼878 s (a 15% improvement, see DOC: Speed up Sphinx build by using sphinxcontrib-globalsubs astropy/astropy#16162).

Changes that we've considered

Skipping execution of example notebooks in some doc builds

Several example Jupyter notebooks are executed during Sphinx builds. Skipping execution of notebooks in some documentation builds would save ∼30 s per doc build (#2563). (We've also pre-executed some of the notebooks.) The doc build done via a GitHub workflow could skip notebook execution to provide quick feedback (except if notebooks are modified), while Read the Docs could execute notebooks to provide slower but more complete feedback, along with a more accurate preview.

Building docs in parallel

Sphinx allows documentation to be built in parallel (i.e., by using -j auto with sphinx-build). We previously had been doing this for doc builds with GitHub Actions. I did some timings and found that building PlasmaPy's docs in parallel was slightly slower than building docs in a single process. I don't know why this is, but am guessing that it could be due to one of our Sphinx extensions. If you're considering this for another repository, I suggest doing a test build to make sure that building docs in parallel will actually speed things up for you.

Caching a doc build for GitHub workflows

Sphinx rebuilds should generally be much quicker than fresh builds. Caching a doc build should speed up doc builds in PRs (astropy/astropy#16161). However, I found that Sphinx rebuilds of PlasmaPy's docs took about as long as fresh builds after a file was modified. I'm not sure why this is, but I suspect it's specific to PlasmaPy (again, maybe a Sphinx extension?). Doing this for other repositories would probably be more successful, though cache invalidation would remain tricky.

Removing tests and documentation

I'm sort of joking and sort of serious. If portions of docs are no longer relevant, removing them would shorten doc build times. If a particular feature has a lot of redundant tests, some of those tests could (cautiously!) be removed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speeding up PlasmaPy's continuous integration checks #2585

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Speeding up PlasmaPy's continuous integration checks #2585

namurphy Mar 16, 2024 Maintainer

Changes that made a significant improvement

Using tox-uv as an extension for tox

Using pytest-xdist

Skipping slow tests in at least one test run

Caching .tox between GitHub workflows

Adopting ruff and ruff-format for linting and formatting

Adopting pre-commit hooks to lint documentation

Minimizing usage of rst_prolog and rst_epilog with Sphinx

Changes that we've considered

Skipping execution of example notebooks in some doc builds

Building docs in parallel

Caching a doc build for GitHub workflows

Removing tests and documentation

Replies: 0 comments

namurphy
Mar 16, 2024
Maintainer

Using `tox-uv` as an extension for `tox`

Using `pytest-xdist`

Caching `.tox` between GitHub workflows

Adopting `ruff` and `ruff-format` for linting and formatting

Adopting `pre-commit` hooks to lint documentation

Minimizing usage of `rst_prolog` and `rst_epilog` with Sphinx