# Powerful Python Packaging for Scientific Codes
**Henry Schreiner • Princeton University • July 8, 2021 @ PyHEP 2021**

Current best practices for building sharable libraries.

# Links

* Scikit-HEP Developer Guidelines @ https://scikit-hep.org/developer
* ISciNumPy @ https://iscinumpy.gitlab.io
* Scikit-HEP / Cookie @ https://github.com/scikit-hep/cookie

# Intro

## Questions

* How do you ensure your code works for everyone?
* How do you maintain something long term?
* How do you encourage high quality contributions?

Assume you are not the only user of your code, you will not be the only contributor to your code.

## Topics
* Using environments (venv/virtualenv, py, conda/mamba)
    * Applications (pipx)
    * Task runners (nox)
* Defining packages (several)
    * Including extensions (pybind11, cibuildwheel)
* Static checking and QA (pre-commit)
* Using CI (GitHub Actions)
* Distributing packages (PyPI, conda-forge)
* Pulling it all together in a cookie!
* Different systems including Scikit-build

# Environments

Don't install globably. You will break things. With the new pip resolver in pip 20.3+, probably will happen even more often.

* Days since a global install broke something for me: 3

This includes user installs.

```python
pip install package  # bad
pip install --user package # horrible
```

## Venv/virtualenv

```python
python -m venv .venv
source .venv/bin/activate
python -m pip install package
```

Everything is installed in the local directory `.venv`. Virtualenv requires installing, but is faster and updates more often.

Best practice: if you just have one per folder, then name it `.venv`.

### Python launcher for Unix

A Python core developer has build a copy of the Python launcher for Windows in Rust for Unix systems, and it includes a fantastic extra feature: it can use `.venv` directly without activation! Less typing _and_ no activation step.

```python
python -m virtualenv .venv
py -m pip install package
```

## Pipx

If you want to install or use an _application_ (anything you don't need to import), then use `pipx`, which lives next to `pip` now in the PyPA. This will make an isolated virtual environment for you for each application, and will only link in the console/gui scripts! `pipx install app` is safe. You can even do both at the same time:

```python
pipx run build --sdist
```

This will do both install and run in one step, and will cache the venv for a week.

### Implications

* Pipx is provided on all GHA and Azure images. No Python setup required.
* You have all of PyPI available in a single line if you have pipx!

### Examples

These are the last few things I've run based on my fish history looking for `pipx run`.

* `build`
* `twine`
* `cibuildwheel`
* `monolens`
* `jupyter-book`
* `bumpversion`
* `lastversion`
* `cmake`
* `cookiecutter`

### Other ideas

These have other ways to run normally or can be obtained from brew, but also are in my `pipx run` history.

* `auditwheel`
* `nox`
* `tox`
* `mypy`
* `black`
* `setup-cfg-version`
* `pyupgrade`
* `isort`
* `pytest`

## Conda

You can also create environment in conda or mamba. I recommend always creating an `environment.yml` file, then you can `conda env create` and `conda env update` with it. (You can replace `conda` with `mamba` and get a large speed up for create, and a massive speedup for update).

```yaml
# environment.yml
name: myenv
channels:
  - conda-forge
dependencies:
  - python ==3.9
  - root >=6.20.0
  - boost-histogram >=1
```

This is also picked up by mybinder.org.

## Task runners

https://scikit-hep.org/developer/tasks

### Nox

I used to avoid task runners, but `nox` is different. Let's look at a little example:

In [None]:
%%writefile noxfile.py
import nox

@nox.session
def example(session):
    session.install("pytest")
    session.run("pytest", "--version")

In [1]:
!pipx run nox -s example

[?25l⚠️  nox is already on your PATH and installed at /usr/local/bin/nox.
    Downloading and running anyway.
[?25h[36mnox > [33mRunning session example[0m
[36mnox > [34mCreating virtual environment (virtualenv) using python in .nox/example[0m
[36mnox > [34mpython -m pip install pytest[0m
[36mnox > [34mpytest --version[0m
pytest 6.2.4
[36mnox > [32mSession example was successful.[0m


Unlike tox, the classic statically configured solution, nothing is assumed about your package, and you even see the exact commands as it runs, teaching you how to run the commands yourself. It's also possible to do anything in Python in your nox file. Notice it looks very much like PyTest.

### Where to Use?
* Encourages new contributors with minimal requirements to run testing and linting
* Can automate challanging rare tasks (docs, bumping versions, etc)
* Can be used in CI
* Supported via `pipx` in the manylinux 2010+ images
* Scripts run via nox should work standalone - use it for environments
* Supports conda (partially) too! :)

You are not likely to use it for tasks _you_ work on daily, but it still helps contributors. Use `nox -l` to see all sessions.

## Packaging

https://scikit-hep.org/developer/packaging

We will focus mostly on setuptools, since it supports extensions. If you don't have extensions, there are more options, though setuptools is not too bad. Flit is likely the best alternative, though Poetry _might_ be a good option, if you are careful.

### pyproject.toml (PEP 517/518, also 621)

**RULE: all packages need to have a pyproject.toml file.**

This _is_ the future of Python packaging. Always include at least this (setuptools):

```toml
[build-system]
requires = [
    "setuptools>=42",
    "wheel"
]
build-backend = "setuptools.build_meta"
```

The packages listed here wil be temporarily installed in a venv when running "build" tasks, like making the SDist (source distribution) with `pipx run build --sdist` or when building the wheel (a built distribution that does not run custom code when installing - fast, simple, and can contain binaries).

Most libraries (except flake8 and setuptools) now support configuration in the `pyproject.toml` file, making it the "One file to rule them all" for Python packaging! Well, once at least setuptools starts supporting PEP 621.

### src structure

**RECOMMENDATION: all packages should be in `/src`, expecially if they have binary parts.**

Place your code in `src/<package_name>`, not `<package_name>`. Why? Python likes to run things from the current directory. _So you may not be testing your installed package when you run your tests!_ This is expecially important if you have compiled extensions.

### setup.cfg configuration

**ALMOST RULE: Static configuration should be in setup.cfg.**

Almost anything static in `setup.py` should be in `setup.cfg`.

#### Exceptions:

* Binary extensions have to be in `setup.py`
* Package extras _can_ be in `setup.py` to have self-dependencies.

#### Benefits:
    
* Easier to convert to PEP 621 pyproject.toml (or run any automated tooling)
* Can be linted and formatted automatically vai setup-cfg-fmt
* Can be read by other tools (like cibuildwheel), while `setup.py` cannot be reliably parsed
* Clarifies what's "special" or non-static about your package build
* Keeps you from having to add "helper" files that are imported in `setup.py`
* Empty `setup.py`'s can be deleted completely now

### Use Extras

You can add extras, which allow optional dependencies to be requested. For example:

```ini
[options.extras_require]
test =
  pytest >=6.0
```

Now you can add `[test]` to the pip install line or requirement listing and `pytest` will be included.

Some suggested extras include `[test]`, `[dev]`, `[docs]`, `[all]`.

### Never forget `python_requires`!

```ini
python_requires = >=3.6
```

The most important line in your setup.cfg (or setup.py) might be `python_requires`. This allows you to set minimim version(s). Always set this, always have at least one release out with `python_requires` before increasing it, and never set a maximum. When pip considers releases, it will check the `Requires-Python` metadata slot (which this fills), and if it doesn't match, it will look at the next oldest version.

_Never_ be "loose" with this variable. As soon as you might not support a version anymore (such as dropping it from CI), change this variable immediatly. Users on old verison of Python will just get old, working versions of your package. Dropping a Python version doesn't mean you are ruining use of your package on that verison, you just aren't developing for it anymore.

# Static checking and QA

https://scikit-hep.org/developer/style

## pre-commit

All checks can be run through pre-commit. While it's somehat of a task runner, and overlaps with `nox`, it has a specific goal: things it runs are intended to be quick checks that you should run before every commit. You can even install it as a pre-commit git hook if you want to.

Anything can be run in pre-commit, including checks based on `python`, `docker`, `ruby`, and many more. It is popular enough to get first and second party support for most common checks. This is what the configuration looks like:

```yaml
repos:
- repo: https://github.com/psf/black
  rev: 21.5b2
  hooks:
  - id: black
```

This sets up a black check. _Always_ use an unmoving revision here - use `pre-commit autoupdate` to bring them up to date. Pre-commit caches environments on the rev.

Pre-commit has two run modes. The default "fast" mode checks only the changed files, and respects your staging area:

```bash
pre-commit run
```

The second method is to run on all git tracked files:

```bash
pre-commit run -a
```

If you want it to always run when you type `git commit`, then run `pre-commit install` and it will become a git pre-commit hook.  You can even set this up to always happen when you clone a repo. If you ever need to skip all hooks, add `-n` to the `git commit` command.

### Tips:

* There are fantastic "common" hooks in pre-commit/pre-commit-hooks
* You can trivally turn any repo into a pre-commit hook
* You can tell pre-commit.ci to skip certain checks
* You can make checks only run in a manual stage. Or any git stage.

## pre-commit.ci

There is a CI service for pre-commit.ci. While you could run pre-commit in GitHub Actions, and this is still useful if you have docker checks or want to "require" pre-commit to pass before running tests or building things, there are several benefits to (also) having pre-commit.ci:

* Weekly (or monthly) updates to all pre-commit rev's.
* Modifying checks are immediatly commited directly to PRs.
* Ultra fast with global caching.

## mypy

Here's an example of a more involved check:

```yaml
- repo: https://github.com/pre-commit/mirrors-mypy
  rev: v0.910
  hooks:
  - id: mypy
    files: src  # Can control what the check runs on
    args: []    # The default two args here are not great, we can remove or replace
    additional_dependencies: # Full control over the environment it runs in!
    - numpy==1.20.*
    - uhi
    - types-dataclasses
```

# CI

https://scikit-hep.org/developer/gha_basic

## GitHub Actions

GitHub Actions is the most popular CI system, and very easy and elegant to use due to a beautiful modular design that avoids assumptions.


```yaml
on:
  pull_request:
  push:
    branches:
    - main

jobs:
  lint:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v2
    - uses: actions/setup-python@v2
    - uses: pre-commit/action@v2.0.3
```

```yaml
  tests:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        python-version: [3.6, 3.9]
    name: Test 🐍 ${{ matrix.python-version }}
```

```yaml
    steps:
    - uses: actions/checkout@v2
    - uses: excitedleigh/setup-nox@v2.0.0  # Sets up all installed Python versions
    - name: Test package
      run: pipx run nox -s test-${{ matrix.python-version }}
```

### Features

- Artifacts can be uploaded and accessed from UI or other jobs
- Actions can be defined in any GitHub repo or locally, in shell, JS, or Docker
- Linux, macOS, Windows all supported, and docker images
- Very well updated

## Dependabot

Can update pinned Pytohn version or GitHub action tags!

```yaml
# .github/dependabot.yml
version: 2
updates:
  # Maintain dependencies for GitHub Actions
  - package-ecosystem: "github-actions"
    directory: "/"
    schedule:
      interval: "daily"
    ignore:
      # Offical actions have moving tags like v1
      # that are used, so they don't need updates here
      - dependency-name: "actions/*"
```

# Distributing packages

## build

Building SDists and pure Python wheels should be done by pypa/build. Example:

```console
pipx run build
```

This will build an SDist, then use that to build an wheel. You can also build directly:

```console
pipx run build --sdist
```

This builds via PEP 517, so any project that pip can install is supported (with much the same infrastructure) (though SDists sometimes are broken! Use `--wheel` to build directly into a wheel).

## cibuildwheel

This is a tool for building wheels for all platforms on CI (or locally for linux, especially in upcoming 2.0).

* Handles manylinux docker images (shared maintainership)
* Handles repairing on Linux, macOS, and experimentally/optionally Windows
* Handles python.org Python (10.9+) downloads on macOS
* Can build Universal2/Arm64 wheels for Apple Silicon 3.8+
* Handles PyPy (collaboration with PyPy devs)
* Handles special archectures (32-bit, Aarch64, PowerPC, s390x)
* Used by large (scikit-learn, scikit-image, mypy, matplotlib) and small projects alike
* Can test your wheels in a new environment
* Powerful configuration system and selection
* Pins all starting dependencies (selectable)

### Example
```
  build_wheels:
    name: Wheels on ${{ matrix.os }}
    runs-on: ${{ matrix.os }}
    strategy:
      matrix:
        os: [ubuntu-20.04, windows-2019, macos-10.15]

    steps:
    - uses: actions/checkout@v1

    - uses: pypa/cibuildwheel@v1.12.0

    - uses: actions/upload-artifact@v2
      with:
        path: wheelhouse/*.whl
```

### Version 2.0 (upcoming)

* Dropped Python < 3.6, using latest manylinux images again, auditwheel 4, etc.
* Optional `pypa/build` support (hits a Windows pip bug in 2.0.0a4)
* Supports Python 3.10 (optional until ABI stable)
* Better PyPy support

```yaml
# Environment variables (1.x or dynamic)
CIBW_SKIP: cp36-*
CIBW_TEST_EXTRAS: test
CIBW_TEST_COMMAND: pytest {project}/tests
CIBW_BUILD_VERBOSITY: 1
CIBW_ARCHS_MACOS: auto universal2
CIBW_TEST_SKIP: "*universal2:arm64"
```

Now you can place config in your pyproject.toml!
```toml
[tool.cibuildhweel]
skip = ["cp36-*"]
test-extras = ["test"]
test-command = "pytest {project}/tests"
build-versbosity = 1

[tool.cibuildwheel.macos]
archs: ["auto", "universal2"]
test-skip: ["*universal2:arm64"]
```

And you don't have to be tied to your CI config; you even can run it locally:

```console
pipx run --spec cibuildwheel==2.0.0a4 cibuildwheel --platform linux
```

## conda-forge feedstocks

You can create recipes that are submitted to the conda-forge orginization on GitHub, then their CI manages the building and updating. See almost any scikit-hep for examples.

# Quick package construction

All the above can be set up in seconds for a new package with `scikit-hep/cookiecutter`!

```console
pipx run cookiecutter gh:scikit-hep/cookie
```

(Live demo)

# Alternate packaging systems

## flit PEP 621

Flit supports PEP 621, albiet secretly at the moment. Take a look at this:

```toml
# pyproject.toml
[build-system]
requires = ["flit>=3.2"]
build-backend = "flit_core.buildapi"

[project]
name = 'example_pkg'
version = '0.1.0'
description = 'Something interesting'
readme = 'README.md'
requires-python = '>=3.7'
license = { file = 'LICENSE' }
authors = [
  { name = 'Me Myself', email = 'me@myself.com' },
]
```

Combine this with a `src/exmaple_pkg/__init__.py` or `example_pkg/__init__.py` file, and a README.md & LICENCE file, and you have a working package. You can `pipx run build` it, and `pip install` it. 

## Poetry

This replaces all other tools with a "all-in-one" solution. You can work on packages without `venv`, `virtualenv`, `pipenv`, you don't need `pip` or `build` (though it does support them via PEP 517). You don't need `setuptools`. It can even bump versions (sortof).

It makes some choices that seem to be made largely out of the "we can do everything better" mindset that can be problematic, expecialy for libraries (it seems to be better alligned to applications). You should never set an upper limit on libraries you don't use heavily - it can create impossible solves, which are much, much harder than telling users that you have to install an older package for a while. It's also rather slow in supporting macOS 11 and Apple Silicon. (So is pipenv).

Other build systems:
* trampolim: Very young PEP 621 build system with arbitrary hook support
* whey: Very young PEP 621 build system that's a bit buggy currently
* enscons: The only binary PEP 517 (and not 621) build system, uses SCONS

## Scikit-build

This is an "extension" (or hack) that integrates with distutils / setuptools. It has some huge benefits, though:

* Native CMake builder from the makers of CMake
* The project packages `cmake` and `ninja` for Python on PyPI, too
* Significant userbase
* Supports C, C++, Fortran, Cython
* Runs circles around setuptools when it comes to library support

See [pybind/scikt_build_example](https://github.com/pybind/scikt_build_example) (and [pybind/cmake_example](https://github.com/pybind/cmake_example), and [pybind/python_example](https://github.com/pybind/python_example), too)

Problems:
* Some known bugs with MSVC 2019 and Apple Silicon (currently being fixed)
* Development has been slow, but there are several more part-time maintainers now
* Tied to Distutils/setuptools internals
* Configuration is not PEP 517/621 or setup.cfg freindly yet.

Completed:

* New release of `cmake` for every supported platform of cibuildwheel (added Apple Sillicon, PowerPC, etc.)

Short-term goals:
* New release somewhat soon with better platform support

Mid-term goals:
* Support newer CMake features natevly, like FindPython
* Drop 2.7/3.5 on or before Jan 1, 2022
* Fix caching issues with editable installs (might depend on the recently accepted editable install PEP adoption)
* Drop distutils before Python 3.12 drops it
* Refactor, document, expand, check, better tests
* Add cookiecutter (based on Scikit-HEP/cookie)

Long-term goals (needs funding):
* Support setup.cfg
* Suppoert PEP 517 directly without setuptools
* Suppert PEP 621 configutation
* Add an extension system, and add support in pybind11
* Look at developing a Poetry plugin