# Distribute Python software (in June 2023)

## 0. Introduction

### 0.1 - Objectives

Build, install, package and distribute Python code

References:

* https://packaging.python.org/tutorials/packaging-projects/
* https://setuptools.readthedocs.io/en/latest/
* https://github.com/pypa/sampleproject

Suppose you have a python package, or a collection of scripts, that you want to share with others (eg. beamline users).

> -- Alright, I send you my scripts by mail. Can you try `python mypackage/script.py` ?  
> -- It doesn't work  
> -- Oh, yes, I forgot: you have to first install numpy: `pip install numpy`  
> -- Done, still doesn't work  
> -- Mmh, lemme check. OK, you also have to install these: `pip install scipy matplotlib`  



The objective of this tutorial is to end up with something like

> -- To install and use my package, do `pip install mypackage` and then run the command `process_data`  

### 0.2 - Warning

When it comes to packaging and distribution, **python has too many ways of doing things**.  

Python packaging has a long and chaotic history (see section 7)

Consequently:
  - You **will** find obsolete/incorrect advice all around the web - prefer [packaging.python.org](https://packaging.python.org) whenever possible
  - There **will** still be a new cool packaging tool every few months

### 0.3 - What is a package ?

A package is a file (or set of files) that **encapsulates your software in a given state** (version!), and possibly  data/metadata.

It can be delivered in the form of pre-compiled binary, or source code.

Ideally, package can be distributed on various operating systems (Windows, MacOS, Linux countless distributions) and processors (x86_64, PowerPC, ARM).

A package usually comes with **dependencies**: it needs other packages to work.

Packages are installed/removed by a **package manager**.

### 0.4 - OS package vs Python package

An Operating System (OS) package is  
  - Installed system-wide, i.e for all users - needs admin privileges
  - Usually accessible by default without issuing any command (eg. module load)
  - Handled by the OS package manager
    - Debian/Ubuntu: apt + dpkg
    - RHEL/CentOS/Rocky: yum/dnf + rpm
    - Windows: MSI

A python package is  
  - Installed within a user directory - does not need admin privileges
  - Usually available after some activation command (eg. `source myvenv/bin/activate`)
  - Handled by a python package manager: pip, conda


### 0.5 - Package manager

A package manager 
  * Keeps track of installed packages
  * Manages dependencies (and sometimes conflicts)
  * Provides access to a package repository

### 0.6 - Standalone packages

Packages can also be standalone (self-contained, i.e almost no dependency):
  - Linux: appimage, snap
  - Python "fat binaries" (might be preferable for GUI on Windows and MacOS)
  

In this tutorial, we will use **setuptools**. Many alternatives exist.

Ensure your environment is use to date

```bash
pip install --upgrade pip setuptools wheel build
```

## 1. Python package skeleton 

Let's create the directory structure of our python package.  
The directories should be aranged as:

```
project/
    src/
        package/
            __init__.py
            module.py
            subpackage/
                __init__.py
        ...
```

As a series of commands:
```bash
mkdir -p project/src/package/subpackage
echo 'version = "0.0.1"' > src/package/__init__.py
touch project/src/package/module.py
touch project/src/package/subpackage/__init__.py
```

NB: `project` and `package` usually have the same name

So that, from the `project/src` directory:

```python
>>> import package
>>> from package import module
>>> import package.subpackage
```

In the `module.py` file, write a function that prints something:

```python
def say_hello():
    print("Hello!")
```

Test that it works, from within the `project/src` directory:

```bash
python -c "from package.module import say_hello; say_hello()"
``` 

---

## 2. Configuring a package with `pyproject.toml` 

We use [setuptools](https://setuptools.pypa.io/en/latest) to build the package. It can be configured by serveral ways:
  - `pyproject.toml` (preferred, works with other backends like hatch, flit, poetry, pdm)
  - `setup.cfg` (specific to setuptools)
  - `setup.py`

https://setuptools.pypa.io/en/latest/userguide/pyproject_config.html


The project configuration files (`pyproject.toml` or `setup.cfg`) have several purposes:
  - Declare project metadata: name, version, description authors names/email
  - Configure how the project is built
  - Define the project dependencies (requirements) - [PEP 518](https://www.python.org/dev/peps/pep-0518)

### 2.1 - Basic `pyproject.toml` file

Add a `pyproject.toml` file with the following content:

```toml
[build-system]
# Minimum requirements for the build system to execute.
requires = ["setuptools>=61.0", "wheel"]  # PEP 508 syntax
build-backend = "setuptools.build_meta"
```

### 2.2 - Dependencies

Dependencies allow the user and installation system to require other packages.
There is 2 kinds of dependencies:

- build dependencies: packages needed to **build** your package
- runtime dependencies: packages needed to **run** your package

For example, your scientific software package probably won't need `numpy` to be installed, but will probably need it to run. 

in `pyproject.toml`, build dependencies are specified with the `requires` keyword in the `[build-system]` section:

```toml
[build-system]
requires = ["setuptools>=61.0", "wheel"]
```

runtime dependencies are specified with the `dependencies` in the `[project]` section:

```toml
[project]
dependencies = [
    "numpy",
    "scipy >= 1.10.1",
]
``` 


**Optional dependencies** can also be specified in the `[project.optional-dependencies]` section.

```toml
[project.optional-dependencies]
gui = ["PyQt5"]
cli = [
  "rich",
  "click",
]
``` 

that way, you can choose to install only required components:
  - Install the package and its minimal required dependencies: `pip install "mypackage"`
  - Install the package and the dependencies associated with GUI: `pip install "mypackage[gui]"`


Complete the `pyproject.toml` file with various metadata - see also https://packaging.python.org/en/latest/specifications/declaring-project-metadata/

```toml
[project]
name = "package"
version = "0.0.1"
authors = [
  { name="Example Author", email="author@example.com" },
]
description = "A small example package"
readme = "README.md"
requires-python = ">=3.8"
classifiers = [
    "Programming Language :: Python :: 3",
    "License :: OSI Approved :: MIT License",
    "Operating System :: OS Independent",
]
license = {text = "MIT"}
dependencies = [
    "numpy",
    "scipy >= 1.10.1",
    'importlib-metadata; python_version<"3.8"',
]

[project.urls]
"Homepage" = "https://github.com/myself/myproject"
"Bug Tracker" = "https://github.com/myself/myproject/issues"
```

### 2.3 -  Entry points

Entry points are commands made available to user once the package are installed. They are useful for launching applications in a user-friendly way.

For example in the python package `pip`, `pip install ...` is an entry point named `install`. 

- https://packaging.python.org/specifications/entry-points/
- https://packaging.python.org/specifications/declaring-project-metadata/#entry-points
- https://setuptools.pypa.io/userguide/entry_point.html


To add an entry point to the `say_hello` function within `module.py`, add this to the `pyproject.toml` file:

```toml
[project.scripts]
say-hello = "package.module:say_hello"

``` 

### 2.4 - Dynamic metadata

Some metadata can be automatically generated at build time using [dynamic metadata](https://setuptools.pypa.io/en/latest/userguide/pyproject_config.html#dynamic-metadata).

For example, to avoid defining the version number at multiple places:

```toml
# ...
[project]
name = "package"
dynamic = ["version", "readme"]
# ...
[tool.setuptools.dynamic]
version = {attr = "package.version"} # defined in src/package/__init__.py
readme = {file = ["README.rst"]}
``` 



### 2.5 - Misc.

#### Version numbering scheme

Stay compatible with Python's "Version Identification and Dependency Specification" [PEP 440](https://www.python.org/dev/peps/pep-0440/#version-scheme):

Some common versioning:
- `major.minor[.micro][{a|b|rc}N]`: 1.0, 1.1.1b1
- `year.month`: 2021.10


#### Additional information files

Good practice (almost mandatory for distribution):

- `LICENSE`: Contract for using the package
- `README`: Abstract

---

## 3. Alternative: configure the package with `setup.cfg` or `setup.py`

`setup.cfg` describes the project metadata and build configuration for setuptools. See: https://setuptools.pypa.io/en/latest/userguide/declarative_config.html

The equivalent to previous `pyproject.toml` file is

```cfg
[metadata]
name = package
version = attr: package.VERSION # dynamic metadata
author = me
author_email = my.email@esrf.fr
description = My package description
long_description = file: README.rst, CHANGELOG.rst, LICENSE.rst
license = BSDMIT
classifiers =
    Programming Language :: Python :: 3

[options]
zip_safe = False
include_package_data = True
packages = find:  # automatic modules discovery
python_requires = >=3.8
# runtime dependencies
install_requires =
    numpy
    scipy >= 1.10.1
    importlib-metadata; python_version<"3.8"
# build dependencies
setup_requires = 
    build
```

For a single `module.py` file project:

```cfg
[metadata]
name = package
version = 0.0.1

[options]
py_modules = module
```

#### Optional requirements

`setup.cfg`'s `[options.extras_require]` section allows to define optional dependencies:

```cfg
[options.extras_require]
dev =
    pytest
	sphinx
```

It is then possible to install those extra dependencies with:

```
pip install package[dev]
```

#### Alternative: `setup.py` 

It used to be the central place for building and packaging Python projects (and it still is for many projects), but it is now optional.

Useful for backward compatibility:
```python
# coding: utf-8
import setuptools

if __name__ == "__main__":
    setuptools.setup()
```

It is also the place for defining specific commands and C extensions.

## 4. Building and installing the package

At this stage, we have a project skeleton and a `pyproject.toml` file (or `setup.cfg` or `setup.py`).

Now it's time to build/install it, as a first step toward distribution to the outside world.

#### Build packages

With the [build](https://pypi.org/project/build/) package (`pip install build`):
```
python -m build
```
generates a source tarball and a wheel (`*.whl`) files in `dist/`:
- The Source tarball needs to be built on target machine
- The wheel file (`.whl`) is a zip file containing an already built Python package

#### Install from source

- Install your package:
  ```
  pip install .
  ```
- Install in editable mode (aka., develop mode):
  ```
  pip install -e .
  ```
- Previous way:
  ```
  python setup.py install
  ```

Check that your package is installed: execute the entry point!

In a shell:

```bash
say-hello
``` 

## 5. Distribution

Distributing means publishing your package in a standard format (wheel or source code) to a public place (pip or a conda channel)

We will use the Python Package Index [PyPI](https://pypi.org).

For this, you need to create an account on pypi.org.

![](PyPI.png)


Use the `twine` package to publish to [pypi.org](https://pypi.org/account/register/):

- First, create an account on [pypi.org](https://pypi.org/account/register/) (or the test instance: [test.pypi.org](https://test.pypi.org/account/register/)) 
- Generate the packages you want to provide (check the version number, and tag it in git):

  `python -m build  # or python setup.py sdist bdist_wheel`
- Upload the project with `twine` (`pip install twine`):

  `twine upload dist/*`

  or for [test.pypi.org](https://test.pypi.org): `twine upload --repository-url https://test.pypi.org/legacy/ dist/*`
- Install from [pypi.org](https://pypi.org/): `pip install package`

  or for [test.pypi.org](https://test.pypi.org): `pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple package`

**Important note**: uploading a package version to pypi is irreversible
  - you cannot delete a version number and re-upload it with the same version number
Ensure your package works before uploading it to pypi !

To test real-world deployment, use test.pypi.org

---

## 6. Advanced features

### `requirements.txt` vs `setup.cfg`

It may look contradictory to define dependencies at different places but it is not https://packaging.python.org/discussions/install-requires-vs-requirements/.

* `setup.cfg` provides abstract minimal dependency requirements (e.g., `numpy`)
* `requirements.txt` provides concrete implementation
  (with hard coded versions and URL to download wheels from).
  This provides a way to specify an environment: `numpy==1.12.0`

  Usage: `pip install -r requirements.txt`

### `MANIFEST.in`: Source package content

A default set of files (`*.py`, `LICENSE`, `README.md`) is included in source packages produced by:
```
python -m build  # or python setup.py sdist
```

It is possible to include additional files by declaring them in a `MANIFEST.in` file at the project's top-level:

```
include CONTRIBUTE.txt
recursive-include package *.dat
graft example
```

See [documentation](https://packaging.python.org/guides/using-manifest-in/)

### Additional resources

Install non-Python files within the package (e.g., reference data).

#### Automatic

Add needed files to `MANIFEST.in` and add the following to `setup.cfg`:
```cfg
[options]
include_package_data = True
```

#### Manual

```cfg
[options]
package_data =
    * = *.txt  # * Applies to all packages
    package.subpackage = *.dat 
```

See [Data Files Support](https://setuptools.pypa.io/en/latest/userguide/datafiles.html) documentation.

### Alternative package folder

Change the folder storing package source:
```
project/
  ...
  src/
    package/
      __init__.py
      ...
```

`setup.cfg`:
```cfg
[options]
package_dir=
    =src
```

### `setup.cfg` vs. `setup.py`

`setup.cfg`:
```cfg
[metadata]
name = package
version = 0.0.1

[options]
packages = find:
```
is equivalent to `setup.py`:
```python
import setuptools

setuptools.setup(
    name="package",
    version="0.0.1",
    packages=setuptools.find_packages(),
)
```

As of today, `setup.cfg` is the recommended way, but `setup.py` is fully working.

### Compiled extensions

It is possible to compile modules written in C, C++, Cython as part of the build process.

```python
from setuptools.extension import Extension

setup(
    ext_modules=[
        Extension('package.cmodule', ['package/cmodule.c'])],
        Extension('package.cythonmodule', ['package/cmodule.pyx'])
    ],
)
```

This adds the requirement of having the proper compiler available and put a lot more constraint on packaging and distribution (e.g., one wheel per operating system and per architecture built with a specific environment).

### A word on automated deployment

It is possible to automate the release process with "continous integration" services:

- Ease version number handling and git tag: [bump2version](https://pypi.org/project/bump2version/)
- Ease generation of compiled wheels: [cibuildwheel](https://cibuildwheel.readthedocs.io/en/stable/)
- Possible to set continuous integration service to publish release on pypi.org (using a token from pypi): [Example here](https://github.com/silx-kit/h5grove/blob/f44bc762ebcf02e1db2e51e442552c469f95f586/.github/workflows/release.yml#L39-L46)

---

## 7. Some notes on python packaging

Python packaging has gone a long way.

![](xkcd_standards.png)

setuptools - distutils - numpy.distutils - hatch - flit - pdm - poetry - ... conda


Before: 
  - a handful packaging tools (setuptools/easy_install and (numpy-)distutil)
  - The setup procedure was **specified programmatically** (via `setup.py`)
  - Complicated/cumbersome calls to setup tool API => copypaste from other project or stackoverflow
  - Messy: if one part of the packaging tool changes, then everything changes

Now:
  - (Too) many packaging tools
  - The setup procedure is **specified declaratively** (via a configuration file).
    - """standard""": `pyproject.toml` (PEP 518) + `MANIFEST.in`
    - Dedicated: setup.cfg (setuptools), meson.build (meson-python), ...
  - Simpler - "make the easy things easy"
  - Hopefully conceptually cleaner (PEP 518 and 517). User interface and actual build backend are decoupled, so can be changed with limited impact


The **build frontend** is a tool exposed to user (via command-line interface) helping to build ( example: `pip wheel .`).  
It calls the actual building code, "the build backend", under the hood.  
Up to "recently", frontend and backend were usually part of the same part codebase.  
PEP 517 specifies how the two can be decoupled.

The **build backend** is the code that does the actual package build (example: `setuptools.build_meta`).

### Wheels: [PEP427](https://www.python.org/dev/peps/pep-0427/)


Wheels are the current standard of Python distribution through [pypi.org](https://pypi.org/).

#### Advantages

1. Avoids arbitrary code execution for installation (no `setup.py` executed).
1. Does not require a compiler on the user side for binary extensions.
1. Faster installation, especially for binary extensions.
1. Creates `*.pyc` files at installation, matching the Python interpreter used.
1. More consistent installs across platforms and machines.

Wheels provide binary packages and a decent installer (`pip`).
It is a very convenient way to install up-to-date versions.

#### Pitfalls

- For compiled extension, a specific compilation environment is required (e.g., [manylinux](https://github.com/pypa/manylinux) docker under Linux). See [Building binary extensions doc](https://packaging.python.org/guides/packaging-binary-extensions/#building-binary-extensions)).
- External shared library needs to be incorporated in the wheel.
  You can use utility software to check against which libraries your package is linked :

  - macOS: [delocate](https://github.com/matthew-brett/delocate)
  - Windows: [depends](http://www.dependencywalker.com/)
  - Linux: ldd, [auditwheel](https://github.com/pypa/auditwheel)

### Debian/Ubuntu packages

Useful tools to create Debian packages from Python packages:

- [stdeb](https://pypi.python.org/pypi/stdeb/): Takes a source Python project as input.
- [wheel2deb](https://pypi.org/project/wheel2deb/): Takes wheels as input.

Might need to edit generated Debian packaging configuration to change dependencies.

### Fat binaries

Standalone self-contained applications or installers.

- Include Python interpreter and all dependencies.
- Fits Windows and macOS application distribution, as unlike Linux they lack a dependency management tool.

Beware:

- Fat binaries are fat (~150 Mb for projects involving GUIs).
- You are redistributing many other people's work, so take care about licenses.

#### Freezing

There is a number of tools to 'freeze' a Python application for distribution from an installation on a computer.

Principle:

- Analyze a script to find its dependencies (i.e., its imports).
- Collect all dependencies and python interpreter in a directory.
- Add a launcher and eventually bundle everything in a single file or installer.

#### Freezing issues

- Those tools relies on rules specific to each package (`matplotlib`, `numpy`) which needs to be updated when packages evolve.
- Analysis can miss some hidden imports.
- All runtime dependencies must be included (including external libraries wrapped by Python packages).
- Data files cannot be guessed and need to be explicitly added.

You must make sure it is stand-alone and includes everything required.
Test the result on a different computer than the one used for packaging.

#### Tools

[PyInstaller](http://www.pyinstaller.org/): Cross-platform

But also
[cx_Freeze](http://cx-freeze.readthedocs.org/) (cross-platform),
[py2app](https://pythonhosted.org/py2app/) (macOS),
[pynsist](https://pypi.python.org/pypi/pynsist) (Windows),
[py2exe](https://pypi.python.org/pypi/py2exe/) (Windows),
[pex](https://github.com/pantsbuild/pex) (Linux, macOS)

On Windows, you can create an installer with a tool such as [NSIS](http://nsis.sourceforge.net/).