# Python Package: Distribution

To illustrate the process we continue using the `company_package` that we developed in 
[part 4](../part4/notebook.html#Working-example).


In principle, the code of your package should be developed with `git` and uploaded on GitHub 
(or GitLab, or Bitbucket, etc.).

If you want to keep your code private you can of course do so, while developing it and sharing it with your chosen colleagues by inviting them to your private repository.


Here, we will assume your are ready to share your code with the world and 
that you want it to be usable by users on Mac OS, Linux, and Windows machines. 

## Source and wheels


When we used 

```bash
pip install -e .
```
inside our `company_package` folder, we created a sub-folder called `company.egg-info`, which contains metadata about the package.

<div class="exercise-box">
**Exercise:** Look inside the `company.egg-info` folder and check-out the content of the files there. 
</div>

More importantly, we also automatically created a new folder called `dist-info` inside our environment site-packages folder, which contains the same sort of information.

This folder will typically be at 

```bash
<venvdir>/<name-of-env>/lib/python3.X/site-packages/company-<version>.dist-info
```

<div class="exercise-box">
**Exercise:** Look inside the `dist-info` folder of your site-packages folder and check-out the content of the files there. 
</div>

One of the files is `direct_url.json`. In our case it shows:

```json
{"dir_info": {"editable": true}, "url": "file:///path/to/company_package"}
```

which means we have installed the package in editable mode and the location of the **source code** is given by the `url` field.


However, we have not **built** the package yet, i.e., we have not created a binary distribution of the code.

Let's do it. To do so, in a Terminal we run:

```bash
python -m build
```

from inside the package folder. Executing the command may throw out an error like:

```
No module named build.__main__; 'build' is a package and cannot be directly executed
```

If so, you simply need to degrade your build version by running the following command:

```bash
pip install 'build<0.10.0'
```

At the end of the process, a `dist` folder should be created and we should see something like:

```
Successfully built company-0.0.0b1.dev6+gfa37a84.d20241029.tar.gz and company-0.0.0b1.dev6+gfa37a84.d20241029-py3-none-any.whl
```

These two files, stored in the `dist` folder, are the source distribution and the wheel distribution of the package. 

The wheel distribution is a binary distribution of the package. (In fact, for a pure Python package, it amounts  to an archive of the package efficiently organized. For a package with compiled extensions, it also contains the compiled files.)

Installing from the wheel can be much faster than from the source. To do so, we can run:

```bash
pip install <package-name>-<version>-<py-version>-<platform>.whl
```

(However, note that this is not allowing you to install the package in editable mode)

To see what's inside the wheel, we can extract it using:

```bash
unzip <package-name>-<version>-<py-version>-<platform>.whl -d <where-to-extract>
```

<div class="exercise-box">
**Exercise:** Create, install and extract the wheel distribution of the `company` package and inspect its content.
</div>

## Docker Images and Containers

Here we notice that our wheel file says `py3-none-any`. This means that it is compatible with Python 3 (any version), 
and will work on any platform (macOS, Linux, Windows, etc.) and any architecture (x86, arm, etc.).

This is generally the case for pure Python packages.

For more complex packages that involve compiled extensions, we will see how to build wheels for multiple platforms and architectures using 
[**cibuildwheel**](https://cibuildwheel.readthedocs.io/en/stable/).

[**cibuildwheel**](https://cibuildwheel.readthedocs.io/en/stable/) is a tool that relies on [**Docker**](https://en.wikipedia.org/wiki/Docker_(software)).

With [**Docker**](https://en.wikipedia.org/wiki/Docker_(software)) we can test how our package installs and behaves on different platforms.

We will cover [**cibuildwheel**](https://cibuildwheel.readthedocs.io/en/stable/) in more details later in the course and focus on Docker for now.

Docker can be used in CLI but also through a graphical interface known as [Docker Desktop](https://www.docker.com/products/docker-desktop/). 

You are encouraged to install it.

To put things simply, with Docker we create a sort of virtual Linux machine on 
our machine. This virtual machine has its own operating system and can be seen as completely isolated from the rest of our local machine.

The key step to set-up Docker is to create a so-called **Dockerfile**. It is a script detailing the setup of the environment, including dependencies, for the package.

For our `company_package`, a valid Dockerfile could be: 

```dockerfile
# Use an official Python image as the base image
FROM python:3.12-slim

# Install Git
RUN apt-get update && apt-get install -y git


# Set the working directory in the container
WORKDIR /app

# Copy the project files to the working directory
COPY . /app

# Install required dependencies for building the package
RUN pip install --upgrade pip setuptools wheel setuptools_scm build

# Install runtime dependencies listed in pyproject.toml
RUN pip install .

# Build the package
RUN python -m build
```

(See [here](https://github.com/borisbolliet/company_package/blob/main/Dockerfile).)


The next step is to **build the Docker image**.

First, we check that Docker is installed and active on our machine. To do so, we can run:

```bash
docker info
```

If Docker is not active. The easiest way to start it is to open Docker Desktop. Then the command above should work (and print info about the Docker version on your machine).


<div class="exercise-box">
**Exercise:** Install Docker on your machine and try the `docker info` command.
</div>


With docker active, we **build** the Docker image by running:

```bash
docker build -t <name-of-image> .
```

In Docker desktop we should be able to see the image being built.

To list the images on our machine we can run:

```bash
docker images
```

Finally, we can **run** the Docker image in a container by running:

```bash
docker run -it <name-of-image>
```

For our `company_package`, this generates a container that has its own Python environment and can be used to test the package in this isolated environment. The `it` option means **interactive**: the command will open a Python shell in the container (and the container will stop when you exit the Python shell). The output looks like:

```bash
docker run -it company-image
Python 3.11.10 (main, Oct 19 2024, 03:39:30) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import company as cp
Company package version: 0.0.0b1.dev7+g74c5191.d20241029
>>> 
```

Now, you understand that you can use Docker to distribute your package in a very robust way. Indeed, by providing a Dockerfile, other users can readily test and use your package on their own machines by running:

```bash
docker build -t <name-of-image> .
docker run -it <name-of-image>
```

With this method, users do not need to worry at all about dependencies of your package, python version, etc. Everything is specified in the Dockerfile.

This covers the essential aspects of Docker and hopefully conveys the idea that it is a very powerful tool for software development. It is worth noting that the Docker (and Remote - Containers) extensions of VSCode allow you to develop and debug inside the container.

<div class="exercise-box">
**Exercise:** Use the VSCode extensions to set-up and run a container based on the `company_package` Docker image.
</div>

