(03:How-to-package-a-Python)=
# How to package a Python
<hr>

In this chapter we will develop an entire example Python package from beginning-to-end to introduce and demonstrate the key steps involved in developing a package. Later chapters will explore each of these steps in further detail. This chapter forms the foundation of this book and aims to be a practical resource for you to refer to when creating Python packages in the future.

The example package we are going to create in this chapter will help us calculate word counts from a plain-text file. We'll be calling it `pycounts` and it will be useful for parsing and understanding word usage in any text, be it novels, research papers, news articles, log files, and more.

## Counting words in a text file

(03:Developing-our-code)=
### Developing our code

Before even thinking about making a package, we'll first develop the code we want to package up. The `pycounts` package we are going to create will help us calculate word counts from a plain-text file. Python provides a very useful `Counter` class (which can be imported from the `collections` module) that can be used to calculate counts of a collection of elements (like a list of words) and store them in a dictionary. We can demonstrate the functionality of `Counter` by first opening up an interactive Python interpreter:

```{prompt} bash \$ auto
$ python
```

And importing `Counter` from the `collections` module:

```{prompt} python >>> auto
>>> from collections import Counter
```

We can define a sample list of words and create a `Counter` object by passing that list of words as an input:

```{prompt} python >>> auto
>>> words = ["a", "happy", "hello", "a", "world", "happy"]
>>> counts = Counter(words)
>>> counts
```

```console
Counter({'a': 2, 'happy': 2, 'hello': 1, 'world': 1})
```

Note how the `Counter` object we created automatically calculated the count of each unique word in our list and returned the result as a dictionary of word:count pairs! So, to use `Counter` to count the words in a text file we'll need to load the file, split it up into a list of words, and then create a `Counter` object from that list of words.

We need a text file to help us build this workflow. "[The Zen of Python](https://www.python.org/dev/peps/pep-0020/)" is a list of 19 aphorisms about the Python programming language which can be viewed by executing `import this` in Python:

```{prompt} python >>> auto
>>> import this
```

```console
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
...
```

Let's export "The Zen of Python" as a text file. You can do this by manually copying the output of `import this` into a *.txt* file using an editor of your choice and saving it as *`zen.txt`*, or you can do it from the command line by first exiting the Python interpreter:

```{prompt} python >>> auto
>>> exit()
```

Then running the following command:

```{prompt} bash \$ auto
$ python -c "import this" > zen.txt
```

```{tip}
In the command above, the `-c` option allows you to pass a string for Python to execute, and the `>` redirects the output of the preceding commands to a file. This command will create the file in the current directory.
```

Now that we have a text file to work with, we can go back to developing our word counting functionality. To open our *`zen.txt`* file in Python, we can use the `open()` function. The code below, run in a Python interpreter, will open the file for us:

```{prompt} python >>> auto
>>> with open("zen.txt") as file:
        text = file.read()
```

Let's see what that looks like:

```{prompt} python >>> auto
>>> text
```

```console
"The Zen of Python, by Tim Peters\n\nBeautiful is better
than ugly.\nExplicit is better than implicit.\nSimple is 
better than complex.\nComplex is better than complicated
..."
```

The `text` variable we have created is a Python string and the `\n` symbols indicates a new-line in the string. Before we split the above text into individual words for counting with `Counter`, we should lowercase all the letters and remove punctuation so that if the same word occurs multiple times with different capitalization or punctuation, it isn't treated as different words. For example we want "Better", "better", and "better!" to result in three counts of the word "better".

To lowercase all letters in a Python string, we can use the `.lower()` method:

```{prompt} python >>> auto
>>> text = text.lower()
```

To remove punctuation, we can iterate over a collection of punctuation marks and replace them with nothing using the `.replace()` method. Python provides a collection of punctuation marks in the `string` module:

```{prompt} python >>> auto
>>> from string import punctuation
>>> punctuation
```

```console
'!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'
```

We can use a `for` loop to remove each punctuation mark from `text` by replacing it with an empty string (`""`):

```{prompt} python >>> auto
>>> for p in punctuation:
        text = text.replace(p, "")
```

With punctuation removed and the letters in `text` all lowercase, we can now split it up into individual words using the `.split()` method, which by default will split a string into a list of strings using spaces, newlines (`\n`) and tabs (`\t`):

```{prompt} python >>> auto
>>> words = text.split()
>>> words
```

```console
['the', 'zen', 'of', 'python', 'by', 'tim', 'peters', 
'beautiful', 'is', 'better', 'than', 'ugly', ...]
```

We've now managed to load, pre-process, and split our `*zen.txt*` file up into individual words and can now determine the word counts using a `Counter` object:

```{prompt} python >>> auto
>>> from collections import Counter
>>> counts = Counter(words)
>>> counts
```

```console
Counter({'is': 10, 'better': 8, 'than': 8, 'the': 6, 
'to': 5, 'of': 3, 'although': 3, 'never': 3, ... })
```

### Turning our code into functions

In **{numref}`03:Developing-our-code`** we developed a workflow for counting words in a text file. But it would be a pain to run all that code every time we want to count the words in a file. To make things more efficient, let’s turn the above code into reusable functions called `load_text()`, `clean_text()` and `count_words()` by defining them in our Python interpreter:

```{tip}
We've added a short docstring to each function here using triple quotes. We'll talk more about docstrings in **{numref}`03:Writing-docstrings`**.
```

```{prompt} python >>> auto
>>> def load_text(input_file):
        """Load text from a text file and return as a string."""
        with open(input_file, "r") as file:
            text = file.read()
        return text
```
```{prompt} python >>> auto
>>> def clean_text(text):
        """Lowercase and remove punctuation from a string."""
        text = text.lower()
        for p in punctuation:
            text = text.replace(p, "")
        return text
```
```{prompt} python >>> auto
>>> def count_words(input_file):
        """Count unique words in a string."""
        text = load_text(input_file)
        text = clean_text(text)
        words = text.split()
        return Counter(words)
```

We can now use our word-counting functionality as follows:

```{prompt} python >>> auto
>>> count_words("zen.txt")
```

```console
Counter({'is': 10, 'better': 8, 'than': 8, 'the': 6, 
'to': 5, 'of': 3, 'although': 3, 'never': 3, ... })
```

Unfortunately, if you quit from the Python interpreter, the functions we defined will be lost and you will have to define them again in new sessions. The whole idea of a Python package is that we can store Python code, like our `load_text()`, `clean_text()` and `count_words()` functions, in an installable package that will allow us, or others, to reuse the code at will in any project without having to re-write it. In the remainder of this chapter, we'll work towards packaging up the code we've written into a Python package called `pycounts` so that we can install it and `import` it in any Python session or new project.

## Package structure

(03:A-brief-introduction)=
### A brief introduction

To develop our `pycounts` package we first need to create an appropriate directory structure. Python packages comprises a specific directory structure of one or more Python modules (files with a *.py* extension that contain the Python code you want to package up) and instructions on how to build and install the package on a computer. Below is an example package structure containing two modules:

```
example_pkg
├── src
│   └── example_pkg
│       ├── __init__.py
│       ├── module1.py
│       └── module2.py
├── README.md
└── pyproject.toml
```

The root directory is named after the package ("example_pkg" here). It contains a *`src/example_pkg`* sub-directory that shares the package's name and contains the Python source code (*.py* files) that make up the package. The *`__init__.py`* file tells Python to treat the directory as a package (we'll talk more about this file in **{numref}`04:The-\_\_init\_\_.py-file`**). *`README.md`* is a text file that provides high-level information about the package (what it does, how it can be used, its structure, etc.) - it is not strictly needed to create a package but is highly recommended. The *`pyproject.toml`* file contains metadata about the project (who made it, how it is licensed, etc.) and instructions on how to build the package for installation and distribution, as we'll talk about in **{numref}`03:Installing-your-package`**.

The above structure is suitable for a very simple package, or one intended solely for personal use. But more complex packages and/or those intended to be shared with others usually contain many more bells and whistles, such as additional documentation, examples of usage, tests that can be run to validate the functionality of the package, and more. Below is an example of the structure of a package structure intended for sharing publicly (i.e., open source):

```
example_pkg
├── CHANGELOG.md
├── CONDUCT.md
├── CONTRIBUTING.md
├── docs
│   └── ...
├── LICENSE
├── pyproject.toml
├── README.md
├── src
│   └── example_pkg
│       ├── __init__.py
│       ├── example_module1.py
│       └── example_module2.py
└── tests
    └── ...
```

The `pycounts` package we are going to create in this chapter will follow the latter package structure and we'll explore each element in that structure in the remainder of this chapter. Our reasoning for building a package with all the bells and whistles is to expose you to all the different elements of packaging, so that you understand them when you see them "in the wild", and can make informed choices about what content to include in your own packages in the future, depending on their intended use and audience.

(03:Creating-a-package-structure)=
### Creating a package structure

Regardless of your packaging expertise or how much content you intend to include in your package, it's efficient to use a pre-made template to set up the boilerplate directory structure. We will use `cookiecutter` (a Python package we installed in **{numref}`02:Install-packaging-software`**) to create our package structure for us.

`cookiecutter` is a tool for populating a directory structure from a pre-made template. People have developed and open-sourced many different `cookiecutter` templates for different projects, such as for creating Python packages, R packages, websites, and more. You can find these templates by, for example, searching an online hosting service like [GitHub](https://www.github.com). We have developed our own `py-pkgs-cookiecutter` template that can be used to create a Python package directory structure. The template is hosted on [GitHub](https://github.com/py-pkgs/py-pkgs-cookiecutter) and can be used from the command line by navigating to the directory where you want to create your package and running the following command:

```{prompt} bash \$ auto
$ cookiecutter https://github.com/py-pkgs/py-pkgs-cookiecutter.git
```

Upon executing the above command you will be prompted to provide information that will be used to create and customize your package file and directory structure. Below is an example of how to respond to these prompts and an explanation of what they mean.

```console
author_name [Monty Python]: Tomas Beuzen
package_name [mypkg]: pycounts
package_short_description [A package for doing great things!]: Calculate word counts in a text file!
package_version [0.1.0]: 
python_version [3.9]: 
Select open_source_license:
1 - MIT
2 - Apache License 2.0
3 - GNU General Public License v3.0
4 - Creative Commons Attribution 4.0
5 - BSD 3-Clause
6 - Proprietary
7 - None
Choose from 1, 2, 3, 4, 5, 6 [1]: 
Select include_github_actions:
1 - no
2 - yes
Choose from 1, 2 [1]: 
```

Let's break down the options above:

- `author_name`, `package_name`, and `package_short_description` are self-explanatory. We provide guidance on choosing a good package name in **{numref}`04:Package-and-modules-names`**) but note that in this book we will eventually be publishing our `pycounts` package to Python's main package index [PyPI](https://pypi.org/). Package names on PyPI must be unique. **So if you plan to follow along with this tutorial you should choose a unique name for your package**. Something like `pycounts_[your intials]` might be appropriate, but you can check if a particular name is already taken by searching for it on [PyPI](https://pypi.org/).
- `package_version` is the version of your package. Most Python packages use [semantic versioning](https://semver.org) for identifying their software. In semantic versioning, a version number consists of three integers A.B.C, where A is the "major" version, B is the "minor" version, and C is the "patch" version. The first version of a software usually starts at 0.1.0 and increments from there. We'll discuss versioning in **Chapter 7: {ref}`07:Releasing-and-versioning`**.
- `python_version` is the minimum version of Python you want to support.
- `open_source_license` is the license you wish to use that dictates how your package can be used and redistributed. The `py-pkgs-cookiecutter` provides several options for commonly used licenses. We've selected the [MIT License](https://choosealicense.com/licenses/mit/) for `pycounts` which is a simple, permissive license commonly used for open source work. Note that if your project will not be open source or you wish to retain exclusive copyright, you can choose not to include a license.
- `include_github_actions` is an option to include continuous integration and continuous deployment files to help automate the building, testing and deployment of your Python package using the [GitHub Actions](https://github.com/features/actions) service. We've selected no for this option as we'll explore these topics in more detail in **Chapter 8: {ref}`08:Continuous-integration-and-deployment`**.

After responding to the `cookiecutter` prompts, we now have a new directory called `pycounts`, full of content suitable for building a fully-featured Python package! We'll explore each element of this directory structure as we develop our package throughout this chapter, but we've given a rough indication of what each file is related to below:

```
pycounts
├── .readthedocs.yml           ┐
├── CHANGELOG.md               │
├── CONDUCT.md                 │
├── CONTRIBUTING.md            │
├── docs                       │
│   ├── make.bat               │
│   ├── Makefile               │
│   ├── requirements.txt       │
│   ├── changelog.md           │
│   ├── conduct.md             │
│   ├── conf.py                │ Package documentation
│   ├── contributing.md        │
│   ├── index.md               │
│   └── usage.ipynb            │
├── LICENSE                    │
├── README.md                  ┘
├── pyproject.toml             ┐ 
├── src                        │
│   └── pycounts               │ Package source code, metadata,
│       ├── __init__.py        │ and build instructions 
│       └── pycounts.py        ┘
└── tests                      ┐
    └── test_pycounts.py       ┘ Package tests
```

(03:Put-your-package-under-version-control)=
## Put your package under version control

Before continuing to develop our package it is generally good practice to put it under local and remote version control. This is not necessary for developing a package but is highly recommended so that you can better manage and track changes to your package over time. Version control is particularly useful and important if you plan on sharing and collaborating on your package with others. If you don't want to use version control, feel free to skip to **{numref}`03:Your-first-package-code`**. The tools we will be using for version control are Git and GitHub (which we set up in **{numref}`02:Install-Git-and–register-for-a-GitHub-account`)**. 

```{attention}
For this book, we assume readers have basic familiarity with Git and GitHub (or similar). To learn more about Git and GitHub, we recommend the following resources: [Happy Git and GitHub for the useR](https://happygitwithr.com) {cite:p}`bryan2021` and [Research Software Engineering with Python: Using Git at the Command Line](https://merely-useful.tech/py-rse/git-cmdline.html) {cite:p}`rsep2021b`
```

### Set up local version control

To set up local version control, navigate to the root `pycounts` directory and initialize a Git repository:

```{prompt} bash \$ auto
$ cd pycounts
$ git init
```

```console
Initialized empty Git repository in /Users/tomasbeuzen/pycounts/.git/
```

Next, we need to tell Git which files to track (which will be all of them at this point) and then commit these changes locally:

```{prompt} bash \$ auto
$ git add .
$ git commit -m "initial package setup"
```

```console
[master (root-commit) 51795ad] initial package setup
 21 files changed, 538 insertions(+)
 create mode 100644 .gitignore
 create mode 100644 .readthedocs.yml
 create mode 100644 CHANGELOG.md
 ...
 create mode 100644 src/pycounts/__init__.py
 create mode 100644 src/pycounts/pycounts.py
 create mode 100644 tests/test_pycounts.py
```

### Set up remote version control

Now that we have set up our local version control, we will create a repository on [GitHub](https://github.com/) and set that as the remote version control home for this project. First we need to create a new repository on [GitHub](https://www.github.com) as demonstrated in {numref}`03-set-up-github-1`:

```{figure} images/03-set-up-github-1.png
---
width: 100%
name: 03-set-up-github-1
alt: Creating a new repository in GitHub.
---
Creating a new repository in GitHub.
```

To follow along with this tutorial, select the following options when setting up your GitHub repository, as shown in {numref}`03-set-up-github-2`: 

1. Give the GitHub repository the same name as your Python package and give it a short description;
2. You can choose to make your repository public or private - we'll be making ours public so we can share it with others; and,
3. Do not initialize the GitHub.com repository with any files (we've already created the files we need using our `py-pkgs-cookiecutter` template).

```{figure} images/03-set-up-github-2.png
---
width: 100%
name: 03-set-up-github-2
alt: Setting up a new repository in GitHub.
---
Setting up a new repository in GitHub.
```

Next, copy the remote link to your repository and then use the commands shown on GitHub, and outlined in {numref}`03-set-up-github-3`, to link your local repository with the remote repository, and push your project to GitHub:

```{figure} images/03-set-up-github-3.png
---
width: 100%
name: 03-set-up-github-3
alt: Instructions on how to link local and remote repositories.
---
Instructions on how to link local and remote repositories.
```

```{prompt} bash \$ auto
$ git remote add origin git@github.com:TomasBeuzen/pycounts.git
$ git branch -M main
$ git push -u origin main
```

```console
Enumerating objects: 26, done.
Counting objects: 100% (26/26), done.
Delta compression using up to 8 threads
Compressing objects: 100% (19/19), done.
Writing objects: 100% (26/26), 8.03 KiB | 4.01 MiB/s, done.
Total 26 (delta 0), reused 0 (delta 0)
To github.com:TomasBeuzen/pycounts.git
 * [new branch]      main -> main
Branch 'main' set up to track remote branch 'main' from 'origin'.
```

```{attention}
The commands above should be specific to your GitHub username and the name of your Python package. The example above uses SSH authentication with GitHub which we recommend setting up. SSH is useful for connecting to GitHub without having to supply your username and password every time. If you're interested in setting up SSH, take a look at the [GitHub documentation](https://docs.github.com/en/github/authenticating-to-github/connecting-to-github-with-ssh). If you don't have SSH authentication set up, HTTPS authentication works as well and would require the use of the following url in place of the one shown above to set the remote: `https://github.com/TomasBeuzen/pycounts.git`. 
```

(03:Package-your-code)=
## Package your code

We now have our package structure set up, and are ready to populate our package with the `load_text()`, `clean_text()` and `count_words()` functions we developed at the beginning of the chapter. Where should we put these functions? Let's review the structure of our package:

```
pycounts
├── .readthedocs.yml
├── CHANGELOG.md
├── CONDUCT.md
├── CONTRIBUTING.md
├── docs
│   └── ...
├── LICENSE
├── pyproject.toml
├── README.md
├── src
│   └── pycounts
│       ├── __init__.py
│       └── pycounts.py
└── tests
    └── ...
```

All the code that we would like the user to run as part of our package should live in modules in the `src` directory. Our `py-pkgs-cookiecutter` template already created a Python module for us to put our code in called `src/pycounts/pycounts.py` (note that this module can be named anything, but it is common for a module to share the name of the package). We'll save our functions there. Because our functions depends on `collections.Counter` and `string.punctuation`, we should also be sure to import them at the top of the file. Here's what *`src/pycounts/pycounts.py`* should now look like:

```python
from collections import Counter
from string import punctuation


def load_text(input_file):
    """Load text from a text file and return as a string."""
    with open(input_file, "r") as file:
        text = file.read()
    return text
    
def clean_text(text):
    """Lowercase and remove punctuation from a string."""
    text = text.lower()
    for p in punctuation:
        text = text.replace(p, "")
    return text
    
def count_words(input_file):
    """Count unique words in a string."""
    text = load_text(input_file)
    text = clean_text(text)
    words = text.split()
    return Counter(words)
```

## Test drive your package code

(03:Create-a-virtual-environment)=
### Create a virtual environment

Before we install and test our package, it is highly recommended to set up a virtual environment. As discussed previously in **{numref}`02:Installing-Python`**, a virtual environment provides a safe and isolated space for us to install our package and any other packages it depends on, without affecting other environments and projects on our computer (or vice versa). If you don't want to use a virtual environment, feel free to skip to **{numref}`03:Installing-your-package`**.

There are several options available when it comes to creating and managing virtual environments (e.g., `conda`, `virtualenv`, etc.). We will use `conda` (which we installed in **{numref}`02:Installing-Python`**) because it is a simple, commonly-used, and effective tool for managing virtual environments.

To use `conda` to create and activate a new virtual environment called `pycounts` that includes Python 3.9, run the following in your terminal:

```{prompt} bash \$ auto
$ conda create --name pycounts python=3.9 -y
```

```{note}
We are using Python 3.9 because that is the minimum version of Python we specified that our package will support in **{numref}`03:Creating-a-package-structure`**. In **Chapter 8: {ref}`08:Continuous-integration-and-deployment`**, we'll show how you can automatically test your package against different versions of Python using continuous integration (without having to set up a virtual environment for each one).
```

To use this new environment for developing and installing software, we should "activate" the environment:

```{prompt} bash \$ auto
$ conda activate pycounts
```

In most command lines, `conda` will add a prefix like `(pycounts)` to your command-line prompt to indicate which environment you are working in. Anytime you wish to work on your package, you should activate its virtual environment. You can view the packages installed in an environment using the command `conda list`. At this point, our `pycounts` environment should only have Python 3.9 and a small collection of its dependencies installed.

````{tip}
You can exit a `conda` virtual environment anytime using the following command:

```{prompt} bash \$ auto
$ conda deactivate
```
````

(03:Installing-your-package)=
### Installing your package

We have our package structure set up and we've populated it with some Python code. Now, how do we install and use it? There are several tools available to develop and build installable Python packages; `poetry`, `flit`, `setuptools`, and more. We compare and contrast these tools in **{numref}`04:Packaging-tools`**. In this book we will be using `poetry` (which we installed in **{numref}`02:Install-packaging-software`**), because it is a modern packaging tool that provides simple and efficient commands to develop, install, and distribute Python packages.

`poetry` uses the *`pyproject.toml`* file to configure how a package should be installed. The *`pyproject.toml`* that the `py-pkgs-cookiecutter` automatically created for our `pycounts` package looks like this:

```toml
[tool.poetry]
name = "pycounts"
version = "0.1.0"
description = "Calculate word counts in a text file."
authors = ["Tomas Beuzen"]
license = "MIT"
readme = "README.md"

[tool.poetry.dependencies]
python = "^3.9"

[tool.poetry.dev-dependencies]

[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"
```

Below is a brief description of each of the headings in that file (called "tables" in the TOML file jargon):

- `[tool.poetry]`: contains metadata about our package. At a minimum, you must define the `name`, `version`, `description`, and `authors` of your package. Additional options can provide more metadata about your package and are described in the `poetry` [documentation](https://python-poetry.org/docs/pyproject/#dependencies-and-dev-dependencies).
- `[tool.poetry.dependencies]`: identifies the dependencies of your package - that is, other packages required to be installed by a user to use your package. Currently our `pycounts` package only depends on Python 3.9 or higher (but we'll add more code and some dependencies to our package later in this chapter).
- `[tool.poetry.dev-dependencies]`: identifies development dependencies of our package. These are packages not required to use our package, but required for development purposes such as running tests or building documentation. We'll add development dependencies to our `pycounts` package later in this chapter.
- `[build-system]`: this identifies the build tools to be used to build your package into a single, shareable distribution, as we'll talk more about in **{numref}`03:Building-and-distributing-your-package`**.

With our *`pyproject.toml`* file already set up for us by our `py-pkgs-cookiecutter`, we can go right ahead and use `poetry` to install our package using the command `poetry install` from the root package directory:

```{prompt} bash \$ auto
$ poetry install
```

```console
Updating dependencies
Resolving dependencies... (0.1s)

Writing lock file

Installing the current project: pycounts (0.1.0)
```

```{tip}
When installing a package for the first time, `poetry` also creates a *`poetry.lock`* file, which contains the exact versions of all the packages and their dependencies you've installed. Subsequent runs of `poetry install` will install packages based on *`poetry.lock`*. This can be helpful for anyone developing your project (including you in the future) because it means they will use the exact same versions of the dependencies that you used when you created the project - even if those dependencies have since released new versions compatible with your *`pyproject.toml`* file. For readers who have used *`requirements.txt`* before with `pip` or *`environment.yaml`* with `conda`, you can think of *`poetry.lock`* as the `poetry` equivalent of those files. If new versions of your dependencies do become available, you can choose to update your installed versions of them and the *`poetry.lock`* file by running `poetry update`. We won't be focusing on *`poetry.lock`* in this book but it can be a useful tool if you're developing a package with others and you can read more about it in the `poetry` [documentation](https://python-poetry.org/docs/basic-usage/#installing-dependencies). 
```

With our package installed, we can now `import` and use it in a Python session. Before we do that, we need a text file to test our package on. Feel free to use any text file, but for now, we'll create the same "Zen of Python" text file we used earlier in the chapter by running the following at the command line:

```{prompt} bash \$ auto
$ python -c "import this" > zen.txt
```

Now we can open an interactive Python session and `import` and use the `count_words()` function from our `pycounts` module with the following code:

```{prompt} python >>> auto
>>> from pycounts.pycounts import count_words
>>> count_words("zen.txt")
```

```console
Counter({'is': 10, 'better': 8, 'than': 8, 'the': 6, 
'to': 5, 'of': 3, 'although': 3, 'never': 3, ... })
```

Looks like everything is working! We now have created and installed a simple Python package that encapsulates the functionality we want to reuse and which you could now use in different projects (by installing it in their respective virtual environments).

It's important to note that `poetry` install packages in "editable mode", which essentially means that it installs a link to your package's code on your computer. This is common practice for developers because it means that any edits you now make to your package's source code are immediately available the next time you `import` it, without having to `poetry install` again. We'll talk more about editable installs and Python's import system in **{numref}`04:Packaging-fundamentals`**.

In the next section, we'll add additional code and functionality to our package and explore `poetry` and the *`pyproject.toml`* file a bit more. For those using version control, it's a good idea to commit the changes we've made to *`src/pycounts/pycounts.py`* to local and remote version control:

```{prompt} bash \$ auto
$ git add src/pycounts/pycounts.py
$ git commit -m "feat: add word counting functions"
$ git push
```

```{tip}
Different developers use different syntax and formats when using a version control system. Here we use the [Angular style](https://github.com/angular/angular.js/blob/master/DEVELOPERS.md#-git-commit-guidelines) for commit messages. Messages take the form "type: subject", where "type" indicates the kind of change being made and "subject" contains a description of the change. In this book, we'll use the follow "types" to identify our commits:
- "build": indicates a change to the build system or external dependencies.
- "docs": indicates a change to documentation.
- "feat": indicates a new feature being added to the code base.
- "fix": indicates a bug fix.
- "test": indicates changes to testing framework.
```

(03:Adding-code-with-dependencies-to-your-package)=
## Adding code with dependencies to your package

Let's now add some new functionality to our package; a plotting function that will plot a bar chart of the top `n` words in a text file.

Imagine we've come up with the following `plot_words()` function that creates a bar chart of the top `n` words in a `Counter` object of words counts. The code itself is not overly important to our discussion, but briefly, the function uses the `.most_common()` method of the `Counter` object to find the top `n` word counts in the object and returns a list of `n` tuples of the format `(word, count)`. It then uses the Python short-hand `zip(*...)` to unpack that list of tuples into two individual lists, `word` and `count`. Finally, the `matplotlib` package is used to plot the result (`plt.bar(...)`), which looks like {numref}`03-matplotlib-figure`.

```python
import matplotlib.pyplot as plt

def plot_words(word_counts, n=10):
    """Plot a bar chart of word counts."""
    top_n_words = word_counts.most_common(n)
    word, count = zip(*top_n_words)
    fig = plt.bar(range(n), count)
    plt.xticks(range(n), labels=word, rotation=45)
    plt.xlabel("Word")
    plt.ylabel("Count")
    return fig
```

```{figure} images/03-matplotlib-figure.png
---
width: 100%
name: 03-matplotlib-figure
alt: Example figure created from the `plot_words()` function.
---
Example figure created from the `plot_words()` function.
```

Where should we put this function in our package? You could certainly add all your package code into a single module (e.g., *`src/pycounts/pycounts.py`*), but as you add functionality to your package that module will quickly become overcrowded and hard to manage. Instead, as you write more code, it's a good idea to compartmentalize and organize it into multiple, logical modules. With that in mind, we'll create a new module called *`src/pycounts/plotting.py`* to house our plotting function `plot_words()`. You can create that new module in an editor of your choice, or by running the following command at the command line:

```{prompt} bash \$ auto
$ touch src/pycounts/plotting.py
```

Your package structure should now look like this:

```
pycounts
├── .gitignore
├── .readthedocs.yml
├── CHANGELOG.md
├── CONDUCT.md
├── CONTRIBUTING.md
├── docs
│   └── ...
├── LICENSE
├── pyproject.toml
├── README.md
├── src
│   └── pycounts
│       ├── __init__.py
│       ├── plotting.py
│       └── pycounts.py
└── tests
    └── ...
```

Open *`src/pycounts/plotting.py`* and add the `plot_words()` code above (don't forget to add the `import matplotlib.pyplot as plt` at the top of the module). After doing this, if we tried to `import` our new function we'd get an error:

```{prompt} python >>> auto
>>> from pycounts.plotting import plot_words
```

```console
ModuleNotFoundError: No module named 'matplotlib'
```

This is because `matplotlib` is not part of the standard Python library, it is an external library that we need to install and add as a dependency of our `pycounts` package. We can do this with `poetry` using the command `poetry add`. This command will install the specified package(s) into the current environment and will update the `[tool.poetry.dependencies]` section of the *`pyproject.toml`* file:

```{prompt} bash \$ auto
$ poetry add matplotlib
``` 

```console
Using version ^3.4.3 for matplotlib

Updating dependencies
Resolving dependencies...

Writing lock file

Package operations: 8 installs, 0 updates, 0 removals

  • Installing six (1.16.0)
  • Installing cycler (0.10.0)
  • Installing kiwisolver (1.3.1)
  • Installing numpy (1.21.1)
  • Installing pillow (8.3.1)
  • Installing pyparsing (2.4.7)
  • Installing python-dateutil (2.8.2)
  • Installing matplotlib (3.4.3)
```

If we view our *`pyproject.toml`* file we now see `matplotlib` listed as a dependency under the `[tool.poetry.dependencies]` section (which previously only contained Python 3.9 as a dependency, as we saw in **{numref}`03:Installing-your-package`**):

```toml
[tool.poetry.dependencies]
python = "^3.9"
matplotlib = "^3.4.3"
```

We can now use our package as follows:

```{prompt} python >>> auto
>>> from pycounts.pycounts import count_words
>>> from pycounts.plotting import plot_words
>>> counts = count_words("zen.txt")
>>> fig = plot_words(counts, 10)
```

````{attention}
If running the above Python code in an interactive IPython shell or Jupyter notebook, the plot will be displayed automatically. If you're running from the Python shell, you'll need to run the `matplotlib` command `plt.show()` to display the plot:

```{prompt} python >>> auto
>>> import matplotlib.pyplot as plt
>>> plt.show()
```
````

We've made some important changes to our package in this section by adding a new module and a dependency, so those using version control should commit these changes:

```{prompt} bash \$ auto
$ git add src/pycounts/plotting.py
$ git commit -m "feat: add plotting module"
$ git add pyproject.toml poetry.lock
$ git commit -m "build: add matplotlib as a dependency"
$ git push
```

(03:Testing-your-package)=
## Testing your package

(03:Writing-tests)=
### Writing tests

At this point we have developed a package that can count words in a text file and plot the results. But how can we be certain that our package works correctly and produces reliable results? One thing we can do is write tests for our package that check the package is working as expected. This is particularly important if you intend to share your package with others (you don't want to share code that doesn't work!). But even if you don't intend to share your package, writing tests can help catch errors in your code, and help you add new code in the future without breaking any tried-and-tested existing functionality. If you don't want to write to tests for your package feel free to skip to **{numref}`03:Package-documentation`**.

Many of us already conduct informal tests of our code by running it a few times in a Python session to see if it's working as we expect, and if not, changing the code and repeating the process. This is informal testing, sometimes called "manual testing" or "exploratory testing". However, when writing software, it's preferable to define your tests in a more formal and reproducible way.

There are different kinds of tests used to test software (unit tests, integration tests, regression tests, etc.) and we discuss these in **Chapter 5: {ref}`05:Testing`**. For now, we'll write some unit tests for our `pycounts` package. Unit tests are a commonly used testing framework that evaluate a single "unit" of software, such as a Python function, to check that it produces an expected result.

The Python `assert` statement is often used to create unit tests. It checks if two values are equal; if the `assert` is true, Python does nothing and continues running, but if it's false, the code terminates and shows a user-defined error message. For example, consider running the follow code in a Python interpreter:

```python
ages = [32, 19, 9, 75]
for age in ages:
    assert age >= 18, "Person is younger than 18!"
    print("Age verified!")
```

```console
Age verified!
Age verified!
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
AssertionError: Person is younger than 18!
```

Note how the first two "ages" (32 and 19) are verified, with an "Age verified!" message printed to screen. But the third age of 9 fails the `assert`, so an error message is raised and the program terminates, such that the last age of 75 is not checked.

Using the `assert` statement, let's write a unit test for the `count_words()` function of our `pycounts` package. We want to `assert` that the function produces an expected result given a particular input. Consider the following quote from Albert Einstein:

>*"Insanity is doing the same thing over and over and expecting different results."*

We can manually count the words in that quote to get the following result (ignoring capitalization and punctuation):

```python
einstein_counts = {'insanity': 1, 'is': 1, 'doing': 1,  'the': 1, 'same': 1, 'thing': 1, 'over': 2, 'and': 2, 'expecting': 1, 'different': 1, 'results': 1}
```

A unit test for `count_words()` would therefore check that the function produces the result above, given the raw quote as input. To write this test, let's first create a text file with the Einstein quote to use in our unit test. We'll add it to the *`tests`* directory of our package as a file called *`einstein.txt`* - you can make the file manually and copy the quote above, or you can create it from a Python session in the root package directory using the following code:

```{prompt} python >>> auto
>>> quote = "Insanity is doing the same thing over and over and expecting different results."
>>> with open("tests/einstein.txt", "w") as file:
        file.write(quote)
```

Now, a unit test for our `count_words()` function would look as below:

```{prompt} python >>> auto
>>> from pycounts.pycounts import count_words
>>> from collections import Counter
>>> einstein = {'over': 2, 'and': 2, 'insanity': 1, 'is': 1,
                'doing': 1, 'the': 1, 'same': 1, 'thing': 1,
                'expecting': 1, 'different': 1, 'results': 1}
>>> actual = Counter(einstein)
>>> expected = count_words("tests/einstein.txt")
>>> assert actual == expected, "Einstein quote words counted incorrectly!"
```

If the above code runs without error, our `count_words()` function is working, at least to our test specifications. We can write tests for other kinds of situations too, such as checking that two numbers are approximately equal, or that your code raises a certain error when used in a particular way. We'll explore these kinds of tests and more in **Chapter 5: {ref}`05:Testing`**.

(03:Running-tests)=
### Running tests

It would be tedious and inefficient to manually write and execute unit tests for your package's code in a Python interpreter like we did above. Instead, it's common to use a testing framework to automatically run our tests for us. `pytest` is one of the most commonly used testing frameworks for Python packages. To use `pytest`:

1. Tests are defined as functions prefixed with `test_`;
2. Tests are put in files of the form *`test_*.py`* or *`*_test.py`*, and are usually placed in a directory called *`tests`* in the package's root; and,
3. Tests are executed using the command `pytest` at the command line.

The `py-pkgs-cookiecutter` already created a *`tests`* directory and a module called *`test_pycounts.py`* for us to put our tests in:

```
example_pkg
├── CHANGELOG.md
├── CONDUCT.md
├── CONTRIBUTING.md
├── docs
│   └── ...
├── LICENSE
├── pyproject.toml
├── README.md
├── src
│   └── ...
└── tests
    ├── einstein.txt
    └── test_pycounts.py
```

```{note}
We created the file *`tests/einstein.txt`* ourselves in **{numref}`03:Writing-tests`**, it was not created by the `py-pkgs-cookiecutter`.
```

`pytest` tests are written as functions prefixed with `test_` and which contain a checking statement like `assert` to verify some code functionality. Based on this format, let's add the unit test we created in **{numref}`03:Writing-tests`** as a test function to *`tests/test_pycounts.py`* using the below Python code:

```python
from pycounts.pycounts import count_words
from collections import Counter

def test_count_words():
    """Test word counting from a file."""
    einstein = {'over': 2, 'and': 2, 'insanity': 1, 'is': 1,
                'doing': 1, 'the': 1, 'same': 1, 'thing': 1,
                'expecting': 1, 'different': 1, 'results': 1}
    expected = Counter(einstein)
    actual = count_words("tests/einstein.txt")
    assert actual == expected, "Einstein quote words counted incorrectly!"
```

Before we can use `pytest` to run our test for us we need to add it as a development dependency of our package using the command `poetry add --dev`. A development dependency is a package that is not required by a user to use your package, but is required for development purposes (like testing):

```{prompt} bash \$ auto
$ poetry add --dev pytest
```

If you look in *`pyproject.toml`* you will see that `pytest` gets added under the `[tool.poetry.dev-dependencies]` section (which was previously empty, as we saw in **{numref}`03:Installing-your-package`**):

```toml
[tool.poetry.dev-dependencies]
pytest = "^6.2.4"
```

To use `pytest` to run our test we can use the following command from our root package directory:

```{prompt} bash \$ auto
$ pytest tests/
```

```console
============================= test session starts ==============================
platform darwin -- Python 3.9.6, pytest-6.2.4, py-1.10.0, pluggy-0.13.1
rootdir: /Users/tomasbeuzen/pycounts
collected 1 item                                                                                                                                   

tests/test_pycounts.py .                                                  [100%]

============================== 1 passed in 0.01s ===============================
```

We get no error returned to us, indicating that our test passed! This suggests that the code we wrote is correct (at least to our test specifications)! We can add more tests for our package by writing more `test_*` functions. We'll discuss and write tests more in **Chapter 5: {ref}`05:Testing`**. For those using version control, commit your tests to local and remote version control:

```{prompt} bash \$ auto
$ git add pyproject.toml poetry.lock
$ git commit -m "build: add pytest as a dev dependency"
$ git add tests/*
$ git commit -m "test: add unit test for count_words"
$ git push
```

````{tip}
Tests are often run from the root package directory which is why we hard-coded the relative file path to *`einstein.txt`* in our `test_count_words()` unit test as `count_words("tests/einstein.txt")`. But this code will fail if a user tries to run `pytest` from some other location. A more robust way to create the above test, so that it works regardless of where `pytest` is run, is to specify the location of *`einstein.txt`* relative to the test module *`test_pycounts.py`*. We can do that using the `dirname()` and `join()` functions from the `os` module as follows:

```python
from pycounts.pycounts import count_words
from collections import Counter
from os.path import dirname, join

def test_count_words():
    """Test word counting from a file."""
    einstein = {'over': 2, 'and': 2, 'insanity': 1, 'is': 1,
                'doing': 1, 'the': 1, 'same': 1, 'thing': 1,
                'expecting': 1, 'different': 1, 'results': 1}
    expected = Counter(einstein)
    module_path = dirname(__file__)  # absolute path to test_pycounts.py
    file_path = "einstein.txt"       # text file path relative to test_pycounts.py
    actual = count_words(join(module_path, file_path))
    assert actual == expected, "Einstein quote words counted incorrectly!"
```
````

(03:Test-coverage)=
### Test coverage

A good test suite will contain tests that cover the main functionality of your code, that is, your tests should run most or all of your code at least once. There are certainly exceptions to this, but the general idea is to have your tests cover the core functionality of your package. We refer to this as "coverage" and there is a useful extension to `pytest` called `pytest-cov` which we can use to automatically determine how much coverage our tests have.

Let's use `poetry` to add `pytest-cov` as a development dependency of our `pycounts` package now:

```{prompt} bash \$ auto
$ poetry add --dev pytest-cov
```

We can determine the coverage of our tests by running the following command which tells `pytest-cov` to determine the coverage our tests have of the `pycounts` package:

```{prompt} bash \$ auto
$ pytest tests/ --cov=pycounts
```

```console
============================= test session starts ==============================
platform darwin -- Python 3.9.6, pytest-6.2.4, py-1.10.0, pluggy-0.13.1
rootdir: /Users/tomasbeuzen/pycounts
plugins: cov-2.12.1
collected 1 item                                                                                                                                   

tests/test_pycounts.py .                                                  [100%]

---------- coverage: platform darwin, python 3.9.6-final-0 -----------
Name                       Stmts   Miss  Cover
----------------------------------------------
src/pycounts/__init__.py       2      0   100%
src/pycounts/plotting.py      10     10     0%
src/pycounts/pycounts.py      16      0   100%
----------------------------------------------
TOTAL                         28     10    64%

============================== 1 passed in 0.01s ===============================
```

In the output above, `Stmts` is how many lines are in a module, `Miss` is how many lines were not executed by your tests, and `Cover` is the percentage of lines executed by your tests. From the above output, we can see that our tests currently don't cover any of the lines in the `pycounts.plotting` module. We'll write more tests for our package, and discuss more advanced methods of testing, code coverage, and how to generate interactive coverage reports in **Chapter 5: {ref}`05:Testing`**.

(03:Package-documentation)=
## Package documentation

Documentation describing what your package does and how to use it is invaluable for the users of your package (including yourself). The amount of documentation needed to support your package varies depending on its complexity and the intended audience. All packages should at least have:
- A README: a text file containing high-level information about the package, e.g., what it does, how to install it, and how to use it.
- docstrings: a docstring is a string at the start of a module, class, method or function that describes what the code does and how to use it.

The above documentation might suffice for a simple, personal package. But more complex packages and/or ones that will be shared and collaborated on with a larger audience will typically contain additional documents such as:
- A license: explains who owns the copyright to your package source and how it can be used and shared.
- Contributing guidelines: explains how to contribute to the project.
- A code of conduct: defines standards for how to engage with and contribute to the project.
- A Changelog: a chronologically ordered list of notable changes to your package over time, usually organized by version.
- Examples of usage: step-by-step examples showing how the package works in more detail.
- An application programming interface (API) reference: a list of the user-facing functionality of your package (i.e., functions, classes, etc.) along with a short description of what they do and how to use them.

We'll discuss these documents, and more, in detail in **Chapter 6: {ref}`06:Documentation`**. But regardless of how much documentation you intend to include in your package, it's common to develop it from a mix of manually written and automatically generated content using the documentation generator tool `sphinx`.

In this section, we will develop the documentation for our `pycounts` package. As we will be making `pycounts` open source and sharing it publicly, it will include all of the documentation listed above. We'll show how to compile this documentation and generate content automatically with `sphinx`, and how to host your documentation online using the free service [Read the Docs](https://readthedocs.org/).

(03:Writing-documentation)=
### Writing documentation

Python package documentation is typically written in a plain-text markup format such as [Markdown](https://en.wikipedia.org/wiki/Markdown) (*.md*) or [reStructuredText](https://www.sphinx-doc.org/en/master/usage/restructuredtext/index.html) (*.rst*). We'll be using Markdown in this book because it is relatively simple and commonly used (check out the [Markdown Guide](https://www.markdownguide.org) to learn more about Markdown syntax). A documentation generator like `sphinx` can then render these plain-text files into a format such as HTML of PDF for easier viewing and sharing, as we'll show later in this chapter. For now, consider the layout of our `pycounts` package:

```
pycounts
├── .gitignore
├── .readthedocs.yml
├── CHANGELOG.md
├── CONDUCT.md
├── CONTRIBUTING.md
├── docs
│   └── ...
├── LICENSE
├── README.md
├── pyproject.toml
├── src
│   └── ...
└── tests
    └── ...
```

The `py-pkgs-cookiecutter` we used to create our package structure already created and populated lot of documentation for us, including a *`CHANGELOG.md`*, *`CONDUCT.md`*, *`CONTRIBUTING.md`*, and *`LICENSE`* file. These files are usually found in the root directory of the package because they contain important information for those interested in using and/or contributing to your package. A basic *`README.md`* was also created for us but it contains a "Usage" section which is currently empty. Now that we've developed the basic functionality of `pycounts`, we can fill that section with Markdown text as follows:

````md
# pycounts

Calculate word counts in a text file!

## Installation

```bash
$ pip install pycounts
```

## Usage

`pycounts` can be used to count words in a text file and plot the results as follows:

```python
from pycounts.pycounts import count_words
from pycounts.plotting import plot_words
import matplotlib.pyplot as plt

file_path = "test.txt"  # path to your file
counts = count_words(file_path)
fig = plot_words(counts, n=10)
plt.show()
```

## Contributing

Interested in contributing? Check out the contributing guidelines. 
Please note that this project is released with a code of conduct. 
By contributing to this project, you agree to abide by its terms.

## Credits

This package was created with [`cookiecutter`](https://cookiecutter.readthedocs.io/en/latest/)
and the `py-pkgs-cookiecutter`[template] (https://github.com/py-pkgs/py-pkgs-cookiecutter) 
template.
````

```{tip}
In the Markdown text above, the following syntax is used:
- Headers are denoted with number signs (\#). The number of number signs corresponds to the heading level.
- Code blocks are bounded by three back-ticks (\`\`\`). A programming language can succeed the opening bounds to specify how the code syntax should be highlighted.
- Links are defined using brackets \[\] to enclose the link text, followed by the URL in parentheses ().
```

So, we now have a *`CHANGELOG.md`*, *`CONDUCT.md`*, *`CONTRIBUTING.md`*, *`LICENSE`*, and *`README.md`*. In the next section, we'll explain how to document your package's Python code using docstrings.

(03:Writing-docstrings)=
### Writing docstrings

A docstring is a string, surrounded by triple-quotes, at the start of a module, class, or function in Python that provides documentation on what the object does and how to use it. General docstring convention in Python is described in [Python Enhancement Proposal (PEP) 257 - Docstring Conventions](https://www.python.org/dev/peps/pep-0257/), but there is flexibility in how you write your docstrings. 

A minimal docstring contains a single line describing what the object does, and that might be sufficient for a simple function or for when you're developing your code (we've been using this minimal style in the functions we've written in this chapter). However, for code you intend to share with others (including your future self) a more comprehensive docstring should be written. This serves the dual purpose of better documenting how to use your code, as well as providing the raw text for a documentation generator like `sphinx` to use to automatically create a searchable reference sheet for your package (which we'll do later). A typical docstring will include:

1. A one-line summary that does not use variable names or the function name;
2. An extended description;
3. Parameter types and descriptions;
4. Returned value types and descriptions;
5. Example usage; and,
6. Potentially more.

There are different "docstring styles" used in Python to organize this information, such as [numpydoc style](https://numpydoc.readthedocs.io/en/latest/format.html#docstring-standard), [Google style](https://github.com/google/styleguide/blob/gh-pages/pyguide.md#38-comments-and-docstrings), and [sphinx style](https://sphinx-rtd-tutorial.readthedocs.io/en/latest/docstrings.html#the-sphinx-docstring-format). We'll be using the numpydoc style for our `pycounts` package because it is readable, commonly-used, and supported by `sphinx`.  The [numpydoc style](https://numpydoc.readthedocs.io/en/latest/format.html#docstring-standard)  documentation describes the exact syntax used for the docstring; for our `count_words()` function, it looks like this (numbers in the docstring below identify items in the numbered list above, but they should not be included in your docstring):

```python
def count_words(input_file):
    """Count words in a text file. (1)

    Words are made lowercase and punctuation is removed 
    before counting. (2)

    Parameters (3)
    ----------
    input_file : str
        Path to text file.

    Returns (4)
    -------
    collections.Counter
        dict-like object where keys are words and values are their counts.

    Examples (5)
    --------
    >>> count_words("text.txt")
    """
    text = load_text(input_file)
    text = clean_text(text)
    words = text.split()
    return Counter(words)
```

You can add information to your docstrings at your discretion - you won't always need all the sections above, and in some case you may want to include additional sections from the [numpydoc style documentation](https://numpydoc.readthedocs.io/en/latest/format.html#docstring-standard). We've documented the remaining functions from our `pycounts` package as follows:

```python
def load_text(input_file):
    """Load text from a text file and return as a string.

    Parameters
    ----------
    input_file : str
        Path to text file.

    Returns
    -------
    str
        Text file contents.

    Examples
    --------
    >>> load_text("text.txt")
    """
    with open(input_file, "r") as file:
        text = file.read()
    return text

def clean_text(text):
    """Lowercase and remove punctuation from a string.

    Parameters
    ----------
    text : str
        Text to clean.

    Returns
    -------
    str
        Cleaned text.

    Examples
    --------
    >>> clean_text("Early optimization is the root of all evil!")
    'early optimization is the root of all evil'
    """
    text = text.lower()
    for p in punctuation:
        text = text.replace(p, "")
    return text

def plot_words(word_counts, n=10):
    """Plot a bar chart of word counts.

    Words are made lowercase and punctuation is removed 
    before counting.

    Parameters
    ----------
    word_counts : collections.Counter
        Counter object of word counts.
    n : int, optional
        Plot the top n words. By default, 10.

    Returns
    -------
    matplotlib.
        Bar chart of word counts.

    Examples
    --------
    >>> from pycounts.pycounts import count_words
    >>> from pycounts.plotting import plot_words
    >>> counts = count_words("text.txt")
    >>> plot_words(counts)
    """
    top_n_words = word_counts.most_common(n)
    word, count = zip(*top_n_words)
    fig = plt.bar(range(n), count)
    plt.xticks(range(n), labels=word, rotation=45)
    plt.xlabel("Word")
    plt.ylabel("Count")
    return fig
```

These docstrings can be accessed by users of our package by using the `help()` function in a Python interpreter:

```{prompt} python >>> auto
>>> from pycounts.pycounts import count_words
>>> help(count_words)
```

```console
Help on function count_words in module pycounts.pycounts:

count_words(input_file)
    Count words in a text file.
    
    Words are made lowercase and punctuation is removed 
    before counting.

    Parameters
    ----------
    input_file : str
        Path to text file.

    ...
```

However, for the users of our package it would be helpful to compile all of our functions and docstrings into a easy-to-navigate document, so they can access this information without having to `import` or search through our source code. Such a document is typically referred to as an API reference. We could create one by manually copying and pasting all of our function names and docstrings into a plain-text file, but that would be inefficient and not reproducible. Instead, we'll show how to use `sphinx` in **{numref}`03:Generating-documentation`** to automatically parse our source code, extract our functions and docstrings, and create an API reference for us.

(03:Creating-usage-examples)=
### Creating usage examples

Creating examples of how to use your package can be invaluable to new and existing users alike. Unlike the brief and basic "Usage" heading we wrote in our README in **{numref}`03:Writing-documentation`**, these examples are more like tutorials, including a mix of text and code that demonstrates the functionality and common workflows of your package step-by-step.

You could write examples from scratch using a plain-text format like Markdown but this can be inefficient and prone to errors. If you change the way a function works, or what it outputs, you would have to re-write your example. Instead, in this section we'll show how to use Jupyter notebooks as a more efficient, interactive, and reproducible way to create usage examples for your users. If you don't want to create usage example for your package, or aren't interested in learning how to use Jupyter notebooks to do so, you can skip to **{numref}`03:Generating-documentation`**.

Jupyter notebooks are documents that can contain executable code, equations, text, and visualizations. They can be effective for creating usage examples for your Python package because they can directly import and use code from your package and display its output. The utility of this approach is best seen by example. To create a usage example for our `pycounts` package using a Jupyter notebook, we first need to add `jupyter` as a development dependency:

```{prompt} bash \$ auto
$ poetry add --dev jupyter
```

Our `py-pkgs-cookiecutter` already created a notebook for us at *`pycounts/docs/example.ipynb`*. To edit that document, we can open the Jupyter Notebook application using the following command:

```{prompt} bash \$ auto
$ jupyter-notebook
```

```{tip}
If you're developing your Python packages in an IDE that supports notebooks, such as VS Code or JupyterLab, feel free to edit *`pycounts/docs/example.ipynb`* there.
```

In the interface, navigate to and open *`docs/example.ipynb`*. As explained in the [Jupyter Notebook documentation](https://jupyter-notebook.readthedocs.io/en/stable/), notebooks are comprised of "cells" which can contain Python code or Markdown text. Our notebook currently looks like {numref}`03-jupyter-example-1`.

```{figure} images/03-jupyter-example-1.png
---
width: 100%
name: 03-jupyter-example-1
alt: A simple Jupyter notebook using code from `pycounts`.
---
A simple Jupyter notebook using code from `pycounts`.
```

Let's update that example with the collection of Markdown and code cells shown in {numref}`03-jupyter-example-2`.

```{figure} images/03-jupyter-example-2.png
---
width: 100%
name: 03-jupyter-example-2
alt: Jupyter notebook demonstrating an example workflow using the `pycounts` package.
---
Jupyter notebook demonstrating an example workflow using the `pycounts` package.
```

Our Jupyter notebook now contains an interactive tutorial demonstrating the basic usage of our package. What's important to note is that the outputs are generated using the actual code from our package itself, they have not been included manually. This approach ensures that if we change any code, those changes would be automatically reflected in our examples. While we could share this notebook with our users so that they can execute it themselves, we'll show how to use `sphinx` to automatically execute notebooks and include their content (including the outputs of code cells) into a compiled collection of all our packages documentation that users can easily read and navigate through without having to start the Jupyter application!

(03:Generating-documentation)=
### Generating documentation

We've now written all the individual pieces of documentation needed to support our `pycounts` package. But all this documentation is not overly helpful in its current state because it's spread over the directory structure of our package making it inefficient to search through. 

This is where the documentation generator `sphinx` comes in. `sphinx` can be used to compile a collection of plain-text source files into user-friendly output formats such as HTML or PDF for sharing and/or hosting on the web. It also has a rich ecosystem of extensions that can be used to help automatically generate content - we'll be using some of these extensions in this section to help create an API reference sheet and to execute and render our Jupyter notebook example into our documentation.

To first give you an idea of what we're going to build, {numref}`03-documentation-1` shows the homepage of our package's documentation compiled by `sphinx` into HTML.

```{figure} images/03-documentation-1.png
---
width: 100%
name: 03-documentation-1
alt: The documentation homepage generated by `sphinx`.
---
The documentation homepage generated by `sphinx`.
```

The source and configuration files to build documentation like this using `sphinx` typically live in a *`docs`* folder in the root of your package. The `py-pkgs-cookiecutter` automatically created this for us:

```
pycounts
├── .gitignore
├── .readthedocs.yml
├── CHANGELOG.md
├── CONDUCT.md
├── CONTRIBUTING.md
├── docs
│   ├── changelog.md
│   ├── conduct.md
│   ├── conf.py
│   ├── contributing.md
│   ├── example.ipynb
│   ├── index.md
│   ├── make.bat
│   ├── Makefile
│   └── requirements.txt
├── LICENSE
├── pyproject.toml
├── README.md
├── src
│   └── ...
└── tests
    └── ...
```

The *`docs`* directory includes:

- *`Makefile`*/*`make.bat`*: files that contain commands needed to build our documentation with `sphinx` and do not need to be modified;
- *`requirement.txt`*: contains a list of documentation-specific dependencies required to host our docs on [Read the Docs](https://readthedocs.org/), which we'll discuss in **{numref}`03:Hosting-documentation-online`**;
- *`conf.py`* is a configuration script controlling how `sphinx` builds your documentation. You can read more about *`conf.py`* in the `sphinx` [documentation](https://www.sphinx-doc.org/en/master/usage/configuration.html) and we'll touch on it again shortly, but for now, it has been pre-populated by the `py-pkgs-cookiecutter` template and does not need to be modified;
- The remaining files in the `docs` directory form the content of our generated documentation, as we'll discuss in the remainder of this section.

The *`index.md`* file forms the landing page of our documentation (the one we saw earlier in {numref}`03-documentation-1`). If you open it in an editor of your choice, you'll see the following:

````
```{include} ../README.md
```

```{toctree}
:maxdepth: 1
:hidden:

example.ipynb
changelog.md
contributing.md
conduct.md
autoapi/index
```
````

The syntax we're using in this file is known as [Markedly Structured Text (MyST)](https://myst-parser.readthedocs.io/en/latest/syntax/syntax.html). MyST is based on Markdown but with additional syntax options compatible for use with `sphinx`. The `{include}` syntax specifies that we want this page to include the content of the *`README.md`* in our package's root directory.

The `{toctree}` syntax defines what documents will be listed in the table of contents (ToC) on the left-hand side of {numref}`03-documentation-1`. The argument `:maxdepth: 1` indicates how many heading levels the ToC should include, and `:hidden:` specifies that the ToC should only appear in the side bar and not in the welcome page itself. The ToC then lists the documents we want to include in our rendered documentation. "example.ipynb" is the notebook we wrote in section **{numref}`03:Creating-usage-examples`**. "changelog.md", "contributing.md", and "conduct.md", contain links to the documents we already wrote in our root using the `{include}` syntax from earlier. For example, *`changelog.md`* contains the following text:

````md
```{include} ../CHANGELOG.md
```
````

The final document in the ToC, "autoapi/index" is an API reference sheet that will be generated automatically for us, using our package structure and docstrings, when we compile our documentation. Before we can compile our documentation, it relies on a few extensions that need to be installed:

- [myst-nb](https://myst-nb.readthedocs.io/en/latest/): extension that will enable `sphinx` to parse our Markdown, MyST, and notebook files (`sphinx` only supports reStructuredTex, *.rst* files, by default);
- [sphinx-autoapi](https://sphinx-autoapi.readthedocs.io/en/latest/): extension that will parse our source code to create an API reference sheet;
- [sphinx.ext.napoleon](https://sphinxcontrib-napoleon.readthedocs.io/en/latest/): extension that enables `sphinx` to parse both numpydoc-style docstrings;
- [sphinx.ext.viewcode](https://www.sphinx-doc.org/en/master/usage/extensions/viewcode.html): extension that adds a helpful link to the source code of each object in the API reference sheet; 
- [sphinx-rtd-theme](https://sphinx-rtd-theme.readthedocs.io/en/stable/): a custom theme for styling the way our documentation will look; and,
- [sphinx-copybutton](https://sphinx-copybutton.readthedocs.io/en/latest/): an extension that will add a helpful copy button to code snippets in our documentation.

All these extensions are not necessary to create documentation with `sphinx`, but they are all commonly used in Python packaging documentation and significantly improve the look and user-experience of the generated documentation. To use these (or any) extensions, we need to add them to a list called `extensions` in the *`conf.py`* configuration file and configure them. Configuration options for each extension (if they exist) can be viewed in their respective documentation, but the `py-pkgs-cookeicutter` has already taken care of everything for us, by defining the following variables within *`conf.py`*:

```python
extensions = [
    "myst_nb",
    "autoapi.extension",
    "sphinx.ext.napoleon",
    "sphinx.ext.viewcode",
    "sphinx_copybutton",
]
autoapi_type = "python"
autoapi_dirs = ["../src"]
napoleon_numpy_docstring = True
html_theme = "sphinx_rtd_theme"
```

We also need to install the extensions not included with `sphinx` (the ones without the `sphinx.ext` prefix) as development dependencies of our package using `poetry`:

```{prompt} bash \$ auto
$ poetry add --dev myst-nb sphinx-autoapi sphinx-rtd-theme sphinx-copybutton
```

Now we can generate our documentation with `sphinx` using the following command from our root package directory:

```{prompt} bash \$ auto
$ make html -C docs
```

```console
Running Sphinx
making output directory... done
...
build succeeded.
The HTML pages are in _build/html.
```

```{tip}
The *`Makefile`*/*`make.bat`* files included in the *`docs`* directory are not necessary to build documentation, but they provide convenience by allowing us to build our documentation using the simple command `make html`. Without these files, you would have to type `sphinx-build -b html sourcedir builddir` to build documentation.
```

If we now look inside our *`docs`* directory we see a new directory *`_build/html`* which contains our rendered HTML files. If you open *`_build/html/index.html`* you should see the page shown in {numref}`03-documentation-1`.

The `sphinx-autoapi` extension extracted the docstrings within each module and rendered them into our documentation. You can find the generated API reference sheet by clicking "API Reference" in the table of contents. For example, {numref}`03-documentation-2` shows the functions and docstrings in the `pycounts.plotting` module. The `sphinx.ext.napoleon` enabled `sphinx` to parse our [numpydoc style](https://numpydoc.readthedocs.io/en/latest/format.html#docstring-standard) docstrings and the `sphinx.ext.viewcode` extension added the "\[source\]" link next to each function in our API reference sheet which links readers directly to the source code of the function (if they want to view it).

```{figure} images/03-documentation-2.png
---
width: 100%
name: 03-documentation-2
alt: Documentation for the pycounts plotting module.
---
Documentation for the pycounts plotting module.
```

Finally, if we navigate to the "Example usage" page, {numref}`03-documentation-3` shows the Jupyter notebook we wrote in **{numref}`03:Creating-usage-examples`** rendered into our documentation, including the Markdown text, code input, and executed output. This was made possible using the `myst-nb` extension.

```{figure} images/03-documentation-3.png
---
width: 100%
name: 03-documentation-3
alt: Jupyter notebook example rendered into `pycounts`'s documentation.
---
Jupyter notebook example rendered into `pycounts`'s documentation.
```

Ultimately, you can easily and efficiently make beautiful and many-featured documentation with `sphinx` and its ecosystem of extensions. You can now use this documentation yourself or potentially share it with others, but it really shines when you host it on the web using a free service like [Read the Docs](https://readthedocs.org/), as we'll do in the next section. For those using version control, now is a good time to commit our work on our package's documentation:

```{prompt} bash \$ auto
$ git add README docs/example.ipynb
$ git commit -m "docs: updated readme and example"
$ git add src/pycounts/pycounts.py src/pycounts/plotting.py
$ git commit -m "docs: created docstrings for package functions"
$ git add pyproject.toml poetry.lock
$ git commit -m "build: added dev dependencies for docs"
$ git push
```

(03:Hosting-documentation-online)=
### Hosting documentation online

If you intend to share your package with others, it will be useful to make your documentation accessible online. It's common to host Python package documentation on the free online hosting service [Read the Docs](https://readthedocs.org/), which can automate the building, deployment, and hosting of your documentation directly from an online repository. 

The [Read the Docs](https://readthedocs.org) documentation will provide the most up-to-date steps required to host your documentation online. For our `pycounts` package, this involved the following steps:

1. Visit <https://readthedocs.org/> and click on "Sign up";
2. Select "Sign up with GitHub";
3. Click "Import a Project";
4. Click "Import Manually";
5. Fill in the project details by:
    1. Providing your package name (e.g., `pycounts`);
    2. The GitHub repository URL (e.g., `https://github.com/TomasBeuzen/pycounts`); and,
    3. Specify the default branch as `main`.
6. Click "Next" and then "Build version".

After following the steps above, your documentation should be successfully built by [Read the Docs](https://readthedocs.org/) and you should be able to access it via the "View Docs" button on the build page. For example, the documentation for `pycounts` is now available at <https://pycounts.readthedocs.io/en/latest/>. This documentation will be automatically re-built by Read the Docs each time you push changes to the specified default branch of your GitHub repository.

```{attention}
The *`.readthedocs.yml`* file that `py-pkgs-cookiecutter` created for us in the root directory of our Python package contains the configuration settings necessary for Read the Docs to properly build our documentation. It specifies what version of Python to use and tells Read the Docs that our documentation requires the extra packages specified in *`pycounts/docs/requirements.txt`* to be generated correctly.
```

(03:Tagging-a-package-release-with-version-control)=
## Tagging a package release with version control

We have now created all the source files that make up version 0.1.0 of our `pycounts` package, including Python code, documentation, and tests - well done! In the next section we'll turn all these source files into a distribution package that can be easily shared and installed by others. But for those using version control, it's helpful at this point to tag a release of your package's source. If you're not using version control, you can skip to **{numref}`03:Building-and-distributing-your-package`**.

Tagging a release means that we permanently "tag" a specific point in our repository's history, and then create a downloadable "release" of all the files in our repository in the state they were in when the tag was made. It's common to tag a release for each new version of your package, as we'll discuss more in **Chapter 7: {ref}`07:Releasing-and-versioning`**.

Tagging a release is a two-step process involving both Git and GitHub:

1. Create a tag marking a specific point in a repository's history using the command `git tag`; and,
2. On GitHub, create a release of all the files in your repository (usually in the form of a zipped archive like *.zip* or *.tar.gz*) based on your tag. Others can then download this release if they wish to view or use your package's source files as they existed at the time the tag was created.

We'll demonstrate this process by tagging a release of v0.1.0 of our `pycounts` package. First, we need to create a tag identifying the state of our repository at v0.1.0 and then push the tag to GitHub using the following `git` commands at the command line:


```{prompt} bash \$ auto
$ git tag v0.1.0
$ git push --tags
```

Now if you go to the `pycounts` repository on GitHub and navigate to the "Releases" tab, you should see a tag like that shown in {numref}`03-tag`.

```{figure} images/03-tag.png
---
width: 100%
name: 03-tag
alt: Tag of v0.1.0 of `pycounts` on GitHub.
---
Tag of v0.1.0 of `pycounts` on GitHub.
```

To create a release from this tag, click "Draft a new release". You can then identify the tag from which to create the release and optionally add some additional details about the release as shown in {numref}`03-release-1`.

```{figure} images/03-release-1.png
---
width: 100%
name: 03-release-1
alt: Making a release of v0.1.0 of `pycounts` on GitHub.
---
Making a release of v0.1.0 of `pycounts` on GitHub.
```

After clicking "Publish release", GitHub will automatically create a release from your tag, including compressed archives of your code in *.zip* and *.tar.gz* format, as shown in {numref}`03-release-2`.

```{figure} images/03-release-2.png
---
width: 100%
name: 03-release-2
alt: Making a release of v0.1.0 of `pycounts` on GitHub.
---
Making a release of v0.1.0 of `pycounts` on GitHub.
```

We'll talk more about making new versions and releases of your package as you update it (e.g., modify code, add features, fix bugs, etc.) in **Chapter 7: {ref}`07:Releasing-and-versioning`**.

(03:Building-and-distributing-your-package)=
## Building and distributing your package

(03:Building-your-package)=
### Building your package

Right now, our package is a collection of files and folders that is difficult to share. If someone wanted to use our package (including ourselves wanting to use our package in a different project) they would need to have a copy of all these source files to be able to install it. The solution to this problem is to create a "distribution package". A distribution package is a single archive file containing all the files and information necessary to install a package. Distribution packages are often called "distributions" or "packages" for short; we'll use the former term in this book.

The main types of distributions in Python are source distributions (known as "sdists") and wheels. sdists are a compressed archive of all the source files, metadata, and instructions needed to construct a package that can be installed by Python. When installing a package from an sdist, all this information is used to build the package on the installer's computer before it is installed. In contrast, wheels are pre-built versions of a package for specific operating systems. They are the preferred distribution format because they only need to be copied to the location on your computer where Python searches for packages, no build step is required. We'll discuss sdists and wheels more in **{numref}`04:Package-distribution`**, but when sharing a package it's common to create both. We can easily create an sdist and wheel of a package with `poetry` using the command `poetry build`. Let's do that now for our `pycounts` package by running the command from our root package directory:

```{prompt} bash \$ auto
$ poetry build
```

```console
Building pycounts (0.1.0)
  - Building sdist
  - Built pycounts-0.1.0.tar.gz
  - Building wheel
  - Built pycounts-0.1.0-py3-none-any.whl
```

After running this command, you'll notice a new directory in your package called *`dist`*:

```
pycounts
├── .gitignore
├── .readthedocs.yml
├── CHANGELOG.md
├── CONDUCT.md
├── CONTRIBUTING.md
├── dist
│   ├── pycounts-0.1.0-py3-none-any.whl  <- built wheel distribution
│   └── pycounts-0.1.0.tar.gz            <- source distribution
├── docs
│   └── ...
├── LICENSE
├── pyproject.toml
├── README.md
├── src
│   └── ...
└── tests
    └── ...
```

Those two new files are the sdist and wheel for our `pycounts` package. A user could easily install our package now if they had one of these distributions by using `pip install`. For example, to install the wheel (the preferred distribution type), you could enter the following in a terminal:

```{prompt} bash \$ auto
$ cd dist
$ pip install pycounts-0.1.0-py3-none-any.whl
```

```console
Processing ./pycounts-0.1.0-py3-none-any.whl
...
Successfully installed pycounts-0.1.0
```

To install using the sdist, you would have to unpack the sdist archive before running `pip install`. In the command below we use the command line tool with argument `x` (extract the input file), `z` (gunzip the input file), `f` (apply operations to the provided input file) to unpack the sdist, but the command on your specific operating system might be different.

```{prompt} bash \$ auto
$ tar xzf pycounts-0.1.0.tar.gz
$ pip install pycounts-0.1.0/
```

```console
Processing ./pycounts-0.1.0-py3-none-any.whl
  Installing build dependencies ... done
    Getting requirements to build wheel ... done
    Preparing wheel metadata ... done
...
Successfully built pycounts
Successfully installed pycounts-0.1.0
```

```{attention}
Note in the output above how installing from an sdist requires a build step prior to installation.
```

Creating a distribution for our package is most useful if we make it available from an online repository like [Python Package Index (PyPI)](https://pypi.org/), the official online software repository for Python. This would allow users to simply run `pip install pycounts` to install our package, without needing the sdist or wheel files locally, and we'll do this in the next section. But even if you don't intend to share your package, it can still be useful to build and install distributions for two reasons:
1. A distribution is a self-contained copy of your package's source files that's easy to move around and store on your computer. It makes it easy to retain distributions for different versions of your package, so that you can re-use or share them if you ever need to.
2. Recall that `poetry` installs package in "editable mode", such that a link to the package's location is installed, rather than an independent distribution of the package itself. This is useful for *development purposes*, because it means that any changes to the source code will be immediately reflected when you next `import` the package, without the need to `poetry install` again. However, for *users* of your package (including yourself using your package in other projects), it is often better to install a "non-editable" version of the package (the default behavior when you `pip install` an sdist or wheel) because a non-editable installation will remain stable and immune to any changes made to the source files on your computer.

(03:Publishing-to-TestPyPI)=
### Publishing to TestPyPI

At this point, we have distributions of `pycounts` which we want to share with the world by publishing to the [PyPI](https://pypi.org/). However, it is good practice to do a "dry run" and check that everything works as expected by submitting to [TestPyPi](https://test.pypi.org/) first. `poetry` has a `publish` command which we can use to do this, however the default behavior is to publish to PyPI. So we need to add TestPyPI to the list of repositories `poetry` knows about using the following command:

```{prompt} bash \$ auto
$ poetry config repositories.test-pypi https://test.pypi.org/legacy/
```

To publish to TestPyPI we can use `poetry publish` (you will be prompted for your TestPyPI username and password - sign up if you have not already done so):

```{prompt} bash \$ auto
$ poetry publish -r test-pypi
```

```console
Username: TomasBeuzen
Password: 
Publishing pycounts (0.1.0) to test-pypi
 - Uploading pycounts-0.1.0-py3-none-any.whl 100%
 - Uploading pycounts-0.1.0.tar.gz 100%
```

```{tip}
Rather than entering your username and password every time you want to publish a distribution to TestPyPI or PyPI, you can configure an API token as described in the PyPI [documentation](https://pypi.org/help/#apitoken).
```

Now we should be able to visit our package on TestPyPI. The URL for our `pycounts` package is: <https://test.pypi.org/project/pycounts/>. We can try installing our package using `pip` from the command line with the following command:

```{prompt} bash \$ auto
$ pip install --index-url https://test.pypi.org/simple/ pycounts
```

```{attention}
By default `pip install` will search PyPI for the named package. However, we want to search TestPyPI because that is where we uploaded our package. The argument `--index-url` points `pip` to the TestPyPI index.
```

Not all developers upload their packages to TestPyPI; some upload them directly to PyPI. If your package depends on packages that are not on TestPyPI you will have to tell `pip` to look for them on PyPI instead. To do that, you can use the argument `--extra-index-url` as below:

```{prompt} bash \$ auto
$ pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple pycounts
```

(03:Publishing-to-PyPI)=
### Publishing to PyPI

If you were able to upload your package to TestPyPI and install it without error, you're ready to publish your package to PyPI. You can publish to PyPI using the `poetry publish` command without any arguments:

```{prompt} bash \$ auto
$ poetry publish
```

Your package will then be available on PyPI (e.g., <https://pypi.org/project/pycounts/>) and can be installed with `pip`:

```{prompt} bash \$ auto
$ pip install pycounts
```

## Summary and next steps

This chapter provided a practical overview of the key steps required to generate a fully-featured Python package. In the following chapters we'll explore each of these steps in more detail and continue to add features to our `pycounts` package. In particular, a key workflow we have yet to discuss is continuous integration and continuous deployment (CI/CD) - that is, setting up automated pipelines for running tests, building documentation, and versioning, building and deploying your package. We'll discuss CI/CD in **Chapter 8: {ref}`08:Continuous-integration-and-deployment`**.

Before moving onto the next chapter, let's summarize the steps we took to develop a Python package in this chapter:

1. Create package structure using a `cookiecutter` template (**{numref}`03:Creating-a-package-structure`**).
    ```{prompt} bash \$ auto
    $ cookiecutter https://github.com/py-pkgs/py-pkgs-cookiecutter.git
    ```
2. (Optional) Put your package under version control (**{numref}`03:Put-your-package-under-version-control`**).
3. (Optional) Create and activate a virtual environment using `conda` (**{numref}`03:Create-a-virtual-environment`**).
    ```{prompt} bash \$ auto
    $ conda create --name <your-env-name> python=3.9 -y
    $ conda activate <your-env-name>
    ```
4. Add Python code and add to module(s) in the *`src/`* directory (**{numref}`03:Package-your-code`**), adding dependencies as needed (**{numref}`03:Adding-code-with-dependencies-to-your-package`**).
    ```{prompt} bash \$ auto
    $ poetry add <packages>
    ```
5. Install and try out your package in a Python interpreter (**{numref}`03:Installing-your-package`**).
6. (Optional) Write tests for your package in module(s) prefixed with *`test_`* in the *`tests/`* directory. Add `pytest` as a development dependency to run your tests (**{numref}`03:Running-tests`**). Optionally add `pytest-cov` as a development dependency to calculate the coverage of your tests (**{numref}`03:Test-coverage`**).
    ```{prompt} bash \$ auto
    $ poetry add --dev pytest pytest-cov
    $ pytest tests/ --cov=<pkg-name>
    ```
7. (Optional) Create documentation source files for your package (**{numref}`03:Package-documentation`**). Optionally use `sphinx` to compile and generate an HTML render of your documentation, adding the required development dependencies (**{numref}`03:Generating-documentation`**).
    ```{prompt} bash \$ auto
    $ poetry add --dev myst-nb sphinx-autoapi sphinx-rtd-theme sphinx-copybutton
    $ make html -C docs
    ```
8. (Optional) Host documentation online with [Read the Docs](https://readthedocs.org/) (**{numref}`03:Hosting-documentation-online`**).
9. (Optional) Tag a release of your package using Git and GitHub, or equivalent version control tools (**{numref}`03:Tagging-a-package-release-with-version-control`**).
10. Build sdist and wheel distributions for your package (**{numref}`03:Building-your-package`**).
    ```{prompt} bash \$ auto
    $ poetry build
    ```
11. (Optional) Publish your distributions to [TestPyPi](https://test.pypi.org/) and try installing your package (**{numref}`03:Publishing-to-TestPyPI`**).
    ```{prompt} bash \$ auto
    $ poetry config repositories.test-pypi https://test.pypi.org/legacy/
    $ poetry publish -r test-pypi
    $ pip install --index-url https://test.pypi.org/simple/ <pkg-name>
    ```
12. (Optional) Publish your distributions to [PyPi](https://pypi.org/). Your package can now be installed by anyone using `pip` (**{numref}`03:Publishing-to-PyPI`**).
    ```{prompt} bash \$ auto
    $ poetry publish
    $ pip install <pkg-name>
    ```
    
The above workflow uses a particular suite of tools (e.g., `conda`, `poetry`, `sphinx`, etc.) to develop a Python package. While there are other tools that can be used to help build Python packages, the aim of this book is to give a high-level and practical introduction to Python packaging using modern, popular tools, and this has influenced our selection of tools in this chapter and book. However, the concepts and workflow discussed here remain relevant to the Python packaging ecosystem, regardless of the tools you use to develop your Python packages.