# Table of contents
- [Table of contents](#Table-of-contents)
- [References](#References)
- [Why?](#Why?)
- [Script](#Script)
- [Module](#Module)
  - [Exercise on module creation](#Exercise-on-module-creation)
- [Package](#Package)
  - [Exercise on package creation](#Exercise-on-package-creation)
  - [Exercise on package refactoring](#Exercise-on-package-refactoring)
  - [Different ways of importing things](#Different-ways-of-importing-things)
  - [Application Programming Interface (API)](#Application-Programming-Interface-(API))
  - [Exercise on defining the public API](#Exercise-on-defining-the-public-API)
  - [Import `as`](#Import-as)
  - [Stability of the API](#Stability-of-the-API)
  - [Imports with '.'](#Imports-with-'.')
  - [Imports with '..'](#Imports-with-'..')
- [Install package](#Install-package)
  - [Exercise: import package from different place](#Exercise:-import-package-from-different-place)
  - [Make the package installable.](#Make-the-package-installable.)
  - [Exercise on installing the package](#Exercise-on-installing-the-package)
- [Outreach: share your package with the world](#Outreach:-share-your-package-with-the-world)
  - [Git & GitHub](#Git-&-GitHub)
  - [PyPI](#PyPI)


# References

* [Introduction to modules](https://docs.python.org/3/tutorial/modules.html) from the official Python tutorial.
* 


# Why?

"Why?" is perhaps the most important question one should ask himself before doing something.
The reason why we use modules and packages (in contrst to just writing all our code in one file) is that it makes our code more organized and easier to maintain.
This approach is known as ["modular programming"](https://en.wikipedia.org/wiki/Modular_programming) and is a very important concept in computer science.

In this part of the tutorial we will learn how to create modules and packages and how to use them in our code.
We will also learn how to distribute our code as a package that can be installed by others.
We will start with a single file script and will gradually convert it into a package.

With these problems in mind let's start with a simple example.

# Script
Up to know we mostly saw scripts, which are a sequence of statements present in a single file or Jupyter cell.
Scripts are greate for a quick and dirty work, but they are not very good for long-term projects, because:

* They are not well organized as they contain all the code in one place.
* They are not easy to maintain.
* They are not easy to reuse in other projects (unless you copy-paste the code).
* They are not easy to distribute (unless you consider sending the file to your colleagues as an easy way of distributing).

Let's say we want to write a script that deals with points in 2D space.

Here is a simple script that defines a class `Point` and a function `distance`:

In [None]:
class Point:
    def __init__(self, x: float, y: float) -> None:
        self.x = x
        self.y = y

def distance(p1, p2) -> float:
    return ((p1.x - p2.x) ** 2 + (p1.y - p2.y) ** 2) ** 0.5

p1 = Point(0, 0)
p2 = Point(3, 4)
print(f"The distance between `p1` and `p2` is {distance(p1, p2)}")

# Module

The **module** is a file containing Python definitions and statements.
The file name is the module name with the suffix `.py` appended.
So our first task would be to create the module `point.py` and put the code from the previous section in it.

## Exercise on module creation

Move the`Point` class and the `distance` function to a new file called `point.py`:

1. Create a new file called `point.py` in the same directory as this notebook.
2. Copy the `Point` class and the `distance` function to the new file.
3. Save the file.

Now, since the module is in the same directory as this notebook, we can import things from it:

In [None]:
from point import Point, distance

p1 = Point(0, 0)
p2 = Point(3, 4)
print(f"The distance between `p1` and `p2` is {distance(p1, p2)}")

Very nice!
We managed to create our first module and import it in our script.

# Package

A **package** is a collection of modules that are often related to each other.

For example, the `numpy` package contains the modules `numpy.core`, `numpy.linalg`, `numpy.random`, etc.

A package is a directory that contains a file called `__init__.py`.
This file can be empty, and it indicates that the directory it contains is a Python package, so it can be imported the same way as a module.

## Exercise on package creation

Let's create a package named `mypackage` that contains our `point` module:

1. Create a new directory called `mypackage` in the same directory as this notebook.
2. Create a new empty file called `__init__.py` in the `mypackage` directory.
3. Move the `point.py` file to the `mypackage` directory.

Once this is done, let's see if we can import the `Point` class and the `distance` function from the `mypackage` package:

In [None]:
from mypackage.point import Point, distance

p1 = Point(0, 0)
p2 = Point(3, 4)
print(f"The distance between `p1` and `p2` is {distance(p1, p2)}")

Congratulations, we managed to create a package and import it here.

Suppose our package grows and we need to do some refactoring.
We can take the `distance` function to a new modules called `utils.py.


## Exercise on package refactoring

Let's move the `distance` function to a new module called `utils.py`:

1.  Create a new file called `utils.py` in the `mypackage` directory.
2.  Copy the `distance` function to the new file and save it.
3.  Remove the `distance` function from the `point.py` file and save it.

If everything went well, we should be able to execute the following code:

In [None]:
from mypackage.point import Point
from mypackage.utils import distance

p1 = Point(0, 0)
p2 = Point(3, 4)
print(f"The distance between `p1` and `p2` is {distance(p1, p2)}")

## Different ways of importing things

There are several ways to import things from a module or package.
Let's see some examples:

In [None]:
# Same as above
from mypackage.point import Point
from mypackage.utils import distance

p1 = Point(0, 0)
p2 = Point(3, 4)
print(f"The distance between `p1` and `p2` is {distance(p1, p2)}")

In [None]:
# Import only the `mypackage` package
import mypackage

p1 = mypackage.point.Point(0, 0)
p2 = mypackage.point.Point(3, 4)

print(f"The distance between `p1` and `p2` is {mypackage.utils.distance(p1, p2)}")

In [None]:
# Import `point` and `utils` modules from `mypackage`
from mypackage import point, utils

p1 = point.Point(0, 0)
p2 = point.Point(3, 4)

print(f"The distance between `p1` and `p2` is {utils.distance(p1, p2)}")

But we can do things slightly better, but first we need to introduce the concept of Application Programming Interface (API).

## Application Programming Interface (API)

An **Application Programming Interface** (API) stands for a set of publicly available classes, functions, and variables that a software program can use to interact with other software components.

Quite often, when developing a package, we import things from its modules into the `__init__.py` file.
This way, when we import the package, we can access the things we imported directly `mypackage.Point` and `mypackage.distance`.
Everything that is importend into the topmost `__init__.py` file is considered to be part of the package's public API.

For example, in the `mypackage` we can import the `Point` class and the `distance` function into the `__init__.py` file.
Let's do it as the next exercise.

## Exercise on defining the public API

Please import the `Point` class and the `distance` function into the `__init__.py` file.

Hint: often when importing things from local modules we use the following syntax:

```python
from .module import thing
```

In our case, the `.` stands for the current directory, so the `.` in the above code means the `mypackage` directory.
We will see more examples of this syntax later.

If everything went well, we should be able to execute the following code:

In [None]:
from mypackage import Point, distance

p1 = Point(0, 0)
p2 = Point(3, 4)
print(f"The distance between `p1` and `p2` is {distance(p1, p2)}")

## Import `as`

It is possible to import a package and use it as namespace throughout the code:

In [None]:
import mypackage

p1 = mypackage.Point(0, 0)
p2 = mypackage.Point(3, 4)
print(f"The distance between `p1` and `p2` is {mypackage.distance(p1, p2)}")

But if the package name is rather long, it is possible to import it under a shorter name.
For example, we can import the `mypackage` package as `mp`:

In [None]:
import mypackage

p1 = mypackage.Point(0, 0)
p2 = mypackage.Point(3, 4)
print(f"The distance between `p1` and `p2` is {mypackage.distance(p1, p2)}")

The last two examples are probably the nicest way to use packages.
If the package name is short, we can import it under its original name.
If the package name is long, we can import it under a shorter name.

For some common packages (like `numpy` and `pandas`) it has became a convention to import them under their aliases.
For example, we often use `np` for `numpy` and `pd` for `pandas`:

```python
import numpy as np
import pandas as pd
```

## Stability of the API

When developing a package, it is important to keep in mind that the public API is a part of the package's contract with the users.
This means, as developers, we should strive to keep the API stable and not change it too often.
It is OK to add new features to the package, but it is not OK to suddenly change the API of the existing features.

<div class="alert alert-info">
<b>Info:</b> The API of a package is considered to be stable when the changes to it are rare and well documented. If the API changes too often, users might walk away from the package.
</div>

Suppose we would like to introduce a new class called `Line` to our package.
Here is the code:

In [None]:
class Line:
    def __init__(self, p1, p2) -> None:
        self.p1 = p1
        self.p2 = p2

    def length(self) -> float:
        return ((self.p1.x - self.p2.x) ** 2 + (self.p1.y - self.p2.y) ** 2) ** 0.5

We would like to place it along with the `Point` class, but the name of the module is `point.py` which doesn't really make sense anymore.
We should rename the module to `geometry.py`.

## Exercise on adding new class to the package

1.  Rename the `point.py` file to `geometry.py`.
2.  Copy the `Line` class to the `geometry.py` file.
3.  Update the import statements in the `__init__.py` file to import the `Point` and `Line` classes from the `geometry` module.

If everything went well, we should be able to execute the following code:

In [None]:
import mypackage as mp

p1 = mp.Point(0, 0)
p2 = mp.Point(3, 4)

line = mp.Line(p1, p2)

print(f"The length of the line is {line.length()}")

We did a good job of maintaing the public API of or package 👍.
We added a new class to the package, we changed the module name, but nothing changed for the users.
The old-good `Point` class is still available under the `mypackage.Point` name.

## Imports with '.'

To import things from the neighboring modules, we can use the `.` symbol.
To test that, let's have a look at our `Line` class again.
You might notice that the body of the `length` method is essentially the same as the `distance` function we had in the `utils.py` module.
Let's import it from there.

## Exercise on importing from neighboring modules.

1. Import the `distance` function from the `utils` module in the `geometry` module.
```python
    from .utils import distance
```
2. Update the `length` method of the `Line` class to use the `distance` function.
3. Store both files and run the following code:

In [None]:
import mypackage as mp

p1 = mp.Point(0, 0)
p2 = mp.Point(3, 4)

line = mp.Line(p1, p2)

print(f"The length of the line is {line.length()}")

## Imports with '..'

Sometimes we need to import things from the modules that are located in the parent directory.
For that we can use the `..` symbol.
The `..` symbol means the parent directory, so `..` in the following code means the `mypackage` directory.

```python
from ..utils import distance
```

Let's reorganise our package a bit to see how it works.


## Exercise on importing from parent modules

1. Create a new directory called geometry in the mypackage directory.
2. Move the geometry.py file to the geometry directory.
3. Create an empty file called __init__.py in the geometry directory.
4. Add the following import statement to the __init__.py file:
```python
from .geometry import Point, Line
```
5. Update the import statement in the `geometry.py` file to import the `distance` function from the `utils` module to take into account the relative location of the `utils` module:
```python
from ..utils import distance
```
6. Store everything and run the following code:

In [None]:
import mypackage as mp

p1 = mp.Point(0, 0)
p2 = mp.Point(3, 4)

line = mp.Line(p1, p2)

print(f"The length of the line is {line.length()}")

Hopefully, everything went well and we were able to execute the code above.

We learned how to import things from the neighboring modules and from the parent modules.
We also learned that it is possible create subpackages and import things from them.

# Install package

Now, that we have a package, we should be able to install it with a little bit of work.

Why do we want to install it?
Well, if we want to use our package in another project, we need to install it.
Because currently, it can only be imported from the directory where it is located.

See that for yourself.

## Exercise: import package from different place

1. Create a new directory called `tmp_dir` (or similar).
2. Open it in the File Browser.
3. Create a new notebook by right click of the mouse and selecting `New Notebook` from the context menu.
4. In the new notebook, try to import the `mypackage` package.

This should fail, because the `mypackage` package is not installed in your system:

```python

import mypackage


---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[1], line 1
----> 1 import mypackage

ModuleNotFoundError: No module named 'mypackage'
```

## Make the package installable.

To make the package installable, we need to do a few more things.
Typically, this is done by creating `pyproject.toml` file.

Here is a minimal example of the `pyproject.toml`: 

```toml
[build-system]
requires = ["flit_core >=3.2,<4"]
build-backend = "flit_core.buildapi"

[project]
name = "mypackage"
version = "0.1.0"
description = "My first package"
```

The package directory should look like this:

```bash
$ tree mypackage
mypackage/
├── mypackage
│   ├── __init__.py
│   ├── geometry
│   │   ├── __init__.py
│   │   └── geometry.py
│   └── utils.py
└── pyproject.toml
```

Notice, that we moved our original `mypackage` directory to `mypackage/mypackage`.
This is common practice in Python packages.

## Exercise on installing the package

1. Inside the `mypackage` directory, we need to create another directory called `mypackage`.
2. Move the `geometry` directory together with `__init__.py` and `utils.py` files to the `mypackage` sub-directory.
3. Create a new file called `pyproject.toml` in the top-level `mypackage` directory.
4. Open the terminal and navigate to the `mypackage` directory.
```bash
    $ cd mypackage
```
5. Run the following command to install the package:
```bash
    $ pip install .
```
6. Return back to the new notebook we created in the previous exercise and try to import the `mypackage` package again.


Congratulations, now your package is installed and you can import it from anywhere!

# Outreach: share your package with the world

Now that we have a package, we can share it with people around the world.
There are many ways to do that, but we will focus on two of them:

- GitHub
- PyPI

## Git & GitHub

Git is a version control system.
It allows you to track changes in your code and collaborate with other people.
GitHub is a website that hosts Git repositories.

If you want to share your package with the world, GitHub is a great place to do that.
It is free and it is easy to use.

## PyPI

PyPI is a website that hosts Python packages.
It is a great place to share your package with the world.
