# Python Packaging

**References**:
+ https://www.pyopensci.org/python-package-guide/index.html

**Content**:
+ Python packaging
    + Introduction & Motivation
    + Python packaging with `hatch`
    + Configure `hatch`
    + Python package directory structure
    + Install package locally

# Python packaging

## Before we start... setup your project structure
+ open GitHub Desktop and create a new repository
    +  
+ open Git Bash in the "python-class-25" directory (use GitHub `Desktop > Repository > Open GitBash`)
+ activate your conda environment: `$ conda activate python-course-2025`

## Introduction
### What is a Python package?

+ A Python package is basically a directory with a specific file structure.
+ Within the package directory structure, there are *modules* which are files that end in `.py`.
+ These modules allow you to group and structure your Python code.
+ Each module contains *functions* and *classes*

### Motivation for Python packaging

+ Use your code across different project
+ Share your code
+ Organize your code

### Elements of a Python package
+ **Code**: Functions and classes that provide functionality for a user of your package
+ **Documentation**: Installation instructions, tutorials, and examples
+ **Tests**: Make sure your code works as it should 
+ **License**: Select a license for your package such that others also know how to re-use your package
+ **Infrastructure**: Automate updates, publication workflows and runs test suites; includes platforms like GitHub and GitLab

### The importance of version control
+ Most Python packages live in an online version control platform such as GitHub or GitLab.
+ GitHub and GitLab both run git for version control.
+ Having your software under version control is important because it allows you to both track changes over time while also going back in history and undoing changes in the case that a change to the code base unexpectedly breaks something.

### Python packages and environments
+ You can install a Python package into a Python environment in the same way you might install NumPy or Pandas.
+ Installing your package into an environment allows you to access it from any code run with that specific Python environment activated

## Python packaging with `hatch`

+ install `hatch` via pip (if you run into problems check out: https://www.pyopensci.org/python-package-guide/tutorials/get-to-know-hatch.html)
+ check whether `hatch` can be found

In [None]:
# install hatch via pip
$ pip install hatch

# check hatch version (should work if installation was successful)
$ hatch --version

## Configure Hatch

+ Once you have installed Hatch, you can customize its configuration. 
+ open the directory where your config file is located and open it for editing

In [None]:
# open a directory window in which the config file is located
$ hatch config explore

The config file should look similar to the following below. Notice that the default license is MIT.

In [None]:
mode = "local"
project = ""
shell = ""

[dirs]
project = []
python = "isolated"
data = "C:\\Users\\bockting\\AppData\\Local\\hatch"
cache = "C:\\Users\\bockting\\AppData\\Local\\hatch\\Cache"

[dirs.env]

[projects]

[publish.index]
repo = "main"

[template]
name = "Florence Bockting"
email = "48919471+florence-bockting@users.noreply.github.com"

[template.licenses]
headers = true
default = [
    "MIT",
]

[template.plugins.default]
tests = true
ci = false
src-layout = true

[terminal.styles]
info = "bold"
success = "bold cyan"
error = "bold red"
warning = "bold yellow"
waiting = "bold magenta"
debug = "bold"
spinner = "simpleDotsScrolling"

Save and close the config file and run the following command in GitBash, which will print out the contents of your config.toml file in your shell

In [None]:
$ hatch config show

## Python package directory structure
To make your Python code installable you need to create a specific directory structure with the following elements:
+ `pyproject.toml`
+ specific directory structure
+ some code
+ an `__init__.py` file in your code directory

Notes:
+ Use the name of your package for the directory name (`src/mypackage/`)
+ the root directory for the package is also called after your package `mypackage/`. This is not required but common practice.
+ the init file (`mypackage/__init__.py`) tells Python that the directory should be treated as a Python package. The init file is usually empty

Using the directory specifciation above you can run in python `import mypackage`

The `pyproject.toml` file is:
+ Where you define your project’s metadata (including its name, authors, license, etc)
+ Where you define dependencies (the packages that it depends on, e.g. numpy, matplotlib, pandas, etc.)
+ Used to specify and configure what build backend you want to use to build your package.
+ required fields for the package to be installable include:
    + The build-backend that you want to use
    + The project name and version

### Create your first python package
+ open the directory where you want to save your package in the file explorer (it should not be in the `python-course-25` directory; as we need a directory which is not already a git repository)
    + with right click select the option > further options > open GitBash here
+ create an initial package structure with hatch.
+ You should see that hatch automatically creates the corresponding package structure for you

In [None]:
$ hatch new mypackage

mypackage
+-- src
|   `-- mypackage
|       +-- __about__.py
|       `-- __init__.py
+-- tests
|   `-- __init__.py
+-- LICENSE.txt
+-- README.md
`-- pyproject.toml


### Setup the whole project struture
+ go to GitHub Desktop and create a **new git repository**
    + File > New Repository
        + Name: "mypackage"
        + folder location (same directory in which you already have your "mypackage" folder)
        + Description: "My first Python package"
        + Initialize Repo with README: No
        + License: None
        + gitignore: None
    + confirm changes using **Create repository**
    + go to the tap **Repository** and click on **Open in GitBash** (a commmand window should open)
    + check whether all expected files are in this directory with `$ ls`
        + expected output: `LICENSE.txt  pyproject.toml  README.md  src/  tests/  tutorial.ipynb`
    + and finally, your can publish your first Python package on GitHub
+ create and activate a **new conda environment**
    + `$ conda create -n mypackage-env`
    + `$ conda activate mypackage-env` 
+ select in Anaconda your new environemnt
+ install Spyder & JupyterLab
+ create a **new Spyder project** for your package
    + (in Spyder) Projects > New project
    + select the mypackage folder

### Copy&paste your python modules to your package
+ A Python module refers to a `.py` file containing the code that you want your package to access and run
+ copy & paste your two modules `simulations.py` and `plotting.py` from mypackage_temp into the folder `mypackage/src/mypackage`
+ delete mypackage_temp
+ updated package structure:

In [None]:
mypackage
+-- src
|   `-- mypackage
|       +-- __about__.py
|       +-- __init__.py
|       +-- simulations.py
|       `-- plotting.py
+-- tests
|   `-- __init__.py
+-- LICENSE.txt
+-- README.md
`-- pyproject.toml

### Modify meta-data in your pyproject.toml

+ open Git Bash in your `mypackage` folder 

In [None]:
# you should be in the following directory: mypackage
# open the pyproject.toml file
start pyproject.toml # MacOS users: open pyproject.toml

A file similar to the following should open.
+ Hatch by default provides a list of classifiers that define what Python versions your package supports.
+ These classifiers do not in any way impact your package’s build and are primarily intended to be used when you publish your package to PyPI.

In [None]:
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[project]
name = "mypackage"
dynamic = ["version"]
description = ''
readme = "README.md"
requires-python = ">=3.8"
license = "MIT"
keywords = []
authors = [
  { name = "Florence Bockting", email = "48919471+florence-bockting@users.noreply.github.com" },
]
classifiers = [
  "Development Status :: 4 - Beta",
  "Programming Language :: Python",
  "Programming Language :: Python :: 3.8",
  "Programming Language :: Python :: 3.9",
  "Programming Language :: Python :: 3.10",
  "Programming Language :: Python :: 3.11",
  "Programming Language :: Python :: 3.12",
  "Programming Language :: Python :: Implementation :: CPython",
  "Programming Language :: Python :: Implementation :: PyPy",
]
dependencies = []

[project.urls]
Documentation = "https://github.com/Florence Bockting/mypackage#readme"
Issues = "https://github.com/Florence Bockting/mypackage/issues"
Source = "https://github.com/Florence Bockting/mypackage"

[tool.hatch.version]
path = "src/mypackage/__about__.py"

[tool.hatch.envs.types]
extra-dependencies = [
  "mypy>=1.0.0",
]
[tool.hatch.envs.types.scripts]
check = "mypy --install-types --non-interactive {args:src/mypackage tests}"

[tool.coverage.run]
source_pkgs = ["mypackage", "tests"]
branch = true
parallel = true
omit = [
  "src/mypackage/__about__.py",
]

[tool.coverage.paths]
mypackage = ["src/mypackage", "*/mypackage/src/mypackage"]
tests = ["tests", "*/mypackage/tests"]

[tool.coverage.report]
exclude_lines = [
  "no cov",
  "if __name__ == .__main__.:",
  "if TYPE_CHECKING:",
]

+ Delete `dynamic = ["version"]`: This sets up dynamic versioning based on tags stored in your git commit history.
+ Add `version = "0.1"` in the place of `dynamic = ["version"]` which you just deleted.
+ Fill in the description if it doesn’t already exist.
+ Remove the `[tool.hatch.version]` table from the bottom of the file.

In [None]:
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[project]
name = "mypackage"
version = "0.1"   <<---
description = 'This is my first package description.'  <<---
readme = "README.md"
...


## Intermediate Check-list
+ git repository for your package
+ Spyder project for your package
+ python directory with a `pyproject.toml` at the root
+ package directory containing an empty `__init__.py`
+ two Python modules (`simulations.py`, `plotting.py`)

## Now, you can install your package
+ make sure that your are in the root project directory (`myproject/`)
+ install package in **editable mode**

Note:
+ Installing your package in editable mode (`-e`), allows you to work on your code and then test the updates interactively in your favorite Python interface.
+ One important caveat of editable mode is that every time you update your code, you may need to restart Python.
+ Below, you use `python -m` to call the version of pip installed into your current active environment. 
+ `python -m` is important to ensure that you are calling the version of pip installed in your current environment

In [None]:
# check out current directory
$ pwd # you should get: /your-path/mypackage

# install your python package in the editable mode (-e)
$ python -m pip install -e .

![congrats](https://cdn-icons-png.flaticon.com/256/3656/3656949.png)

Check whether your package is in the list of current package installations:

In [None]:
pip list

## Import your library
+ create a new Jupyter notebook in your `mypackage` folder called `tutorial`
+ give it a **title** (e.g., *Welcome to the tutorial about mypackage!*)
+ import the functions from `mypackage`:

In [None]:
# load required functions and classes
from mypackage.plotting import print_galton_board
from mypackage.simulations import galton

+ describe what your code is doing next and then call the imported function `galton` 
    + Simulate data. In the following we simulate a galton board with 100 balls and 15 bins.

In [None]:
data = galton(num_bins = 15, num_balls = 100)
print(data)

+ describe the next step and call the imported function `print_galton_board` 
    + Visualize simulated data. We see that already the expected form of a Gaussian...

In [None]:
print_galton_board(data)


😍 

## Including dependencies in your project

+ consider we import functions from `numpy` that we use in the modules of our package
+ then we have to add these *dependencies* in our pyproject.toml
+ one way of doing is the following:
    + first, create a `requirements.txt` file which lists all dependencies for you
        + open GitBash in the main directory of your package `mypackage` (where also the `pyproject.toml` file is)
        + install the `pipreqs` package: `$ pip install pipreqs`
        + run in `pipreqs .`
        + a `requirements.txt` file should have been created in the current directory you are pointing at
    + second, check out the `requirements.txt` file and include the listed packages incl. versions in the `dependencie` variable in your `pyproject.toml`
    + for example like this for numpy:    

+ install your project again as already described