#| label: intro

# Course material 8
## Lesson 9 (07.12.2023)

> Disclaimer: Material is taken from 
> 
> the Research Software Engineering with Python Workhop
> 
> from [The Alan Turing Institute’s Research Engineering Group](https://www.turing.ac.uk/research/research-engineering)
> 
> and [UCL Research IT Services](http://www.ucl.ac.uk/research-it-services/homepage).

# Open Source Software Licenses (some resources)
+ A short introduction (youtube video):
    + The Linux Experiment (Username). "Free and Open Source software licenses explained." YouTube, uploaded by The Linux Experiment, 25.05.2022, https://www.youtube.com/watch?v=UMIG4KnM8xw.
+ Github support in choosing a license: https://choosealicense.com/
+ Open Source License Comparison Grid as provided by the Carnegie Mellon University: https://www.cmu.edu/cttec/forms/opensourcelicensegridv1.pdf

# Software Packaging
+ open your terminal in the directory of your example-project folder
+ pull information from your Github repo such that your local directory is up-to-date
    + `$ git pull` 

## Creating a virtual environment using conda
+ First, we create an empty virtual environment using conda
    + `$ conda create --name example-project`
+ activate you environment
    + `$ conda activate example-project`  (the environment's name is now displayed in brackets in the terminal)
+ install pip in conda such that we can still use pip commands in conda
    + `$ conda install pip`
      
## Prepare example-project for packaging
+ change in the Anaconda Navigator your environment to `example-project`
+ install and open Spyder and then open the politeness_data.py 
+ open a new python file in Spyder 
    + from `politeness_data.py` cut the class `DownloadData` and paste it into the new file
    + cut & paste also the following imports
        + `import requests`, `import os`, `import pathlib`
    + save the new python file as `download_data.py` in your example-project folder
+ open once more a new python file 
    + import the library "pathlib": `import pathlib`
    + cut all code snippets that are still below the class PolitenessData in the `politeness_data.py` and paste them into the new file
    + save the new python file as `usage_example.py`
+ modify in the `politeness_data.py`:
    + in the plot sns.swarmplot change the value of the `data` argument from `df` to `pd.DataFrame(df, columns = df.columns)` (as seaborn requires here a pandas data frame and not a polars data frame)
    + install the module `pyarrow` which is needed for the polars function: pl.from_pandas
        + `$ pip install ` 
+ save and close all files in Spyder

+ go to your terminal and list all files in your folder including hidden files (starting with .)
    + `$ ls -a`
+ You should see the following:

## Laying out a project
+ When planning to package a project for distribution, defining a suitable project layout is essential.
+ A classical layout looks as follows (here with exemplified by our example-project):

+ `example_project` (source code directory)
    + should contain an `__init__.py` file, which makes Python treat it as a module.
    + The `__init__.py` file can be empty.
+ `pyproject.toml`
    + To install your package you need to define a "build system", the tool that will do the work of creating the package, and to provide a configuration file to specify how your package should be built.
    + The three most common package config files are:
        + pyproject.toml (preferred)
        + setup.cfg (may be deprecated in the future)
        + setup.py (may be needed for packages with complex build requirements)
    + You’ll find a lot of projects that use setup.py (which used to be the standard), but for new projects it’s recommended to use pyproject.toml. TOML is a modern file format for configuration files.
    + There are multiple "build systems" that can interpret `pyproject.toml` files and build your package. The original and most ubiquitous is `setuptools`.
    + Note: The structure of pyproject.toml will differ depending on the tool you’re using.
+ `setup.py`
    + If you want to install a package, but keep working on it, you can install it in “editable mode”. (most likely this won't be needed in future)
+ `LICENSE.txt`, `README.md`, `.gitignore`
    + about these files we talked in the last session (please checkout the corresponding material)
+ `examples`
    + this folder includes all code/ material that we need in order to present case studies that show the functionality of our package
    + this is not a "standard folder" but for our specific example-project useful
+ `tests`
    + includes tests of your functions/classes  
    + we will talk about testing in the upcoming session   

## Prepare the folder directory
+ prepare the `example_project` folder:
    + go to your terminal create a `example_project` folder in your example-project directory
        + `$ mkdir example_project`
    + move `politeness_data.py` and `download_data.py` into the `example_project` folder
        + `$ mv politeness_data.py example_project`
        + `$ mv download_data.py example_project`  
    + go into the example_project folder and create an empty `__init__.py`:
        + `$ cd example_project`
        + `$ touch __init__.py`     
    + check whether everything worked: list all files:
        + `$ ls`
    + go again out of the example_project folder into the main directory of your example-project:
        + `$ cd ..`
          
+ prepare the `examples` folder: 
    + Let's do the same procedure as above for the examples folder:
        + `$ mkdir examples` (create the examples folder)
        + `$ mv data examples` (move the data folder to examples)
        + `$ mv usage_example.py examples` (move usage_example.py to examples)
          
+ prepare the `tests` folder: 
    + let's create also already the tests folder with empty files as preparation for the upcoming "testing" session
        + `$ mkdir tests` and `$ cd tests` (create the tests folder and go into it)
        + `$ touch test_politeness_data.py` (create an empty test file)
        + `$ touch test_download_data.py` (create an empty test file)
        + `$ cd ..` (go back to the main directory of the example-project)

+ Let's have a look into the current structure of our example-project
    + `$ ls -Rt` 

## Create a pyproject.toml with setuptools
+ next, we want to create the pyproject.toml
+ go to your terminal and create the file: `$ touch pyproject.toml`
+ then open the file (Windows: `start pyproject.toml`, Mac: `open pyproject.toml`)
+ copy&paste the following text into your pyproject.toml:

+ The `[build-system]` section gives the details the tool that should be used to create the package from our code, in this case `setuptools`
+ The `[project]` section contains metadata about your package, at minimum this should include your package’s name (usually the name of your package directory) and a version number
+ Your package’s dependencies should be passed as a list in the `[project]` section
    + Now the question is: How do you get a list of all modules that you import in your package?
    + One possible way is to use the python module `pipreqs`
        + install the module: `$ pip install pipreqs`
        + make sure that you are in the main directory of your example-project
        + and then run `$ pipreqs .` (it will now take some time but finally you should see the message:  Successfully saved requirements file in .\requirements.txt)
    + open the requirements.txt (`$ start requirements.txt`)
        + search for each module for the oldest version and type ">= ..." (e.g., numpy >= 1.21.5)
        + copy the modules/versions and paste them into your pyproject.toml
    + it is important to check whether all imports that you made in the folder `example_project` were detected. We can use the terminal for that:
        + `$ cd example_project` and `$ grep -r "import"`
        + You see here that pathlib is listed but was not recognized by `pipreqs`,
        + thus we add `pathlib` manually to the dependencies list
        + information about the version we get by typing the following into the terminal:
            + `$ pip show pathlib`
+ `[project.optional-dependencies]`: Sometimes a package may have extra optional features, with extra dependencies, that not all users need. A common example is development dependencies (e.g. for running tests, building documentation, checking code quality, and similar) that a normal user won’t need. `dev` is the name of an optional group of dependencies that can be passed to pip when installing the package

+ save and close the pyproject.toml

## Create a small setup.py

+ create a setup.py
    + `$ touch setup.py`
    + write the following two lines, save and close the file: 

from setuptools import setup

setup()

## Install package
+ now we install our package (make sure that your are in the main directory of your example-project) 
    + `$ pip install .`
+ To install dependencies specified in `[project.optional-dependencies]`, include the name of the optional group in square brackets, like this:
    + `$ pip install -e ".[dev]"` (now pytest will be installed in an editable mode therefore -e)
+ Let's have again a look into our folder structure
    + `$ ls -Rt`

## Using your package

+ open the file `usage_example.py` in Spyder
+ add the following two lines on top of the file:
    + `from example_project.download_data import DownloadData`
    + `from example_project.politeness_data import PolitenessData`
+ thus, we import our own module and call the class from the submodule
+ change the path of the variable `target_path` (note, we moved the data folder into the folder "examples")
+ finally, let's create a Spyder project such that we don't have to modify the PATH permanently
    + click on the tab `Projects` in the Spyder toolbar
    + then use the option "existing directory" and select the "example-project" directory 
+ Open the `usage_example.py` again and run the file 

# Documentation (with Sphinx)

+ We’re going to document our “example-project” using docstrings with Sphinx.
+ There are various conventions for how to write docstrings, we use the conventions from NumPy. So we use the numpydoc sphinx extension to support these.

## Set up sphinx
+ Invoke the sphinx-quickstart command to build Sphinx’s configuration file automatically based on questions at the command line:
    +  `$ sphinx-quickstart docs`
+ Entering this command result in the following response: 

+ now our directory contains a new folder called `docs`:
    + `$ ls`
+ let's go into the folder and have a look at the generated files:
    + `$ cd docs` and `$ ls`
    + you should see something like: Makefile  _build/  _static/  _templates/  conf.py  index.rst  make.bat
+ let's have a look at the `conf.py`:
    + `$ start conf.py`      
+ This file contains the project’s Sphinx configuration, as Python variables.
+ First, we add before the section "project information" the path to our package modules (relative to the location where the conf.py is stored)  

import os
import sys

sys.path.insert(0, os.path.abspath("../example_project"))

+ Let’s add some extensions to the `extensions` field and save the file

+ we have to install the `myst_nb` module using
    + `$ pip install myst_nb` 

# automatic documentation (build .rst from .py files)
+ make sure you are in the `docs` folder 
+ `$ sphinx-apidoc -f -o source/ ../example_project` (sphinx-apidoc [optional: -f] -o <OUTPUT_PATH> <MODULE_PATH> )
+ check whether the folder `source` has been created:
    + `$ ls`
+ go into the folder `source` and list all generated files:
    + `$ ls`
    + you should see something like: example_project.rst,  modules.rst
 
## Define the root documentation page
+ Sphinx uses RestructuredText another wiki markup format similar to Markdown.
+ when using the command `sphinx-quickstart` a template is created in an `index.rst`, which can be edited to contain any preamble text you want.
+ the index.rst file can be found in the `docs` folder, thus let's go to the docs folder and open the file 
    + `$ cd ..` and `$ start index.rst`
+ Make the following small changes in the index.rst (and save the modified file): 

## Run sphinx
+ make sure you are in the `docs` folder 
+ and then run Sphinx using:
    + `$ sphinx-build . ./output`  ( sphinx-build <sourcedir> <outputdir> )
+ check whether in docs a folder named `output` has been created:
    + `$ ls`
    + you should see something like: Makefile  _build/  _static/  _templates/  conf.py  index.rst  make.bat  output/

## Sphinx output
+ go into the `output` folder and open the `index.html`
    + `$ cd output` and `start index.html`
    + you should see a simple documentation page that has been opened in your default browser  

## Sphinx design
+ At the moment our documentation site looks not very nice, so let us change the design a bit.
+ First, let us use as html design `sphinx_book_theme`. Open the conf.py in the docs folder:
    + (make sure you are in the docs folder) `$ start conf.py`
    + change the value of the variable `html_theme` from alabaster to `sphinx_book_theme` and save the file
    + Before we can use this new theme, we have to install it with: `$ pip install sphinx_book_theme`
+ Now, we can build the site again and have a look how the design has changed
    + (make sure you are still in the docs folder) `$ sphinx-build . ./output`
    + open the html file: `$ start output/index.html`
+ Looks already better. However, we can even improve the design a bit.
+ Open again the `conf.py` and add the following below the variable "exclude_patterns":

numpydoc_show_class_members = False

autodoc_default_options = {
    "members": "var1, var2",
    "special-members": "__call__,__init__",
    "undoc-members": True,
    "exclude-members": "__weakref__",
    "member-order": "bysource"
}

+ furthermore add to the variable `extensions`:
    + "sphinx_design",  # For designing beautiful, view size responsive web components.
+ and add  below `html_theme`:
    + html_title = "Example Project" 
+ save the conf.py
+ before we build the site again, we have to install the module `sphinx_design` with
    + `$ pip install sphinx_design`
+ now we can have a look at the changes we just made:
    + `$ sphinx-build . ./output`
 
## Include jupyter notebooks into your site
+ Finally, let us include a jupyter notebook into our site
+ I have already created a notebook that corresponds to our "usage_example.py"
+ In order to save time you can simply download the `example_notebook.ipynb` from our course website in the material section (lesson 9)
+ create in the `example-project/docs` folder a new folder called `notebooks` 
    + (make sure you are in the docs folder) `$ mkdir notebooks`
    + save the file `example_notebook.ipynb` in the folder `notebooks`
+ open the index.rst in order to include the notebook 
    + `$ start index.html`
    + include after `API <source/modules.rst>` the following line:
        + `Example <notebooks/example_notebook>`
    + save the index.rst
+ now open the `conf.py` and add the following lines at the end of the file:

+ save the `conf.py`
+ now let's build the site again
    + (make sure you are in the docs folder) `$ sphinx-build . ./output`
+ and open the `index.html` 