<img src='img/logo.png'>
<img src='img/title.png'>
<img src='img/py3k.png'>

# Table of Contents
* [Learning Objectives:](#Learning-Objectives:)
* [From Scripts to Packages](#From-Scripts-to-Packages)
	* [Directory structure](#Directory-structure)
	* [`__init__.py`](#__init__.py)
	* [`__main__.py`](#__main__.py)
* [Example project](#Example-project)
	* [Exercise 1](#Exercise-1)
	* [Exercise 2](#Exercise-2)
* [Python Packages](#Python-Packages)
	* [Setuptools](#Setuptools)
		* [More information](#More-information)
	* [Conda recipe](#Conda-recipe)
	* [Exercise 1](#Exercise-1)
	* [Exercise 2](#Exercise-2)


# Learning Objectives:

After completion of this module, learners should be able to:

* Distinguish a script, a module and a package in python
* Refactor scripts into re-usable modules
* Create a package from a collection of modules
* Create a setup.py file for the package
* Create a conda recipe for the package
* Install and test the package in a python environment

# From Scripts to Packages

Previously in this course we learned a great deal about parsing command line arguments, reading data from files, preparing efficient data structures, doing computation and displaying results. Generally, these actions were performed in Jupyter notebooks or script files.

* A **script** is a unit of python code that is executed from a single file. It need not use functions.

We learned that **scripts** can also be used as **modules** in other scripts

* A **module** is a unit of python code that can be be *imported* into other python code files. It need not be directly executable as a script.

In this module we will learn how to transform **scripts** into **modules** and prepare a **package**.

* A **package** is a collection of python **modules** that can be *installed* as a whole using a package manager. A package need not provide executable scripts, but should provide some testing capabilities.

## Directory structure

```
package/
  README
  setup.py
  
  package/
    __init__.py
    __main__.py
    module1/
      __init__.py
      classes.py
    module2/
      __init__.py
      funcs.py
      tests/
        test2.py
      
  tests/
    global_tests.py
```    

## `__init__.py`

The `__init__.py` file controls what functions get imported and any code that is run during

```python
import package
```

In many cases this can be blank. It just informs Python that this directory contains module files can can be imported.

For packages with multiple subdirectories the `__init__.py` file can be used to perform relative imports and expose certain methods or classes in the API after import.

For the above project the `package/__init__.py` file can be written using relative imports 

```python
from . import module1
from . import module2
```

With this `__init__.py` file the submodules `module1` and `module2` are automatically imported with the user runs `import package`.

If `function2` in from `module2/funcs.py` requires the `class1` that has been defined in `module1/classes.py` then the `module2/__init__.py` can use relative imports

```python
from . import funcB
from ..module1.classA import class1
```

Alternatively, the `package/__init__.py` file can be used to expose only certain definitions from lower-level modules.

This `__init__.py` file exposes `class2` and `function2` from `module1` and `module2` as `package.class2` and `package.function2` when the user runs `import package`.

```python
from .module1.classes import class2
from .module2.funcs import function2
```

The user will still be able to access all of the other definitions using `package.module1.classes` and `package.module2.funcs`.

## `__main__.py`

The `__main__.py` file defines an python script that can be installed as a stand-alone executable. This script can import any required objects from the module files and generally defines the Command Line Interface. It is considered best practice to also include the following in script itself.

```python
if __name__ == '__main__':
```

# Example project

Look at the `src/qStats` directory. This is a clone of the [qStats2 project](https://github.com/AlbertDeFusco/qStats2) to generate reports of HPC cluster utilization.

This project shows a common organization of a Python package.

## Exercise 1

* Practice running `python setup.py`. Where does the package get installed? What happens when you change conda environments?

## Exercise 2

<img src='img/topics/Advanced-Concept.png' align='left' style='padding:10px'>
Refactor the code
-----------------

1. Refactor the code in `__main__.py` into a module called `qStats/qstats/reports.py` that defines at least two separate functions (or maybe classes) to make a Group Report and a Queue Report. A report should take simple arguments and automate reading the necessary files.

2. Can you write tests for these methods?

# Python Packages

## Setuptools

Here is a simple `setup.py` file. It defines
* what directories contain packages
* where data files are stored
* where *entry* points are defined
  * entry points are command line executable

```python
from setuptools import setup

setup( name='qStats',
        version='0.1',
        description='Moab queue stat report generator',
        author='Albert DeFusco',
        license='MIT',
        packages=['qstats'],

        package_data= {
            'qstats.tests' : ['data/*']
            },

        entry_points = {
            'scripts':'qstats = qstats.__main__:main'
            }

        )
```

Python packages are installed using

```
> python setup.py
```

### More information

* use `find_package()` from `setuptools` to scan for all directories with `__init__.py` files

## Conda recipe

The `meta.yaml` file is called a [conda build recipe](http://conda.pydata.org/docs/building/recipe.html) it defines
* the package name
* the package version
* required conda packages to build and run
* test scripts and imports

The source can be a relative path within the package or URL addresses like pypi and Github.

```yaml
package:
  name: qstats
  version: "0.1"

source:
  path: ./

build:
  script: python setup.py install
  entry_points:
    - qstats = qstats.__main__:main

requirements:
  build:
    - python
    - setuptools
  run:
    - python

test:
  imports:
    - qstats
```

Conda packages are built using

```
> conda-build .
```

The packages are either uploaded to [Anaconda Cloud](http://anaconda.org), your private Anaconda Repository or isntalled locally.

```
> conda install qstats --use-local
```

## Exercise 1

<img src='img/topics/Exercise.png' align='left' style='padding:10px'>
<br>
In `notebooks/src/qStats` practice building and installing the package.

## Exercise 2

<img src='img/topics/Advanced-Concept.png' align='left' style='padding:10px'>
<br>
Build a package called `great_circle` from the distancing code we have been working on

Using the template `great_circle` package in `notebooks/src/great_circle/` add the following items:
* `__init__.py`
* `setup.py`
* `meta.yaml`

Use conda-build to build and install the package in a new conda environment.

<img src='img/copyright.png'>