Modules and Packaging
====

At some point, you will want to organize and distribute your code library for the whole world to share, preferably on PyPI so that it is pip installable.

## References

This notebook shows a bare-bones version of creating and distributing a project to PyPI. Please follow the instructions in the official documentations. For convenience, you can use the sample project as a template. 

- [Packaging and Distributing Projects](https://packaging.python.org/tutorials/distributing-packages/)
- [A sample Python project](https://github.com/pypa/sampleproject)

For more about how to organize the structure of your package 

- [Official tutorial on packages](https://docs.python.org/3/tutorial/modules.html#packages)

If you are still confused about what `__init__.py` does, this [blog post](and the mysterious `__init__.py`, see) might help.

## Install packages we will use for packaging

In [None]:
! pip install -U pip
! pip install twine

## Modules

In Pythoh, any `.py` file is a module in that it can be imported. Because the interpreter runs the entrie file when a moudle is imported, it is traditional to use a guard to ignore code that should only run when the file is executed as a script.

In [None]:
%%file foo.py
"""
When this file is imported with `import foo`,
only `useful_func1()` and `useful_func()` are loaded, 
and the test code `assert ...` is ignored. However,
when we run foo.py as a script `python foo.py`, then
the two assert statements are run.
Most commonly, the code under `if __naem__ == '__main__':`
consists of simple examples or test cases for the functions
defined in the moule.
"""

def useful_func1():
    pass

def useful_fucn2():
    pass

if __name__ == '__main__':
    assert(useful_func1() is None)
    assert(useful_fucn2() is None)

### Organization of files in a module

When the number of files you write grow large, you will probably want to orgnize them into their own directory structure. To make a folder a  module, you just need to include a file named `__init__.py` in the folder. This file can be empty. For example, here is a module named `pkg` with sub-modules `sub1` and `sub2`.

```
./pkg:
__init__.py	foo.py		sub1		sub2

./pkg/sub1:
__init__.py		more_sub1_stuff.py	sub1_stuff.py

./pkg/sub2:
__init__.py	sub2_stuff.py
```


In [None]:
import pkg.foo as foo

In [None]:
foo.f1()

In [None]:
import pkg

In [None]:
pkg.foo.f1()

#### How to import a module at the same level

Within a package, we need to use absolute path names for importing other modules in the same directory. This prevents confusion as to whether you want to import a system moudle with the same name. For example, `foo.sub1.more_sub1_stuff.py` imports functions from `foo.sub1.sub1_stuff.py`

In [None]:
! cat pkg/sub1/more_sub1_stuff.py

In [None]:
from pkg.sub1.more_sub1_stuff import g3

g3()

#### How to import a moudle at a different level

Again, just use absolute paths. For example, `sub2_stuff.py` in the `sub2` directory uses functions from `sub1_stuff.py` in the `sub1` directory:

In [None]:
! cat pkg/sub2/sub2_stuff.py

In [None]:
from pkg.sub2.sub2_stuff import h2

h2()

## Distributing your package

Suppose we want to distribute our code as a library (for example, on PyPI so that it cnn be installed with `pip`). Let's create an `sta663-<username>` (the username part is just to avoid name conflicts) library containing the `pkg` package and some other files:

- `README.md`: some information about the library
- `sta663.py`: a standalone module
- `run_sta663.py`: a script (intended for use as `python run_sta663.py`)

In [None]:
! ls -R sta663

In [None]:
! cat sta663/run_sta663.py

### Using distutils

All we need to do is to write a `setup.py` file.

In [None]:
%%file sta663/setup.py
from setuptools import setup

setup(name = "sta663-cliburn",
      version = "1.0",
      author='Cliburn Chan',
      author_email='cliburn.chan@duke.edu',
      url='http://people.duke.edu/~ccc14/sta-663-2018/',
      py_modules = ['sta663'],
      packages = ['pkg', 'pkg/sub1', 'pkg/sub2'],
      scripts = ['run_sta663.py'],
      python_requires='>=3',
      )

### Build a source archive for distribution

In [None]:
%%bash

cd sta663
python setup.py sdist
cd -

In [None]:
! ls -R sta663

### Distribution

You can now distribute `sta663-1.0.tar.gz` to somebody else for installation in the usual way.

In [None]:
%%bash

cp sta663/dist/sta663-1.0.tar.gz /tmp
cd /tmp
tar xzf sta663-1.0.tar.gz
cd sta663-1.0
python setup.py install

In [None]:
import sta663

In [None]:
from sta663 import pkg

In [None]:
pkg.sub1.sub1_stuff.g1()

In [None]:
pkg.sub1.sub1_stuff.g2()

In [None]:
pkg.sub1.more_sub1_stuff.g3()

In [None]:
pkg.sub2.sub2_stuff.h1()

In [None]:
pkg.sub2.sub2_stuff.h2()

#### Distributing to PyPI

For testing, please upload to TestPyPI which is cleaned on a regular basis. See instructions at 
https://packaging.python.org/guides/using-testpypi/#using-test-pypi

- **Note 1**: You need to confirm your email address after registration.
- **Note 2**: You can easily delete any uploaded packages by logging in at https://test.pypi.org.

When your package is ready for public release, you can upload to PyPI. See instructions at
https://packaging.python.org/tutorials/distributing-packages/#id78

In [None]:
%%bash

export TWINE_USERNAME='' 
export TWINE_PASSWORD=''
twine upload --repository-url https://test.pypi.org/legacy/ sta663/dist/*

In [None]:
%%bash

pip install --index-url https://test.pypi.org/simple/ sta663