# Packaging Projects for Distribution

- Creating a basic Python project with `setup.py` and `setup.cfg`
- Specifying dependencies
- Activating projects in a virtualenv with `setup.py develop`
- Distributing data with your project
- Using entry points to create console scripts
- Uploading source distributions to PyPI

# First, some terminology...

- a Python **module** is typically a single file ending in `.py` located somewhere along `sys.path` that you can use with the Python `import` statement
- a Python **package** is a folder located somewhere along `sys.path` containing a "magic" file `__init__.py` which can also be imported. If you import a package, Python is actually importing the `__init__.py` *module* in that *package*. You can also import modules or subpackages from a package.
- a Python **project** is a unit of distribution of Python code (it's something you can `pip install`)

# Creating a basic Python project with `setup.py` and `setup.cfg`

To create a project for distribution, you'll need to create a directory with:

- one or more Python packages to distribute
- a `setup.py` file
- (optionally) a `setup.cfg` file

In [None]:
%%file data/MyProject/mypackage/__init__.py
print('This is the __init__ file for mypackage')

In [None]:
%%file data/MyProject/mypackage/mymodule.py
print('This is mymodule')


def greet(name):
    print(f'Hello, {name}!')

For this demo, we'll use `setup.cfg` to provide metadata for our project, so we only need a minimal setup.py:

In [1]:
%%file data/MyProject/setup.py
from setuptools import setup


setup()

Writing data/MyProject/setup.py


We can create the `setup.cfg` file to specify how `setuptools` will build and distribute our project:

In [2]:
%%file data/MyProject/setup.cfg
[metadata]
name = MyProject
url = file:///
author = Some Person
author_email = somebody@example.com
version = 0.1
description = This should be a short description of our project
long_description = file: README.md
classifiers =
    Programming Language :: Python :: 3
    Programming Language :: Python :: 3.7
keywords = test, class

Writing data/MyProject/setup.cfg


It's always nice to provide a README as well:

In [3]:
%%file data/MyProject/README.md
# MyProject

This project is a test setuptools project.

Writing data/MyProject/README.md


## Creating a source distribution

The entry point for all our project management commands is `setup.py`.

We can create a simple source distribution of our project by calling `python setup.py sdist`:

In [4]:
%%bash
cd data/MyProject
python setup.py sdist

running sdist
running egg_info
creating MyProject.egg-info
writing MyProject.egg-info/PKG-INFO
writing dependency_links to MyProject.egg-info/dependency_links.txt
writing top-level names to MyProject.egg-info/top_level.txt
writing manifest file 'MyProject.egg-info/SOURCES.txt'
reading manifest file 'MyProject.egg-info/SOURCES.txt'
writing manifest file 'MyProject.egg-info/SOURCES.txt'
running check
creating MyProject-0.1
creating MyProject-0.1/MyProject.egg-info
copying files to MyProject-0.1...
copying README.md -> MyProject-0.1
copying setup.cfg -> MyProject-0.1
copying setup.py -> MyProject-0.1
copying MyProject.egg-info/PKG-INFO -> MyProject-0.1/MyProject.egg-info
copying MyProject.egg-info/SOURCES.txt -> MyProject-0.1/MyProject.egg-info
copying MyProject.egg-info/dependency_links.txt -> MyProject-0.1/MyProject.egg-info
copying MyProject.egg-info/top_level.txt -> MyProject-0.1/MyProject.egg-info
Writing MyProject-0.1/setup.cfg
creating dist
Creating tar archive
removing 'MyProject-

In [5]:
!tar tzf data/MyProject/dist/MyProject-0.1.tar.gz

MyProject-0.1/
MyProject-0.1/MyProject.egg-info/
MyProject-0.1/MyProject.egg-info/PKG-INFO
MyProject-0.1/MyProject.egg-info/SOURCES.txt
MyProject-0.1/MyProject.egg-info/dependency_links.txt
MyProject-0.1/MyProject.egg-info/top_level.txt
MyProject-0.1/PKG-INFO
MyProject-0.1/README.md
MyProject-0.1/setup.cfg
MyProject-0.1/setup.py


## Adding our packages

So we have an empty project (no packages/modules). We need to tell setuptools to actually include our package explicitly:

In [6]:
%%file data/MyProject/setup.cfg
[metadata]
name = MyProject
url = file:///
author = Some Person
author_email = somebody@example.com
version = 0.1
description = This should be a short description of our project
long_description = file: README.md
classifiers =
    Programming Language :: Python :: 3
    Programming Language :: Python :: 3.7
keywords = test, class

[options]
packages = mypackage

Overwriting data/MyProject/setup.cfg


In [7]:
%%bash
cd data/MyProject
python setup.py sdist

running sdist
running egg_info
writing MyProject.egg-info/PKG-INFO
writing dependency_links to MyProject.egg-info/dependency_links.txt
writing top-level names to MyProject.egg-info/top_level.txt
reading manifest file 'MyProject.egg-info/SOURCES.txt'
writing manifest file 'MyProject.egg-info/SOURCES.txt'
running check
creating MyProject-0.1
creating MyProject-0.1/MyProject.egg-info
creating MyProject-0.1/mypackage
copying files to MyProject-0.1...
copying README.md -> MyProject-0.1
copying setup.cfg -> MyProject-0.1
copying setup.py -> MyProject-0.1
copying MyProject.egg-info/PKG-INFO -> MyProject-0.1/MyProject.egg-info
copying MyProject.egg-info/SOURCES.txt -> MyProject-0.1/MyProject.egg-info
copying MyProject.egg-info/dependency_links.txt -> MyProject-0.1/MyProject.egg-info
copying MyProject.egg-info/top_level.txt -> MyProject-0.1/MyProject.egg-info
copying mypackage/__init__.py -> MyProject-0.1/mypackage
copying mypackage/mymodule.py -> MyProject-0.1/mypackage
Writing MyProject-0.1/s

## Specifying dependencies

We can tell setuptools that we depend on particular versions (or version ranges) of other packages with an `install_requires` option:

In [8]:
%%file data/MyProject/setup.cfg
[metadata]
name = MyProject
url = file:///
author = Some Person
author_email = somebody@example.com
version = 0.1
description = This should be a short description of our project
long_description = file: README.md
classifiers =
    Programming Language :: Python :: 3
    Programming Language :: Python :: 3.7
keywords = test, class

[options]
packages = mypackage
install_requires = 
    jupyter
    flask
    numpy>=1.16.0<1.17

Overwriting data/MyProject/setup.cfg


In [9]:
%%bash
cd data/MyProject
python setup.py sdist

running sdist
running egg_info
writing MyProject.egg-info/PKG-INFO
writing dependency_links to MyProject.egg-info/dependency_links.txt
writing requirements to MyProject.egg-info/requires.txt
writing top-level names to MyProject.egg-info/top_level.txt
reading manifest file 'MyProject.egg-info/SOURCES.txt'
writing manifest file 'MyProject.egg-info/SOURCES.txt'
running check
creating MyProject-0.1
creating MyProject-0.1/MyProject.egg-info
creating MyProject-0.1/mypackage
copying files to MyProject-0.1...
copying README.md -> MyProject-0.1
copying setup.cfg -> MyProject-0.1
copying setup.py -> MyProject-0.1
copying MyProject.egg-info/PKG-INFO -> MyProject-0.1/MyProject.egg-info
copying MyProject.egg-info/SOURCES.txt -> MyProject-0.1/MyProject.egg-info
copying MyProject.egg-info/dependency_links.txt -> MyProject-0.1/MyProject.egg-info
copying MyProject.egg-info/requires.txt -> MyProject-0.1/MyProject.egg-info
copying MyProject.egg-info/top_level.txt -> MyProject-0.1/MyProject.egg-info
copyi

# Activating projects using `setup.py develop`

When we're developing our project, we probably want its packages to be importable as though it were 'installed' in our virtualenv. To do this, we can invoke `setup.py` with the `develop` option. 

This creates a `MyProject.egg-link` file in a location along `sys.path` which makes your packages importable from anwhere that uses the virtualenv.

In [10]:
%%bash
cd data/MyProject
python -m venv env
source env/bin/activate
python setup.py develop

running develop
running egg_info
writing MyProject.egg-info/PKG-INFO
writing dependency_links to MyProject.egg-info/dependency_links.txt
writing requirements to MyProject.egg-info/requires.txt
writing top-level names to MyProject.egg-info/top_level.txt
reading manifest file 'MyProject.egg-info/SOURCES.txt'
writing manifest file 'MyProject.egg-info/SOURCES.txt'
running build_ext
Creating /Users/rick446/src/arborian-classes/data/MyProject/env/lib/python3.7/site-packages/MyProject.egg-link (link to .)
Adding MyProject 0.1 to easy-install.pth file

Installed /Users/rick446/src/arborian-classes/data/MyProject
Processing dependencies for MyProject==0.1
Searching for numpy>=1.16.0<1.17
Reading https://pypi.org/simple/numpy/
Downloading https://files.pythonhosted.org/packages/a6/6f/cb20ccd8f0f8581e0e090775c0e3c3e335b037818416e6fa945d924397d2/numpy-1.16.2-cp37-cp37m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl#sha256=80a41edf64a3626e729a62df7

no previously-included directories found matching 'docs/build'
zip_safe flag not set; analyzing archive contents...
tornado.__pycache__.autoreload.cpython-37: module references __file__
tornado.__pycache__.gen.cpython-37: module references __file__
tornado.__pycache__.options.cpython-37: module references __file__
tornado.__pycache__.speedups.cpython-37: module references __file__
tornado.__pycache__.testing.cpython-37: module references __file__
tornado.test.__pycache__.gen_test.cpython-37: module references __file__
tornado.test.__pycache__.httpserver_test.cpython-37: module references __file__
tornado.test.__pycache__.iostream_test.cpython-37: module references __file__
tornado.test.__pycache__.locale_test.cpython-37: module references __file__
tornado.test.__pycache__.options_test.cpython-37: module references __file__
tornado.test.__pycache__.template_test.cpython-37: module references __file__
tornado.test.__pycache__.web_test.cpython-37: module references __file__
zip_safe flag 

In [12]:
cat data/MyProject/env/lib/python3.7/site-packages/easy-install.pth

/Users/rick446/src/arborian-classes/data/MyProject
./numpy-1.16.2-py3.7-macosx-10.14-x86_64.egg
./Flask-1.0.2-py3.7.egg
./jupyter-1.0.0-py3.7.egg
./itsdangerous-1.1.0-py3.7.egg
./Click-7.0-py3.7.egg
./Werkzeug-0.15.1-py3.7.egg
./Jinja2-2.10-py3.7.egg
./qtconsole-4.4.3-py3.7.egg
./notebook-5.7.6-py3.7.egg
./nbconvert-5.4.1-py3.7.egg
./jupyter_console-6.0.0-py3.7.egg
./ipywidgets-7.4.2-py3.7.egg
./ipykernel-5.1.0-py3.7.egg
./MarkupSafe-1.1.1-py3.7-macosx-10.14-x86_64.egg
./traitlets-4.3.2-py3.7.egg
./Pygments-2.3.1-py3.7.egg
./jupyter_core-4.4.0-py3.7.egg
./jupyter_client-5.2.4-py3.7.egg
./ipython_genutils-0.2.0-py3.7.egg
./tornado-6.0.1-py3.7-macosx-10.14-x86_64.egg
./terminado-0.8.1-py3.7.egg
./pyzmq-18.0.1-py3.7-macosx-10.14-x86_64.egg
./prometheus_client-0.6.0-py3.7.egg
./nbformat-4.4.0-py3.7.egg
./Send2Trash-1.5.0-py3.7.egg
./testpath-0.4.2-py3.7.egg
./pandocfilters-1.4.2-py3.7.egg
./mistune-0.8.4-py3.7.egg
./entrypoints-0.3-py3.7.egg
./defusedxml-0.5.0

In [13]:
%%bash
source data/MyProject/env/bin/activate
python -c 'import mypackage.mymodule; mypackage.mymodule.greet("class")'

This is the __init__ file for mypackage
This is mymodule
Hello, class!


## Distributing data with our project

Normally, only Python files are included with our project. In order to include non-Python files, we need to specify those as well:

In [14]:
%%file data/MyProject/mypackage/template.txt
This is an awesome template that greets you.

Hello, ${name}!

Writing data/MyProject/mypackage/template.txt


In [15]:
%%file data/MyProject/mypackage/mymodule.py
import os, string


def greet(name):
    with open(os.path.join(
        os.path.dirname(__file__),
        'template.txt'
    )) as f:
        template = string.Template(f.read())
    print(template.safe_substitute({'name': name}))

Overwriting data/MyProject/mypackage/mymodule.py


In [16]:
%%file data/MyProject/setup.cfg
[metadata]
name = MyProject
url = file:///
author = Some Person
author_email = somebody@example.com
version = 0.1
description = This should be a short description of our project
long_description = file: README.md
classifiers =
    Programming Language :: Python :: 3
    Programming Language :: Python :: 3.7
keywords = test, class

[options]
packages = mypackage
install_requires = 
    jupyter
    flask
    numpy>=1.16.0<1.17
    
[options.package_data]
* = *.txt

Overwriting data/MyProject/setup.cfg


In [17]:
%%bash
cd data/MyProject
python setup.py sdist

running sdist
running egg_info
writing MyProject.egg-info/PKG-INFO
writing dependency_links to MyProject.egg-info/dependency_links.txt
writing requirements to MyProject.egg-info/requires.txt
writing top-level names to MyProject.egg-info/top_level.txt
reading manifest file 'MyProject.egg-info/SOURCES.txt'
writing manifest file 'MyProject.egg-info/SOURCES.txt'
running check
creating MyProject-0.1
creating MyProject-0.1/MyProject.egg-info
creating MyProject-0.1/mypackage
copying files to MyProject-0.1...
copying README.md -> MyProject-0.1
copying setup.cfg -> MyProject-0.1
copying setup.py -> MyProject-0.1
copying MyProject.egg-info/PKG-INFO -> MyProject-0.1/MyProject.egg-info
copying MyProject.egg-info/SOURCES.txt -> MyProject-0.1/MyProject.egg-info
copying MyProject.egg-info/dependency_links.txt -> MyProject-0.1/MyProject.egg-info
copying MyProject.egg-info/requires.txt -> MyProject-0.1/MyProject.egg-info
copying MyProject.egg-info/top_level.txt -> MyProject-0.1/MyProject.egg-info
copyi

In [18]:
%%bash
source data/MyProject/env/bin/activate
python -c 'import mypackage.mymodule; mypackage.mymodule.greet("class")'

This is the __init__ file for mypackage
This is an awesome template that greets you.

Hello, class!



# Using entry_points for console_scripts

If you need to create a new command-line tool, a nice approach is to use the `entry_points` feature of `setuptools`:

In [19]:
%%file data/MyProject/setup.cfg
[metadata]
name = MyProject
url = file:///
author = Some Person
author_email = somebody@example.com
version = 0.1
description = This should be a short description of our project
long_description = file: README.md
classifiers =
    Programming Language :: Python :: 3
    Programming Language :: Python :: 3.7
keywords = test, class

[options]
packages = mypackage
install_requires = 
    jupyter
    flask
    numpy>=1.16.0<1.17
    
[options.package_data]
* = *.txt

[options.entry_points]
console_scripts =
  my-greet=mypackage.mymodule:greet_main

Overwriting data/MyProject/setup.cfg


In [20]:
%%file data/MyProject/mypackage/mymodule.py
import os, sys, string


def greet(name):
    with open(os.path.join(
        os.path.dirname(__file__),
        'template.txt'
    )) as f:
        template = string.Template(f.read())
    print(template.safe_substitute({'name': name}))
    
    
def greet_main():
    if len(sys.argv) > 1:
        name = sys.argv[1]
    else:
        name = 'unknown human'
    greet(name)

Overwriting data/MyProject/mypackage/mymodule.py


In [21]:
%%bash
cd data/MyProject
python -m venv env
source env/bin/activate
python setup.py develop

running develop
running egg_info
writing MyProject.egg-info/PKG-INFO
writing dependency_links to MyProject.egg-info/dependency_links.txt
writing entry points to MyProject.egg-info/entry_points.txt
writing requirements to MyProject.egg-info/requires.txt
writing top-level names to MyProject.egg-info/top_level.txt
reading manifest file 'MyProject.egg-info/SOURCES.txt'
writing manifest file 'MyProject.egg-info/SOURCES.txt'
running build_ext
Creating /Users/rick446/src/arborian-classes/data/MyProject/env/lib/python3.7/site-packages/MyProject.egg-link (link to .)
MyProject 0.1 is already the active version in easy-install.pth
Installing my-greet script to /Users/rick446/src/arborian-classes/data/MyProject/env/bin

Installed /Users/rick446/src/arborian-classes/data/MyProject
Processing dependencies for MyProject==0.1
Searching for numpy==1.16.2
Best match: numpy 1.16.2
Processing numpy-1.16.2-py3.7-macosx-10.14-x86_64.egg
numpy 1.16.2 is already the active version in easy-install.pth
Installi

In [22]:
!data/MyProject/env/bin/my-greet

This is the __init__ file for mypackage
This is an awesome template that greets you.

Hello, unknown human!



In [23]:
!data/MyProject/env/bin/my-greet class

This is the __init__ file for mypackage
This is an awesome template that greets you.

Hello, class!



In [24]:
cat data/MyProject/env/bin/my-greet

#!/Users/rick446/src/arborian-classes/data/MyProject/env/bin/python
# EASY-INSTALL-ENTRY-SCRIPT: 'MyProject','console_scripts','my-greet'
__requires__ = 'MyProject'
import re
import sys
from pkg_resources import load_entry_point

if __name__ == '__main__':
    sys.argv[0] = re.sub(r'(-script\.pyw?|\.exe)?$', '', sys.argv[0])
    sys.exit(
        load_entry_point('MyProject', 'console_scripts', 'my-greet')()
    )


# Registering with PyPI

You'll need to create an account at http://pypi.org

...and then login using




In [28]:
%%file data/MyProject/setup.cfg
[metadata]
;; change name to make it unique
name = ProductionalizingProject-1
url = https://github.com/DevelopIntelligence
author = Some Person
author_email = somebody@example.com
version = 0.3
description = This should be a short description of our project
long_description = file: README.md
classifiers =
    Programming Language :: Python :: 3
    Programming Language :: Python :: 3.7
keywords = test, class

[options]
packages = mypackage
install_requires = 
    jupyter
    flask
    numpy>=1.16.0<1.17
    
[options.package_data]
* = *.txt

Overwriting data/MyProject/setup.cfg


In [29]:
%%bash
cd data/MyProject
rm dist/*   # clean up old distributions
source env/bin/activate
pip install twine
python setup.py sdist
twine upload dist/*

running sdist
running egg_info
creating ProductionalizingProject_1.egg-info
writing ProductionalizingProject_1.egg-info/PKG-INFO
writing dependency_links to ProductionalizingProject_1.egg-info/dependency_links.txt
writing requirements to ProductionalizingProject_1.egg-info/requires.txt
writing top-level names to ProductionalizingProject_1.egg-info/top_level.txt
writing manifest file 'ProductionalizingProject_1.egg-info/SOURCES.txt'
reading manifest file 'ProductionalizingProject_1.egg-info/SOURCES.txt'
writing manifest file 'ProductionalizingProject_1.egg-info/SOURCES.txt'
running check
creating ProductionalizingProject-1-0.3
creating ProductionalizingProject-1-0.3/ProductionalizingProject_1.egg-info
creating ProductionalizingProject-1-0.3/mypackage
copying files to ProductionalizingProject-1-0.3...
copying README.md -> ProductionalizingProject-1-0.3
copying setup.cfg -> ProductionalizingProject-1-0.3
copying setup.py -> ProductionalizingProject-1-0.3
copying ProductionalizingProject_1

You are using pip version 18.1, however version 19.0.3 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.


# Lab

Open [packaging lab][packaging-lab]

[packaging-lab]: ./packaging-lab.ipynb