# Packaging and Distributing Code (for python)

## Kevin Gullikson

# Why Package code?

- Make things easy on collaborators
  - They can install things easier
  - They can all use the same code base as you
  - They can contribute to the code
  - Motivate them to switch to python

- Make things easy on YOU
  - Easily install on new computers
  - Documentation --> helps you remember what your 2 year old code does

- Get citations. Profit.

- Overall good experience for industry jobs (I presume)

# Setting up a Python Package

#Minimal Package Structure

```
root
+-- setup.py
+-- README
+-- LICENSE
+-- package_name/
|   +-- __init__.py
|   +-- foo.py
|   +-- bar.py

```

## What is the `__init__.py`?

 - It tells python that this directory is a package
 - Does not need to have anything in it - an empty file is fine
 - Lets you do:
 
```python
import package_name
package_name.foo.foofunction()
```

- You CAN put some stuff in it, though. Putting this in the `__init__.py`
```python
from foo import foofunction
```
    Lets you do:
```python
import package_name
package_name.foofunction()
```

## What is the `setup.py`?

- This tells python how to install the program
- You have probably done this before:

```bash
python setup.py install
```

- [Here](https://pythonhosted.org/an_example_pypi_project/setuptools.html) is a good tutorial on setting it up.
- An example of a setup.py I created:
  
```python
from setuptools import setup

setup(name='fitting_utilities',
      version='0.1.0',
      description='Various useful classes for fitting stuff.',
      author='Kevin Gullikson',
      author_email='kevin.gullikson@gmail.com',
      license='BSD',
      classifiers=[
          'Development Status :: 3 - Alpha',
          'Intended Audience :: Science/Research',
          'License :: OSI Approved :: BSD License',
          'Programming Language :: Python',
          'Topic :: Scientific/Engineering :: Astronomy',
          ],
      packages=['fitters'],
      requires=['numpy', 'astropy'])
```

## `setup.py` arguments

- name: This is what goes on pypi (more on that later)
- classifiers: Think of these like the keywords you put in your abstract. You want to make your code searchable. A list of classifiers is available [here](https://pypi.python.org/pypi?%3Aaction=list_classifiers)
- packages: In the simple/standard case, it is just a list of the packages you are making available. **This is the name you import**
```python
import fitters
```
NOT
```python
import fitting_utilities
```
  - Having different things for the 'name' and 'packages' field can lead to [confusion](https://github.com/dfm/corner.py/issues/59)

## Some final thoughts about package setup



- Always have a README
  - github will initialize one for you when you make a repository
  - github works with markdown or reStructured Text (.rst files)
  - pypi only works with reStructured Text
  - Write your README in .rst!

- Always have a LICENSE
> "Because I did not explicitly indicate a license, I declared an implicit copyright without explaining 
> how others could use my code. Since the code is unlicensed, I could theoretically assert copyright at 
> any time and demand that people stop using my code. Experienced developers won't touch unlicensed code
> because they have no legal right to use it. That's ironic, considering the whole reason I posted the 
> code in the first place was so other developers could benefit from that code. I could have easily 
> avoided this unfortunate situation if I had done the right thing and included a software license with my code."
> -- <cite>Jeff Atwood, [codinghorror](http://blog.codinghorror.com/pick-a-license-any-license/) </cite>

- The main choices are:
  - BSD/MIT: Permissive. Anyone can use for any purpose. The only legalese is saying that I don't guarantee this will work.
  - GPL: copy-left. Anyone can use, but then their license **must** be GPL as well. 
  - Choosing between the two gets nerds as riled up as vim vs emacs.

# Documenting Your Code

# Readme
- Should contain general information about the package
- Should include how to install it (even if that is just `python setup.py install`)
- A simple usage example is a good idea too

# Docstrings

- One of the best parts of python. **USE THEM**
- From scikit-learn:

```python
class BayesianRidge(LinearModel, RegressorMixin):
    """Bayesian ridge regression
    Fit a Bayesian ridge model and optimize the regularization parameters
    lambda (precision of the weights) and alpha (precision of the noise).
    Read more in the :ref:`User Guide <bayesian_regression>`.
    Parameters
    ----------
    n_iter : int, optional
        Maximum number of iterations.  Default is 300.
    tol : float, optional
        Stop the algorithm if w has converged. Default is 1.e-3.
    alpha_1 : float, optional
        Hyper-parameter : shape parameter for the Gamma distribution prior
        over the alpha parameter. Default is 1.e-6
    alpha_2 : float, optional
        Hyper-parameter : inverse scale parameter (rate parameter) for the
        Gamma distribution prior over the alpha parameter.
        Default is 1.e-6.
    lambda_1 : float, optional
        Hyper-parameter : shape parameter for the Gamma distribution prior
        over the lambda parameter. Default is 1.e-6.
    lambda_2 : float, optional
        Hyper-parameter : inverse scale parameter (rate parameter) for the
        Gamma distribution prior over the lambda parameter.
        Default is 1.e-6
    compute_score : boolean, optional
        If True, compute the objective function at each step of the model.
        Default is False
    fit_intercept : boolean, optional
        whether to calculate the intercept for this model. If set
        to false, no intercept will be used in calculations
        (e.g. data is expected to be already centered).
        Default is True.
    normalize : boolean, optional, default False
        If True, the regressors X will be normalized before regression.
    copy_X : boolean, optional, default True
        If True, X will be copied; else, it may be overwritten.
    verbose : boolean, optional, default False
        Verbose mode when fitting the model.
    Attributes
    ----------
    coef_ : array, shape = (n_features)
        Coefficients of the regression model (mean of distribution)
    alpha_ : float
       estimated precision of the noise.
    lambda_ : array, shape = (n_features)
       estimated precisions of the weights.
    scores_ : float
        if computed, value of the objective function (to be maximized)
    Examples
    --------
    >>> from sklearn import linear_model
    >>> clf = linear_model.BayesianRidge()
    >>> clf.fit([[0,0], [1, 1], [2, 2]], [0, 1, 2])
    ... # doctest: +NORMALIZE_WHITESPACE
    BayesianRidge(alpha_1=1e-06, alpha_2=1e-06, compute_score=False,
            copy_X=True, fit_intercept=True, lambda_1=1e-06, lambda_2=1e-06,
            n_iter=300, normalize=False, tol=0.001, verbose=False)
    >>> clf.predict([[1, 1]])
    array([ 1.])
    Notes
    -----
    See examples/linear_model/plot_bayesian_ridge.py for an example.
```

##Docstring conventions

- Give a short description of what the class/function/method does
```python
"""Bayesian ridge regression
    Fit a Bayesian ridge model and optimize the regularization parameters
    lambda (precision of the weights) and alpha (precision of the noise).
"""
```

- Describe each parameter (both what the variable type should be and what the parameter means)
```python
"""
    Parameters
    ----------
    n_iter : int, optional
        Maximum number of iterations.  Default is 300.
    tol : float, optional
        Stop the algorithm if w has converged. Default is 1.e-3.
    ...
"""
```

- Explain what the function returns, if applicable
```python
"""
        Returns
        -------
        C : array, shape = (n_samples,)
            Returns predicted values.
"""
```

- Examples are nice, but I wouldn't say necessary
```python
"""
    Examples
    --------
    >>> from sklearn import linear_model
    >>> clf = linear_model.BayesianRidge()
    >>> clf.fit([[0,0], [1, 1], [2, 2]], [0, 1, 2])
    ... # doctest: +NORMALIZE_WHITESPACE
    BayesianRidge(alpha_1=1e-06, alpha_2=1e-06, compute_score=False,
            copy_X=True, fit_intercept=True, lambda_1=1e-06, lambda_2=1e-06,
            n_iter=300, normalize=False, tol=0.001, verbose=False)
    >>> clf.predict([[1, 1]])
    array([ 1.])
"""
```

#Sphinx (+readthedocs)
- Can probably skip this step *unless you are widely publishing the code*
- Builds documentation from reStructuredText
- Similar stuff to the README on the main page
- Include tutorials
- Auto-documentation of the API

# Sphinx setup
```
root
+-- setup.py
+-- README
+-- LICENSE
+-- package_name/
|   +-- __init__.py
|   +-- foo.py
|   +-- bar.py
+-- docs/
|   +-- index.rst
|   +-- foo.rst
|   +-- baz.rst
```



## `index.rst`
- This is what people will open the documentation to. 
- My [index.rst](http://telfit.readthedocs.org/en/latest/):

```
Welcome to TelFit's documentation!
==================================

Contents:

.. toctree::
   :maxdepth: 2

   Intro
   Installation
   Tutorial
   API
   Updating the atmosphere profile <GDAS_atmosphere>

```

## API (application programming interface)
- Setup in the conf.py (gets mostly generated when you run

```bash
sphinx-quickstart
```
- Include the autodoc extension:

```python

# -- General configuration ------------------------------------------------

# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
    'sphinx.ext.autodoc',
    'sphinx.ext.coverage',
    'sphinx.ext.mathjax',
]

```


## Readthedocs integration
- Set up an account, connect to your github
- Add commit hooks to rebuild the documentation every time you commit to master
- Mostly will just work if the documentation builds on your own computer
- readthedocs won't install random code (understandably)
- You almost definitely will need to hack the conf.py file to make it work on readthedocs

```python

# Mock a few modules
from mock import Mock as MagicMock

class Mock(MagicMock):
    @classmethod
    def __getattr__(cls, name):
            return Mock()

MOCK_MODULES = ['FittingUtilities', 'numpy', 'scipy', 'matplotlib', 'scipy.interpolate', 'numpy.polynomial',
                'lockfile', 'scipy.optimize', 'astropy', 'pysynphot', 'fortranformat', 'cython', 'requests',
                'scipy.linalg', 'matplotlib.pyplot']
sys.modules.update((mod_name, Mock()) for mod_name in MOCK_MODULES)
```

#Distributing Your Code

# Pypi
- de-facto standard in python installation.

```bash
pip install astropy
```

- Works great for pure-python packages, especially if they don't depend on anything too complicated.
- Works for more complicated things as well, but can get hairy...


## Tutorial (mostly stolen from [here](http://peterdowns.com/posts/first-time-with-pypi.html))
- One-time stuff:
  1. Create an account on [pypi](https://pypi.python.org/pypi?%3Aaction=register_form) *and* [pypi testing](https://testpypi.python.org/pypi?%3Aaction=register_form)
  2. Create a .pypirc file in the home directory (to make your life easier):

```
[distutils] # this tells distutils what package indexes you can push to
index-servers =
  pypi
  pypitest

[pypi]
repository: https://pypi.python.org/pypi
username: your_username
password: your_password

[pypitest]
repository: https://testpypi.python.org/pypi
username: your_username
password: your_password
```

## Tutorial (continued)
- For each package
  1. Make a setup.py if you haven't already. Make sure there is a version number in there!
  2. Register to pypitest:
  
    ```bash
    python setup.py register -r pypitest
    ```

  3. Upload to pypitest
  
    ```bash
    python setup.py sdist upload -r pypitest
    ```

 4. Test:
 
    ```bash
    # Make a new environment to isolate this from the installation you probably already have working
    conda create -n package_test python=3 numpy astropy ...

    # Switch to the new environment
    source activate package_test

    # Install your new package
    pip install -i https://testpypi.python.org/pypi <package name>

    # Test that it works. At the very least, make sure you can import the package
    python -c 'import package_name'
    ```

   - If it works, move on.
   - If not figure out what went wrong, *increment the version number* and start from step 3
 
 5. Upload to pypi (de-increment the version number if you had to update it while testing)
     
    ```bash
    python setup.py register -r pypi
    python setup.py sdist upload -r pypi
    
    ```

# Anaconda

- Quickly starting to rival pypi for installation
- Installs *binaries* rather than compiling from source
- Knows more about dependencies, can install other things in the right order
- Can easily install non-python things too (like the HDF5 library need by h5py/pytables)

# Tutorial
- I will assume you have your package on pypi already. There are other ways to make conda packages...
- One-time stuff:
  1. Make an account on anaconda.org
  2. login:
  
    ```bash
    anaconda login
    # Enter username and password when prompted
    ```

#Tutorial
- For every package
  1. cd to your home directory
  2. ``` conda skeleton pypi <package_name> #This makes a directory called package_name```
  3. Look at the meta.yaml in the new package_name directory. Make sure the information, and especially the required packages, are correct
  4. If you have the "install_requires" keyword in your setup.py, you may need to edit the build.sh to have:
  
    ```bash
    $PYTHON setup.py install --single-version-externally-managed --record=/tmp/record.txt
    ```
    
    (that fix came from [this issue](https://groups.google.com/a/continuum.io/forum/#!topic/conda/ZKdP5BujriA))
  5. build the package. The full path name to the package will be printed to the screen
    
    ```bash
    conda build <package_name>
    ```
  6. Convert to work for other platforms (I am not sure this is guaranteed to work)
  
    ```bash
    conda convert -f --platform all full/path/to/package -o output_directory
    ```
  7. Upload all of the packages to anaconda.org (must be logged in)
  
    ```bash
    for f in output_directory/*/*
    do
        anaconda upload $f
    done
    ```