# Reproducibility & Continuous Integration

## Notebooks 

Notebooks offer a fairly self contained approach to reproducibility (see recent paper "The Scientific Paper Is Obsolete" https://www.theatlantic.com/science/archive/2018/04/the-scientific-paper-is-obsolete/556676/).

When the first gravitational wave detections were announced in 2016, Jupyter notebooks were front and center, allowing the public to (basically) reproduce the results from raw data.

https://losc.ligo.org/s/events/GW150914/LOSC_Event_tutorial_GW150914.html

## Binder 

Allowing people to not just download your notebooks, but also interact with them is a good step. 

Mybinder (https://mybinder.org/) is a hosted (via UC Berkeley) executation layer for your public notebooks.

<img src="imgs/binder.png">

You then add something like this to your README:

```markdown
[![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/profjsb/python-seminar/master)
```

Note that we often need to add dependencies (`environment.yml`)

In [None]:
# %load https://raw.githubusercontent.com/profjsb/python-seminar/master/environment.yml
channels:
  - conda-forge
  - defaults
dependencies:
  - matplotlib
  - scipy
  - numpy
  - pandas
  - numexpr
  - bokeh
  - datashader
  - seaborn
  - pillow
  - ipywidgets
  - altair
  - plotly
  - bqplot
  - pip:
    - sphinx-gallery
    - pdvega
    - vega3



And execute a post build script (`postBuild`). Put these in your top level directory as needed.

In [None]:
# %load https://raw.githubusercontent.com/profjsb/python-seminar/master/postBuild
#!/bin/bash
jupyter nbextension install --sys-prefix --py vega3
jupyter nbextension enable vega --py --sys-prefix
jupyter nbextension enable vega3 --py --sys-prefix


## Packaging and Testing

In [30]:
!git clone https://github.com/profjsb/PyAdder

Cloning into 'PyAdder'...
remote: Counting objects: 80, done.[K
remote: Compressing objects: 100% (18/18), done.[K
remote: Total 80 (delta 6), reused 19 (delta 3), pack-reused 55[K
Unpacking objects: 100% (80/80), done.


In [None]:
!brew install tree

In [32]:
!tree PyAdder/ -a -T PyAdder -C --noreport -I ".git|*.pyc"

[01;34mPyAdder/[00m
├── .coveragerc
├── .gitignore
├── [01;34m.travis[00m
│   └── [01;32mrun.sh[00m
├── .travis.yml
├── CHANGES.txt
├── LICENSE.txt
├── MANIFEST.in
├── README.md
├── [01;34madder[00m
│   ├── __init__.py
│   └── [01;34mtests[00m
│       ├── __init__.py
│       └── test_one_number.py
├── requirements.txt
├── setup.cfg
└── setup.py


See package management: https://packaging.python.org/ for more about the structure of this and the 02 lecture in `01_Versioning_Application_Building`. The meat of this python package is `adder/__init__.py`:

```python
__version__ = "0.0.3"
__author__ = "Josh!"

from numpy import array

def run(*args):
    return array(args).sum()
```


In [33]:
cd PyAdder/

/Users/jbloom/Classes/python-seminar/DataFiles_and_Notebooks/09_Web/PyAdder


In [34]:
!python setup.py sdist bdist_wheel

running sdist
running egg_info
creating PyAdder.egg-info
writing PyAdder.egg-info/PKG-INFO
writing dependency_links to PyAdder.egg-info/dependency_links.txt
writing top-level names to PyAdder.egg-info/top_level.txt
writing manifest file 'PyAdder.egg-info/SOURCES.txt'
reading manifest file 'PyAdder.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
writing manifest file 'PyAdder.egg-info/SOURCES.txt'
running check
creating PyAdder-0.0.3
creating PyAdder-0.0.3/PyAdder.egg-info
creating PyAdder-0.0.3/adder
creating PyAdder-0.0.3/adder/tests
copying files to PyAdder-0.0.3...
copying CHANGES.txt -> PyAdder-0.0.3
copying LICENSE.txt -> PyAdder-0.0.3
copying MANIFEST.in -> PyAdder-0.0.3
copying README.md -> PyAdder-0.0.3
copying requirements.txt -> PyAdder-0.0.3
copying setup.cfg -> PyAdder-0.0.3
copying setup.py -> PyAdder-0.0.3
copying PyAdder.egg-info/PKG-INFO -> PyAdder-0.0.3/PyAdder.egg-info
copying PyAdder.egg-info/SOURCES.txt -> PyAdder-0.0.3/PyAdder.egg-info
copying PyAdder

In [35]:
ls dist/

PyAdder-0.0.3-py3-none-any.whl  PyAdder-0.0.3.tar.gz


In [36]:
!pip install dist/PyAdder-0.0.3-py3-none-any.whl

Processing ./dist/PyAdder-0.0.3-py3-none-any.whl
Installing collected packages: PyAdder
  Found existing installation: PyAdder 0.0.1
    Uninstalling PyAdder-0.0.1:
      Successfully uninstalled PyAdder-0.0.1
Successfully installed PyAdder-0.0.3


In [38]:
import adder
print(f"version: {adder.__version__}")
adder.run(1,2,10,-1,30)

version: 0.0.2


42

## Tests

We'd like to know if our code is working before we deploy. Locally we can run `py.test` or:

In [39]:
!python setup.py test

running pytest
running egg_info
writing PyAdder.egg-info/PKG-INFO
writing dependency_links to PyAdder.egg-info/dependency_links.txt
writing top-level names to PyAdder.egg-info/top_level.txt
reading manifest file 'PyAdder.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
writing manifest file 'PyAdder.egg-info/SOURCES.txt'
running build_ext
platform darwin -- Python 3.6.4, pytest-3.3.2, py-1.5.2, pluggy-0.6.0 -- /Users/jbloom/anaconda3/bin/python
cachedir: .cache
rootdir: /Users/jbloom/Classes/python-seminar/DataFiles_and_Notebooks/09_Web/PyAdder, inifile: setup.cfg
plugins: cov-2.5.1
collected 3 items                                                              [0m[1m

adder/tests/test_one_number.py::TestOneNumber::test_deplorables [32mPASSED[0m[36m   [ 33%][0m
adder/tests/test_one_number.py::TestOneNumber::test_floats [32mPASSED[0m[36m        [ 66%][0m
adder/tests/test_one_number.py::TestOneNumber::test_ints [32mPASSED[0m[36m          [100%][0m



In [40]:
# %load adder/tests/test_one_number.py
from unittest import TestCase
import math
from numpy import inf, isinf, nan, isnan
import adder

class TestOneNumber(TestCase):

    def test_floats(self):
        for num in [1617161771.7650001, math.pi, math.pi**100,
                    math.pi**-100, 3.1]:
            self.assertEqual(adder.run(num), num)

    def test_ints(self):
        for num in [-1,0,1]:
            self.assertEqual(adder.run(num), num)

    def test_deplorables(self):
        self.assertTrue(isinf(adder.run(inf)))
        self.assertTrue(isnan(adder.run(nan)))



# Continuous Integration

"Continuous Integration is a software development practice where members of a team integrate their work frequently, usually each person integrates at least daily - leading to multiple integrations per day. Each integration is verified by an automated build (including test) to detect integration errors as quickly as possible. Many teams find that this approach leads to significantly reduced integration problems and allows a team to develop cohesive software more rapidly."

http://docs.python-guide.org/en/latest/scenarios/ci/

**Travis-CI** is a distributed CI server which builds tests for open source projects for free. It provides multiple workers to run Python tests on and seamlessly integrates with GitHub. You can even have it comment on your Pull Requests whether this particular changeset breaks the build or not. So if you are hosting your code on GitHub, travis-ci is a great and easy way to get started with Continuous Integration. See https://docs.travis-ci.com/

If we're maintaining a repo, ideally we'd like to know if our tests are passing before accepting pull requests.

Let's look at the Travis CI interface: https://travis-ci.org/profjsb/PyAdder

<img src="https://www.evernote.com/l/AUUiuc2SSGNHk63FpxmZrYb2w4nSuzUry9UB/image.png">

We can add 
```html
<img src="https://travis-ci.org/profjsb/PyAdder.svg?branch=master" data-pin-nopin="true">
```

to our `README.md` to get a green badge:

<img src="https://travis-ci.org/profjsb/PyAdder.svg?branch=master" data-pin-nopin="true">

## code coverage

When tests are run, we can see what part of our code is touched ("covered").

In [44]:
!pip install pytest-cov



In [48]:
!py.test --cov-report html --cov-report term-missing --cov=./

platform darwin -- Python 3.6.4, pytest-3.3.2, py-1.5.2, pluggy-0.6.0 -- /Users/jbloom/anaconda3/bin/python
cachedir: .cache
rootdir: /Users/jbloom/Classes/python-seminar/DataFiles_and_Notebooks/09_Web/PyAdder, inifile: setup.cfg
plugins: cov-2.5.1
collected 3 items                                                              [0m[1m

adder/tests/test_one_number.py::TestOneNumber::test_deplorables [32mPASSED[0m[36m   [ 33%][0m
adder/tests/test_one_number.py::TestOneNumber::test_floats [32mPASSED[0m[36m        [ 66%][0m
adder/tests/test_one_number.py::TestOneNumber::test_ints [32mPASSED[0m[36m          [100%][0m

---------- coverage: platform darwin, python 3.6.4-final-0 -----------
Name                Stmts   Miss Branch BrPart  Cover   Missing
---------------------------------------------------------------
adder/__init__.py       7      1      0      0    86%   10
Coverage HTML written to dir htmlcov




In [49]:
!open htmlcov/index.html

We can keep track of code coverage using codecov and Travis (https://docs.codecov.io/docs/about-code-coverage)

see https://codecov.io/gh/profjsb/PyAdder

# Making Code Citeable

http://ivory.idyll.org/blog/2016-using-zenodo-to-archive-github.html

https://guides.github.com/activities/citable-code/

<img src="https://www.evernote.com/l/AUVFhdB6uhFC7IAz3uSz5K-L74xYniPLyQUB/image.png">
