# Prototyping to tested code

## About the author

I am a physicist by training and am now working as a data scientist. In my work, I focus on building robust software and putting code in production. Python has been my tool of choice for about 8 years.

## An introduction to ipytest

Jupyter notebooks are a great environment to prototype solutions and explore their design. Turning these solutions into reusable components usually requires moving them out of the notebook environment into external python packages. Often, at this stage, the code is refactored and test are written.

In this talk, I will demo ipytest, a small tool to run tests inside notebooks. It supports pytest as well as the standard unittest framework. It allows to start prototypes in a notebook and to develop the tests with the code in an highly interactive environment. As the code grows, it can be transparently moved outside notebooks and transformed into reusable components. By bringing support for tests to the notebook environment, ipytest bridges the artificial gap between notebooks and reusable components.

# Prototyping to tested code

## An introduction to [`ipytest`](https://github.com/chmp/ipytest/)

<p>
    <div>Christopher Prohm (<a href="https://twitter.com/@c_prohm">@c_prohm</a>)</div>
    <div>PyCon.DE 2018, Karlsruhe</div>
</p>

<br/>

<p>
    <div>Github: <a href="https://githb.com/chmp/ipytest">https://githb.com/chmp/ipytest</a></div>
</p>

## Disclaimer

**The views and opinions expressed in this talk are mine and do not necessarily reflect the ones of my employer. The content and materials are my own.**

# About Me

<br/>
<p>
    <div>Physicist by training, turned data scientist.</div>
    <div>Working at Volkswagen Data:Lab in Munich.</div>
</p>

<br/>
<p>
    <span class="fragment fade-in">Avid user of Jupyter notebooks.</span>
    <span class="fragment fade-in">Also, conflicted about Jupyter notebooks.</span>
</p>

In [1]:
try:
    from PIL import Image, ImageDraw

except ImportError:
    pass

else:
    img1 = Image.open('../resources/Joel_Grus_Tweet.png')
    img2 = Image.open('../resources/refactoring.png')
    img3 = Image.open('../resources/state.png')

    scale = min(img1.size[0] / img2.size[0], img1.size[1] / img2.size[1])
    img2 = img2.resize((int(img2.size[0] * scale), int(img2.size[1] * scale)))

    w = img2.size[0] + img1.size[0] // 2 + 10
    h = img2.size[1] + img1.size[1] // 2

    composite = Image.new('RGB', (w, h))
    composite.paste((255, 255, 255), (0, 0, *composite.size))
    composite.paste(img3, (img1.size[0] // 2 - 260, img1.size[1] // 2 - 200))
    composite.paste(img2, (img1.size[0] // 2 - 20, img1.size[1] // 2 + 100))
    composite.paste(img1, (0, 50, img1.size[0], img1.size[1] + 50))

    draw = ImageDraw.Draw(composite)

    for i in range(3):
        draw.rectangle((635 - i, 375 - i, 870 + i, 475 + i), outline='red')


    for i in range(3):
        draw.rectangle((630 - i, 30 - i, 900 + i, 295 + i), outline='red')

    composite.save('../resources/composite.png')

# Jupyter notebooks

![](../resources/composite.png)

# My take on it

Notebooks *are* hard:

- global state is confusing to me
- git and notebooks do not mesh well in my view
- my notebooks seems to becomes messier over time

<div class="fragment fade-in">
<p>
    But notebooks overall increases my <b>productivity</b> enormously:
    <ul>
        <li>rapid feedback and exploration</li>
        <li>documentation (incl. math)</li>
    </ul>
</p>
</div>

# Notebook vs. Modules?

<br/>
<p class="fragment fade-in">Use what is most efficient.</p>
<br/>
<p>
    <div class="fragment fade-in">
        <div>Combine the best of both worlds,</div>
        <div>move code progressively out of notebooks.</div>
    </div>
</p>
<br/>
<p class="fragment fade-in">Use same  libraries &amp; tooling inside and outside notebooks. </p>
<br/>

# The notebook-module continuum

<div style="display: grid; grid-template-columns: 20% 20% 20% 20% 20%; grid-template-rows: 2em 2em 0.5em 2em 2em 2em 2em;">
    <span style="grid-row: 2; grid-column-start: 1; grid-column-end: 1; text-align: left;">
        <b>notebook</b>
    </span>
    <span style="grid-row: 2; grid-column-start: 5; grid-column-end: 5; text-align: right;">
        <b>modules</b>
    </span>
    <span 
        style="grid-row: 4; grid-column-start: 1; grid-column-end: 3; text-align: left;" 
        class="fragment fade-in"
        data-fragment-index="1"
    >
        <a href="https://matplotlib.org">matplotlib</a>, 
        <a href="https://bokeh.pydata.org">bokeh</a>, 
        <a href="https://altair-viz.github.io/">altair</a>, ...
    </span>
    <span style="grid-row: 5; grid-column: 1; text-align: left;">
        <span class="fragment fade-in" data-fragment-index="2">
            <span class="fragment fade-out" data-fragment-index="3">
                <a href="https://dask.org/">dask</a>, 
                <a href="https://spark.apache.org/">pyspark</a>
            </span>
        </span>
    </span>
    <span style="grid-row: 5; grid-column: 2; text-align: center;">
        <span class="fragment fade-in" data-fragment-index="3">
            <a href="https://dask.org/">dask</a>, 
            <a href="https://spark.apache.org/">pyspark</a>
        </span>
    </span>
    <span style="grid-row: 6; grid-column: 1; text-align: left;" class="fragment fade-in" data-fragment-index="4">
        <a class="fragment fade-out" href="https://mlflow.org" data-fragment-index="5">mlflow</a>
    </span>
    <span style="grid-row: 6; grid-column: 4; text-align: center;">
        <a class="fragment fade-in" href="https://mlflow.org" data-fragment-index="5">mlflow</a>
    </span>
    <span style="grid-row: 7; grid-column-start: 1; grid-column-end: 5; text-align: left;">
        <span class="fragment fade-in" data-fragment-index="6">
            <span class="fragment fade-out"  data-fragment-index="7">
                <a href="https://panel.pyviz.org">panel</a> (*)
        </span>
        </span>
    </span>
    <span style="grid-row: 7; grid-column-start: 3; grid-column-end: 5; text-align: center;">
        <span class="fragment fade-in"  data-fragment-index="7">
            <a href="https://panel.pyviz.org">panel</a> (*)
        </span>
    </span>
    <span style="grid-row: 8; grid-column-start: 1; grid-column-end: 5; text-align: center;">
        <span class="fragment fade-in">...</span>
    </span>
</div>

<p style="width: 100%; text-align: center;">
    <div class="fragment fade-in">How does such a workflow look like in practice?</div>
    <div class="fragment fade-in">How does testing fit in this picture?</div>
    <div class="fragment fade-in">How does <code>ipytest</code> support <code>pytest</code> inside notebooks?</div>
</p>

# Getting Started

<div>&nbsp;</div>

<pre>
! pip install pytest       <span style="color: darkgreen;">&lt;------- Hugely popular testing framework</span>
! pip install ipytest      <span style="color: darkgreen;">&lt;------- Integration of pytest and notebooks</span>
                           <span style="color: darkgreen;">         Full disclosure: I am the author.</span>
</pre>


In [2]:
import ipytest.magics                  # <--- enable IPython magics


import ipytest                         # <--- enable pytest's assert rewriting
ipytest.config.rewrite_asserts = True  # 


__file__ = "IPyTestIntro.ipynb"        # <--- make the notebook filename available 
                                       #      to ipytest

In [3]:
%%run_pytest[clean] -qq

def test_example():
    assert [1, 2, 3] == [1, 2,3 ]

.                                                                [100%]


# The main `ipytest` API

<br/>

<pre>
%%<span class="fragment highlight-current-red" data-fragment-index="1">run_pytest</span>[<span class="fragment highlight-current-red" data-fragment-index="2">clean</span>] <span class="fragment highlight-current-red" data-fragment-index="3">-qq</span>
     ^         ^     ^
     +---------|-----|---- <span class="fragment highlight-current-red" data-fragment-index="1">execute tests with pytest</span>
               |
               +-----|---- <span class="fragment highlight-current-red" data-fragment-index="2">delete any previously defined tests</span>
                     |
                     +---- <span class="fragment highlight-current-red" data-fragment-index="3">arbitrary pytest arguments</span>
</pre>

<br/><br/>

<div class="fragment fade-in">
    Full docs at <a href="https://github.com/chmp/ipytest">https://github.com/chmp/ipytest</a>.
</div>

# `pytest` support

<div>&nbsp;</div>

`pytest` is doing all the heavy lifting 😀. Most (all?) `pytest` features work out of the box.


- ` @pytest.mark.*` 
- ` @pytest.fixture`
- `--pdb`
- `-l`
- ...
- Assertion rewriting


# Assertion rewriting

In [4]:
def keep_odds(iterable):
    return [item for item in iterable if item % 2 == 0]
    #                           error at ^^^^^^^^^^^^^

In [5]:
%%run_pytest[clean] -qq

def test_keep_odds():
    assert keep_odds([1, 2, 3, 4]) == [1, 3]

F                                                                [100%]
____________________________ test_keep_odds ____________________________

    def test_keep_odds():
>       assert keep_odds([1, 2, 3, 4]) == [1, 3]
E       assert [2, 4] == [1, 3]
E         At index 0 diff: 2 != 1
E         Full diff:
E         - [2, 4]
E         + [1, 3]

<ipython-input-5-757929023375>:3: AssertionError


# Parametrize

In [6]:
import pytest

In [7]:
%%run_pytest[clean] -qq

@pytest.mark.parametrize('input, expected', [
    ([0.5, 1, 1.5], 3), 
    ([2, 2.5], 4.5),
])
def test_sum(input, expected):
    actual = sum(input)
    assert actual == pytest.approx(expected)

..                                                               [100%]


# Fixtures

In [8]:
%%run_pytest[clean] -qq

@pytest.fixture
def my_fixture():
    return True
    
def test_my_fixture(my_fixture):
    assert my_fixture is True

.                                                                [100%]


In [9]:
%%run_pytest[clean] -qq

def test_with_tmpdir(tmpdir):
    tmpdir.join("foo").write("bar")

.                                                                [100%]


# Debugger

In [10]:
%%run_pytest[clean] -qq -x --pdb 

def test_pdb():
    l = [1, 2, 3]
    assert l == []

F
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> traceback >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

    def test_pdb():
        l = [1, 2, 3]
>       assert l == []
E       assert [1, 2, 3] == []
E         Left contains more items, first extra item: 1
E         Full diff:
E         - [1, 2, 3]
E         + []

<ipython-input-10-26043487c447>:4: AssertionError
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> entering PDB >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> <ipython-input-10-26043487c447>(4)test_pdb()
-> assert l == []
(Pdb) q



Exit: Quitting debugger


!!!!!!!!!!!!!!! _pytest.outcomes.Exit: Quitting debugger !!!!!!!!!!!!!!!


# How does `ipytest` work?

<p>
<div>Small package.</div>
<div class="fragment fade-in">Creative use of extension APIs of pytest, jupyter.</div>
</p>

<div class="fragment fade-in">
<p>
<pre>
<span style="color:darkgreen;">&#35; pytest plugin to make notebooks look like modules</span>
<span style="color:darkgreen;">class</span> <span style="color: darkblue;">ModuleCollectorPlugin</span>(object):
    <span style="color:darkgreen;">def</span> <span style="color: darkblue;">pytest_collect_file</span>(self, parent, path):
        ...
</pre>
</p>
</div>

<div class="fragment fade-in">
<p>
<pre>
<span style="color:darkgreen;">&#35; ipython plugin to rewrite asserts</span>
shell = get_ipython()
shell.ast_transformers.append(...)
</pre>
</p>
</div>

# Prototyping to Production

Navigating the notebook / module continuum

# Directory layout

<p>
<pre>
notebooks/
notebooks/IPyTestIntro.ipynb
</pre>
</p>

<div class="fragment fade-in">
<p>
Requirements
<pre>
Pipfile          <span style="color: darkgreen">&#35; &lt;---- abstract</span>
Pipfile.lock     <span style="color: darkgreen">&#35; &lt;---- concrete</span>
</pre>
</p>
</div>

<div class="fragment fade-in">
<p>
Packaging
<pre>
setup.py
src/             <span style="color: darkgreen">&#35; &lt;---- source code</span>
</pre>
</p>
</div>

<div class="fragment fade-in">
<p>
Tests
<pre>
tests/
</pre>
</p>
</div>

# `Pipfile` & `pipenv`

<br/>
<pre>
<span style="color: darkblue">[packages]</span>
ipytest = "*"
pytest = "*"
ipytest-demo = {editable = true, path = "."}
<span style="color:darkgreen">&#35;              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
&#35;               make local module editable
&#35; ...
</span>
<span style="color: darkblue">[scripts]</span>
test = "pytest tests"
</pre>
<br/>

# `setup.py`

<br/>

<pre>
<span style="color:darkgreen">from</span> setuptools <span style="color:darkgreen">import</span> setup, PEP420PackageFinder

setup(
    name=<span style="color: darkred">'ipytest-demo'</span>,
    version=<span style="color: darkred">'0.0.0'</span>,
    py_modules=[<span style="color: darkred">"keep_odds"</span>],
    <span style="color: darkgreen">&#35; ^^^ when using modules (credit: @tmuxbee)</span>
    <span style="color: darkgreen">&#35;</span>
    <span style="color: darkgreen">&#35; alternative for packages:</span>
    <span style="color: darkgreen">&#35; packages=PEP420PackageFinder.find('src'),</span>
    package_dir={<span style="color: darkred">''</span>: <span style="color: darkred">'src'</span>},
)
</pre>

<br/>

# From notebooks to modules (1/4)

Write the code and explore it inside notebooks

In [11]:
# Write your functionality
def keep_odds(iterable):
    return [item for item in iterable if item % 2 == 1]

In [12]:
# Interactive Exploration
keep_odds([1, 2, 3, 4, 5, 6])

[1, 3, 5]

# From notebooks to modules (2/4)
Write tests

In [13]:
# Write your functionality
def keep_odds(iterable):
    return [item for item in iterable if item % 2 == 1]

In [14]:
# Interactive Exploration
keep_odds([1, 2, 3, 4, 5, 6])

[1, 3, 5]

In [15]:
%%run_pytest[clean] -qq

def test_keep_odds():
    assert keep_odds([1, 2, 3, 4, 5, 6]) == [1, 3, 5]

.                                                                [100%]


# From notebooks to modules (3/4)

Move the code to a module, continue experimenting with tests inside notebook

In [16]:
!cat ../src/keep_odds.py


def keep_odds(iterable):
    return [item for item in iterable if item % 2 == 1]


In [17]:
%%run_pytest[clean] -qq

# reload the module
ipytest.reload('keep_odds')
from keep_odds import keep_odds


def test_keep_odds():
    assert keep_odds([1, 2, 3, 4, 5, 6]) == [1, 3, 5]

.                                                                [100%]


# From notebooks to module (4/4)

Move everything outside the notebook

In [18]:
!cat ../src/keep_odds.py


def keep_odds(iterable):
    return [item for item in iterable if item % 2 == 1]


In [19]:
!cat ../tests/test_keep_odds.py

from keep_odds import keep_odds


def test_keep_odds():
    assert keep_odds([1, 2, 3, 4, 5, 6]) == [1, 3, 5]


In [20]:
!pytest -qq ../tests

.[36m                                                                [100%][0m


# How well does it work?

<div>
    <p><span style="color: green; font-weight: bold;">&#10003;</span> Moving code out of notebooks</p>
    <ul>
        <li>Development packages &amp; reloading</li>
        <li>More and more libraries with support for both environments</li>
    </ul>
</div>

<div class="fragment fade-in">
    <p><span style="color: orange; font-weight: bold;">-</span> Development inside notebook</p>
    <ul>
        <li>More support to reason about global state</li>
        <li>Integration into notebooks of more libraries</li>
        <li>Better tooling (type checking, completion, refactoring, ...)</li>
    </ul>
</div>


<div class="fragment fade-in">
    <p><span style="color: red; font-weight: bold;">X</span> Keeping notebooks &amp; modules in sync</p>
    <ul>
        <li>Moving code into notebook</li>
        <li>Regression checking for notebooks (papermill?)</li>
        <li>Notebook &amp; package aware tools</li>
        <li>...</li>
    </ul>
</div>

# Conclusion

<p>
    <div>Notebooks offer a very effective environment for rapid iteration</div>
    <div class="fragment fade-in" style="padding-left:1em; padding-top: 1em;">Interactive tests of code allow to create test input/output pairs quickly</div>
</p>

<p style="padding-top: 1em;">
    <div class="fragment fade-in">Notebooks can become cumbersome for large code bases</div>
    <div class="fragment fade-in" style="padding-left:1em; padding-top: 1.0em;">&nbsp;&nbsp;Move code out of notebooks progressively</div>
    <div class="fragment fade-in" style="padding-left:1em; padding-top: 0.5em;">&nbsp;&nbsp;Use same libraries &amp; tooling. </div>
    <div class="fragment fade-in" style="padding-left:1em; padding-top: 0.5em;">&nbsp;&nbsp;For testing: <code>ipytest</code> &amp; <code>pytest</code></div>
</p>

<p style="padding-top: 1em;">
    <div class="fragment fade-in">Caveat: Hidden state requires some care (<code>%%run_pytest[clean]</code>, <code>reload</code>)</div>
</p>

<br/>
<div class="fragment fade-in">
<p>
    <div>Install: <code>pip install pytest ipytest</code></div>
    <div>Twitter <a href="https://twitter.com/@c_prohm">@c_prohm</a></div>
    <div>Github: <a href="https://github.com/chmp/ipytest">https://github.com/chmp/ipytest</a></div>
</p>
</div>