# Beyond Unit Testing - What is Property-based Testing?
> Do you know what differentiate a Powerpoint prepared by data scientist and a Business Analyst? Your Charts! But not the good way.

- toc: true 
- badges: true
- comments: true
- categories: [python]
- hide: true

In [56]:
#collapse-hide
# https://hypothesis.readthedocs.io/en/latest/quickstart.html
!pip install hypothesis 
%load_ext ipython_pytest

The ipython_pytest extension is already loaded. To reload it, use:
  %reload_ext ipython_pytest


Unit Testing is a common technique for software engineering. Even if you are not writing a unit test explicitly, you are still doing unit testing, as your function should at least works for what you intended. You give an input _x_ to a function, it should return _y_, simple as that.

For example, imagine we have a function like this.

In [57]:
def add_ints(x1, x2):
    return x1 + x2

In [58]:
# Case 1
add_ints(1,1)

2

In [59]:
# Case 2
add_ints(1,'2')

TypeError: unsupported operand type(s) for +: 'int' and 'str'

In [None]:
# Case 3
add_ints('2', '2')

The first two cases are expected behaviors, but the last case is a side-effect of how Python works. We should probably checks the input are numbers, otherwise we should throw error explicitly. Now, checking function behave properly with intend use is easy, to test the opposite is much harder. You have to test a lot of edge case, which is much harder and make your test verbose.

In this article, I will introduce a library called `Hypothesis` that does property-based testing. If none of this make sense to you, please bare with me, I will explain with simple examples. I found the name of `Hypothesis` and property-based testing isn't adding a lot of information, but they are useful.

__Hypothesis__ comes in handy that it generated artificial input to make your test fails. Instead of specifying an input, you specify what kind of input you want to test loosely. For example, if you expect your input is number, often you may want to test when the value is negative, positive, a floating point number, or if it exceeds certain range. This list of condition can expands quickly, and __Hypothesis__ make this easier.

# Start with a simple function

Let's stick with our simple `add_ints` function above. To keep it simple, let test for this 3 cases first.

1. Adding two number -> Expect Pass
2. Adding number and string -> Expect Fail
3. Adding two number -> Expect Fail

In [64]:
%%pytest 
# ipython magic to run pytest within a cell. This whole blog is written in a Jupyter Notebook!
# https://github.com/akaihola/ipython_pytest/blob/master/ipython_pytest.py

import pytest

def add_ints(x1, x2):
    return x1 + x2

def test_add_ints():
    assert add_ints(1,1) == 2

@pytest.mark.xfail()
def test_add_ints_fail():
    assert add_ints(1,'2')

@pytest.mark.xfail(strict=True)
def test_add_ints_string():
    assert add_ints('2', '2')
    

platform win32 -- Python 3.7.4, pytest-5.2.1, py-1.8.0, pluggy-0.13.0
rootdir: C:\Users\CHANNO\AppData\Local\Temp\tmp79u2a6x6
plugins: hypothesis-5.8.1, arraydiff-0.3, doctestplus-0.4.0, openfiles-0.4.0, remotedata-0.3.2
collected 3 items

_ipytesttmp.py .xF                                                       [100%]

____________________________ test_add_ints_string _____________________________
[XPASS(strict)] 


In `pytest`, you can use a mark `@pytest.mark.xfail` to annotate a function is expected to fail the test. We have 1 pass, 1xfailed, 1 failed.

`_ipytesttmp.py .xF`
indicates the last test is failed. Let's try to fix it by throwing an error is input type is not a number.

In [65]:
%%pytest 
# ipython magic to run pytest within a cell. This whole blog is written in a Jupyter Notebook!
# https://github.com/akaihola/ipython_pytest/blob/master/ipython_pytest.py

import pytest

def add_ints(x1, x2):
    if isinstance(x1, int) and isinstance(x2, int):
        return x1 + x2
    else:
        raise TypeError(f'Make sure your input is a number x1 {type(x1)}, x2 {type(x2)}')
    

def test_add_ints():
    assert add_ints(1,1) == 2

@pytest.mark.xfail()
def test_add_ints_fail():
    assert add_ints(1,'2')

@pytest.mark.xfail(strict=True)
def test_add_ints_string():
    assert add_ints('2', '2')
    

platform win32 -- Python 3.7.4, pytest-5.2.1, py-1.8.0, pluggy-0.13.0
rootdir: C:\Users\CHANNO\AppData\Local\Temp\tmpfrh2uipy
plugins: hypothesis-5.8.1, arraydiff-0.3, doctestplus-0.4.0, openfiles-0.4.0, remotedata-0.3.2
collected 3 items

_ipytesttmp.py .xx                                                       [100%]



Okay, now we checks if input are integers. In reality, this if often an iterative process. You start with coming up with test cases, then every now and then, you hit some edge cases and you add that into your collections of test cases. 

How can we make out test cases more robust to input? `Hypothesis` is exactly the tool you need.

# `strategy`, your auto-genenerated input for unit test

`strategy` is your input for unit test. Instead of specify a number, or a string, you specify what kind of input you want, and `Hypothesis` wouuld take care the rest of it. You can even composite different `strategies` to form more complicated input.

But let's keep it simple, we would just use integer for this demo.

In [72]:
from hypothesis import strategies as st

In [79]:
%%pytest 
# ipython magic to run pytest within a cell. This whole blog is written in a Jupyter Notebook!
# https://github.com/akaihola/ipython_pytest/blob/master/ipython_pytest.py

import pytest
from hypothesis import given
from hypothesis import strategies as st

def add_ints(x1, x2):
    if isinstance(x1, int) and isinstance(x2, int):
        return x1 + x2
    else:
        raise TypeError(f'Make sure your input is a number x1 {type(x1)}, x2 {type(x2)}')
    
@given(st.integers(), st.integers())
def test_add_ints(x1, x2):
    assert add_ints(x1, x2)

platform win32 -- Python 3.7.4, pytest-5.2.1, py-1.8.0, pluggy-0.13.0
rootdir: C:\Users\CHANNO\AppData\Local\Temp\tmpmdbi_h6f
plugins: hypothesis-5.8.1, arraydiff-0.3, doctestplus-0.4.0, openfiles-0.4.0, remotedata-0.3.2
collected 1 item

_ipytesttmp.py F                                                         [100%]

________________________________ test_add_ints ________________________________

    @given(st.integers(), st.integers())
>   def test_add_ints(x1, x2):

_ipytesttmp.py:15: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

x1 = 0, x2 = 0

    @given(st.integers(), st.integers())
    def test_add_ints(x1, x2):
>       assert add_ints(x1, x2)
E       assert 0
E        +  where 0 = add_ints(0, 0)

_ipytesttmp.py:16: AssertionError
--------------------------------- Hypothesis ----------------------------------
Falsifying example: test_add_ints(
    x1=0, x2=0,
)


The test was simple, as should pass as long as no error was thrown. Look what `Hypothesis` found, it found when both x1, x2=0, the assertion will fail, because we are asserting 0 + 0 = 0, thus evaluated as False in Python.

Hence, I modified my test to not assert anything, it should just keep silent as long as no error is thrown.


In [None]:
@given(st.integers(), st.integers())
def test_add_ints(x1, x2):
    assert add_ints(x1, x2)

In [80]:
%%pytest 
# ipython magic to run pytest within a cell. This whole blog is written in a Jupyter Notebook!
# https://github.com/akaihola/ipython_pytest/blob/master/ipython_pytest.py

import pytest
from hypothesis import given
from hypothesis import strategies as st

def add_ints(x1, x2):
    if isinstance(x1, int) and isinstance(x2, int):
        return x1 + x2
    else:
        raise TypeError(f'Make sure your input is a number x1 {type(x1)}, x2 {type(x2)}')
    
@given(st.integers(), st.integers())
def test_add_ints(x1, x2):
    add_ints(x1, x2)

platform win32 -- Python 3.7.4, pytest-5.2.1, py-1.8.0, pluggy-0.13.0
rootdir: C:\Users\CHANNO\AppData\Local\Temp\tmpwicd08ny
plugins: hypothesis-5.8.1, arraydiff-0.3, doctestplus-0.4.0, openfiles-0.4.0, remotedata-0.3.2
collected 1 item

_ipytesttmp.py .                                                         [100%]



Yes, now our test finally pass.