# Ruthless Testing with Hypothesis

Xavier Villaneau — PyTennessee 2020

## Who am I?

Just call me Xavier.

Pronouns: He / Him / His

Software Engineer at Pandora + Sirius XM in Atlanta GA

## 1. A Short Example

In [1]:
import contextlib
from typing import Optional

from hypothesis import settings, Verbosity
settings.register_profile('demo', verbosity=Verbosity.verbose, max_examples=5)
settings.register_profile('try_a_lot', max_examples=10_000)

### Hello, I'm “The Function"

In [2]:
def extract_id(str_id: str) -> Optional[int]:
    if str_id[:2] == "u/" and str_id[2:].isdecimal():
        return int(str_id[2:])
    return None

Let's try it out:

In [3]:
extract_id('u/12345')

12345

In [4]:
extract_id('csx8888')

### Hello Function, I'm Hypothesis

In [5]:
settings.load_profile('demo')

In [6]:
from hypothesis import given, strategies as st

@given(st.integers(min_value=0))
def test_int_always_decodes(number):
    assert extract_id(f'u/{number}') == number

In [40]:
test_int_always_decodes()

Trying example: test_int_always_decodes(
    number=0,
)
Trying example: test_int_always_decodes(
    number=0,
)
Trying example: test_int_always_decodes(
    number=0,
)
Trying example: test_int_always_decodes(
    number=18221,
)
Trying example: test_int_always_decodes(
    number=0,
)


### A harder test

In [8]:
from hypothesis import assume, given, strategies as st

@given(st.characters(), st.integers(min_value=0))
def test_not_u_never_decodes(char, number):
    assume(char != 'u')
    assert extract_id(f'{char}/{number}') is None

In [51]:
test_not_u_never_decodes()

Trying example: test_not_u_never_decodes(
    char='0', number=0,
)
Trying example: test_not_u_never_decodes(
    char='0', number=0,
)
Trying example: test_not_u_never_decodes(
    char='\U000acb0f', number=64,
)
Trying example: test_not_u_never_decodes(
    char='0', number=0,
)
Trying example: test_not_u_never_decodes(
    char='0', number=0,
)


### So far so good?

In [10]:
settings.load_profile('try_a_lot')

In [57]:
from hypothesis import assume, given, strategies as st

@given(st.text(alphabet='0123456789', min_size=1))
def test_matches_expected(text):
    str_id = f'u/{text}'
    assert f'u/{extract_id(str_id)}' == str_id

In [58]:
with contextlib.suppress(Exception):
    test_matches_expected()

Falsifying example: test_matches_expected(
    text='00',
)


### Y'all got anymore of em' bugs?

In [13]:
import re
from hypothesis import assume, given, strategies as st

@given(st.text())
def test_no_false_positives(text):
    assume(re.fullmatch('[0-9]+', text) is None)
    assert extract_id(f'u/{text}') is None

In [14]:
with contextlib.suppress(Exception):
    test_no_false_positives()

Falsifying example: test_no_false_positives(
    text='၀',
)


### wut.

In [15]:
int('७೧୨௧໒')

71212

𝕿𝖍𝖔𝖚 𝖍𝖆𝖘𝖙 𝖜𝖆𝖐𝖊𝖙𝖍 𝖙𝖍𝖊 𝖜𝖗𝖆𝖙𝖍 𝖔𝖋 𝖀𝖓𝖎𝖈𝖔𝖉𝖊

## 2. The Hypothesis Manual, Abridged

In [16]:
settings.load_profile('default')

### The basics

* `@given()` → Decorate a test to run Hypothesis
* `strategies` → Data generators module

```python
@given(strategy)
def test_function(arguments):
    assert test_condition(arguments)
```

### Skipping tests

* `assume(condition)` → Ignore test if `condition` is false
* Can also use `Strategy.filter()`

**Warning:**  
Hypothesis gives up after too many failed attempts.

### Forcing a test input

* `@example(*arguments)` → Forces a specific test input

This is _in addition to_ the randomly generated examples.

Useful for forcing known corner cases to be tested.

In [17]:
settings.load_profile('demo')

In [18]:
from hypothesis import example

@given(st.integers(), st.integers())
@example(71, 212)
def test_addition_commutative(x, y):
    assert x + y == y + x

In [63]:
test_addition_commutative()

Trying example: test_addition_commutative(
    x=71, y=212,
)
Trying example: test_addition_commutative(
    x=0, y=0,
)
Trying example: test_addition_commutative(
    x=0, y=0,
)
Trying example: test_addition_commutative(
    x=36, y=32,
)
Trying example: test_addition_commutative(
    x=0, y=0,
)
Trying example: test_addition_commutative(
    x=-126, y=-28799,
)


In [20]:
settings.load_profile('default')

### The Magic of Failure

When an input fails, Hypothesis **shrinks** it to the simplest case.

Failing input is rememberred in the **database** and tried again.

Every test in Hypothesis is **repeatable**.

### What to test?

Look for *invariant* properties.

1. Does the function *crash* when it shouldn't? (e.g. validators)

```python
@given(strategy())
def test_wont_crash(data):
    this_must_not_crash(data)
```

2. Does the inverse function return the same input?  
   (e.g. parsers, serializers)

```python
@given(strategy())
def test_codec(data):
    assert decode(encode(data)) == data
```

3. Is the function *idempotent*?

```python
@given(strategy())
def test_idempotence(data):
    result = process(data)
    assert process(result) == result
```

4. Does the function match a known other function?  
   (optimization, refactor)

```python
@given(strategy())
def test_refactored(data):
    assert new_thing(data) == old_thing(data)
```

## 3. Strategic Overview

### Primitive Types

In [21]:
from hypothesis import strategies

def gimme_examples(strategy, n=5):
    for _ in range(n):
        print(strategy.example())

In [87]:
gimme_examples(strategies.floats())

0.0
3.122979806705833e+221
0.0
1.502540591884679e-52
-1.7976931348623157e+308


In [108]:
gimme_examples(strategies.text().map(repr))

'0'
'\U000f9aa0\x1b'
'\x14\x04$\U000e6833%'
'\x07'
'\x18\x15\x04'


### Collections

In [118]:
gimme_examples(strategies.lists(strategies.integers()))

[]
[-620538434, 9353]
[0]
[0]
[2300576252139297195, 19644, -6465, 13404]


Also: `sets`, `dictionaries`, `tuples`…

### Building objects

In [25]:
from decimal import Decimal
from dataclasses import dataclass
from string import ascii_letters

In [26]:
@dataclass
class Customer:
    username: str
    customer_id: int
    account_balance: Decimal

In [125]:
gimme_examples(strategies.builds(Customer))

Customer(username='', customer_id=0, account_balance=Decimal('NaN'))
Customer(username='&', customer_id=91, account_balance=Decimal('-0.6648167181'))
Customer(username='', customer_id=0, account_balance=Decimal('-sNaN'))
Customer(username='\x18\U0009a011\n\U000b784b\t\U000895d9\U000ad29f\n\U000ac5e6\x04\U0003ff13(', customer_id=14677, account_balance=Decimal('-0.0194458148'))
Customer(username='', customer_id=0, account_balance=Decimal('NaN'))


### Composite strategies

In [28]:
@strategies.composite
def char_ids(draw):
    prefix = draw(strategies.sampled_from(ascii_letters))
    number = draw(strategies.integers(min_value=0))
    return f'{prefix}/{number}'

In [136]:
gimme_examples(char_ids())

a/0
d/581
g/0
q/1280383754
Y/1404234951146601292


### ...and much more!

Dates and time, composition tools, recursive data structures, functional tools...

https://hypothesis.readthedocs.io/en/latest/data.html

## 4. The Bonus Features

### Django support

* Must use `hypothesis.extra.django.TestCase` instead of `django.test.TestCase`.
* Automatic DB model creation strategy: `hypothesis.extra.django.from_model`.

```python
from hypothesis.extra.django import (
    TestCase, from_model
)

class CustomerRecordsTest(TestCase):
    @given(from_model(CustomerRecord))
    def test_customer_record(record):
        # Relevant test with DB usage
```

### Settings profiles

```python
from hypothesis import settings
settings.register_profile(
    'ci_tests', max_examples=10_000
)
settings.load_profile('ci_tests')
```

### pytest support

With `pytest`, Hypothesis adds CLI options to:
* Set the test seed
* Set the verbosity
* Set a test profile
* Collect runtime statistics

```
test_extract_id.py::test_int_always_decodes:
  - 100 passing examples, 0 failing examples, 0 invalid examples
  - Typical runtimes: < 1ms
  - Fraction of time spent in data generation: ~ 47%
  - Stopped because settings.max_examples=100

test_extract_id.py::test_not_c_never_decodes:
  - 100 passing examples, 0 failing examples, 0 invalid examples
  - Typical runtimes: < 1ms
  - Fraction of time spent in data generation: ~ 63%
  - Stopped because settings.max_examples=100

test_extract_id.py::test_no_false_positives:
  - 8 passing examples, 8 failing examples, 5 invalid examples
  - Typical runtimes: < 1ms
  - Fraction of time spent in data generation: ~ 35%
  - Stopped because nothing left to do
```

### Stateful testing

> With Hypothesis’s stateful testing, \[it\] tries to generate not just data but entire tests. You specify a number of primitive actions that can be combined together, and then **Hypothesis will try to find sequences of those actions that result in a failure**.

## Conclusion

### Conclusion

Hypothesis is good at:

* Finding the edge case bugs you forgot
* Testing mission-critical logic that **must** be reliable
* Testing bidirectional functions (e.g. parsers, serializers)
* Fuzzy testing

Cons:

* Tests **must** be invariant
* Tests should be fast
* It's bad at generating "real" data (use `faker`)
* It should not replace sanity/smoke tests

### One last obsolete quote

>  Most errors are of an obvious nature that can be easily spotted by visual inspection. […] Do not use the computer to detect this kind of thing – it is too expensive.

Dr. Winston W. Royce, _Managing the Development of Large Software Systems_. IEEE WESCON Proceedings, August 1970.

# Thank you!

Slides on:  
https://github.com/xvillaneau/talks

(This was v1.2.0 of the talk)

## Questions?

### Further reading

David R. MacIver, The Purpose of Hypothesis  
https://hypothesis.readthedocs.io/en/latest/manifesto.html

Scott W., Choosing properties for property-based testing  
https://fsharpforfunandprofit.com/posts/property-based-testing-2/

Joe "begriffs" Nelson, The Design and Use of QuickCheck  
https://begriffs.com/posts/2017-01-14-design-use-quickcheck.html