# Property-based testing with Hypothesis

Typically when we test software, we use *example*-based testing. We manually think of sets of inputs (i.e., examples) to pass into a function and see if it performs as expected. We've learnt many techniques to come up with these examples, like boundary-value testing, equivalence class, and decision tables. The problem is that this type of testing only tells you your program works as intended for those specific test cases *only*.

Instead of relying on specific test cases, *property*-based testing, gauruntees that your programs work for *all data matching some specification*. So for an adder(x, y) function, instead of checking just (3,4), (0,1) and (2,3) etc., with property-based testing you can check that your program works for any given *int* inputs. This is really useful, especially in cases where you can't see the code, i.e., functional testing, or a low-trust environment.

Let's see that in action:

This demo should only require the Hypothesis package, everything else is standard. Uncomment and run the cell below to install Hypothesis via pip, conda users you'll know what to do.

In [1]:
#!pip install hypothesis

## A devious adder function

Let's say you're a testing a few adder programs that takes two integers and returns the result. We'll have to pretend we can't see the code.

In [2]:
# Here's a normal adder function

def adder(x, y):
    return x + y

# And here is a malicious adder function

def bad_adder(x, y):
    if x>1000000000:
        return  "1; DROP TABLE users;"
    else:
        return x + y

The bad_adder function attempts an sql injection attack. So if the addition fucntion was being used to dynamically fill in a query, it could cause all kinds of trouble, viz:

```
SELECT * FROM users WHERE user_id = {bad_adder(x,y)};

if x = 42 then

SELECT * FROM users WHERE user_id = 1; DROP TABLE users;

This causes the users table to be deleted!!
```

With example based testing you'd test these functions like this:

In [3]:
import unittest

class AdderTestBase(unittest.TestCase):
    adder = None  # Placeholder for the adder function to be set by subclasses

    def test_positive_numbers(self):
        self.assertEqual(self.adder(1, 2), 3)
        self.assertEqual(self.adder(10, 5), 15)

    def test_negative_numbers(self):
        self.assertEqual(self.adder(-1, -2), -3)
        self.assertEqual(self.adder(-10, -5), -15)

    def test_mixed_numbers(self):
        self.assertEqual(self.adder(-1, 1), 0)
        self.assertEqual(self.adder(-10, 5), -5)

    def test_zero(self):
        self.assertEqual(self.adder(0, 0), 0)
        self.assertEqual(self.adder(0, 5), 5)
        self.assertEqual(self.adder(5, 0), 5)

# Testing the correct implementation
class TestSimpleAdder(AdderTestBase):
    adder = staticmethod(adder)

# Testing the incorrect implementation
class TestBadAdder(AdderTestBase):
    adder = staticmethod(bad_adder)

if __name__ == '__main__':
    # Only load subclasses, not AdderTestBase itself
    loader = unittest.TestLoader()
    suite = unittest.TestSuite()
    suite.addTests(loader.loadTestsFromTestCase(TestSimpleAdder))
    suite.addTests(loader.loadTestsFromTestCase(TestBadAdder))

    runner = unittest.TextTestRunner()
    runner.run(suite)

........
----------------------------------------------------------------------
Ran 8 tests in 0.009s

OK


The tests passed so things look good right? Let's see what Hypothesis finds. To use Hypothesis:
1. Create test function that asserts the result of a function being tested to a trusted implementation (in this case I'll compare the result of adder() to Python's built-in add())
2. Using the @given decorator, tell Hypothesis what the types of the inputs are: @given(st.integers(), st.integers()) tells Hypothesis to test integer inputs for both arguments of the function.

What is st.? It refers to Hypothesis strategies, these are intelligent sequence of values from the type that Hypothesis will try out, such as extremely small values, huge values, zero values, plus/minus. Think of it like automated equivalence class testing.

In [4]:
from hypothesis import given, strategies as st
import operator
import unittest

@given(st.integers(), st.integers())
def test_adder(x, y):
    assert adder(x, y) == operator.add(x, y)

# Running the hypothesis test

test_adder()

test_adder() ran without any issues so it's passed Hypothesis test suite. What about bad_adder()?

In [5]:
@given(st.integers(), st.integers())
def test_bad_adder(x, y):
    assert bad_adder(x, y) == operator.add(x, y)

# Running the hypothesis test

test_bad_adder()

AssertionError: 

```
Falsifying example: test_bad_adder(
    x=1_000_000_001,
    y=0,  # or any other generated value
)
Explanation:
    These lines were always and only run by failing examples:
```

Wow we see that Hypothesis is telling us that the test fails for any value of y where x is 1_000_000_001 (underlines have no semantic meaning it's just for ease of reading).

How did Hypothesis figure this out? Let's look under the hood. To do this we can set the Verbosity setting to explicitly mention all the combinations of inputs it's trying:

In [7]:
from hypothesis import given, settings
from hypothesis._settings import Verbosity

@given(st.integers(), st.integers())
@settings(verbosity=Verbosity.verbose)
def test_bad_adder(x, y):
    assert bad_adder(x, y) == operator.add(x, y)

# Running the hypothesis test

test_bad_adder()

Trying example: test_bad_adder(
    x=1_000_000_001,
    y=0,
)
Traceback (most recent call last):
  File "C:\Users\seanm\AppData\Local\Temp\ipykernel_8648\888568253.py", line 8, in test_bad_adder
    assert bad_adder(x, y) == operator.add(x, y)
AssertionError

Trying example: test_bad_adder(
    x=1_293_489_081,
    y=0,
)
Traceback (most recent call last):
  File "C:\Users\seanm\AppData\Local\Temp\ipykernel_8648\888568253.py", line 8, in test_bad_adder
    assert bad_adder(x, y) == operator.add(x, y)
AssertionError

Trying example: test_bad_adder(
    x=1_294_779_321,
    y=0,
)
Traceback (most recent call last):
  File "C:\Users\seanm\AppData\Local\Temp\ipykernel_8648\888568253.py", line 8, in test_bad_adder
    assert bad_adder(x, y) == operator.add(x, y)
AssertionError

Trying example: test_bad_adder(
    x=1_294_789_561,
    y=0,
)
Traceback (most recent call last):
  File "C:\Users\seanm\AppData\Local\Temp\ipykernel_8648\888568253.py", line 8, in test_bad_adder
    assert bad_ad

AssertionError: 

Looking at logs, see how Hypothesis tried many combinations of inputs before it stumbled on one where x=1_000_000_001. The real magic is that it doesn't stop there. After it realises there was something wrong witht that case, it tries many combinations of y values to "shrink" the problematic example down to what we see in the final AssertionError of the block. That's how it knows the problem is the x value regardless of y value.

This "shrinking" means you, the developer, don't have to spend time figuring whether the bug you're seeing appears only in a specific cases or when a combination of conditions are met. Hypothesis finds out for you!
```
Falsifying example: test_bad_adder(
    x=1_000_000_001,
    y=0,  # or any other generated value
)
Explanation:
    These lines were always and only run by failing examples:
```

You can probably guess what's wrong with bad_adder():

In [8]:
def bad_adder(x, y):
    if x>1000000000:
        return  "1; DROP TABLE users;"
    else:
        return x + y

# Test anything you can imagine!

Hypothesis has a huge range of built in strategies for almost any primitive, and even complex, types! There's a couple of really convenient ones too for urls, emails, and ip addresses, which are hard for people to think of edge cases for.

In [9]:
from hypothesis import strategies as st

# Define various strategies
int_strategy = st.integers(min_value=0, max_value=100)
float_strategy = st.floats(min_value=0.0, max_value=100.0)
text_strategy = st.text(min_size=1, max_size=10)
bool_strategy = st.booleans()
date_strategy = st.dates()
datetime_strategy = st.datetimes()
time_strategy = st.times()
timedelta_strategy = st.timedeltas()
complex_strategy = st.complex_numbers()
decimal_strategy = st.decimals(min_value=0, max_value=100)
none_strategy = st.none()
just_strategy = st.just("fixed value")  # Always produces "fixed value"
sampled_from_strategy = st.sampled_from(["apple", "banana", "cherry"])
binary_strategy = st.binary(min_size=1, max_size=10)
uuid_strategy = st.uuids()
email_strategy = st.emails()
ipv4_strategy = st.ip_addresses(v=4)
ipv6_strategy = st.ip_addresses(v=6)

# Generate and print examples
print("Integer example:", int_strategy.example())
print("Float example:", float_strategy.example())
print("Text example:", text_strategy.example())
print("Boolean example:", bool_strategy.example())
print("Date example:", date_strategy.example())
print("Datetime example:", datetime_strategy.example())
print("Time example:", time_strategy.example())
print("Timedelta example:", timedelta_strategy.example())
print("Complex number example:", complex_strategy.example())
print("Decimal example:", decimal_strategy.example())
print("None example:", none_strategy.example())
print("Just example:", just_strategy.example())
print("Sampled from example:", sampled_from_strategy.example())
print("Binary example:", binary_strategy.example())
print("UUID example:", uuid_strategy.example())
print("Email example:", email_strategy.example())
print("IPv4 address example:", ipv4_strategy.example())
print("IPv6 address example:", ipv6_strategy.example())

Integer example: 33
Float example: 4.1926306459187895e-262
Text example: ë^󺀓񊣣񅜷¯
Boolean example: True
Date example: 1068-11-26
Datetime example: 3346-04-23 08:53:18.377140
Time example: 09:19:48.672125
Timedelta example: -52655 days, 0:58:22.508196
Complex number example: (1.9+1.9j)
Decimal example: 29.056
None example: None
Just example: fixed value
Sampled from example: apple
Binary example: b'\x06\xe5\x06\x1e'
UUID example: f4c4ac74-d853-cd95-b13e-5d8ca8d50baf
Email example: Da@H.mc.gEorGe
IPv4 address example: 10.224.0.0
IPv6 address example: 7f93:9df1:3a8c:8171:55ef:d95a:e22c:2a44


You might be thinking how about data structures as examples, like creating a list to pass into a merge sort? For those Hypothesis has composite structures, you can combine different strategies together to create the data structure and format you want.

This example produces a lists between length 1 and 10 that are made of integers:

In [10]:
list_strategy = st.lists(st.integers(), min_size=1, max_size=10)
print("List example:", list_strategy.example())

List example: [30, -11299]


We can do the same for other data structures:

In [11]:
# Composite structures
tuple_strategy = st.tuples(st.integers(), st.text(min_size=1, max_size=5))
set_strategy = st.sets(st.integers(), min_size=1, max_size=5)
dict_strategy = st.dictionaries(keys=st.text(min_size=1, max_size=5), values=st.integers(min_value=0, max_value=100))
frozenset_strategy = st.frozensets(st.integers(), min_size=1, max_size=5)

# Recursive structures (useful for generating nested structures)
nested_list_strategy = st.recursive(st.integers(), lambda children: st.lists(children, min_size=1, max_size=3))

print("Tuple example:", tuple_strategy.example())
print("Set example:", set_strategy.example())
print("Dictionary example:", dict_strategy.example())
print("Frozenset example:", frozenset_strategy.example())
print("Nested list example:", nested_list_strategy.example())


Tuple example: (25204, '\U000966b6i')
Set example: {14065, -19916, 1475533974, 90, -30242}
Dictionary example: {'®\x911\U00080ad4,': 73, '\U00040078`e\t\xa0': 1, '0': 1, '\U0004d28c\U0005aad6\U000fb457àu': 73}
Frozenset example: frozenset({-8786})
Nested list example: [[0], 29542, [0, -80, -80]]


One really useful application of this is creating json objects for testing APIs, these are far too tedious to write by hand especially for json objects with many fields. With hypothesis it's trivial:

In [12]:
from hypothesis import strategies as st

EcommerceData = st.fixed_dictionaries(
    {
        "product": st.fixed_dictionaries(
            {
                "product_id": st.text(min_size=8, max_size=12),  # Unique identifier for the product
                "name": st.text(min_size=5, max_size=50),  # Product name
                "category": st.sampled_from(
                    ["electronics", "clothing", "home", "toys", "books", "beauty", "sports", "automotive"]
                ),
                "price": st.floats(min_value=0.0, max_value=1000.0),  # Price of the product
                "discount": st.one_of(st.none(), st.floats(min_value=0.0, max_value=0.5)),  # Discount rate (0-50%)
                "in_stock": st.integers(min_value=0, max_value=1000),  # Quantity available in stock
            }
        ),
        "user": st.fixed_dictionaries(
            {
                "user_id": st.text(min_size=8, max_size=12),  # Unique user ID
                "username": st.text(min_size=5, max_size=15),  # Username
                "email": st.emails(),  # User email
                "is_premium_member": st.booleans(),  # Whether the user has a premium membership
                "age": st.one_of(st.none(), st.integers(min_value=18, max_value=80)),  # Age of the user
                "address": st.fixed_dictionaries(
                    {
                        "street": st.text(min_size=10, max_size=100),
                        "city": st.text(min_size=3, max_size=50),
                        "zip_code": st.text(min_size=5, max_size=10),
                        "country": st.sampled_from(["US", "CA", "GB", "AU", "DE", "FR"]),
                    }
                ),
            }
        ),
        "order": st.fixed_dictionaries(
            {
                "order_id": st.text(min_size=8, max_size=12),  # Unique order identifier
                "order_date": st.datetimes(),  # Date and time of order
                "status": st.sampled_from(["pending", "shipped", "delivered", "canceled", "returned"]),
                "items": st.lists(
                    st.fixed_dictionaries(
                        {
                            "product_id": st.text(min_size=8, max_size=12),
                            "quantity": st.integers(min_value=1, max_value=5),  # Quantity of each item
                            "price": st.floats(min_value=0.0, max_value=1000.0),  # Price per item
                        }
                    ),
                    min_size=1,
                    max_size=5,
                ),
                "total_cost": st.floats(min_value=0.0, max_value=5000.0),  # Total cost of the order
                "payment_method": st.sampled_from(["credit_card", "paypal", "gift_card", "crypto"]),
                "shipping_address": st.one_of(st.none(), st.text(min_size=10, max_size=200)),
            }
        ),
    }
)

# Generate and print an example
EcommerceData.example()

{'product': {'product_id': '\x10|£ \U000435e8Æòh',
  'name': '\U00071521\U000f00a3\U0005101d\U0006143c=í«',
  'category': 'clothing',
  'price': 1.192092896e-07,
  'discount': None,
  'in_stock': 975},
 'user': {'user_id': '²G×Ï\U000d78b3R?A',
  'username': '\x85(\U000603ef\x0bû잗',
  'email': 'fQ@u.HSbc',
  'is_premium_member': True,
  'age': None,
  'address': {'street': 'ì\U000cfd04{\U000dfac5\x12\x7f\x06¹L(\U0001b8cdN\U00067d99\x85¶àød@\x93\x1d\U000c30bf\x9c',
   'city': '\U0010d509©\U0001eff4\U000e874d\U000f1a34\U00084f7eµ',
   'zip_code': '\x05\x10Î\x08±',
   'country': 'CA'}},
 'order': {'order_id': '\x03\x14JyÆA\x0e\U000dea2f',
  'order_date': datetime.datetime(6642, 6, 6, 20, 10, 22, 658176),
  'status': 'returned',
  'items': [{'product_id': '\x1a\x18kL:\U00041056\U000f7102\U0004f4d3',
    'quantity': 2,
    'price': 999.9999999999999},
   {'product_id': 'é&𱸍ÂV&]é', 'quantity': 1, 'price': 1.1754943508222875e-38},
   {'product_id': '\x05©\x93\U00064835\x0bêºd7\x08\x01á',
    '

The data might look like gibberish, but that's the point! We want to make sure the thing we're testing performs as expected given data that meets it's specficiations. So the API should be able to handle it gracefully!

# Hypothesis can write test functions for you!

Before we saw that hypothesis removes having to think about what specific inputs to test. But Hypothesis can also write the *test functions* for you!

Hypothesis has a set of *ghostwriters* that help you write common forms of test functions. For example there is the *fuzz* ghostwriter that helps you write a test function that ensures that if a function is given a valid input, it will always give an expected result (i.e., no errors or only accepted errors):

In [13]:
from hypothesis.extra import ghostwriter


def adder(x, y):
    return x + y
    
print(ghostwriter.fuzz(adder))

# This test code was written by the `hypothesis.extra.ghostwriter` module
# and is provided under the Creative Commons Zero public domain dedication.

from hypothesis import given, strategies as st

# TODO: replace st.nothing() with appropriate strategies


@given(x=st.nothing(), y=st.nothing())
def test_fuzz_adder(x, y):
    adder(x=x, y=y)



Noticed the test function's decorator uses the placeholder strategy st.nothing(), this is because Hypothesis cannot infer what types the arguments should be given Python isn't a strongly typed language. 

To help Hypothesis auto choose the right strategies, simply add a type signature to your function:

In [14]:
def adder(x: int, y: int) -> int:
    return x + y
    
print(ghostwriter.fuzz(adder))

# This test code was written by the `hypothesis.extra.ghostwriter` module
# and is provided under the Creative Commons Zero public domain dedication.

from hypothesis import given, strategies as st


@given(x=st.integers(), y=st.integers())
def test_fuzz_adder(x: int, y: int) -> None:
    adder(x=x, y=y)



Now Hypothesis has automatically chosen the the right strategies to use and you've a ready-made test function!

There are other ghostwriters too! Two useful ones are idempotent() and roundtrip():

idempotent() checks whether *f*(a) == *f*(*f*(a)), e.g., in the case of sorting algorithmns, sorting an already sorted array should yield that original array without any changes:

In [15]:
print(ghostwriter.idempotent(sorted))

# This test code was written by the `hypothesis.extra.ghostwriter` module
# and is provided under the Creative Commons Zero public domain dedication.

from hypothesis import given, strategies as st


@given(
    iterable=st.one_of(st.iterables(st.integers()), st.iterables(st.text())),
    key=st.none(),
    reverse=st.booleans(),
)
def test_idempotent_sorted(iterable, key, reverse):
    result = sorted(iterable, key=key, reverse=reverse)
    repeat = sorted(result, key=key, reverse=reverse)
    assert result == repeat, (result, repeat)



Here we're asking Hypothesis to create a test function for Python's built-in sorting function. Notice is applied strategies for all the arguments of sorted(): iterable (the list of integers or strings you want to sort), key (which is always none), and reverse (a bool that determines ascending or descending order):

In [16]:
@given(
    iterable=st.one_of(st.iterables(st.integers()), st.iterables(st.text())),
    key=st.none(),
    reverse=st.booleans(),
)
@settings(verbosity=Verbosity.verbose)
def test_idempotent_sorted(iterable, key, reverse):
    result = sorted(iterable, key=key, reverse=reverse)
    repeat = sorted(result, key=key, reverse=reverse)
    assert result == repeat, (result, repeat)

test_idempotent_sorted() 

Trying example: test_idempotent_sorted(
    iterable=PrettyIter([]),
    key=None,
    reverse=False,
)
Trying example: test_idempotent_sorted(
    iterable=PrettyIter([]),
    key=None,
    reverse=False,
)
Trying example: test_idempotent_sorted(
    iterable=PrettyIter([]),
    key=None,
    reverse=True,
)
Trying example: test_idempotent_sorted(
    iterable=PrettyIter([0]),
    key=None,
    reverse=False,
)
Trying example: test_idempotent_sorted(
    iterable=PrettyIter([-13660, 3632, 2_842_157_859_296_842_360, -21210]),
    key=None,
    reverse=False,
)
Trying example: test_idempotent_sorted(
    iterable=PrettyIter(['H\U00016d4e\U000b894f\U001096d0\x92L\x9c']),
    key=None,
    reverse=False,
)
Trying example: test_idempotent_sorted(
    iterable=PrettyIter([]),
    key=None,
    reverse=True,
)
Trying example: test_idempotent_sorted(
    iterable=PrettyIter([28783]),
    key=None,
    reverse=False,
)
Trying example: test_idempotent_sorted(
    iterable=PrettyIter([28783]),
 

As expected the built-in sort function is idempotent(). Lots of functions out there are expected to be idempotent and this is a way to write test functions for them accurately and quickly!

The roundtrip() ghostwriter checks whether *f*(*g*(x)) = x. This is important for encode decode functions or anything else that should be perfectly reversible. 

Let's look at these simple Cesar encode and decode functions:

In [17]:
def encode(text: str, shift: int = 3) -> str:
    encoded_chars = [chr(ord(char) + shift) for char in text]
    return ''.join(encoded_chars)

def decode(text: str, shift: int = 3) -> str:
    decoded_chars = [chr(ord(char) - shift) for char in text]
    return ''.join(decoded_chars)


The cipher works by shifting each character by a certain number of places so for example:

In [18]:
encode('a', shift=3)

'd'

So this function should have a roundtrip property:

In [19]:
decode(encode('a', shift=3), shift=3) == 'a'

True

Let's make sure this is acutally true by getting Hypothesis to ghostwrite the test function and running it:

In [20]:
print(ghostwriter.roundtrip(encode, decode))

# This test code was written by the `hypothesis.extra.ghostwriter` module
# and is provided under the Creative Commons Zero public domain dedication.

from hypothesis import given, strategies as st


@given(shift=st.integers(), text=st.text())
def test_roundtrip_encode_decode(shift: int, text: str) -> None:
    value0 = encode(text=text, shift=shift)
    value1 = decode(text=value0, shift=shift)
    assert text == value1, (text, value1)



In [21]:
@given(shift=st.integers(), text=st.text())
def test_roundtrip_encode_decode(shift: int, text: str) -> None:
    value0 = encode(text=text, shift=shift)
    value1 = decode(text=value0, shift=shift)
    assert text == value1, (text, value1)

test_roundtrip_encode_decode()

  + Exception Group Traceback (most recent call last):
  |   File "c:\Users\seanm\datascience\Lib\site-packages\IPython\core\interactiveshell.py", line 3577, in run_code
  |     exec(code_obj, self.user_global_ns, self.user_ns)
  |   File "C:\Users\seanm\AppData\Local\Temp\ipykernel_8648\1970084490.py", line 7, in <module>
  |     test_roundtrip_encode_decode()
  |   File "C:\Users\seanm\AppData\Local\Temp\ipykernel_8648\1970084490.py", line 2, in test_roundtrip_encode_decode
  |     def test_roundtrip_encode_decode(shift: int, text: str) -> None:
  |                    ^^^
  |   File "c:\Users\seanm\datascience\Lib\site-packages\hypothesis\core.py", line 1706, in wrapped_test
  |     raise the_error_hypothesis_found
  | ExceptionGroup: Hypothesis found 2 distinct failures. (2 sub-exceptions)
  +-+---------------- 1 ----------------
    | Traceback (most recent call last):
    |   File "C:\Users\seanm\AppData\Local\Temp\ipykernel_8648\1970084490.py", line 3, in test_roundtrip_encode_de

It looks like we've found out that shift cannot be => -49, but why?

```
ValueError: chr() arg not in range(0x110000)
Falsifying example: test_roundtrip_encode_decode(
    shift=-49,
    text='0',
)
```

In [22]:
encode(text='0', shift=-49)

ValueError: chr() arg not in range(0x110000)

In [23]:
ord('0')

48

It's because the number that repesents '0' in the Unicode set is 48. But 48 shift -49 gives us -1, which does not correspond to any Unicode character so char(-1) throws and error


In [24]:
chr(-1)

ValueError: chr() arg not in range(0x110000)

Let's fix encode by catching the ValueError and raising a mesage:

In [25]:
def better_encode(text: str, shift: int = 3) -> str:
    try:
        encoded_chars = [chr(ord(char) + shift) for char in text]
        return ''.join(encoded_chars)
    except ValueError:
        # print(f"Shift {shift} may be out of bounds")
        raise ValueError("Shift value out of bounds")

We'll also tweak the ghostwriter to ignore ValueErrors now that we have them handled in-function

In [26]:
print(ghostwriter.roundtrip(better_encode, decode, except_=ValueError))

# This test code was written by the `hypothesis.extra.ghostwriter` module
# and is provided under the Creative Commons Zero public domain dedication.

from hypothesis import given, reject, strategies as st


@given(shift=st.integers(), text=st.text())
def test_roundtrip_better_encode_decode(shift: int, text: str) -> None:
    try:
        value0 = better_encode(text=text, shift=shift)
        value1 = decode(text=value0, shift=shift)
    except ValueError:
        reject()
    assert text == value1, (text, value1)



In [27]:
from hypothesis import reject

@given(shift=st.integers(), text=st.text())
def test_roundtrip_better_encode_decode(shift: int, text: str) -> None:
    try:
        value0 = better_encode(text=text, shift=shift)
        value1 = decode(text=value0, shift=shift)
    except ValueError:
        reject()
    assert text == value1, (text, value1)

test_roundtrip_better_encode_decode()

OverflowError: Python int too large to convert to C int

It looks like there's another problem. If the shift value is too big there is an overflow error:
```
OverflowError: Python int too large to convert to C int
Falsifying example: test_roundtrip_better_encode_decode(
    shift=2_147_483_600,
    text='0',
)
```

Let's fix that and twek the ghostwriter again: Here we catch Overflow errors where the shift value is too large or small and raise a useful message alongside the error.

In [123]:
def better_better_encode(text: str, shift: int = 3) -> str:
    try:
        encoded_chars = []
        for char in text:
            new_code = ord(char) + shift
            if new_code > 0x10FFFF:
                raise OverflowError(f"Shift {shift} results in an overflow for character {char}")
            encoded_chars.append(chr(new_code))
        return ''.join(encoded_chars)
    except (ValueError, OverflowError) as e:
        if e is OverflowError:
            raise e
        else: 
            raise ValueError("Shift value out of bounds")

In [116]:
def better_decode(text: str, shift: int = 3) -> str:
    try:
        decoded_chars = []
        for char in text:
            new_code = ord(char) - shift
            if new_code < 0:
                raise OverflowError(f"Shift {shift} results in an underflow for character {char}")
            decoded_chars.append(chr(new_code))
        return ''.join(decoded_chars)
    except (ValueError, OverflowError) as e:
        if e is OverflowError:
            raise e
        else: 
            raise ValueError("Shift value out of bounds")

In [117]:
print(ghostwriter.roundtrip(better_better_encode, better_decode, except_=(ValueError, OverflowError)))

# This test code was written by the `hypothesis.extra.ghostwriter` module
# and is provided under the Creative Commons Zero public domain dedication.

from hypothesis import given, reject, strategies as st


@given(shift=st.integers(), text=st.text())
def test_roundtrip_better_better_encode_better_decode(shift: int, text: str) -> None:
    try:
        value0 = better_better_encode(text=text, shift=shift)
        value1 = better_decode(text=value0, shift=shift)
    except (OverflowError, ValueError):
        reject()
    assert text == value1, (text, value1)



In [118]:
@given(shift=st.integers(), text=st.text())
def test_roundtrip_better_better_encode_better_decode(shift: int, text: str) -> None:
    try:
        value0 = better_better_encode(text=text, shift=shift)
        value1 = better_decode(text=value0, shift=shift)
    except (OverflowError, ValueError):
        reject()
    assert text == value1, (text, value1)
    
test_roundtrip_better_better_encode_better_decode()

The test passes so now we have an encode and decode Cesar function that works as we expect and only fails in ways that we expect and can handle!

# Review:

Hypothesis helps you:
1. Automatically think of and write test cases you couldn't dream of!
2. Generate arbitarily complex test payloads, from strings to jsons to dataframes! 
3. Write test functions automatically!
4. Write better code by ensuring your programs never fail in an unexpected way!

MacIver, D. R., Hatfield-Dodds, Z., & many other contributors. (2019). Hypothesis: A new approach to property-based testing. https://doi.org/10.21105/joss.01891