# Chapter 4: Property-Based Testing - From Examples to Hypotheses

In Chapters 1–3, we moved from a quick, monolithic solver to a modular design with unit tests.
Unit tests gave us confidence, but only for cases that we handpicked.

Now, we’ll take the next step by introducing property-based testing via the Hypothesis
library:

 - Instead of manually writing individual test cases, we’ll state general properties such as conservation, symmetry, maximum principle intuition, or linearity.
 - Hypothesis will generate many random inputs automatically, exploring cases we might never think to test.
 - When a property fails, Hypothesis will shrink the failing case to the simplest possible counterexample, the testing equivalent of a controlled scientific experiment.

## What is a "property"?

A property is a statement about your program that you expect to hold for all valid inputs.
It’s a general rule, not just a single example.

For our 1-D heat equation solver, some key properties might include:

- **Conservation under insulated BCs:** total heat stays constant.
- **Boundary work balance:** with fixed flux BCs, the heat change matches net boundary flux over Δt.
- **Symmetry:** symmetric initial states stay symmetric under symmetric BCs.
- **Max Principle:** Diffusion shouldn’t create new temperature extremes, as long as the scheme is stable.
- **Linearity:** The discrete update should scale linearly with the initial temperature distribution.

## Unit tests vs. property-based tests:
 - Unit tests: "for this input, expect this output"
 - Property tests: "for all inputs satisfying these preconditions, this relation should hold"

Property tests complement unit tests:
 - Unit tests are concrete and focused.
 - Property tests are broader and can reveal edge cases you never anticipated.

## How this relates to Chapter 2

In Chapter 2, we introduced preconditions, postconditions, and invariants as lightweight specifications for our solver.

Property-based testing is a natural extension of these ideas:

 - The precondition {P} now defines the input space that Hypothesis will explore.
 - The postcondition {Q} or invariant becomes the property being checked.
 - When a test fails, Hypothesis provides a counterexample, helping you determine whether:
   - The precondition was too weak (i.e., it allowed invalid inputs).
   - The postcondition was too strong (i.e., it ruled out valid outputs).
   - There’s a bug in the implementation.

Think of this as turning your Chapter 2 contracts into scientific hypotheses:

 - “If the precondition holds, the property should always be true.”

Hypothesis plays the role of the experimenter, probing your code with many randomized “experiments” to try and refute your claim.

## The `Hypothesis` library

Hypothesis is a Python library for property-based testing.
Instead of handpicking test cases, you describe the space of valid inputs, and Hypothesis generates random examples within that space.

If a failure is found, Hypothesis automatically shrinks the input to the simplest version that still causes the failure. This helps you debug by giving a clear, minimal failing example.

Let's revisit our `div` function from the previous chapters.

In [9]:
def div(x, y):
    assert y != 0           # P    (precondition)
    res = x / y             # code (implementation)
    assert res * y == x     # Q    (postcondition)
    return res

In Chapter 3, we tested this manually:

In [10]:
def test_division():
    div(7, 25)

test_division()

AssertionError: 

Running this fails because 7 / 25 cannot be represented exactly in floating point.
But notice: we only found this case because we happened to test exactly this input.
If we had tested other pairs like the ones listed in the below cell, the test
might have passed, which would leave us unaware of the issue.

In [14]:
def test_more_divisions():
    div(7, 26)
    div(7, 24)
    div(7, 23)
    div(6, 1)
    div(2, 4)
    div(0, 1)
    div(46, 7)
    div(7657, 26)
    div(1, 3)
    div (23, 23424)
    div(1000, 3)
    div(123,456)
    print("No assertion violations encountered!")

test_more_divisions()

No assertion violations encountered!


### Hypothesis to the rescue

Let's use Hypothesis to generate a wide range of test cases for our division function.

In [4]:
from hypothesis import assume, given, strategies as st

@given(st.integers(), st.integers())
def test_div_property(x, y):
    assume(y != 0)
    div(x, y)

The above test definition highlights the key components of property-based testing with Hypothesis:

1. The @given decorator
  - This is the main entry point for defining a property-based test.
  - It takes zero or more strategies as arguments, which describe how Hypothesis should generate input data for the test.

2. Strategies
  - Strategies define the space of possible inputs.
  - In our example, we used st.integers() to generate random integers for both x and y.

We also used the assume function to enforce our precondition {P}.
Here, the precondition is that y != 0 to avoid division by zero. Adding `assume(y != 0)` ensures that we only test valid inputs.

With these pieces in place, we can now run the property-based test and let Hypothesis explore a wide range of inputs automatically:

In [5]:
test_div_property()

AssertionError: 

Running the above property-based test should reveal a pair of integers (x, y) that violate the division property. Recall, the inputs are generated randomly, so you might see different failing cases each time you run the test. 

### Shrinking

Hypothesis automatically *shrinks* the input, step by step, until it finds the simplest failing 
example. This makes debugging much easier, since you don’t need to wade through massive random inputs.

### Combining strategies

In the above example, we only tested integer division. However, we can also test floating-point division by using the `st.floats()` strategy from Hypothesis. Better yet, we can combine both strategies to test the `div` function with a wider range of inputs:

In [73]:
from hypothesis import assume, given, strategies as st

# Combine integers and floats into a single strategy:
numbers = st.one_of(st.integers(), st.floats(allow_nan=False, allow_infinity=False))

@given(numbers, numbers)
def test_div_property(x, y):
    assume(y != 0)
    div(x, y)

In [74]:
test_div_property()

AssertionError: 

### Controlling how many examples are generated

By default, `hypothesis` generates 100 random input values. You can control this with the @settings decorator:

In [72]:
from hypothesis import given, settings, strategies as st

ctr = 0

@settings(max_examples=10)
@given(st.integers())
def test_random(x):
    global ctr; ctr += 1
    print(f"Test no. {ctr} with random input {x}")

test_random()

Test no. 1 with random input 0
Test no. 2 with random input -8477684967185004649
Test no. 3 with random input 6386
Test no. 4 with random input -31712
Test no. 5 with random input -17165
Test no. 6 with random input 4
Test no. 7 with random input 114
Test no. 8 with random input -24931
Test no. 9 with random input -1854951769
Test no. 10 with random input -16623


In the above cell, we reduced `max_examples` from the default 100 to 10. In practice, one
should consider increasing the `max_examples` parameter based on the complexity of the property being tested, particularly in the absence of failing cases when no counterexamples are found.

## What we just did

...


## Looking Ahead


---

*This notebook is a part of the "Rigor and Reasoning in Research Software" (R3Sw) tutorial, led by Alper Altuntas and sponsored by the Better Scientific Software (BSSw) Fellowship Program. Copyright © 2025*