# Overview of Hypothesis for Python

Hypothesis is a powerful and easy-to-use library for property-based testing in Python. This guide is written with the assumption that the reader is new to property-based testing.

## What is Property-Based Testing?

Property-based testing in its simplest form is used to test a program or properties of a program against a wide range of inputs. This is in contrast to traditional software testing where using a specific input is tested. The exact details may vary on how the wide range of input is generated, but the main outcome is the same - a more robust way to test software. With that said, traditional software testing still has its place and should be used in conjunction with property-based testing. 

## Advantages and Disadvantages of Hypothesis

It is important to be aware of the advantages of disadvantages of Hypothesis in order to make an informed decision about whether or not to incorporate it.

### Advantages
1. **Finds Edge Cases and Hidden Bugs**: Seeing as the test cases are randomly generated, it might help to find bugs and edge cases that were not thought of during traditional testing.
2. **Time Efficiency**: Hypothesis is easy to implement and automatic test generation saves time.
3. **Improves Test Robustness**: Due to a wide range of inputs, testing is more robust.
4. **Easy to Integrate**: Hypothesis is easy to integrate into an existing Python project.

### Disadvantages
1. **Longer Execution Time**: Tests in Hypothesis can take longer to run compared to a traditional unit test suite because it generates and runs many more test cases.
2. **Less Control Over Test Cases**: Whereas as traditional testing is precise, property-based testing using hypothesis is less precise and even has a random element to it.
3. **Reproducibility**: Although Hypothesis does its best to narrow down inputs that caused the error, due to the random element this is not always that straightforward and as precise with traditional software testing.

## How to Use Hypothesis

This section focusses on the actual usage of hypothesis with a very simple guide.

### Setting Up

This setup has been tested with the following specifications:
- Ubuntu 22.04.3 LTS
- Python 3.10.12
- pip 22.0.2 
- Hypothesis 6.92.2

However, assuming the latest version of Python and pip are installed, Hypothesis and pytest should work. 


To install Hypothesis and pytest, run:

```bash
pip install hypothesis pytest
```

### Basic Test

In [17]:
from hypothesis import given
import hypothesis.strategies as st

@given(st.integers())
def test_builtin_abs(x: int) -> None:
    assert abs(x) >= 0
    assert abs(x) == (x if x >= 0 else -x)

test_builtin_abs()

The above function tests the built-in absolute value function of the Python programming language. An absolute value takes a given integer and returns the same integer if the integer is greater or equal to 0, otherwise it inverts the negative sign if the integer is negative returning a positive or absolute number.

**The Purpose of the `given` Decorator**
- The `given` decorator is used to **specify the inputs** to the function you would like to test.
- Uses specified **strategies**

**Strategies in Hypothesis**
- Defines how input should be generated
- For instance, `st.integers` will generate integers across the range of valid Python integers
- Other options are `st.text` to generate text and `st.date` to generate dates. One can even create complex types
- Can be further narrowed down, i.e. by specifying ranges for the numbers generated by st.integers()
- Can be thought of as property. By specifying `st.integers` we are asserting that the program should pass the test for all valid integers


### Multiple Inputs

In [10]:
def concat_strings(a:str, b:str) -> str:
    if not isinstance(a, str) or not isinstance(b, str):
        raise TypeError("Inputs must be of type string")

    return a + b


@given(st.text(), st.text())
def test_concat_strings(a: str, b: str) -> None:
    result = concat_strings(a, b)
    assert result == a + b
    assert len(result) == len(a) + len(b)

test_concat_strings()

As one can see in the above example, using hypothesis with multiple inputs is as easy as matching the inputs of the given decorator to the types of the function one would like to test.

### Complex Inputs

In [4]:
from hypothesis.strategies import composite

PI = 3.14159

@composite
def custom_input_generator(draw) -> tuple[float, str]:
    decimal = draw(st.floats(max_value=PI))
    text = draw(st.text(alphabet=st.characters(whitelist_categories=['Lu']), min_size=2, max_size=5))
    return decimal, text

@given(custom_input_generator())
def test_custom_input_generator(generated_input: tuple[float, str]) -> None:
    decimal, text = generated_input
    assert decimal <= PI
    assert len(text) >= 2 and len(text) <= 5
    assert text.isupper()

test_custom_input_generator()


**The Purpose of the `composite` Decorator**
- The composite decorator is used to combine input test generation methods (**_search strategies_**) into a single, more powerful and complex version.

**Arguments to _Search Strategies_**
- Search strategies can receive arguments to further refine inputs generated
- For instance, `st.floats(max_value=PI)` means all floats generated will be smaller or equal to PI
- And `alphabet=st.characters(whitelist_categories=['Lu'])` means all the text generated will be uppercase letters
- Search strategy arguments can also be used outside of the composite decorator
- Alternatively, lambda functions can be used such as `st.integers().filter(lambda x: x != 0)`

### Specifying Test Case Amount

To enhance the robustness of tests, increasing the quantity of generated tests can be beneficial. Additionally, creating a substantial volume of random test cases will help to approximate formal verification. The quantity of generated test cases can be modified using the `settings` decorator and the `max_samples` property:


In [None]:
from hypothesis import settings

@settings(max_examples=100)
@given(st.integers())
def test_builtin_abs(x: int) -> None:
    assert abs(x) >= 0
    assert abs(x) == (x if x >= 0 else -x)

test_builtin_abs()

This can also be done globally, by specifying the following at the beginning of the test file:

In [None]:
settings.register_profile("default", max_examples=100)
settings.load_profile("default")

### Specifying Seeds

A noteworthy capability of Hypothesis is the ability to specify a seed. By specifying a seed, the test cases that are produced by Hypothesis remain consistent across runs, thus removing randomness. This feature is particularly useful for debugging, as Hypothesis tries to identify and display the exact seed that triggered a failure in the program. Incorporating specific seeds into the test suite, in addition to randomized tests, can enhance test reliability. To set a seed, use the `seed` decorator with a seed value.


In [None]:
from hypothesis import given
from hypothesis import seed
import hypothesis.strategies as st

@seed(30)
@given(st.integers())
def test_builtin_abs(x: int) -> None:
    assert abs(x) >= 0
    assert abs(x) == (x if x >= 0 else -x)

test_builtin_abs()

### Integration with pytest

To integrate Hypothesis with pytest is extremely easy, simply run: 

```bash
pytest
```

Like one normally would. Either in the correct directory or by explicitly stating the file.

It should automatically pick up and run the Hypothesis tests. In addition to the Hypothesis tests, one can also have normal unit and integration tests. This can be demonstrated as follows:

1. Writing a hypothesis test to a Python file
2. Running the file with pytest
3. Deleting the file

For an example, the following three code blocks can be run in sequence.

In [7]:
%%writefile test_script.py
from hypothesis import given
import hypothesis.strategies as st

@given(st.integers())
def test_builtin_abs(x: int) -> None:
    assert abs(x) >= 0
    assert abs(x) == (x if x >= 0 else -x)

Writing test_script.py


In [3]:
!pytest -q test_script.py

[32m.[0m[32m                                                                        [100%][0m
[32m[32m[1m1 passed[0m[32m in 0.09s[0m[0m


Although multiple test cases were run, pytest only shows that a single test case has been pased. For more output, specifically relevant to Hypothesis, the following command can be run:

In [4]:
!pytest -q test_script.py --hypothesis-show-statistics

[32m.[0m[32m                                                                        [100%][0m
test_script.py::test_builtin_abs:

  - during reuse phase (0.00 seconds):
    - Typical runtimes: < 1ms, of which < 1ms in data generation
    - 1 passing examples, 0 failing examples, 0 invalid examples

  - during generate phase (0.03 seconds):
    - Typical runtimes: < 1ms, of which < 1ms in data generation
    - 99 passing examples, 0 failing examples, 0 invalid examples

  - Stopped because settings.max_examples=100


[32m[32m[1m1 passed[0m[32m in 0.03s[0m[0m


In [None]:
import os
os.remove("test_script.py")

## Conclusion

As seen in the examples above, Hypothesis is easy to use and allows for the creation of complex logic in the generation of inputs used. Thus it is a great addition to the testing setup of any software project.

## References

This guide is primarly based on the following resource:

Hypothesis Development Team. "Hypothesis for Python." Hypothesis, Hypothesis Development Team, n.d., https://hypothesis.readthedocs.io. Accessed December 20, 2023.