# Test Data Generation

- generating good test data can be challenging
- use Hypothesis library - https://hypothesis.readthedocs.io/en/latest/index.html
- Hypothesis is a Python library for creating unit tests by automatically generating meaningful test data
    - helps create edge test cases in your code, you'd not have thought to look for
    - can use it with `pytest` and `unittest` libraries
- hypothesis provides property-based testing 
- designed to test the aspects of a data property that should always be true
- allows for a range of inputs to be programmed and tested within a single test, rather than having to write a different test (hard-coded inputs) for every value that you want to test
- let's you do **fuzz testing**
    - an automated software testing method that injects invalid, malformed, or unexpected inputs into a system to reveal software defects and security vulnerabilities 
- install hypothesis library
- more detailed examples: [https://semaphoreci.com/blog/property-based-testing-python-hypothesis-pytest](https://semaphoreci.com/blog/property-based-testing-python-hypothesis-pytest) 

```bash
pip install hypothesis
```

- see what data you can generate and how docs: [https://hypothesis.readthedocs.io/en/latest/data.html#](https://hypothesis.readthedocs.io/en/latest/data.html#)

In [23]:
! pip install hypothesis



In [24]:
def add(nums:list[int]) -> int:
    s: int = 0
    for n in nums:
        s += n
    return s

In [25]:
# typical unittesting; hardcoded input provided to functions/methods provides the expected output
assert add([1, 2, 3]) == 6, '1 2 3 did NOT add to 6'
assert add([1, 3, -1, 0, -1]) == 2, '1, 3, -1, 0, -1 did NOT add to 2'
print('all tests done...')

all tests done...


In [26]:
# see settings docs: https://hypothesis.readthedocs.io/en/latest/settings.html
from hypothesis import given, settings, Verbosity
import hypothesis.strategies as some

In [29]:
# By default generates 100 random lists of integers
@given(some.lists(some.integers()))
def test_add(nums):
    print(nums) # uncomment it to see what nums are generated
    assert add(nums) == sum(nums)

In [30]:
test_add()

[]
[0]
[0]
[20]
[0]
[0]
[0]
[]
[0]
[0]
[0]
[14248, 4080122076909879220, 13265, -14, 4705, 14825, -30469, -96]
[-3354187958538829209, -7628, -8081, 21958, -82, -12089, -8527183814462008243, -2064, 10691, -1277552906]
[-8579, -34084057804480655150321441586356569745, 1885473316, 7332894914154390761]
[23359, -13217, 113, -81, 18799, 4223, -18475, -28865, -27829, 63]
[16]
[17980, 1742648667, -94, -66, 9574, -12195, 12836, -11466, 28764]
[17980, 1742648667, -94, -66, 9574, -12195, 12836, -11466, 28764]
[-6609, -32, 29302, -6211, -3, 31231]
[16445]
[18653]
[15069, 19372, 20849, 11469, -12114, 86]
[15069, 19372, 20849, 11469, -12114, 86]
[-13247, 6650, 17979, 12161]
[-10]
[-138]
[-138, 79, 1515982208, 3632357360740786984, -9551]
[-138]
[-138, -138, -18080, -8758, 155281628062005949607034577926984830502, -108, 968284458, -54]
[-138, -138, -18080, -8758, 155281628062005949607034577926984830502, -108, 968284458, -54]
[-138, 968284458, -18080, -8758, 155281628062005949607034577926984830502, -108, 

In [33]:
# more examples...
@given(some.integers(), some.integers())
# can set it to control the no. of examples, database, randomization, etc.
@settings(max_examples=100, verbosity=Verbosity.verbose, derandomize=True)
def test_ints_are_commutative(x, y):
    #print(x, y)
    assert x + y == y + x

In [35]:
test_ints_are_commutative()

Trying example: test_ints_are_commutative(
    x=0,
    y=0,
)
Trying example: test_ints_are_commutative(
    x=0,
    y=0,
)
Trying example: test_ints_are_commutative(
    x=-75595815,
    y=25546,
)
Trying example: test_ints_are_commutative(
    x=0,
    y=0,
)
Trying example: test_ints_are_commutative(
    x=-28065,
    y=-79,
)
Trying example: test_ints_are_commutative(
    x=0,
    y=0,
)
Trying example: test_ints_are_commutative(
    x=-4559,
    y=4665598009820062891,
)
Trying example: test_ints_are_commutative(
    x=0,
    y=0,
)
Trying example: test_ints_are_commutative(
    x=14153,
    y=17605,
)
Trying example: test_ints_are_commutative(
    x=0,
    y=0,
)
Trying example: test_ints_are_commutative(
    x=0,
    y=0,
)
Trying example: test_ints_are_commutative(
    x=-3375,
    y=1394,
)
Trying example: test_ints_are_commutative(
    x=1394,
    y=1394,
)
Trying example: test_ints_are_commutative(
    x=1394,
    y=1394,
)
Trying example: test_ints_are_commutative(
    x=-

In [36]:
# explicitly give name to data
@given(x=some.integers(), y=some.integers())
def test_ints_cancel(x, y):
    assert (x + y) - y == x

In [37]:
test_ints_cancel()

In [38]:
# generate lists of arbitrary length (usually between 0 and
# 100 elements) whose elements are integers.
@given(some.lists(some.integers()))
def test_reversing_twice_gives_same_list(xs):
    ys = list(xs)
    ys.reverse()
    ys.reverse()
    assert ys == xs

In [39]:
test_reversing_twice_gives_same_list()

In [40]:
@given(some.tuples(some.booleans(), some.text()))
def test_look_tuples_work_too(t):
    # A tuple is generated as the one you provided, 
    # with the corresponding types in those positions.
    assert len(t) == 2
    assert isinstance(t[0], bool)
    assert isinstance(t[1], str)

In [41]:
test_look_tuples_work_too()

In [42]:
# generate even numbers between 10 and 20
# use min_value and max_value or map method
@given(some.integers(min_value=5, max_value=10).map(lambda x: x*2))
def test_somefunc(num):
    print(num)
    #assert test some functions using nums!

In [43]:
test_somefunc()

10
18
10
10
20
14
16
12


In [48]:
# can compose types such as list, tuple, etc...
# list with at most 100 integers with min value of 1
@given(some.lists(some.integers(min_value=1), min_size=1, max_size=100))
def test_func1(nums):
    print(nums)
    

In [49]:
test_func1()

[1]
[1]
[1]
[1]
[1]
[12590]
[1]
[1996]
[1]
[1]
[1131236312, 28328, 1214599958, 7978562000870048562, 7627]
[98, 18]
[98, 18]
[1, 18]
[91, 43, 1586896285, 5324384061262269659, 19255, 5577463742941249159, 44, 63, 19708, 91, 25851, 6500001491918147879018860405130483215, 28619, 12, 76651580024789507934066175699311541696, 84, 69, 16, 108, 23638, 276601048, 67, 4755209746814401625, 4038, 5018840516472864172, 1655, 26318880382022301934792159577941957248, 18997, 9669, 25986, 114, 1853724379, 15153, 103, 17140, 32463637438918027495198403295622712545, 28, 8118, 125]
[91, 43, 1586896285, 5324384061262269659, 19255, 5577463742941249159, 44, 63, 19708, 91, 25851, 6500001491918147879018860405130483215, 28619, 12, 76651580024789507934066175699311541696, 84, 69, 16, 108, 23638, 276601048, 67, 4755209746814401625, 4038, 5018840516472864172, 1655, 26318880382022301934792159577941957248, 18997, 9669, 25986, 114, 1853724379, 15153, 103, 17140, 32463637438918027495198403295622712545, 28, 8118, 125]
[32660, 

### Software Requirement

1. Define a function that takes an integer value between 1 and 10 as an argument
2. Function finds and returns the square root of the integer value provided

In [63]:
# see settings docs: https://hypothesis.readthedocs.io/en/latest/settings.html
from hypothesis import given, settings, Verbosity
import hypothesis.strategies as some

def int_sqrt(n: int) -> float:
    # Is this the correct implementation?
    assert isinstance(n, int)
    assert n >= 1 and n <=10
    return n**0.5

In [66]:
def test_int_sqrt():
    import math
    assert int_sqrt(9) == 3, 'sqrt(9) != 3'
    assert int_sqrt(4) == 2, 'sqrt(4) != 2'
    assert int_sqrt(10) == math.sqrt(10)
    #assert int_sqrt(100) == 10, 'sqrt(100) != 10'
    # any problem here...?
    print('all tests PASS...')

In [67]:
test_int_sqrt()

all tests PASS...


In [None]:
# Property-based testing using hypothesis
from dataclasses import dataclass
import hypothesis.strategies as st

@dataclass
class TestData:
    int_value: st.SearchStrategy[int]

# Generating correct input data range
test_data = TestData(int_value=st.integers(min_value=1, max_value=10))

In [53]:
@given(st.data())
def test_int_sqrt(data: st.DataObject):
    import math

    an_int = data.draw(test_data.int_value)
    root = int_sqrt(an_int)
    # TODO: uncomment to see the test data
    print(an_int, root) 

    assert isinstance(an_int, int)
    assert root == math.sqrt(an_int)
    print('all answer correct')

In [68]:
test_int_sqrt()

all tests PASS...


In [55]:
# What if you pass a string, -negative, 0, float, larger than 10 values...?

# Let's test for -ve values
@given(some.integers(min_value=-100000, max_value=-1))
def test_int_sqrt_negative(n: int):
    # This should throw AssertionError, but does it...?
    try:
        #print(n)
        root = int_sqrt(n)
    except AssertionError:
        # This must be printed... to pass the test
        print('assertion error thrown... PASS')
    else:
        print('FAIL')

In [69]:
test_int_sqrt_negative()

assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertio

In [57]:
# let's test for larger than 10 values
@given(some.integers(min_value=11, max_value=100))
def test_int_sqrt_larger_positives(n: int):
    # This should throw AssertionError, but does it...?
    try:
        #print(n)
        root = int_sqrt(n)
    except AssertionError:
        # This must be printed... to pass the test
        print('assertion error thrown... PASS')
    else:
        print('FAIL')

In [70]:
test_int_sqrt_larger_positives()

assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertio

In [59]:
# let's test with float values
@given(some.floats())
def test_int_sqrt_floats(n: float):
    # This should raise AssertionError, but does it...?
    try:
        #print(n)
        root = int_sqrt(n)
    except AssertionError:
        # This must be printed... to pass the test
        print('assertion error thrown... PASS')
    else:
        print('FAIL')

In [71]:
test_int_sqrt_floats()

assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertion error thrown... PASS
assertio

In [61]:
# Let's test with some strings
@given(some.text())
def test_int_sqrt_strings(n: str):
    # This should throw AssertionError, but does it...?
    try:
        #print(n)
        root = int_sqrt(n)
    except AssertionError:
        # This must be printed... to pass the test
        print('assertion error thrown...PASS')
    else:
        print('FAIL')

In [72]:
test_int_sqrt_strings()

assertion error thrown...PASS
assertion error thrown...PASS
assertion error thrown...PASS
assertion error thrown...PASS
assertion error thrown...PASS
assertion error thrown...PASS
assertion error thrown...PASS
assertion error thrown...PASS
assertion error thrown...PASS
assertion error thrown...PASS
assertion error thrown...PASS
assertion error thrown...PASS
assertion error thrown...PASS
assertion error thrown...PASS
assertion error thrown...PASS
assertion error thrown...PASS
assertion error thrown...PASS
assertion error thrown...PASS
assertion error thrown...PASS
assertion error thrown...PASS
assertion error thrown...PASS
assertion error thrown...PASS
assertion error thrown...PASS
assertion error thrown...PASS
assertion error thrown...PASS
assertion error thrown...PASS
assertion error thrown...PASS
assertion error thrown...PASS
assertion error thrown...PASS
assertion error thrown...PASS
assertion error thrown...PASS
assertion error thrown...PASS
assertion error thrown...PASS
assertion 

## Fix int_sqrt( ) so all property-based test PASS

- since all the tests use an AssertionError exception, use assert to fix various properties of the input data
    - assert num >= 1 and num <= 10
    - ...
- you could decide to throw your custom error for invalid data and assert those errors accordingly
- re-run all the property-based tests so every test PASSes

## Property-based testing demo

- see `src/unittesting/inventory` folder
    - a simple order processing and stock control system
    - burrowed from book "The Pragmatic Programmer" by David Thomas and Andrew Hunt
- two classes:
    - `Warehouse` and `Order` in two separate modules
- run several test modules provided in the order:
- if the Warehouse doesn't have enough inventory, you shouldn't be able to create an Order!
    - how do you check if the warehouse has enough inventory?
    - Hypothesis will find this bug and report it!

```bash
pytest test_order.py # no hypothesis used; doesn't find error!
pytest test_warehouse.py # no hypothesis used
pytest test_order_fail.py # <-- this property-based testing using hypothesis will find error
pytest test_order_fixed.py # use hypothesis on fixed order.py
```

- performs several property-based tests
- automatically generates test data using `hypothesis`
- finds the data that causes tests to fail
    - use the data to create the separate explicit `unittest` - which becomes your regression test
    - since the data is generated randomly, you may not guarantee the same data will be generated
- property-based tests often surprise you!

### Regression test
- focus on the subset of unit tests targeting a subset of new code/feature
- a type of software testing technique that re-runs functional and non-functional tests to ensure that a software application works as intended after any code changes, updates, revisions, improvements, or optimizations
- change int_sqrt( ) function to accept values from 0 to 100
- see if the existing test passes
    - do you need new property-based tests?