# Build a comprehensive **unit-test suite** for a "greatest common divisor" function, `gcd`</title>

Specification: For two integers $x$ and $y$ which are not both zero, the **greatest common divisor** 
$\displaystyle \gcd(x, y)$ is the largest positive integer that divides each of the integers. For example, $\displaystyle \gcd(8, 12) = 4$. Furthermore, $\displaystyle \gcd(0, 0) = 0$.

_Adapted from Wikipedia, https://en.wikipedia.org/wiki/Greatest_common_divisor, retrieved on September 21 2023_

In [1]:
# import our gcd function from our example module
from example import gcd

## We want to check for correct functionality with **good data**

### Essential tooling: `assert` lets us throw an error if a check is false

We can use assert statements in python to write tests.

The assert statement in python raises an `Error` if the result of a calculation is "Falsy":

In [2]:
assert True  # no Error

In [3]:
assert False  # raises an Exception

AssertionError: 

In [4]:
assert 1 == 1  # no Error

In [5]:
assert 1 > 0  # no Error

In [6]:
assert 0 > 1  # False, so raises an exception

AssertionError: 

In [7]:
assert "a"  # a string is truthy...

In [8]:
assert ""  # but an empty string is falsy

AssertionError: 

> Beware: `assert` is meant for debugging, and can be turned off by running `python` with the `-O` flag.
> Use `raise` statements and conditions if your code relies on the check being run.

### Type test

Does it produce sensible results, like the correct datatype?

In [9]:
assert type(gcd(8, 12)) is int  

... or the correct sign (+ rather than -)?

In [10]:
assert gcd(8, 12) >= 0

### Nominal cases

Check for correct result in "normal", middle-of-the-road cases. 

In [11]:
assert gcd(7, 21) == 7
assert gcd(20, 10) == 10
assert gcd(54, 24) == 6

Ideally, you'd want to test many nominal cases. This could be through calculating them by hand, or constructing examples at random. 

See the section on "property-based testing" for examples of how to do this.

### Boundaries

Check for correctness at the boundaries of the domain, or boundaries within parameters.
Checking the boundary means the value on the boundary, just above, and (if valid) just below.

The `gcd` function operates on integers and has a boundary at zero:

In [12]:
assert gcd(1, 17) == 1  # should be 1
assert gcd(0, 17) == 0  # should be 0

> Python doesn't have a bound on the size of integers, and we'll look at common errors with large values later.

### Compound boundaries

You should test the behavior of your function at places where several variables have boundaries.

In the case of the `gcd`, this is relatively simple:

In [13]:
assert gcd(0, 0) == 0

### Special cases

Check behavior at special values (if any exist):

In [14]:
assert gcd(0, 0) == 0

### Symmetries

We also know that $\gcd(x, y) = \displaystyle \gcd(y, x)$ so we should test those too:

In [15]:
# Nominal
assert gcd(21, 7) == 7
assert gcd(10, 20) == 10
assert gcd(24, 54) == 6

# Boundary
assert gcd(17, 1) == 1  # should be 1
assert gcd(17, 0) == 0  # should be 0

### Minimum and Maximum Valid Configuration

For programs which accept collections of things as arguments, we should check 
- the "minimum normal configuration", with the smallest valid dataset,
- the "maximum normal configuration", with data at least as large as the largest expected dataset for the use case.

### Old Data

For programs which are upgraded over time and which are intended to be backwards compatible, we should pass data formatted in the "old" way, and ensure that these continue to be handled correctly.

## It is vital to test that our function also throws `Exceptions` correctly for **bad data**

### Uninitialized data

If we pass `None` (where `None` is a disallowed value), it should throw a `TypeError`:

In [16]:
import pytest

with pytest.raises(TypeError):
    gcd(1, None)

with pytest.raises(TypeError):
    gcd(None, 2)

> Of course, if your function allows `None` as a valid input, it should be included in the **good data** tests. 

### Incorrect type

If we pass in the wrong `type` of data, it should throw a `TypeError`:

In [17]:
with pytest.raises(TypeError):
    gcd(1, 2.4)

with pytest.raises(TypeError):
    gcd(1.2, 2)

with pytest.raises(TypeError):
    gcd(1.2, 2.4)

with pytest.raises(TypeError):
    gcd("one-point-two", 2)

### Too little data

If we pass in too little data it should throw an `Exception`:

In [18]:
with pytest.raises(TypeError):
    gcd()

In [19]:
with pytest.raises(TypeError):
    gcd(0)

### Too much data

If we pass in too little data it should throw an `Exception`:

In [20]:
gcd(1, 2, 3)  # throws a type error

TypeError: gcd() takes 2 positional arguments but 3 were given

In [21]:
with pytest.raises(TypeError):  # which we can catch like this
    gcd(1, 2, 3)

## **Guess errors** to focus on tests which are disproportionately likely to show problems

Some input values cause more errors than others. 

You might be able to guess which errors will crop up, and test more effectively by finding errors faster.

### Numbers: Zeros
Zeros often cause problems in numerical functions.

In [22]:
assert gcd(0, 100) == 0

### Numbers: Values at the limit of a type's definition may cause issues

The "natural" maximum size of an integer might be $2^{63} - 1$ on a 64-bit system (which I'm using for this demo), so we'll treat that as a boundary.

> As of python 3, the only size limit for an integer is the size of memory [[1]](https://docs.python.org/3/library/sys.html#sys.maxsize), but if you're using a library like Numpy which *does* impose a limit, you should check behavior below and above that limit. 

In [23]:
a = 2**63-1  # prime factors: 7, 73, 127, 337, 92737, 649657, https://www.wikidata.org/wiki/Q10571632
b = 649657 * 7 * 6  # the gcd is 649657 * 7 = 4547599 by construction
assert gcd(a, b) == 4547599

a = 2**63  # prime factors: 2 by construction
b = 2**10  # 1024, for example
assert gcd(a, b) == 1024

a = 2**63+1  # prime factors: 3, 3, 3, 19, 43, 5419, 77158673929
b = 43 * 5419 * 2  # the gcd is 43 * 5419 = 233017 by construction
assert gcd(a, b) == 233017

If we do the same test with a function which casts the values to numpy 64-bit integers, we get some errors:

In [24]:
from example import gcd_numpy

a = 2**63-1  # prime factors: 7, 73, 127, 337, 92737, 649657, https://www.wikidata.org/wiki/Q10571632
b = 649657 * 7 * 6  # the gcd is 649657 * 7 = 4547599 by construction
assert gcd_numpy(a, b) == 4547599

a = 2**63  # prime factors: 2 by construction
b = 2**10  # 1024, for example
assert gcd_numpy(a, b) == 1024

a = 2**63+1  # prime factors: 3, 3, 3, 19, 43, 5419, 77158673929
b = 43 * 5419 * 2  # the gcd is 43 * 5419 = 233017 by construction
assert gcd_numpy(a, b) == 233017

OverflowError: Python int too large to convert to C long

### Strings: empty, long, unicode

In functions which operate on strings, test the behavior with strings which
- are emtpy,
- have a single character,
- are very long compared to the "normal" case in your use case,
- contain unicode characters.

> Strings have a length limit of $(2^{63} - 1)\,\mathrm{B}$ – around $9\,000\,000\,\mathrm{TB}$. 

### Lists, Arrays, Dictionaries: Mutable datatypes can cause very strange errors

In python, it's easy to introduce a fault which causes function to change its output each time you run it, even with the same inputs – check that a function returns the same output for the same input:

In [25]:
# Example of a function which displays this behavior
from example import append
help(append)

Help on function append in module example._lists:

append(value: Any, the_list: Optional[list] = [])
    Appends a value to a list, and if the list isn't given, return the value on a new list.
    
    :param value: the value to append to `list`
    :type value: Any
    
    :param the_list: the list to append to, defaults to an empty list
    :type the_list: list, optional



In [26]:
# Works fine if we give it a list to extend:
append(1, [])  # should return [1]

[1]

In [27]:
assert append(1, []) == [1]

If we don't give it a list to extend, it breaks:

In [28]:
assert append(1) == [1]

In [29]:
assert append(2) == [2]   # should return [2]!!!

AssertionError: 

What's going on? Let's try to debug this function:

In [30]:
append(2)

[1, 2, 2]

The default value of `the_list` is getting extended each time we run the function.

You might think that you can check this by running a test like this:

In [31]:
assert append(3) == append(3)  # passes unexpectedly!

... but the error is so insidious that this test fails! Both functions are appending to the same list! 
You actually need to store a copy of the value from the first run and compare it later:

In [32]:
import copy

first_result = copy.deepcopy(append(4))
second_result = copy.deepcopy(append(4))

assert first_result == second_result, "%s != %s" % (first_result, second_result)

AssertionError: [1, 2, 2, 3, 3, 4] != [1, 2, 2, 3, 3, 4, 4]

To fix this, we replace the mutable list in the function with a `None`:

In [33]:
from example import append_fixed

first_result = copy.deepcopy(append_fixed(4))
second_result = copy.deepcopy(append_fixed(4))

assert first_result == second_result, "%s != %s" % (first_result, second_result)

In the context of our `gcd` function, the test would be:

In [34]:
first_gcd = copy.deepcopy(gcd(32, 8))
second_gcd = copy.deepcopy(gcd(32, 8))

assert first_gcd == second_gcd, "%s != %s" % (first_gcd, second_gcd)

### Write a **regression test** test for every bug

A rich source of errors is *faults which were already fixed*. If a faults re-emerges, it is called a **regression**.

So, every time you find a bug: 

- Make a test case which fails because of the bug.
- Fix the bug (so the test case passes)
- Leave the test case in your testing library.

### Testing can ensure we don't regress when we upgrade, and we can leverage **existing implementations** if we have them

When we reimplement something, we can also introduce regressions. By comparing outputs, we can ensure that the new implementation is equivalent to the new implementation.

Suppose we want to test an implementation of the Euclidean GCD algorithm which I copied from [geeksforgeeks.org](https://www.geeksforgeeks.org/euclidean-algorithms-basic-and-extended/): 

In [35]:
from example import gcd_euclidean

assert gcd_euclidean(4, 12) == 4
assert gcd_euclidean(71383, 27455) == 5491  # from an earlier test using prime factors

It looks good! It might even be a lot faster than our existing code:

In [36]:
%timeit gcd_euclidean(71383, 27455)

374 ns ± 2.12 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)


In [37]:
%timeit gcd(71383, 27455)

1.19 ms ± 4.09 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


Wow! A speed up of $\times 3000$! Amazing!

#### We can use a reference implementation to show that `gcd_euclidean`, though fast, is defective

Let's test it against our existing reference implementation, just to see if it's working for all the values we are already confident in. 

In [52]:
import random
random.seed(180)
for i in range(1000):
    a = random.randrange(0, 1000)
    b = random.randrange(0, 1000)
    assert gcd(a, b) == gcd_euclidean(a, b), f"Fails for {a=}, {b=}"

AssertionError: Fails for a=841, b=0

It's failing for at least one case. But is it failing because of `a`, or because of `b`?

## **Property-based testing** helps check more values and locate minimal failing cases

Property-based testing libraries like [`hypothesis`](https://hypothesis.readthedocs.io/): 
- Check that invariant properties of a function are fulfilled for a range of input values.
- "Shrinking" inputs which cause errors systematically to find the "minimal" failing case.

You can convert existing "Example-based" tests into property-based tests.

### The basic behavior includes the output types and the symmetry between the input values:

In [161]:
from hypothesis import given, strategies, assume, settings

@given(strategies.integers(), strategies.integers())
def test_gcd_type_symmetry(a, b):
    # Set the boundaries we'll test within. Valid inputs are >= 0
    # Values > 4_000_000 took too long, so limit the upper range.
    assume(0 <= a < 4_000_000 and 0 <= b < 4_000_000)
    
    # Calculate the result
    result = gcd(a, b)

    print(f"{a=}, {b=}, {result}")

    # Check the type and sign
    assert type(result) is int
    assert result >= 0

    # Check the results are the same when we swap a and b
    result_swapped = gcd(b, a)
    assert result == result_swapped

test_gcd_type_symmetry()

a=0, b=0, 0
a=0, b=0, 0
a=0, b=0, 0
a=0, b=0, 0
a=0, b=0, 0
a=0, b=0, 0
a=0, b=0, 0
a=0, b=0, 0
a=0, b=0, 0
a=22624, b=0, 0
a=0, b=0, 0
a=16174, b=8749, 1
a=28403, b=97, 1
a=28403, b=24903, 1
a=24903, b=24903, 24903
a=24200, b=19646, 22
a=19646, b=19646, 19646
a=17285, b=1, 1
a=17285, b=17285, 17285
a=20917, b=191, 1
a=20917, b=20917, 20917
a=14, b=14, 14
a=127, b=27781, 1
a=27781, b=27781, 27781
a=67, b=67, 67
a=75, b=29757, 3
a=75, b=116, 1
a=19201, b=116, 1
a=116, b=116, 116
a=5251, b=5251, 5251
a=107, b=30209, 1
a=107, b=30209, 1
a=107, b=107, 107
a=16203, b=49, 1
a=16203, b=16203, 16203
a=2277, b=32332, 1
a=88, b=88, 88
a=20307, b=20307, 20307
a=31309, b=2898, 1
a=31309, b=2898, 1
a=31309, b=31309, 31309
a=74, b=74, 74
a=32218, b=32218, 32218
a=2248, b=39, 1
a=39, b=39, 39
a=28913, b=5584, 1
a=28913, b=5584, 1
a=28913, b=28913, 28913
a=21615, b=21615, 21615
a=84, b=21615, 3
a=84, b=84, 84
a=4689, b=18519, 3
a=18519, b=18519, 18519
a=18519, b=18519, 18519
a=7426, b=0, 0
a=7426, b=7

### We can also check the degenerate cases where one of the values is 1
$\gcd(x, 1) = 1, x \geq 1$:

In [162]:
@given(strategies.integers())
def test_gcd_type_one(a):
    # Calculate the result
    assume(1 <= a)
    assert gcd(a, 1) == 1

test_gcd_type_one()

### ... or one of the values is 0
$\gcd(x, 0) = 0$:

In [163]:
@given(strategies.integers())
def test_gcd_type_zero(a):
    # Calculate the result
    assume(0 <= a)
    assert gcd(a, 0) == 0

test_gcd_type_zero()

### If we have a reference implementation, we can use that to run a larger sample of tests

We can reimplement our breaking case from before, using properties instead, and see that it's the 0, not the 841, which causes us problems:

In [164]:
@given(strategies.integers(), strategies.integers())
def test_gcd_euclidean_against_reference(a, b):
    assume(0 <= a < 4_000_000 and 0 <= b < 4_000_000)
    assert gcd(a, b) == gcd_euclidean(a, b)
    
test_gcd_euclidean_against_reference()

AssertionError: 

### If we have a way to construct known correct results, we can use that

We can use the fact that for two numbers the $\gcd$ is the product of the intersection of their 
prime factors (or 1, if they have no matching factors).

#### We make a way to generate lists of primes or ones (with replacement)

In [165]:
from functools import reduce
from sympy import primerange

list_of_primes_strategy = strategies.lists(
    strategies.sampled_from(
        [1] + list(primerange(0, 20)) 
    ), 
    min_size=0, 
    max_size=10,
    unique=False
)
 
list_of_primes_strategy.example()  # show an example

[5, 19, 19, 7, 7, 5, 17, 19, 11, 11]

#### We need to find the product of a list of integers

In [166]:
def product(x: list[int]):
    if len(x) == 0:
        result = 0
    else:
        result = reduce(lambda x, y: x * y, x, 1)
    return result

# Plausibility checks:
assert product([]) == 0
assert product([1]) == 1
assert product([1, 2]) == 2
assert product([3]) == 3
assert product([3]) == 3  # check that we get the same result with the same data
assert product([3, 3, 3]) == 27
assert product([3, 3, 3, 2]) == 54

Here is the product of an example list of primes:

In [167]:
the_primes = list_of_primes_strategy.example()
print(f"{the_primes=}, {product(the_primes)=}") 

the_primes=[7, 11, 19, 7, 5], product(the_primes)=51205


#### We also need to get the common elements from two lists

In [168]:
from collections import Counter

def get_common_elements(ai: list[int], bi: list[int]) -> list[int]:
    # From https://stackoverflow.com/a/37645155, thanks to "miradulo"
    common_elements = list((Counter(ai) & Counter(bi)).elements())
    return common_elements

# And test it
assert get_common_elements([], []) == []
assert get_common_elements([1], []) == []
assert get_common_elements([], [1]) == []
assert get_common_elements([1], [1]) == [1]
assert get_common_elements([1], [2]) == []
assert get_common_elements([1, 1], [1]) == [1]
assert get_common_elements([1, 1], [1, 1]) == [1, 1]
assert get_common_elements([1, 2], [1, 2]) == [1, 2]
assert get_common_elements([1, 2], [3]) == []
    

#### Now we can construct as many middle-of-the-road examples as we like

In [171]:
@settings(deadline=500)
@given(list_of_primes_strategy, list_of_primes_strategy)
def test_gcd_constructed_known_cases(a_prime_factors, b_prime_factors):
    
    a, b = product(a_prime_factors), product(b_prime_factors)  
    
    # Skip the testcase if the numbers are negative or too large – modified for our larger deadline
    assume(0 <= a < 100_000_000 and 0 <= b < 100_000_000)
    
    # Get the gcd by construction
    common_factors = get_common_elements(a_prime_factors, b_prime_factors)
    if len(common_factors) == 0:
        if a == 0 or b == 0:
            known_gcd = 0
        if a > 0 and b > 0:
            known_gcd = 1
    else:
        known_gcd = product(common_factors)
    
    # Calculate the result using the function
    calculated_gcd = gcd(a, b) 
    assert calculated_gcd == known_gcd
    
    # Report for debugging purposes
    print(f"gcd({a:11,}, {b:11,}) = {known_gcd:11,} == {calculated_gcd:11,}")

test_gcd_constructed_known_cases()

gcd(          0,           0) =           0 ==           0
gcd(          0,           1) =           0 ==           0
gcd(          1,           0) =           0 ==           0
gcd(          1,           0) =           0 ==           0
gcd(          1,           0) =           0 ==           0
gcd(          1,           0) =           0 ==           0
gcd(          1,           0) =           0 ==           0
gcd(          1,           0) =           0 ==           0
gcd(          1,           0) =           0 ==           0
gcd(          1,           0) =           0 ==           0
gcd(          1,           0) =           0 ==           0
gcd(          0,       8,645) =           0 ==           0
gcd(        221,   7,166,250) =          13 ==          13
gcd(          0,     332,367) =           0 ==           0
gcd(          0,          11) =           0 ==           0
gcd(          0,  24,984,050) =           0 ==           0
gcd(         19,  96,426,330) =          19 ==          

KeyboardInterrupt: 

### A more thorough search strategy is required – perhaps constructing more representative results across a whole range of values. 
Are these tests good? Maybe:

- There's a lot of them, 
- there are a lot of big numbers,
- there seem to be non-trivial results.

But are they checking everything? No:
- The sampling strategy makes composite numbers with:
  - a handful of factors - 10 at most,
  - where the prime factors are less than 20
- They aren't checking:
  - composite numbers or primes with low numbers of large factors
  - primes larger than 20 (certainly no primes up around the million mark)
 
A more thorough search strategy is required – perhaps constructing more representative results across a whole range of values. But this creates its own problems: how do you test the tests?

## Once we can see the code, we can do **white-box** tests to ensure every line and every option is exercised, complementing our black-box tests


**Beware!** White-box tests check each line of code does what we intended it to do, but not that the software as a whole meets its specifications.

Only with a combination of black- and white-box testing can we be confident in software.

- All the tests up until now could, in principle, be written before the implementation of the code.
- Once the code is written, we can look at the code itself, reason about what it is doing, and be more confident that it can *only* do what we intend. This is called "white-box testing".|

An equivalent definition is to ensure that every logical path through each part of the code is tested at least once.

"Basis path testing" (also called "structured basis testing", "structured testing", "structural testing") seeks a minimal set of test cases which achieve 100% coverage. 

> This is subtly different to testing *every possible path* through the *whole code* – we don't aim to run every possible scenario, because that can be prohibitively time consuming and expensive, even if automated.

### With coverage testing, we ensure that all the code written is actually tested

Our black-box tests should cover every path through our code. If they don't:
- Our black-box tests might be insufficient to test all the **intended use-cases** boundaries, special cases and bugfixes we implemented,
- We might be doing calculations which are **unintended use-cases** and should be separate functions,
- We might have **redundant code** which make comprehension more difficult and should be deleted. 

#### 100% branch coverage $\gt$ 100% statement coverage

You should aim for "100% branch coverage":
- ensuring that **every statement is tested** (100% statement coverage), _and_
- that every **predicate term** is tested for at least **one true** and **one false** value.

("100% statement coverage" alone is **insufficient** as it will often miss key logical faults in code.)

#### Let's work out what tests we need to exercise all the `gcd` source code.

In [48]:
import inspect 
print(inspect.getsource(gcd))

def gcd(x: int, y: int) -> int:
    """Algorithm to calculate the greatest common divisor of two integers"""

    if type(x) is not int or type(y) is not int:
        raise TypeError("x and y must be integers")
    if x < 0 or y < 0:
        raise ValueError("x and y must be ≥ 0")

    low, high = min(x, y), max(x, y)

    _gcd = 0

    div = low
    while div > 0:
        if (low % div == 0) and (high % div == 0):
            if _gcd < div:
                _gcd = div
        div -= 1

    return _gcd



#### Every statement gets a test which exercises all of its modes of execution

In [49]:
def gcd_(x: int, y: int) -> int:                     #  1: nominal case, x > y, e.g. gcd(18, 24)
    if type(x) is not int or type(y) is not int:     #  2: x is a non-integer
                                                     #  3: y is a non-integer
        raise TypeError("x and y must be integers")  #     -> expect TypeError
    if x < 0 or y < 0:                               #  4: x is a negative integer
                                                     #  5: y is a negative integer, x is positive
        raise ValueError("x and y must be ≥ 0")      #     -> expect ValueError
    low, high = min(x, y), max(x, y)                 #  _: x > y – covered by the nominal case
                                                     #  6: x < y – e.g. gcd(24, 18)
                                                     #  7: x = y - e.g. gcd(33, 33)
    _gcd = 0                                         #  _: covered by the nominal case
    div = low                                        #  _: covered by the nominal case
    while div > 0:                                   #  _: where div starts > zero, covered by the nominal case
                                                     #  8: where div starts at zero, e.g. gcd(0, 0)
        if (low % div == 0) and (high % div == 0):   #  9: low % div == 0, high % div == 0 – low is gcd, e.g. gcd(12, 4)
                                                     # 10: low % div == 0, high % div =! 0 – happens if low is not gcd,
                                                     #     covered by nominal case
                                                     # 11: low % div =! 0, high % div == 0 – can't happen in 1st iter,
                                                     #     in 2nd low = 2, high is even, e.g. gcd(3, 8)
                                                     # 12: low % div =! 0, high % div =! 0 – can't happen in 1st iter,
                                                     #     in 2nd low = 2, high is prime != 2, e.g. gcd(3, 7)
            if _gcd < div:                           #  _: _gcd < div – covered by the nominal case
                                                     #     gcd(18, 24) when div reaches 6
                                                     #  _: _gcd > div – covered by the nominal case gcd(18, 24)
                                                     #     when div reaches 3 (and _gcd is 6)
                _gcd = div                           #  _: covered by the nominal case
        div -= 1                                     #  _: covered by the nominal case
    return _gcd                                      #  _: covered by the nominal case

The tests required here are:

- The nominal case – a middle-of-the-road example with `a` > `b`, like `gcd(18, 24)`
- `if` statement about types:
  - `a` is a non-integer, expect `TypeError`
  - `b` is a non-integer, expect `TypeError`
- `if` statement about $\gt 0$:
  - `a` is a negative integer, expect `ValueError`
  - `b` is a negative integer, expect `ValueError`
- `low, high`:
  - `a` > `b` – covered by the nominal case
  - `b` > `a`
  - `a` = `b`
- `while` statement:
  - A case where `div` starts at zero, say `a` = 0 and `b` > 0
- `if` statement with remainders:
  - `low % div == 0` – covered by the nominal case, always true in the first step as `div` > 0 if we get here
  - `low % div != 0` – a case like `gcd(3, 7)` where in the second step of the code, `div` = 2
  - `high % div == 0` – a case where the lower number is the $\gcd$, like `gcd(12, 4)`
  - `high % div != 0` – covered by the nominal case, where the lower number is not the $\gcd$, like `gcd(12, 5)`
- `if` statement with `_gcd < div`:
  - `_gcd < div`: covered by the nominal case `gcd(18, 24)` when `div` reaches 6
  - `_gcd > div`: covered by the nominal case `gcd(18, 24)` when `div` reaches 3 (and `_gcd` is 6)

#### A minimum test set would cover all of these cases

In [172]:
from example.gcd.test_reference import test_gcd_basis
print(inspect.getsource(test_gcd_basis))

def test_gcd_basis(gcd_fn=gcd):
    assert gcd_fn(24, 18) == 6

    with pytest.raises(TypeError):
        gcd_fn(float(24), 18)
    with pytest.raises(TypeError):
        gcd_fn(24, float(18))

    with pytest.raises(ValueError):
        gcd_fn(-24, 18)
    with pytest.raises(ValueError):
        gcd_fn(24, -18)

    assert gcd_fn(18, 24) == 6
    assert gcd_fn(33, 33) == 33
    assert gcd_fn(0, 0) == 0
    assert gcd_fn(0, 5) == 0
    assert gcd_fn(3, 7) == 1
    assert gcd_fn(3, 8) == 1
    assert gcd_fn(12, 4) == 4
    assert gcd_fn(12, 5) == 1



#### Manual coverage checks are time consuming and need repeating – use tooling like `coverage` to help

Manual basis testing as a technique for planning test cases:
- is extremely time consuming,
- needs to be re-done every time the code changes as they are linked to the specific implementation.

Recommendation: 
- Use something like [`coverage`](https://coverage.readthedocs.io/) with "branch coverage" to ensure that every line of code was run when going through your black-box tests.
- Use specific white-box tests only when necessary to test particular behaviors you can't easily create in black-box tests.



In [51]:
!coverage run --branch -m pytest ../
!coverage html
#!browser htmlcov/index.html

platform darwin -- Python 3.10.12, pytest-7.4.2, pluggy-1.3.0
rootdir: /Users/jholla10/Developer/testingDSCoV
plugins: anyio-4.0.0, hypothesis-6.86.2
collected 4 items                                                              [0m

../src/example/gcd/test_euclid.py [32m.[0m[31mF[0m[31m                                     [ 50%][0m
../src/example/gcd/test_optimize.py [32m.[0m[31m                                    [ 75%][0m
../src/example/gcd/test_reference.py [32m.[0m[31m                                   [100%][0m

[31m[1m_______________________________ test_gcd_euclid ________________________________[0m

    [94mdef[39;49;00m [92mtest_gcd_euclid[39;49;00m():[90m[39;49;00m
>       test_gcd_basis(gcd_euclidean)[90m[39;49;00m

[1m[31m../src/example/gcd/test_euclid.py[0m:6: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

gcd_fn = <function gcd_euclidean at 0x1025eaa70>

    [94mdef[39;49;00m [92mtest_gcd_basis[39;49;00