# Testing

How do we know if our code is working correctly? It is not when the code runs and returns some value: as seen above, there may be times where it makes sense to stop the code even when it is correct, as it is being used incorrectly. We need to test the code to check that it works.

*Unit testing* is the idea of writing many small tests that check if simple cases are behaving correctly. Rather than trying to *prove* that the code is correct in all cases (which could be very hard), we check that it is correct in a number of tightly controlled cases (which should be more straightforward). If we later find a problem with the code, we add a test to cover that case.

##### Teaching note

We are intending to use unit tests to *automatically mark* student submissions of weekly work. Knowing how this works in outline is going to be needed to interpret the errors the students see and ask questions about.

At least to start, we are *not* intending to use these tests on coursework submissions.

We will write a simple function that divides two numbers:

In [2]:
def divide(x, y):
    """
    Divide two numbers
    
    Parameters
    ----------
    
    x : float
        Numerator
    y : float
        Denominator
    
    Returns
    -------
    
    x / y : float
    """
    return x / y

For now we can play with this in the console.

We want to check that it does the "right thing". How much do we need to check?

Check integers:

In [3]:
print(divide(4,2))

2.0


Check obvious fractions:

In [4]:
print(divide(3,2))

1.5


Does $a^7 / a = a^6$?

In [5]:
a = 1.234
print(divide(a**7, a), a**6)

3.5309450437774568 3.5309450437774568


What happens if you divide by zero? (What should happen?)

In [6]:
print(divide(1, 0))

ZeroDivisionError: division by zero

What happens if you divide by a really large number?

In [8]:
print(divide(1, 1e1000))

0.0


Each of these tests has their uses and may show different potential problems. What counts as "correct" depends on how you want your code to handle certain situations.

If are function didn't do what we wanted on one of these tests then we'd have to alter it and test again. This can be error prone, so it's better to write functions. We want these functions to complain loudly if something is wrong, but be quiet if all is well. To do this we can use the `assert` statement:

In [9]:
def test_integer_division():
    assert(divide(4, 2) == 2)

In [11]:
test_integer_division()

We see that nothing happened, as we wanted.

## Formalizing tests

This small set of tests covers most of the cases we are concerned with. However, by this point it's getting hard to remember

1. what each line is actually testing, and
2. what the correct value is meant to be.

To formalize this, we write each test as a small function that contains this information for us. Let's start with the $x^2 - 1 = 0$ case where the roots are $\pm 1$:

In [29]:
from numpy.testing import assert_equal, assert_allclose

def test_real_distinct():
    """
    Test that the roots of x^2 - 1 = 0 are \pm 1.
    """
    
    roots = (1.0, -1.0)
    assert_equal(real_quadratic_roots(1, 0, -1), roots,
                 err_msg="Testing x^2-1=0; roots should be 1 and -1.")

In [30]:
test_real_distinct()

What this function does is checks that the results of the function call match the expected value, here stored in `roots`. If it didn't match the expected value, it would raise an exception:

In [31]:
def test_should_fail():
    """
    Comparing the roots of x^2 - 1 = 0 to (1, 1), which should fail.
    """
    
    roots = (1.0, 1.0)
    assert_equal(real_quadratic_roots(1, 0, -1), roots,
                 err_msg="Testing x^2-1=0; roots should be 1 and 1."
                 " So this test should fail")

test_should_fail()

AssertionError: 
Items are not equal:
item=1
Testing x^2-1=0; roots should be 1 and 1. So this test should fail
 ACTUAL: -1.0
 DESIRED: 1.0

Testing that one floating point number equals another can be dangerous. Consider $x^2 - 2 x + (1 - 10^{-10}) = 0$ with roots $1.1 \pm 10^{-5} )$:

In [32]:
from math import sqrt

def test_real_distinct_irrational():
    """
    Test that the roots of x^2 - 2 x + (1 - 10**(-10)) = 0 are 1 \pm 1e-5.
    """
    
    roots = (1 + 1e-5, 1 - 1e-5)
    assert_equal(real_quadratic_roots(1, -2.0, 1.0 - 1e-10), roots,
                 err_msg="Testing x^2-2x+(1-1e-10)=0; roots should be 1 +- 1e-5.")
    
test_real_distinct_irrational()

AssertionError: 
Items are not equal:
item=0
Testing x^2-2x+(1-1e-10)=0; roots should be 1 +- 1e-5.
 ACTUAL: 1.0000100000004137
 DESIRED: 1.00001

We see that the solutions match to the first 14 or so digits, but this isn't enough for them to be *exactly* the same. In this case, and in most cases using floating point numbers, we want the result to be "close enough": to match the expected precision. There is an assertion for this as well:

In [33]:
from math import sqrt

def test_real_distinct_irrational():
    """
    Test that the roots of x^2 - 2 x + (1 - 10**(-10)) = 0 are 1 \pm 1e-5.
    """
    
    roots = (1 + 1e-5, 1 - 1e-5)
    assert_allclose(real_quadratic_roots(1, -2.0, 1.0 - 1e-10), roots,
                 err_msg="Testing x^2-2x+(1-1e-10)=0; roots should be 1 +- 1e-5.")
    
test_real_distinct_irrational()

The `assert_allclose` statement takes options controlling the precision of our test.

We can now write out all our tests:

In [34]:
from math import sqrt
from numpy.testing import assert_equal, assert_allclose

def test_no_roots():
    """
    Test that the roots of x^2 + 1 = 0 are not real.
    """
    
    roots = None
    assert_equal(real_quadratic_roots(1, 0, 1), roots,
                 err_msg="Testing x^2+1=0; no real roots.")

def test_zero_roots():
    """
    Test that the roots of x^2 = 0 are both zero.
    """
    
    roots = (0, 0)
    assert_equal(real_quadratic_roots(1, 0, 0), roots,
                 err_msg="Testing x^2=0; should both be zero.")

def test_real_distinct():
    """
    Test that the roots of x^2 - 1 = 0 are \pm 1.
    """
    
    roots = (1.0, -1.0)
    assert_equal(real_quadratic_roots(1, 0, -1), roots,
                 err_msg="Testing x^2-1=0; roots should be 1 and -1.")
    
def test_real_distinct_irrational():
    """
    Test that the roots of x^2 - 2 x + (1 - 10**(-10)) = 0 are 1 \pm 1e-5.
    """
    
    roots = (1 + 1e-5, 1 - 1e-5)
    assert_allclose(real_quadratic_roots(1, -2.0, 1.0 - 1e-10), roots,
                 err_msg="Testing x^2-2x+(1-1e-10)=0; roots should be 1 +- 1e-5.")
    
def test_real_linear_degeneracy():
    """
    Test that the root of x + 1 = 0 is -1.
    """
    
    root = -1.0
    assert_equal(real_quadratic_roots(0, 1, 1), root,
                 err_msg="Testing x+1=0; root should be -1.")

In [35]:
test_no_roots()
test_zero_roots()
test_real_distinct()
test_real_distinct_irrational()
test_real_linear_degeneracy()

## `py.test`

We now have a set of tests - a *testsuite*, as it is sometimes called - encoded in functions, with meaningful names, which give useful error messages if the test fails. Every time the code is changed, we want to re-run all the tests to ensure that our change has not broken the code. This can be tedious. A better way would be to run a single command that runs all tests. `pytest` is that command.

The easiest way to use it is to put all tests in the same file as the function being tested. So, create a file `quadratic.py` containing

```python
from math import sqrt
from numpy.testing import assert_equal, assert_allclose
    
def real_quadratic_roots(a, b, c):
    """
    Find the real roots of the quadratic equation a x^2 + b x + c = 0, if they exist.
    
    Parameters
    ----------
    
    a : float
        Coefficient of x^2
    b : float
        Coefficient of x^1
    c : float
        Coefficient of x^0
        
    Returns
    -------
    
    roots : tuple or float or None
        The root(s) (two if a genuine quadratic, one if linear, None otherwise)
        
    Raises
    ------
    
    NotImplementedError
        If the equation has trivial a and b coefficients, so isn't solvable.
    """
    
    discriminant = b**2 - 4.0*a*c
    if discriminant < 0.0:
        return None
    
    if a == 0:
        if b == 0:
            raise NotImplementedError("Cannot solve quadratic with both a"
                                      " and b coefficients equal to 0.")
        else:
            return -c / b
    
    x_plus = (-b + sqrt(discriminant)) / (2.0*a)
    x_minus = (-b - sqrt(discriminant)) / (2.0*a)
    
    return x_plus, x_minus

def test_no_roots():
    """
    Test that the roots of x^2 + 1 = 0 are not real.
    """
    
    roots = None
    assert_equal(real_quadratic_roots(1, 0, 1), roots,
                 err_msg="Testing x^2+1=0; no real roots.")

def test_zero_roots():
    """
    Test that the roots of x^2 = 0 are both zero.
    """
    
    roots = (0, 0)
    assert_equal(real_quadratic_roots(1, 0, 0), roots,
                 err_msg="Testing x^2=0; should both be zero.")

def test_real_distinct():
    """
    Test that the roots of x^2 - 1 = 0 are \pm 1.
    """
    
    roots = (1.0, -1.0)
    assert_equal(real_quadratic_roots(1, 0, -1), roots,
                 err_msg="Testing x^2-1=0; roots should be 1 and -1.")
    
def test_real_distinct_irrational():
    """
    Test that the roots of x^2 - 2 x + (1 - 10**(-10)) = 0 are 1 \pm 1e-5.
    """
    
    roots = (1 + 1e-5, 1 - 1e-5)
    assert_allclose(real_quadratic_roots(1, -2.0, 1.0 - 1e-10), roots,
                 err_msg="Testing x^2-2x+(1-1e-10)=0; roots should be 1 +- 1e-5.")
    
def test_real_linear_degeneracy():
    """
    Test that the root of x + 1 = 0 is -1.
    """
    
    root = -1.0
    assert_equal(real_quadratic_roots(0, 1, 1), root,
                 err_msg="Testing x+1=0; root should be -1.")
```

Then, in a terminal or command window, switch to the directory containing this file. Then run

```
nosetests quadratic.py
```

You should see output similar to

```
nosetests quadratic.py 
.....
----------------------------------------------------------------------
Ran 5 tests in 0.006s

OK
```

Each dot corresponds to a test. If a test fails, `nose` will report the error and move on to the next test. `nose` automatically runs every function that starts with `test`, or every file in a module starting with `test`, or more. [The documentation](https://nose.readthedocs.org/en/latest/testing.html) gives more details about using `nose` in more complex cases.

To summarize: when trying to get code working, tests are essential. Tests should be simple and cover as many of the easy cases and as much of the code as possible. By writing tests as functions that raise exceptions, and using a testing framework such as `nose`, all tests can be run rapidly, saving time.

## Test Driven Development

There are many ways of writing code to solve problems. Most involve planning in advance how the code should be written. An alternative is to say in advance what tests the code should pass. This *Test Driven Development* (TDD) has advantages (the code always has a detailed set of tests, features in the code are always relevant to some test, it's easy to start writing code) and some disadvantages (it can be overkill for small projects, it can lead down blind alleys). A [detailed discussion is given by Beck's book](http://www.amazon.co.uk/Driven-Development-Addison-Wesley-Signature-Series/dp/0321146530), and a [more recent discussion in this series of conversations](http://martinfowler.com/articles/is-tdd-dead/).

Even if TDD does not work for you, testing itself is extremely important.