## Testing frameworks

### Why use testing frameworks?

Frameworks should simplify our lives:

* Should be easy to add simple test
* Should be possible to create complex test:
    * Fixtures
    * Setup/Tear down
    * Parameterized tests (same test, mostly same input)
* Find all our tests in a complicated code-base 
* Run all our tests with a quick command
* Run only some tests, e.g. ``test --only "tests about fields"``
* **Report failing tests**
* Additional goodies, such as code coverage

### Common testing frameworks

* Language agnostic: [CTest](http://www.cmake.org/cmake/help/v2.8.12/ctest.html)
  * Test runner for executables, bash scripts, etc...
  * Great for legacy code hardening
    
* C unit-tests:
    * all c++ frameworks,
    * [Check](http://check.sourceforge.net/),
    * [CUnit](http://cunit.sourceforge.net)

* C++ unit-tests:
    * [CppTest](http://cpptest.sourceforge.net/),
    * [Boost::Test](http://www.boost.org/doc/libs/1_55_0/libs/test/doc/html/index.html),
    * [google-test](https://code.google.com/p/googletest/),
    * [Catch](https://github.com/philsquared/Catch) (best)

* Python unit-tests:
    * [nose](https://nose.readthedocs.org/en/latest/) includes test discovery, coverage, etc
    * [unittest](http://docs.python.org/2/library/unittest.html) comes with standard python library
    * [py.test](http://pytest.org/latest/), branched off of nose

* R unit-tests:
    * [RUnit](http://cran.r-project.org/web/packages/RUnit/index.html),
    * [svUnit](http://cran.r-project.org/web/packages/svUnit/index.html)
    * (works with [SciViews](http://www.sciviews.org/) GUI)

* Fortran unit-tests:
    * [funit](http://nasarb.rubyforge.org/funit/),
    * [pfunit](http://sourceforge.net/projects/pfunit/)(works with MPI)

### Nose framework: usage

[nose](https://nose.readthedocs.org/en/latest/) is a python testing framework.

We can use its tools in the notebook for on-the-fly tests in the notebook. This, happily, includes the negative-tests example we were looking for a moment ago.

In [1]:
def I_only_accept_positive_numbers(number):
    # Check input
    if number < 0: 
        raise ValueError("Input "+ str(number)+" is negative")

    # Do something

In [2]:
from nose.tools import assert_raises

In [3]:
with assert_raises(ValueError):
    assert I_only_accept_positive_numbers(-5)

but the real power comes when we write a test file alongside our code files in our homemade packages:

In [4]:
%%bash
mkdir -p saskatchewan
touch saskatchewan/__init__.py

In [1]:
%%writefile saskatchewan/overlap.py
def overlap(field1, field2):
    left1, bottom1, top1, right1 = field1
    left2, bottom2, top2, right2 = field2
    
    overlap_left=max(left1, left2)
    overlap_bottom=max(bottom1, bottom2)
    overlap_right=min(right1, right2)
    overlap_top=min(top1, top2)
    # Here's our wrong code again
    overlap_height=(overlap_top-overlap_bottom)
    overlap_width=(overlap_right-overlap_left)
    
    return overlap_height*overlap_width

Overwriting saskatchewan/overlap.py


In [4]:
%%writefile saskatchewan/test_overlap.py
from .overlap import overlap
from nose.tools import assert_equal

def test_full_overlap():
    assert_equal(overlap((1.,1.,4.,4.),(2.,2.,3.,3.)), 1.0)

def test_partial_overlap():
    assert_equal(overlap((1,1,4,4),(2,2,3,4.5)), 2.0)
                 
def test_no_overlap():
    assert_equal(overlap((1,1,4,4),(4.5,4.5,5,5)), 0.0)

Overwriting saskatchewan/test_overlap.py


In [5]:
%%bash
cd saskatchewan
nosetests

..F
FAIL: saskatchewan.test_overlap.test_no_overlap
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/Users/jamespjh/devel/rsdt/rsd-engineeringcourse/ch03tests/saskatchewan/test_overlap.py", line 11, in test_no_overlap
    assert_equal(overlap((1,1,4,4),(4.5,4.5,5,5)), 0.0)
AssertionError: 0.25 != 0.0

----------------------------------------------------------------------
Ran 3 tests in 0.001s

FAILED (failures=1)


Note that it reported **which** test had failed, how many tests ran, and how many failed.

The symbol `..F` means there were three tests, of which the third one failed.

Nose will:

* automagically finds files ``test_*.py``
* collects all subroutines called ``test_*``
* runs tests and reports results

Some options:

* help: `nosetests --help`
* test only a given file: `nosetests test_file.py`
* compute coverage: `nosetests --with-coverage`

## Testing with floating points

### Floating points are not reals


Floating points are inaccurate representations of real numbers:

`1.0 == 0.99999999999999999` is true to the last bit.

This can lead to numerical errors during calculations: $1000 (a - b) \neq 1000a - 1000b$

In [6]:
1000.0 * 1.0 - 1000.0 * 0.9999999999999998

2.2737367544323206e-13

In [7]:
1000.0 * (1.0 - 0.9999999999999998)

2.220446049250313e-13

*Both* results are wrong: `2e-13` is the correct answer.

The size of the error will depend on the magnitude of the floating points:

In [8]:
1000.0 * 1e5 - 1000.0 * 0.9999999999999998e5

1.4901161193847656e-08

The result should be `2e-8`.

### Comparing floating points

Comparison can be absolute:

In [9]:
from nose.tools import assert_almost_equal
assert_almost_equal( 0.7, 0.7 + 1e-6, delta = 1e-5)

<div class="fragment roll-in">
Or relative:

In [12]:
from nose.tools import assert_almost_equal
magnitude = 0.7
assert_almost_equal(0.7, 0.7 + 1e-5, delta = magnitude * 1e-5)

AssertionError: 0.7 != 0.7000099999999999 within 7e-06 delta

Where `magnitude` should be chosen based on the intrinsic scale of the calculations.

For instance, if calculations is a result of differences between large numbers:

In [13]:
(1e15 + 1.4) - (1e15 + 0.7)

0.625

then `magnitude = 1e15` is reasonable. 

However, the best choice of scale for comparison in scientific floating point testing is an active area of research.

### Comparing vectors of floating points

Numerical vectors are best represented using [numpy](http://www.numpy.org/).

In [14]:
from numpy import array, pi

vector_of_reals = array([0.1, 0.2, 0.3, 0.4]) * pi

Numpy ships with a number of assertions (in ``numpy.testing``) to make
comparison easy:

In [15]:
from numpy import array, pi
from numpy.testing import assert_allclose
expected = array([0.1, 0.2, 0.3, 0.4, 1e-12]) * pi
actual = array([0.1, 0.2, 0.3, 0.4, 2e-12]) * pi
actual[:-1] += 1e-6

assert_allclose(actual, expected, rtol=1e-5, atol=1e-8)

It compares the difference between `actual` and `expected` to ``atol + rtol * abs(expected)``.