# Linge & Langtagen, "Programming for Computations"
## Ch. 3.4 Testing

### Ch. 3.4.1 Problems with brief testing procedures

Testing of the programs for numerical integration has so far employed two strategies.

1. If we have an exact answer, we compute the error and see that increasing $n$ decreases the error.

2. When the exact answer is not available, we can (as in the comparison example in the previous section) look at the integral values and see that they stabilize as $n$ grows.

$Unit$ $testing$

A good habit is to test small pieces of a larger code individually, one at a time. This is known as $unit$ $testing$. One identifies a (small) unit of the code, and then one makes a separate test for this unit.



The unit test should be stand-alone in the sense that it can be run without the outcome of other tests. Typically, one algorithm in scientific numerical computing is to deal with numerical approximation errors.

A fortunate side effect of unit testing is that the programmer is forced to use functions to modularize the code into smaller, logical pieces.

### Ch. 3.4.2 Proper test procedures

There are three serious ways to test the implementation of numerical methods via unit tests:
1. $Comparing$ $with$ $hand$$-computed$ $results$ in a problem with few arithmetic operations, i.e., small $n$.

2. $Solving$ $a$ $problem$ $without$ $numerical$ $errors$. We know that the trapezoidal rule must be exact for linear functions. The error produced by the program must then be zero (to machine precision)

3. $Demonstrating$ $correct$ $convergence$ $rates$. A strong test when we can compute exact errors, is to see how fast the error goes to zero as $n$ grows. In the trapezoidal and midpoint rules it is known thjat the error depends on $n$ as $n^{-2}$ as $n$ $\rightarrow$ $\infty$

$Hand-$$computed$ $results$.

Let us use two trapezoids and compute the integral $\int_0^1 v(t), v(t) = 3t^2 e^{t^{3}}$:

$$\frac{1}{2}h(v(0) + v(0.5)) + \frac{1}{2}h(v(0.5)+v(1)) = 2.463642041244344, $$

In [5]:
from trapezoidal import trapezoidal
import numpy as np

F = lambda t: 3*t**2*np.exp(t**3)
p = trapezoidal(F, 0, 1, 2)

print("::: Trapezoidal :::")
print(p)
print("-----------------------------")

::: Trapezoidal :::
2.463642041244344
-----------------------------


$Solving$ $a$ $problem$ $without$ $numerical$ $errors.$

The best unit tests for numerical algorithms involve mathematical problems where we know the numerical result beforehand.

Usually, numerical results contain unknown approximation errors, so knowing the numerical result implies that we have a problem where the approximation errors vanish.

This feature may be present in very simple mathematical problems.

A specific test case can be $\int_{1.2}^{4.4}(6x-4)dx$. This integral involves an "arbitrary" interval [1.2, 4.4] and an "arbitrary" linear function $f(x) = 6x -4$.

By "arbitrary" we mean expressions where we avoid the special numbers 0 and 1 since these have special properties in arithmetic operations.

In [25]:
from trapezoidal import trapezoidal
import numpy as np

F1 = lambda t: 6*t - 4
n = 1     ### For linear equations, all results are the same.
p1 = trapezoidal(F1, 1.2, 4.4, n)

F2 = lambda t: 3*t**2 - 4*t
p2 = F2(4.4)-F2(1.2)

print("::: Trapezoidal :::")
print(p1)
print("-----------------------------")

print("::: Exact :::")
print(p2)
print("-----------------------------")

::: Trapezoidal :::
40.96000000000001
-----------------------------
::: Exact :::
40.96000000000001
-----------------------------


$Demonstrating$ $correct$ $convergence$ $rates.$

Normally, unit tests must be based on problems where the numerical approximation errors in our implementation remain unknown. Howevere, we ofen know or may assume a certain $asymptotic$ behavior of the error. We can do some experimental runs with the test problem $\int_{0}^{1}3t^2e^{t{^3}}dt$ where $n$ is doubled in each run: $n = 4, 8 ,16$. The corresponding errors are then 12%, 3% and 0.77%, respectively.

In [32]:
from trapezoidal import trapezoidal
import numpy as np

F1 = lambda t: 3*t**2*np.exp(t**3)
F2 = lambda t: np.exp(t**3)

n1 = 4
n2 = 8
n3 = 16

p1 = trapezoidal(F1, 0, 1, n1)
p2 = trapezoidal(F1, 0, 1, n2)
p3 = trapezoidal(F1, 0, 1, n3)

exact = F2(1) - F2(0)

error1 = abs(exact - p1)
error2 = abs(exact - p2)
error3 = abs(exact - p3)

rel_error1 = (error1/exact)*100
rel_error2 = (error2/exact)*100
rel_error3 = (error3/exact)*100

print("::: Error n = %4d :::" % n1)
print(rel_error1)
print("-----------------------------")

print("::: Error n = %4d :::" % n2)
print(rel_error2)
print("-----------------------------")

print("::: Error n = %4d :::" % n3)
print(rel_error3)
print("-----------------------------")

::: Error n =    4 :::
11.89763626796125
-----------------------------
::: Error n =    8 :::
3.0591090942323977
-----------------------------
::: Error n =   16 :::
0.7704993341376286
-----------------------------


These numbers indicate that the error is roughly reduced by a factor of 4 when doubling $n$. Thus, the error converges to zero as $n^{-2}$ and we say that the $convergence$ $rate$ is 2.

Numerical integration methods usually have an error that converge to zero as $n^{-p}$ for some $p$ that depends on the method. With such a result, it does not matter if we do not know what the actual approximation error is: we know at what rate it is $reduced$, so running the implementation for two or more difference $n$ valuies will put us in a position to measure the expected rate and see if it is achieved.

Let us develop a more precise method for such unit tests based on convergence rates. We assume that the error $E$ depneds on $n$ according to

$$E = Cn^{r},$$

where $C$ is an unknown constant and $r$ is the convergence rate. Consider a set of experiments with various $n: n_{0},n_{1},n_{2},\ldots,n_{q}$.

We compute the corresponding erros $E_{0},\ldots,E_{q}$. For two consecutive experiments, number $i$ and $i-1$, we have the error model

$$E_{i} = {Cn}_{i}^{r},$$
$$E_{i-1} = {Cn}_{i-1}^{r},$$


These are two equations for two unknwons $C$ and $r$. We can easily eliminate $C$ by dividing the equations by each other. The solving for $r$ gives

$$r_{i-1} = \frac{ln(E_{i}/E_{i-1})}{ln(n_{i}/n_{i-1})}$$

We have introduced a subscript $i-1$ in $r$ since the estimated value for $r$ varies with $i$. Hopefully, $r_{i-1}$ approaches the correct convergence rate as the number of intervals increases and $i \rightarrow q$.

### Ch. 3.4.3 Finite precision of floating-point numbers

The test procedures above lead to comparison of numbers for checking that calculations were correct. Such comparison is more complicated than what a newcomer might think.
Suppose we have a calculation $a + b$ and want to check that the result is what we expect. We start with $1 + 2$:

In [37]:
a = 1 ; b= 2; expected =3
a + b == expected

True

Then we proceed with 0.1 + 0.2:

In [39]:
a = 0.1 ; b = 0.2 ; expected = 0.3
a + b == expected

False

The reason is that real numbers cannot in general be exactly represented on a comupter.

They must instead be approximated by a floating-point number that can only store a finite amiount of information, usually about $17$ digits of a real number.

Let us print$ 0.1, 0.2, 0.1 + 0.2,$ and $0.3$ with $17$ decimals:

In [42]:
print('%.17f\n%.17f\n%.17f\n%.17f' % (0.1, 0.2, 0.1 + 0.2, 0.3))

0.10000000000000001
0.20000000000000001
0.30000000000000004
0.29999999999999999


In general, real numbers in Python have (at most) 16 correct decimals.

If we cannot make tests like $0.1 + 0.2 == 0.3$, what should we then do? The answer is that we must accept some small inaccuracy and make a test with a $tolerance$. Here is the recipe:

In [48]:
a = 0.1 ; b = 0.2 ; expected = 0.3
computed = a + b
diff = abs(expected - computed)
tol = 1E-15

if diff < tol:
    print("::: True :::")
    print("::: Error :::")
    print(diff)
    print("-----------------------------")

::: True :::
::: Error :::
5.551115123125783e-17
-----------------------------


Here we have set the tolerance for comparison to $10^{-15}$, but calculating $0.3 - (0.1 + 0.2)$ shows that it equals $-5.55e-17$, so a lower tolerance could be used in this particular example.

### Ch. 3.4.4 Constructing unit tests and writing test functions

Python has several frameworks for automatically running and checking a potentially very large number of tests for parts of your software by one command.

This is an extremely useful feature during program development: whenever you have done some changes to one or more files, launch the test command and make sure nothing is broken because of your edits.

The test frameworks $nose$ and $py.test$ are particularly attractive as they are very easy to use.


The requirements to a test function are simple:

$\diamond$ the name must start with test_

$\diamond$ the test function cannot have any arguments

$\diamond$ the tests inside test functions must be boolean expressions

$\diamond$ a boolean expression $b$ must be tested with $assert b, msg,$ where $msg$ is an optional object (string or number) to be written out when $b$ is false.

In [55]:
def add(a, b):
    return a + b

def test1_add():
    expected = 1 + 1
    computed = add(1, 1)
    assert computed == expected, '1+1=%g' % computed
    if computed == expected:
        print("test1_add is no problem")
    
def test2_add():
    expected = 0.3
    computed = add(0.1, 0.2)
    tol = 1E-14
    diff = abs(expected - computed)
    assert diff < tol, 'diff=%g' % diff
    if diff < tol:
        print("test2_add is no problem")

test1_add()
test2_add()

test1_add is no problem
test2_add is no problem


$Hand-$$computed$ $numerical$ $results$.

In [59]:
from trapezoidal import trapezoidal

def test_trapezoidal_one_exact_result():
    """Compare one hand-computed result"""
    from math import exp
    v = lambda t: 3*(t**2)*exp(t**3)
    n = 2
    computed = trapezoidal(v, 0, 1, n)
    expected = 2.463642041244344
    error = abs(expected - computed)
    tol = 1E-14
    success = error < tol
    msg = 'error=%g > tol=%g' % (error, tol)
    assert success, msg
    print("\n %s \n%s" % (success, msg))
    
if __name__ == '__main__':
    test_trapezoidal_one_exact_result()


 True 
error=0 > tol=1e-14


Note the importance of checking $err$ against $exact$ with a tolernace: rounding errors from the arithmetics inside $trapezoidal$ will not make the result exactly like the hand-computed one. The size of the tolerance is here set to $10^{-14}$, which is a kind of all-round value for computations with numbers not deviating much from unity.

$Solving$ $a$ $problem$ $without$ $numerical$ $errors.$

We know that the trapezoidal rule is exact for liner integrands. Choosing the integral $\int_{1.2}^{4.4}(6x-4)dx$ as test case, the corresponding test function for this unit test may look like

In [69]:
def test_trapezoidal_linear():
    """Check that linear function are integrated exactly."""
    f = lambda x: 6*x - 4
    F = lambda x: 3*x**2 - 4*x # Anti-derivative
    a = 1.2; b= 4.4
    expected = F(b) - F(a)
    tol = 1E-14
    for n in 2, 20, 21:
        computed = trapezoidal(f, a, b, n)
        error = abs(expected - computed)
        success = error < tol
        msg = 'n=%d, err=%g' % (n, error)
        assert success, msg
        if error < tol:
            print('The number of points : %s ' % (n))
            print('Error : %s ' % (error))
            print('Successfully operating')
            print("-----------------------------")
        
if __name__ == '__main__':
    test_trapezoidal_linear()

The number of points : 2 
Error : 7.105427357601002e-15 
Successfully operating
-----------------------------
The number of points : 20 
Error : 0.0 
Successfully operating
-----------------------------
The number of points : 21 
Error : 0.0 
Successfully operating
-----------------------------


$Demonstrating$ $correct$ $convergence$ $rates$.

In the present example with integration, it is known that the approximation erros in the trapezoidal rule are propotional to $n^{-2}$, $n$ being the number of subintervals used in the composite rule.

Computing convergence rates requires somewhat more tedious programming than the previous tests, but can be applied to more general integrands. The algorithm typically goes like 

$\diamond$ for $ i = 0, 1, 2, \ldots, q$

 $\circ$ $n_{i} = 2^{i+1}$

 $\circ$ Compute integral with $n_{i}$ intervals

 $\circ$ Compute the error $E_{i}$

 $\circ$ Estimate $r_{i}$ from (3.24) if $i > 0$

The corresponding code may look like

In [76]:
from trapezoidal import trapezoidal

def convergence_rates(f, F, a, b, num_experiments=14):
    from math import log
    from numpy import zeros
    expected = F(b) - F(a)
    n = zeros(num_experiments, dtype=int)
    E = zeros(num_experiments)
    r = zeros(num_experiments-1)
    for i in range(num_experiments):
        n[i] = 2**(i+1)
        computed = trapezoidal(f, a, b, n[i])
        E[i] = abs(expected - computed)
        if i > 0:
            r_im1 = log(E[i]/E[i-1])/log(float(n[i])/n[i-1])
            r[i-1] = float('%.2f' % r_im1) # Truncate to two decimals
        return r

def test_trapezoidal_conv_rate():
    """Check empirical convergence rates against the expected -2."""
    from math import exp
    v = lambda t: 3*(t**2)*exp(t**3)
    V = lambda t: exp(t**3)
    a = 1.1; b = 1.9
    r = convergence_rates(v, V, a, b, 14)
    print(r)
    tol = 0.01
    msg = str(r[-4:]) # show last 4 estimate rates
    assert (abs(r[-1]) - 2 ) < tol, msg
    if (abs(r[-1]) -2) < tol:
        print(tol)
        
if __name__ == '__main__':
    test_trapezoidal_conv_rate()

[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
0.01


Making a test function is a matter of choosing $f, F, a,$ and $b$, and then checking the value of $r_{i}$ for the largest $i$:

Running the test shows that all $r_{i}$, except the first one, equal the target limit 2 within two decimals. This oberservation suggest a tolerance of $10^{-2}$