In [1]:
import numpy as np

# Testing Software

## Overview and Principles
Testing is the process by which you exercise your code to determine if it performs as expected. The code you are testing is referred to as the **code under test**. 

There are two parts to writing tests.
1. invoking the code under test so that it is exercised in a particular way;
1. evaluating the results of executing code under test to determine if it behaved as expected.

The collection of tests performed are referred to as the **test cases**. The fraction of the code under test that is executed as a result of running the test cases is referred to as **test coverage**.

For dynamical languages such as Python, it's extremely important to have a high test coverage. In fact, you should try to get 100% coverage. This is because little checking is done when the source code is read by the Python interpreter. For example, the code under test might contain a line that has a function that is undefined. This would not be detected until that line of code is executed.

Test cases can be of several types. Below are listed some common classifications of test cases.
- *Smoke test*. This is an invocation of the code under test to see if there is an unexpected exception. It's useful as a starting point, but this doesn't tell you anything about the correctness of the results of a computation.
- *One-shot test*. In this case, you call the code under test with arguments for which you know the expected result.
- *Edge test*. The code under test is invoked with arguments that should cause an exception, and you evaluate if the expected exception occurrs.
- *Pattern test* - Based on your knowledge of the *calculation* (not implementation) of the code under test, you construct a suite of test cases for which the results are known or there are known patterns in these results that are used to evaluate the results returned.

Another principle of testing is to limit what is done in a single test case. Generally, a test case should focus on one use of one function. Sometimes, this is a challenge since the function being tested may call other functions that you are testing. This means that bugs in the called functions may cause failures in the tests of the calling functions. Often, you sort this out by knowing the structure of the code and focusing first on failures in lower level tests. In other situations, you may use more advanced techniques called *mocking*. A discussion of mocking is beyond the scope of this course.

A best practice is to develop your tests while you are developing your code. Indeed, one school of thought in software engineering, called **test-driven development**, advocates that you write the tests *before* you implement the code under test so that the test cases become a kind of specification for what the code under test should do.

## Examples of Test Cases
This section presents examples of test cases. The code under test is the calculation of entropy.

### Entropy of a set of probabilities
$$
H = -\sum_i p_i \log(p_i)
$$
The calculation expects that the $\sum_i p_i = 1$. So, this is something that our implementation should check.

In [2]:
# Code Under Test
def entropy(ps):
    if not np.isclose(sum(ps), 1):
        raise ValueError("Probabilities must sum to 1.")
    items = ps * np.log(ps)
    return -np.sum(items)

Suppose that all of the probability of a distribution is at one point. An example of this is a coin with two heads. Whenever you flip it, you always get heads. That is, the probability of a head is 1.

What is the entropy of such a distribution? From the calculation above, we see that the entropy should be $log(1)$, which is 0. This means that we have a test case where we know the result!

In [14]:
# One-shot test.
if entropy([1.0]) != 0:
    print ("Bad result!")
else:
    print("Worked!")

Worked!


One edge test of interest is to provide an input that is *not* a distribution in that probabilities don't sum to 1.

In [15]:
# Edge test.
try:
  entropy([0.1, 0.5])
  print ("Bad result.")
except ValueError:
  print ("Worked!")

Worked!


Now let's consider a pattern test. Examining the structure of the calculation of $H$, we consider a situation in which there are $n$ equal probabilities. That is, $p_i = \frac{1}{n}$.
$$
H = -\sum_{i=1}^{n} p_i \log(p_i) 
= -\sum_{i=1}^{n} \frac{1}{n} \log(\frac{1}{n}) 
= n (-\frac{1}{n} \log(\frac{1}{n}) )
= -\log(\frac{1}{n})
$$
For example, entropy([0.5, 0.5]) should be $-log(0.5)$.

In [5]:
# Pattern test
def test_equal_probabilities(n):
    prob = 1.0/n
    ps = np.repeat(prob , n)
    if entropy(ps) != -np.log(prob):
        import pdb; pdb.set_trace()
        print ("Bad result.")
    else:
        print("Worked!")
        
# Run a test
test_equal_probabilities(2)

Worked!


You see that there are many, many cases to test. So far, we've been writing special codes for each test case. We can do better.

## Unittest Infrastructure

There are several reasons to use a test infrastructure:
- If you have a lot test cases, you'll end up writing a lot less code.
- The infrastructure provides a uniform way to report test failures, and to report the failures of many tests, rather than just one at a time.
- A test infrastructure can tell you about coverage so you know what tests to add.

We'll be using the `unittest` framework. This is a separate Python package. Using this infrastructure, requires the following:
1. import the unittest module
1. define a class that inherits from unittest.TestCase
1. write methods that run the code to be tested and check the outcomes.

The last item has two subparts. First, we must identify which methods in the class inheriting from unittest.TestCase are tests. You indicate that a method is to be run as a test by having the method name begin with "test".

Second, the "test methods" should communicate with the infrastructure the results of evaluating output from the code under test. This is done by using `assert` statements. For example, `self.assertEqual` takes two arguments. If these are objects for which `==` returns `True`, then the test passes. Otherwise, the test fails.

In [6]:
import unittest

# Define a class in which the tests will run
class UnitTests(unittest.TestCase):

    # Each method in the class to execute a test
    def test_upper(self):
        self.assertEqual(1, 1)

    def test_isupper(self):
        self.assertEqual(1, 2)


In [7]:
# Running unittests inside Jupyter requires some special code.
# This code is encapsulated inthe function below. When you create
# files containing unittests, it will look simpler.
def test(test_class=UnitTests):
    # Convenience function to run tests.
    # test_class is the class containing the tests.
    suite = unittest.TestLoader().loadTestsFromModule(test_class())
    unittest.TextTestRunner().run(suite)

In [8]:
# The function test runs the class UnitTests.
test()

F.
FAIL: test_isupper (__main__.UnitTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "<ipython-input-6-091b0079f427>", line 11, in test_isupper
    self.assertEqual(1, 2)
AssertionError: 1 != 2

----------------------------------------------------------------------
Ran 2 tests in 0.009s

FAILED (failures=1)


As expected, the first test passes, but the second test fails.

Below, we test the `entropy` function using the unittest infrastructure.

In [17]:
import unittest

# Define a class in which the tests will run
class EntropyTest(unittest.TestCase):

    def test_one_shot(self):
        self.assertEqual(entropy([1.0]), 0.0)
        
test(test_class=EntropyTest)


.
----------------------------------------------------------------------
Ran 1 test in 0.003s

OK


Although there's some setup, the tests are much easier than what we did before
```
ps = 1.0  # Should have an entropy of 0
if entropy(ps) != 0:
    print ("Got a bad result.")
```

Now we can add LOTS of tests.

In [10]:
import unittest

# Define a class in which the tests will run
class EntropyTest(unittest.TestCase):

    def test_one_shot(self):
        self.assertEqual(entropy([1.0]), 0.0)
        
    def _test_equal_probability(self, n):
        prob = 1.0/n
        ps = np.repeat(prob , n)
        self.assertEqual(entropy(ps), -np.log(prob))
        
    def test_equal_probability(self):
        self._test_equal_probability(2)
        self._test_equal_probability(20)
        self._test_equal_probability(200)
        
test(test_class=EntropyTest)


F.
FAIL: test_equal_probability (__main__.EntropyTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "<ipython-input-10-a89793b0b452>", line 17, in test_equal_probability
    self._test_equal_probability(200)
  File "<ipython-input-10-a89793b0b452>", line 12, in _test_equal_probability
    self.assertEqual(entropy(ps), -np.log(prob))
AssertionError: 5.2983173665480372 != 5.2983173665480363

----------------------------------------------------------------------
Ran 2 tests in 0.012s

FAILED (failures=1)


Why did this test fail? How do we deal with this?

In [11]:
import unittest

# Define a class in which the tests will run
class EntropyTest(unittest.TestCase):

    def test_one_shot(self):
        self.assertEqual(entropy([1.0]), 0.0)
        
    def _test_equal_probability(self, n):
        prob = 1.0/n
        ps = np.repeat(prob , n)
        self.assertTrue(np.isclose(entropy(ps), -np.log(prob)))
        
    def test_equal_probability(self):
        self._test_equal_probability(2)
        self._test_equal_probability(20)
        self._test_equal_probability(200)
        
test(test_class=EntropyTest)

..
----------------------------------------------------------------------
Ran 2 tests in 0.009s

OK


## Testing For Exceptions

Testing edge cases often involves handling exceptions. One approach is to code this directly.

In [12]:
import unittest

# Define a class in which the tests will run
class EntropyTest(unittest.TestCase):
        
    def test_invalid_probability(self):
        try:
            entropy([0.1, 0.5])
            self.assertTrue(False)
        except ValueError:
            self.assertTrue(True)
        
test(test_class=EntropyTest)

.
----------------------------------------------------------------------
Ran 1 test in 0.007s

OK


`unittest` provides help with testing exceptions.

In [13]:
import unittest

# Define a class in which the tests will run
class EntropyTest(unittest.TestCase):
        
    def test_invalid_probability(self):
        with self.assertRaises(ValueError):
            entropy([0.1, 0.5])
        
test(test_class=EntropyTest)

.
----------------------------------------------------------------------
Ran 1 test in 0.004s

OK


## Test Files
Although I presented the elements of `unittest` in a notebook. your tests should be in a file. If the name of module with the code under test is `foo.py`, then the name of the test file should be `test_foo.py`.

The structure of the test file will be very similar to cells above. You will import `unittest`. You must also import the module with the code under test. Take a look at `test_prime.py` in this directory to see an example.

## Discussion
**Question**: What tests would you write for a plotting function?

## Exercise
- Debug prime.py
- Write a one-shot, edge test, and pattern test for prime.py.
- Try using nosetests to get coverage information (`nosetests --with-coverage test_prime.py`). You may have to install the coverage module (look at https://stackoverflow.com/questions/14488601/how-to-fix-python-nose-coverage-not-available-unable-to-import-coverage-module).