# Python Testing and Test-Driven Development (TDD)


<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Python-Testing-and-Test-Driven-Development-(TDD)" data-toc-modified-id="Python-Testing-and-Test-Driven-Development-(TDD)-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Python Testing and Test-Driven Development (TDD)</a></span><ul class="toc-item"><li><span><a href="#Learning-Objectives" data-toc-modified-id="Learning-Objectives-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>Learning Objectives</a></span></li><li><span><a href="#Basics-of-Testing" data-toc-modified-id="Basics-of-Testing-1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>Basics of Testing</a></span><ul class="toc-item"><li><span><a href="#Assertions" data-toc-modified-id="Assertions-1.2.1"><span class="toc-item-num">1.2.1&nbsp;&nbsp;</span>Assertions</a></span></li><li><span><a href="#Exceptions" data-toc-modified-id="Exceptions-1.2.2"><span class="toc-item-num">1.2.2&nbsp;&nbsp;</span>Exceptions</a></span></li></ul></li><li><span><a href="#Unit-Tests" data-toc-modified-id="Unit-Tests-1.3"><span class="toc-item-num">1.3&nbsp;&nbsp;</span>Unit Tests</a></span></li><li><span><a href="#Continuous-Integration-(CI)" data-toc-modified-id="Continuous-Integration-(CI)-1.4"><span class="toc-item-num">1.4&nbsp;&nbsp;</span>Continuous Integration (CI)</a></span><ul class="toc-item"><li><span><a href="#Question:-Does-Your-Software-Work-on-Your-Colleague’s-Computer?" data-toc-modified-id="Question:-Does-Your-Software-Work-on-Your-Colleague’s-Computer?-1.4.1"><span class="toc-item-num">1.4.1&nbsp;&nbsp;</span>Question: <strong>Does Your Software Work on Your Colleague’s Computer?</strong></a></span></li><li><span><a href="#Answer:-Let-The-Computers-Do-The-Work" data-toc-modified-id="Answer:-Let-The-Computers-Do-The-Work-1.4.2"><span class="toc-item-num">1.4.2&nbsp;&nbsp;</span>Answer: <strong>Let The Computers Do The Work</strong></a></span></li><li><span><a href="#Continuous-Integration-Hosting" data-toc-modified-id="Continuous-Integration-Hosting-1.4.3"><span class="toc-item-num">1.4.3&nbsp;&nbsp;</span>Continuous Integration Hosting</a></span></li></ul></li><li><span><a href="#Test-Driven-Development-(TDD):-Write-a-test-before-writing-the-code" data-toc-modified-id="Test-Driven-Development-(TDD):-Write-a-test-before-writing-the-code-1.5"><span class="toc-item-num">1.5&nbsp;&nbsp;</span>Test-Driven Development (TDD): Write a test before writing the code</a></span><ul class="toc-item"><li><span><a href="#You-Do-You!" data-toc-modified-id="You-Do-You!-1.5.1"><span class="toc-item-num">1.5.1&nbsp;&nbsp;</span>You Do You!</a></span></li><li><span><a href="#Example" data-toc-modified-id="Example-1.5.2"><span class="toc-item-num">1.5.2&nbsp;&nbsp;</span>Example</a></span><ul class="toc-item"><li><span><a href="#Step-1:-Write-failing-test" data-toc-modified-id="Step-1:-Write-failing-test-1.5.2.1"><span class="toc-item-num">1.5.2.1&nbsp;&nbsp;</span>Step 1: Write failing test</a></span></li><li><span><a href="#Step-2:-Make-the-test-pass" data-toc-modified-id="Step-2:-Make-the-test-pass-1.5.2.2"><span class="toc-item-num">1.5.2.2&nbsp;&nbsp;</span>Step 2: Make the test pass</a></span></li></ul></li></ul></li><li><span><a href="#Key-Takeaways" data-toc-modified-id="Key-Takeaways-1.6"><span class="toc-item-num">1.6&nbsp;&nbsp;</span>Key Takeaways</a></span></li><li><span><a href="#Going-Further" data-toc-modified-id="Going-Further-1.7"><span class="toc-item-num">1.7&nbsp;&nbsp;</span>Going Further</a></span></li></ul></li></ul></div>

## Learning Objectives

- Understand the place of testing in a scientific workflow.
- Understand how to run a test suite using the pytest framework
- Understand how continuous integration speeds software development
- Understand the benefits of continuous integration
- Identify a few options for hosting a continuous integration server

*Note*: This notebook borrows from http://katyhuff.github.io/python-testing

## Basics of Testing


The first step toward getting the right answers from our programs is to assume that mistakes will happen and to guard against them. This is called **defensive programming** and the most common way to do it is to add alarms and tests into our code so that it checks itself.

**Testing** should be a seamless part of scientific software development process. This is analogous to experiment design in the experimental science world:

- At the beginning of a new project, tests can be used to help guide the overall architecture of the project.
- The act of writing tests can help clarify how the software should be perform when you are done.
- In fact, starting to write the tests before you even write the software might be advisable. (Such a practive is called `test-driven development`)

There are many ways to test software, such as:

- **Assertions and Exceptions**: While writing code, `exceptions` and `assertions` can be added to sound an alarm as runtime problems come up. These kinds of tests, are embedded in the software iteself and handle, as their name implies, exceptional cases rather than the norm.


- **Unit Tests**: Unit tests investigate the behavior of units of code (such as functions, classes, or data structures). By validating each software unit across the valid range of its input and output parameters, tracking down unexpected behavior that may appear when the units are combined is made vastly simpler.

- **Integration Tests**: involve exercising more than one unit of the system. The goal is to check whether these units have been integrated correctly. 

### Assertions

- Assertions are one line tests embedded in code.
- Assertions are the building blocks of tests
- The `assert` keyword is used to set an assertion
- Assertions halt execution if the argument is false
- Assertions do nothing if the argument is true.


In [1]:
def mean(num_list):
    #assert len(num_list) != 0
    return sum(num_list) / len(num_list)

In [2]:
a = [1, 2, 3]
b = []

In [3]:
mean(a)

2.0

In [4]:
mean(b)

ZeroDivisionError: division by zero


- The NumPy numerical computing library has built-in functionality for tests. This functionality is provided through `numpy.testing` module: https://docs.scipy.org/doc/numpy/reference/routines.testing.html



In [5]:
import numpy as np

In [6]:
x = [1e-5, 1e-3, 1e-1]
y = np.arccos(np.cos(x))
y

array([1.00000004e-05, 1.00000000e-03, 1.00000000e-01])

In [7]:
# Raises an AssertionError if two objects are not equal up to desired tolerance.
np.testing.assert_allclose(x, y, rtol=1e-5, atol=0)

### Exceptions

- Exceptions are more sophisticated than assertions. They are the standard error messaging system in most modern programming languages. Fundamentally, when an error is encountered, an informative exception is ‘thrown’ or ‘raised’.

For example, instead of the assertion in the case before, an exception can be used.

```python
def mean(num_list):
    assert len(num_list) != 0
    return sum(num_list) / len(num_list)
```


In [8]:
def mean(num_list):
    if len(num_list) == 0:
        raise Exception("The algebraic mean of an empty list is undefined. "
                      "Please provide a list of numbers")
    else:
        return sum(num_list) / len(num_list)

In [9]:
mean(a)

2.0

In [10]:
mean(b)

Exception: The algebraic mean of an empty list is undefined. Please provide a list of numbers

- Once an exception is raised, it will be passed upward in the program scope. An exception be used to trigger additional error messages or an alternative behavior. Rather than immediately halting code execution, the exception can be **caught** upstream with a **`try-except`** block. When wrapped in a try-except block, the exception can be intercepted before it reaches global scope and halts execution.

In [11]:
def mean(num_list):
    try:
        return sum(num_list)/len(num_list)
    except ZeroDivisionError as detail :
        msg = "The algebraic mean of an empty list is undefined. Please provide a list of numbers."
        #raise ZeroDivisionError(f'{detail.__str__()}\n{msg}')
        print(msg)

In [12]:
mean(a)

2.0

In [13]:
mean(b)

The algebraic mean of an empty list is undefined. Please provide a list of numbers.


## Unit Tests

Unit tests exercise the functionality of the code by interrogating individual functions and methods. Functions and methods can often be considered the atomic units of software because they are indivisible. 

To illustrate how to write and run unit tests in Python, let's create a suite of tests for our mean function. Once these tests are written in a file called test_statistics.py, we will use [`pytest`](https://docs.pytest.org/en/latest/) library to run these tests all at once just reporting which tests fail and which succeed.

```bash
Project
.
├── my_cesm_package
│   ├── __init__.py
│   └── statistics.py
└── tests
    └── test_statistics.py
```

In [14]:
!cat my_cesm_package/statistics.py

def mean(num_list):
    try:
        return sum(num_list)/len(num_list)
    except ZeroDivisionError as detail :
        msg = "The algebraic mean of an empty list is undefined. Please provide a list of numbers."
        raise ZeroDivisionError(f'{detail.__str__()}\n{msg}')

In [15]:
!cat tests/test_statistics.py

from my_cesm_package.statistics import mean


def test_ints():
    num_list = [1, 2, 3, 4, 5]
    obs = mean(num_list)
    exp = 3
    assert obs == exp

def test_zero():
    num_list=[0,2,4,6]
    obs = mean(num_list)
    exp = 3
    assert obs == exp

def test_double():
    num_list=[1,2,3,4]
    obs = mean(num_list)
    exp = 2.5
    assert obs == exp

def test_long():
    big = 100000000
    obs = mean(range(1,big))
    exp = big/2.0
    assert obs == exp

def test_complex():
    # given that complex numbers are an unordered field
    # the arithmetic mean of complex numbers is meaningless
    num_list = [2 + 3j, 3 + 4j, -32 - 2j]
    obs = mean(num_list)
    exp = NotImplemented
    assert obs == exp

In [16]:
!pytest -v tests

platform darwin -- Python 3.7.3, pytest-5.1.2, py-1.8.0, pluggy-0.12.0 -- /Users/abanihi/opt/miniconda3/bin/python
cachedir: .pytest_cache
rootdir: /Users/abanihi/devel/ncar/NCAR-pangeo-tutorial/notebooks/bytopic/test-driven-development
collected 5 items                                                              [0m

tests/test_statistics.py::test_ints [32mPASSED[0m[36m                               [ 20%][0m
tests/test_statistics.py::test_zero [32mPASSED[0m[36m                               [ 40%][0m
tests/test_statistics.py::test_double [32mPASSED[0m[36m                             [ 60%][0m
tests/test_statistics.py::test_long [32mPASSED[0m[36m                               [ 80%][0m
tests/test_statistics.py::test_complex [31mFAILED[0m[36m                            [100%][0m

[31m[1m_________________________________ test_complex _________________________________[0m

[1m    def test_complex():[0m
[1m        # given that complex numbers are an unordered fi

In the above case, the pytest package **sniffed-out** the tests in the directory and ran them together to produce a report of the sum of the files and functions matching the regular expression `[Tt]est[-_]*`.

The major benefit a testing framework provides is exactly that, a utility to find and run the tests automatically. With pytest, this is the command-line tool called pytest. When pytest is run, it will search all directories below where it was called, find all of the Python files in these directories whose names start or end with test, import them, and run all of the functions and classes whose names start with test or Test. This automatic registration of test code saves tons of human time and allows us to focus on what is important: writing more tests.

## Continuous Integration (CI)

### Question: **Does Your Software Work on Your Colleague’s Computer?**


Imagine you developed software on a Linux System such as Cheyenne. Last week, you helped your office colleague build and run it on their MacOSX computer. You’ve made some changes since then.

1. How can you be sure it will still work if they update their repository when they come back from vacation?
2. How long will that process take?


The typical story is that, well, you don’t know whether it will work on your colleagues’ machine until you try rebuilding it on their machine. If you have a build system, it might take a few minutes to update the repository, rebuild the code, and run the tests. If you don’t have a build system, it could take all afternoon just to see if your new changes are compatible.

### Answer: **Let The Computers Do The Work**

- Continuous integration tools allow us to validate the integrity of our application by running the test suite on every commit by building your application on Linux, macOS, and Windows.

- Based on your instructions, a continuous integration server can:

    - check out new code from a repository
    - spin up instances of supported operating systems (i.e. various versions of OSX, Linux, Windows, etc.).
    - spin up those instances with different software versions (i.e. python 3.6 and python 3.7)
    - run the build and test scripts
    - check for errors
    - and report the results.
    
    
- These CI servers exist for automatically running your tests. 
    

**Example CI Workflow: NCAR pop-tools package (https://github.com/NCAR/pop-tools)**
![](./img/ci-workflow.png)
![](./img/ci-job.png)

**Code Coverage Report**

![](./img/coverage.png)

### Continuous Integration Hosting

Depending on your needs, you may consider continous integration hosting services such as:

- [CircleCI](https://circleci.com/)
- [TravisCI](https://travis-ci.org/)
- [Azure Pipelines](https://azure.microsoft.com/en-us/services/devops/pipelines/)
- [AppVeyor](https://www.appveyor.com/)

All these services are **free** for open source project (such as a public repository on Github). To use them, all you need is an account on any of the services and then following the instructions on the respective service website to connect your account with GitHub.

## Test-Driven Development (TDD): Write a test before writing the code

TDD design philosophy was most strongly put forth by Kent Beck in his book [*Test-Driven Development: By Example*](https://www.eecs.yorku.ca/course_archive/2003-04/W/3311/sectionM/case_studies/money/KentBeck_TDD_byexample.pdf).

Test-driven development is very simple (easier said than done 😄) :


![TDD](./img/TDD.png)



- **Red:** Write a small unit test case. This test will naturally fail as we haven't written the implementation yet.
- **Green:** Write the code that implements the desired functionality. We just want something simple that will pass the test.
- **Refactor:** Now that the test is passing, look at the code to see whether it can be improved.

- The cycle repeats as we proceed to the next test and implement the next bit of functionality.

The most important takeaway from test-driven development is that the moment you start writing code, you should be considering how to test that code. The tests should be written and presented in tandem with the implementation. **Testing is too important to be an afterthought**.

### You Do You!


<p style="color:blue;"> Software developers who practice strict TDD will tell you that it is the best thing since sliced arrays. However, do what works for you. The choice whether to pursue classic TDD is a personal decision. <p>

### Example


The following example illustrates classic TDD for a standard deviation function, std().

To start, we write a test for computing the standard deviation from a list of numbers as follows:


#### Step 1: Write failing test

In [17]:
def test_std1():
    obs = std([0.0, 2.0])
    exp = 1.0
    assert obs == exp

In [18]:
test_std1()

NameError: name 'std' is not defined

#### Step 2: Make the test pass

Next, we write the minimal version of std() that will cause test_std1() to pass:

In [19]:
def std(vals):
    # Surely this is cheating ...
    return 1.0

In [20]:
test_std1()

As you can see, the minimal version simply returns the expected result for the sole case that we are testing. If we only ever want to take the standard deviation of the numbers 0.0 and 2.0, or 1.0 and 3.0, and so on, then this implementation will work perfectly. If we want to branch out, then we probably need to write more robust code. However, before we can write more code, we first need to add another test or two:

In [21]:
def test_std1():
    obs = std([0.0, 2.0])
    exp = 1.0
    
    assert obs == exp

def test_std2():
    # Test the fiducial case when we pass in an empty list.
    obs = std([])
    exp = 0.0
    assert obs == exp

def test_std3():
    # Test a real case where the answer is not one.
    obs = std([0.0, 4.0])
    exp = 2.0
    assert obs == exp

A simple function implementation that would make these tests pass could be as follows:

In [22]:
def std(vals):
    # a little better
    if len(vals) == 0: # Special case the empty list.
        return 0.0
    return vals[-1] / 2.0 # By being clever, we can get away without doing real work.


In [23]:
test_std1()
test_std2()
test_std3()

Are we done? No. Of course not. Even though the tests all pass, this is clearly still not a generic standard deviation function. To create a better implementation, TDD states that we again need to expand the test suite:



In [24]:
def test_std1():
    obs = std([0.0, 2.0])
    exp = 1.0
    assert obs == exp

def test_std2():
    obs = std([])
    exp = 0.0
    assert obs == exp

def test_std3():
    obs = std([0.0, 4.0])
    exp = 2.0
    assert obs == exp

def test_std4():
    # The first value is not zero.
    obs = std([1.0, 3.0])
    exp = 1.0
    assert obs == exp

def test_std5():
    # Here, we have more than two values, but all of the values are the same.
    obs = std([1.0, 1.0, 1.0])
    exp = 0.0
    assert obs == exp


In [25]:
test_std1()
test_std2()
test_std3()
test_std4()
test_std5()

AssertionError: 

We would spend more time trying to come up with clever approximations to the standard deviation than we would spend actually coding it.

<p style="color:blue;"> It is important to note that we could improve this function by writing further tests. For example, this std() ignores the situation where infinity is an element of the values list. There is always more that can be tested. TDD prevents you from going overboard by telling you to stop testing when you have achieved all of your use cases.
</p>

## Key Takeaways

- Test driven development is a common software development technique
- By writing the tests first, the function requirements are very explicit
- TDD requires vigilance for success

## Going Further

- [An online workshop session on testing and continuous integration](https://katyhuff.github.io/python-testing/)
- [Getting Started with Testing in Python Tutorial](https://realpython.com/python-testing/#automating-the-execution-of-your-tests)
- [Pytest Documentation](https://docs.pytest.org/en/latest/contents.html)
- [Kent Beck's book: *Test-Driven Development: By Example*](https://www.eecs.yorku.ca/course_archive/2003-04/W/3311/sectionM/case_studies/money/KentBeck_TDD_byexample.pdf).