# Debugging and Defensive programming

**Note**: This lesson draws heavily from, and in some parts quotes directly,
the [Python Testing](http://katyhuff.github.io/python-testing/) lesson
developed by Kathryn Huff. 

Untested code is broken code. Doing science with untested code is akin to using
an experimental device that is uncalibrated, which is generally a bad idea.
The best way to write code that works and keeps on working is to assume it's
broken, and to build yourself some alarms for when its behavior is outside of
what is expected. 

This mindset is often called **defensive programming**.

In this lesson, we will learn more about the types of bugs code can often have, in particular code that:

1. fails loudly with Python `Exception`s.
2. fails silently by producing incorrect results.

We'll then learn about how to avoid these issues more routinely by using:

1. [Assertions](http://katyhuff.github.io/python-testing/02-assertions.html)
2. [Exceptions](http://katyhuff.github.io/python-testing/03-exceptions.html)
3. [Unit Tests](http://katyhuff.github.io/python-testing/04-units.html)

and we will write some of our own, too.

#### Additional resources

* [Python Testing](http://katyhuff.github.io/python-testing/) by Kathryn Huff
* _[Effective Computation in Physics, Chapter 18](http://physics.codes/)_, A. Scopatz and K. Huff. O'Reilly Media. (2015)


## Loud bugs: Python `Exception`s

The best bugs are ones that are loud and noisy, often by raising an [`Exception`](https://docs.python.org/3/tutorial/errors.html). An `Exception` that is raised indicates some problem with the code on execution. We'll explore some examples below.

In [1]:
fro num in range(10):
    print(num)

SyntaxError: invalid syntax (<ipython-input-1-f8e2cbf169ec>, line 1)

A `SyntaxError` is a type of exception indicating that there is a problem with our use of language itself; it can amount to misspelling a Python keyword or bad grammar that the interpreter just can't understand.

In [2]:
for num in range(10):
    x = num**2
     print(x)
print("Done") 

IndentationError: unexpected indent (<ipython-input-2-1facdd64a496>, line 3)

Similar to a `SyntaxError`, an `IndentationError` corresponds to a problem with indentation, which Python is sensitive to (indentation is used to indicate scope in loops, if statements, function definitons, etc.).

In [3]:
x = float(input("Enter non-zero number --> "))
if x = 0:
   print("ERROR: number cannot be 0") 

SyntaxError: invalid syntax (<ipython-input-3-1250168fb68b>, line 2)

Sometimes `SyntaxError`s aren't so obvious. Here the use of `x = 0`, which normally amounts to assignment of the name `x` to an integer `0`, is used inside an `if` statement as a condition. Since this makes no sense to the language, it amounts to wrong syntax.

In [4]:
def sinc(x):
    return sin(x)/x

print(sinc(3.145))

NameError: name 'sin' is not defined

A `NameError` indicates that we are using a name, in this case `sin`, that hasn't been defined (it doesn't exist in the namespace, and so it doesn't point to any object).

In [5]:
seasons = ['Spring', 'Summer', 'Fall', 'Winter']
print('My favorite season is ', seasons[4])

IndexError: list index out of range

An `IndexError` commonly shows up when we try to index a sequence, such as a `list`, with an index that doesn't resolve to any value in the sequence. Even though this `list` has four elements, because Python uses 0-based indexing by convention, the largest index we could use is `3`.

## Silent (but deadly) bugs: code that executes without an `Exception` but produces unintended results

The best kind of bugs are loud ones. Bugs that raise a useful `Exception` allow us to:
1. quickly recognize something is wrong with the code.
2. quickly identify the source of the issue.
3. quickly (hopefully) fix the problem.

But many bugs aren't loud; instead, they are silent, executing without raising an `Exception` but producing a result that is either wrong or undesirable. We'll look at some examples of these next.

Say we want to create a list of values  $-10, -9.8, -9.6, …, -0.2, 0, 0.2, …, 10$. If we do:

In [6]:
h = 0.2
x = [-10 + i*h for i in range(100)]

We get:

In [7]:
x

[-10.0,
 -9.8,
 -9.6,
 -9.4,
 -9.2,
 -9.0,
 -8.8,
 -8.6,
 -8.4,
 -8.2,
 -8.0,
 -7.8,
 -7.6,
 -7.4,
 -7.199999999999999,
 -7.0,
 -6.8,
 -6.6,
 -6.4,
 -6.199999999999999,
 -6.0,
 -5.8,
 -5.6,
 -5.3999999999999995,
 -5.199999999999999,
 -5.0,
 -4.8,
 -4.6,
 -4.3999999999999995,
 -4.199999999999999,
 -4.0,
 -3.8,
 -3.5999999999999996,
 -3.3999999999999995,
 -3.1999999999999993,
 -3.0,
 -2.8,
 -2.5999999999999996,
 -2.3999999999999995,
 -2.1999999999999993,
 -2.0,
 -1.799999999999999,
 -1.5999999999999996,
 -1.4000000000000004,
 -1.1999999999999993,
 -1.0,
 -0.7999999999999989,
 -0.5999999999999996,
 -0.3999999999999986,
 -0.1999999999999993,
 0.0,
 0.20000000000000107,
 0.40000000000000036,
 0.6000000000000014,
 0.8000000000000007,
 1.0,
 1.200000000000001,
 1.4000000000000004,
 1.6000000000000014,
 1.8000000000000007,
 2.0,
 2.200000000000001,
 2.4000000000000004,
 2.6000000000000014,
 2.8000000000000007,
 3.0,
 3.200000000000001,
 3.4000000000000004,
 3.6000000000000014,
 3.8000000000000

Notice that this is missing the very last one we wanted. If we had written this code block into a much larger block of code, chances are very good we'd not notice this until much later, if at all.

Let's try defining the `sinc` function again, this time using a working implementation of `sin` inside:

In [8]:
import math
def sinc(x):
    return math.sin(x)/x

Does it work?

In [27]:
sinc(math.pi)

3.8981718325193755e-17

In [15]:
sinc(2 * math.pi)

-3.8981718325193755e-17

Looks like it doesn't throw any errors. Let's try plotting this to see if it looks like what we'd expect. We can generate a bunch of x values to plot with like we did above:

In [9]:
h = 0.2
x = [-10 + i*h for i in range(101)]

Now we'll plot with `matplotlib`:

In [12]:
import matplotlib.pyplot as plt
%matplotlib inline

In [16]:
plt.plot(x, [sinc(i) for i in x])

ZeroDivisionError: float division by zero

Hmmm...so somewhere we divide by zero. Does this function not work for $x = 0$?

In [17]:
sinc(0)

ZeroDivisionError: float division by zero

Apparently not.

-----------------

### Challenge: fix the `sinc` function so that it works for $x = 0$

-----------------

Let's look at a couple more examples of silent errors.

----------------
### Challenge: find (and fix) the issue in this code block

Can you find the issue with the following block of code?

In [18]:
squares = []
s = 0
for n in range(1, 10):
    squares.append(s)
    s = n*n
    print(n, s)

sum_s = sum(squares)
print("sum of squares", sum_s)

1 1
2 4
3 9
4 16
5 25
6 36
7 49
8 64
9 81
sum of squares 204


---------------------

----------------
### Challenge: is there an issue with this calculation of a free-fall trajectory?

This block of code is supposed to calculate the position of an object in free fall as a function of time and store the time points (in 1 second intervals) and positions in two arrays for plotting later. Is there something wrong with it?

In [None]:
g = -9.81
t, h, tmax = 1., 0., 10.

times, positions = [], []
while t <= tmax:
    x = 0.5 * g * t * t
    times.append(t)
    positions.append(x)
    t += h

print(times)
print(values)

If you identify a problem, fix it.

----------------------

## Defensive programming

Now that we've covered a few different types of common bugs that can show up in code we write, we're going to learn strategies for minimizing the frequency with which we write buggy code. These will help us to write code that works and keeps on working as we make changes to it.

### Using assertions

Learning objectives:
* Assertions are one line tests embedded in code.
* Assertions can halt execution if something unexpected happens.
* Assertions are the building blocks of tests.

The ``assert`` Python keyword tests the truth value of what follows, and if what follows evaluates to ``False``, then it raises an ``AssertionError`` (a type of ``Exception``, which we'll get to):

In [1]:
assert True == True

In [2]:
assert True == False, "True is not False"

AssertionError: True is not False

We can follow up an assertion statement with a string giving what should be printed when the assertion rings false. We'll see how this is useful below.

#### Assertions as input enforcement

A common use of assertions is to check that the inputs of a function 
meet the expectations of that function; that is, that they are valid
given the assumptions that function needs to make about them. If we
write a simple function that gets the mean of a list of values:

In [3]:
def mean(num_list):
    return sum(num_list)/len(num_list)

In [4]:
mean([4, 2, 3])

3.0

Feeding it an empty list gives:

In [5]:
mean([])

ZeroDivisionError: division by zero

Let's add some assertions. We'll also add in a docstring to make clear
what we want this function to take, and what we want it to return.

In [6]:
def mean(num_list):
    """Return the mean of a list of numbers.
    
    Parameters
    ----------
    num_list : list
        List of values to get arithmetic mean for.
        
    Returns
    -------
    float
        Arithmetic mean.
    """
    assert len(num_list) > 0, "Cannot take an empty list"
    assert all([isinstance(i, (float, int)) for i in num_list]), "List must only have numbers"
    
    return sum(num_list)/len(num_list)

Now we have two assertions that check:
1. the list given is not empty.
2. all elements in the list are either floating point numbers or integers.

So now when we do:

In [7]:
mean([])

AssertionError: Cannot take an empty list

we get back an error message that's more meaningful to us than "cannot divide by zero". We can change how we're using the function appropriately, and didn't have to dig into the implementation to understand what went wrong.

Also, if we give it:

In [8]:
mean([42, "a word"])

AssertionError: List must only have numbers

we see that our second assertion catches cases where the list has non-numbers, and complains clearly why it fails.

### Using exceptions as flexible, catchable assertions

Assertions are useful for input-checking, but in production code 
it's generally better to explicitly use **exceptions**. 
An ``AssertionError`` is one type of exception, but we can use
others to greater effect.

Let's change our ``mean`` function to raise a ``ValueError`` instead
of an ``AssertionError`` when we give an empty list:

In [9]:
def mean(num_list):
    if len(num_list) == 0:
        raise ValueError("The arithmetic mean of no elements makes no sense")
    return sum(num_list)/len(num_list)

In [10]:
mean([1, 2, 3])

2.0

In [11]:
mean([])

ValueError: The arithmetic mean of no elements makes no sense

Raising a ``ValueError`` more [clearly defines the type of error
indicated](https://docs.python.org/3/library/exceptions.html#ValueError): we gave a list, but it was empty. We'll see how using different types of exceptions allows us to write more flexible code below.

#### Catching exceptions

Let's rewrite our ``mean`` function yet again, only this time we'll put
the meat of the function--the actual calculation--inside a ``try-except`` block. 

In [12]:
def mean(num_list):
    try:
        return sum(num_list)/len(num_list)
    except ZeroDivisionError:
        return 0

Instead of raising an exception giving ``mean`` an empty list, we could
catch the ``ZeroDivisionError`` raised by the calculation and simply
return ``0``, which sounds sensible. It's up to us how our function
behaves, but choosing sensible behavior is a good idea.

In [13]:
mean([])

0

Can we do something similar for the case where the list has non-number
elements? Yes.

In [14]:
def mean(num_list):
    try:
        return sum(num_list)/len(num_list)
    except ZeroDivisionError:
        return 0
    except TypeError:
        raise TypeError("Cannot get mean for non-number elements")

In [15]:
mean([1, "nothing"])

TypeError: Cannot get mean for non-number elements

In this case we caught the ``TypeError`` that results from getting the
sum of a list with non-number elements, then we raised another
``TypeError`` (since this is a good choice of exception for this
issue) with a more descriptive message that tells us what is wrong.

### Defining expected behavior with unit tests

Assertions and exceptions give mechanisms for checking that functions
are working as expected at runtime, with the inputs given to them
at runtime. But these don't tell us how the function will behave
for inputs that it *might* get elsewhere at other times. How can we
ensure that our function behaves as we expect for different assortments
of input?

We can write **unit tests**.

Let's place the last version of our ``mean`` function into a module
called ``mean.py``; open your favorite text editor and make your ``mean.py`` look like this:

In [28]:
%cat mean.py

def mean(num_list):
    try:
        return sum(num_list)/len(num_list)
    except ZeroDivisionError:
        return 0
    except TypeError:
        raise TypeError("Cannot get mean for non-number elements")


We can import this mean function directly, and use it as before:

In [29]:
from mean import mean

In [30]:
mean([5])

5.0

In [31]:
mean([])

0

Now make a file in the same directory called ``test_mean.py`` in your
favorite text editor, and put a single function called ``test_ints``
inside. Don't forget to import your ``mean`` function at the top
of this new module:

In [33]:
%cat test_mean.py

from mean import mean

def test_ints():
    num_list = [1, 2, 3, 4, 5]
    obs = mean(num_list)

    assert obs == 3


Congratulations! You've written your first unit test. This simple test
function takes the mean of a known list of numbers, and then at the
end asserts that the result is what we know whould be the answer. We
can run this using ``py.test`` in the shell from the same directory:

In [36]:
%%bash
py.test

platform linux -- Python 3.5.1, pytest-2.9.1, py-1.4.31, pluggy-0.3.1
rootdir: /home/alter/Library/becksteinlab/ComputationalPhysics494/PHY494-resources/14_testing, inifile: 
collected 1 items

test_mean.py .



[py.test](http://pytest.org/) is one widely-used and 
actively-developed testing framework. It can do way more than we are
going to use here, but it makes complex sets of tests much easier to
build than otherwise.

If we add more tests to our test suite, such that we now have:

In [37]:
%cat test_mean.py

from mean import mean

import pytest

def test_ints():
    num_list = [1, 2, 3, 4, 5]
    obs = mean(num_list)

    assert obs == 3

def test_not_numbers():
    values = [2, "lolcats"]
    with pytest.raises(TypeError):
        out = mean(values)

def test_zero():
    num_list = [0, 2, 4, 6]
    assert mean(num_list) == 3

def test_empty():
    assert mean([]) == 0

def test_single_int():
    with pytest.raises(TypeError):
        mean(1)



Note the special **context manager** ``pytest.raises`` used to assert that the statement that follows raises a particular exception. These are useful for making sure our function gives the expected response for
gnarly inputs.

We see that ``py.test`` finds these as well:

In [38]:
%%bash
py.test

platform linux -- Python 3.5.1, pytest-2.9.1, py-1.4.31, pluggy-0.3.1
rootdir: /home/alter/Library/becksteinlab/ComputationalPhysics494/PHY494-resources/14_testing, inifile: 
collected 5 items

test_mean.py .....



For each test that passes, a `.` is printed. If a test failed, we'd get an `F` in that place instead, and ``py.test`` would tell us where the
failure occurred. Let's change our function so that it returns ``None``
instead of ``0`` for an empty list to see if this affects our tests:

In [39]:
%cat mean.py

def mean(num_list):
    try:
        return sum(num_list)/len(num_list)
    except ZeroDivisionError:
        return None
    except TypeError:
        raise TypeError("Cannot get mean for non-number elements")


In [40]:
%%bash
py.test

platform linux -- Python 3.5.1, pytest-2.9.1, py-1.4.31, pluggy-0.3.1
rootdir: /home/alter/Library/becksteinlab/ComputationalPhysics494/PHY494-resources/14_testing, inifile: 
collected 5 items

test_mean.py ...F.

__________________________________ test_empty __________________________________

    def test_empty():
>       assert mean([]) == 0
E       assert None == 0
E        +  where None = mean([])

test_mean.py:21: AssertionError


This small change was caught by one of our tests, and we see exactly
where. This is where unit tests become immensely useful: 
if ``mean`` was part of a larger codebase and we decided to make a tiny
change to it, we see immediately that this change affects the behavior
expected of it by our tests. We can the decide if we **want** the new
behavior (so we'd change the tests) or if the new behavior was a
mistake (so we'd fix the function). 

Without tests, it is very hard to ensure a large amount of code
continues behaving as we expect while we (and maybe others) keep
working at it. The more tests we have, the more well-defined the
expected behavior of the codebase, and the more time you will save
scratching your head later.