# What is a bug?

Bugs are codes that result in errors or wrong results.

In [None]:
# Syntax error
x = 1; y = 2
b = x == y # Boolean variable that is true when x & y have the same value
b = 1 = 2  # Syntax error

In [None]:
# Exception - invalid operation
5/0  # Division by zero

In [None]:
# Exception - invalid operation
'44'/11  # Incompatiable types for the operation

In [None]:
# Incorrect logic
import math
x = 55
math.sin(x)**2 + math.cos(x) == 1  # Should be math.cos(x)**2

# Debugging a Program

Debugging is the process of finding and resolving the cause of an error.

# How Do We Find Bugs?

# Basics

Debugging has the following steps:

1. Detection of invalid results
2. Isolation of where the program causes the error
3. Resolution of how to change the code to eliminate the error

# Debugging

There are three main methods commonly used for debugging Python code.  In order of increasing sophistication, they are:

1. Inserting ``print`` statements
2. Injecting an IPython interpreter
3. Using a line-by-line debugger like ``pdb``

### The easiest method: print statements

Say we're trying to compute the **entropy** of a set of probabilities.  The
form of the equation is
$$
H = -\sum_i p_i \log(p_i)
$$
We can write the function like this:

In [2]:
import numpy as np
def entropy(ps):
    ps = np.asarray(ps)  # convert p to array if necessary
    items = ps * np.log(ps)
    return -np.sum(items)

Say these are our probabilities:

In [3]:
ps = np.arange(5.)
ps /= ps.sum()
ps

array([ 0. ,  0.1,  0.2,  0.3,  0.4])

In [4]:
entropy(ps)



nan

We get ``nan``, which stands for "Not a Number".  What's going on here?

Often the first thing to try is to simply print things and see what's going on.
Within the file, you can add some print statements in key places:

In [13]:
def entropy(ps):
    ps = np.asarray(ps)  # convert p to array if necessary
    items = ps * np.log(ps)
    if np.isnan(items[0]):
      import pdb; pdb.set_trace()
    return -np.sum(items)

In [16]:
ps = [0, .1, .1, .3]
entropy(ps)

  app.launch_new_instance()
  app.launch_new_instance()


> <ipython-input-13-5aecfe1583b3>(6)entropy()
-> return -np.sum(items)
(Pdb) items
array([        nan, -0.23025851, -0.23025851, -0.36119184])
(Pdb) exit()


BdbQuit: 

In [None]:
np.isnan(np.nan)

By printing some of the intermediate items, we see the problem: 0 * np.log(0) is resulting in a NaN. Though mathematically it's true that limx→0[xlog(x)]=0limx→0[xlog⁡(x)]=0, the fact that we're performing the computation numerically means that we don't obtain this result.

Often, inserting a few print statements can be enough to figure out what's going on.


### Using a Debugger

Python comes with a built-in debugger called [pdb](http://docs.python.org/2/library/pdb.html).  It allows you to step line-by-line through a computation and examine what's happening at each step.  Note that this should probably be your last resort in tracing down a bug.  I've probably used it a dozen times or so in five years of coding.  But it can be a useful tool to have in your toolbelt.

You can use the debugger by inserting the line
``` python
import pdb; pdb.set_trace()
```
within your script. Let's try this out:

In [None]:
def entropy(p):
    import pdb; pdb.set_trace()
    p = np.asarray(p)  # convert p to array if necessary
    items = p * np.log(p)
    return -np.sum(items)

entropy(p)

This can be a more convenient way to debug programs and step through the actual execution.

# Unit Tests

Python provides help with systematically testing codes via the unittest package.

Steps:

1. Put the code in a separate file with separate functions for simple features.
2. Create a unittest template.
3. Create separate functions within the template that (a) call the functions to be tested and check the results (using assert statements).