# Debugging

It is normal that programs do not work the first time they run, and even when they do run they may still not do exactly as was intended. Fixing these sorts of errors is called debugging, and your ability to debug will get better and better as you write more and more programs. Despite the importance of debugging, many courses don't provide specific guidance about how to go about it, which is surprising given that it can often be difficult to determine the exact cause of Python bugs from the error output produced. It's not possible to cover every aspect of debugging here, but there are suggestions at the end of this notebook if you want to continue your learning.

Debugging will be easier if you write good comments throughout your code, if you structure your code so that it is modular (i.e.made up of small parts that can be tested independently), and if you design and implement tests for your program.

## Common errors

The most common errors are basic syntax errors, such as a missing bracket or colon. Here are some examples to illustrate the Python output you get from such errors. If we close too many brackets, it can be relatively easy to spot (and the notebook highlights it for us):

In [None]:
x=(1+2)*(3-5))

But it can be a bit harder if we open too many brackets, since Python doesn't know where we intended to close the bracket:

In [None]:
x=((1+2)*(3-5)
y=2+5

Python helps us here to find the missing colon below, and when writing this, it wouldn't automatically indent when we create the new line under the `for` loop:

In [None]:
for i in range(5)
    print(i)

In these examples it is fairly easy to spot the mistake, and the output from Python helps us to find it. But in longer pieces of code, it may not be so easy and the bug may be earlier in the code than where Python suggests it is.

Another common problem is incorrectly indenting code:

In [None]:
for i in range(24):
    if i>20:    
    print(i)

Python will produce an error if you divide by zero. This can sometimes happen by accident, for example, the following piece of code computes the mean of a list of numbers, and it will work find provided the list of numbers is not empty:

In [None]:
def mean(numbers):
    total = 0
    count = len(numbers)
    for num in numbers:
        total += num
        count += 1
    average = total / count
    return average

print(mean([1,2,3,4,5]))

But if the list is being generated by some other code, then it may be possible to run our `mean` function on an empty list, which generates an error:

In [None]:
mean([])

In the code below there are three mistakes where we haven't defined the names used:

In [None]:
log_area = log( pi * r**2)

How could you fix these?

It's easy to incorrectly index a list/tuple:

In [None]:
x=[1,2,3]
x[3]

And it's easy to use the wrong brackets:

In [None]:
x=(1,2,3)
x(3)

## Fixing code

Even if your code runs, you still need to test that it does what you intended it to do. Here's an example where we use some typical ways to determine what is going wrong. The following code was intended to produce an empty list that would trigger an error when the mean function was called:

In [None]:
numbers=[2,4,100,34,25,90]
for n in numbers:
    if n % 2==0 or n % 5==0:
        numbers.remove(n)
print(mean(numbers))

However, it runs and no error is triggered. First note that the `numbers` list is not empty:

In [None]:
numbers

The code loops through the elements in `numbers`, looks to see if each number is divisible by 2 or 5, then removes those numbers that are, which in this case would be all the elements in `numbers`. So why have 4, 34 and 90 been left in the list? First we might check that the `if` statement is evaluating as we expect it to:

In [None]:
4%2==0 or 4%5==0

Since this returns `True` it tells us that 4 should definitely be removed from the list. So why isn't the number 4 being removed from the list? It's often helpful to insert `print` commands to see what values the variables are taking. Below we add print statements to see what numbers are being tested:

In [None]:
numbers=[2,4,100,34,25,90]
for n in numbers:
    print(f'n = {n}')
    print('Logical test result: ',n % 2==0 or n % 5==0)
    if n % 2==0 or n % 5==0:
        numbers.remove(n)
print(mean(numbers))

This shows us what has gone wrong, though it might not be obvious at first. Looking at what has been printed we see that the numbers that are left in the list are **not** assigned to `n` in the `for` loop. Thus the `remove` method has changed the order in which elements are processed in the list. After processing element 0, it then processes element 1, but it skips the number 4 because this has been shifted to element 0 after the number 2 has been removed. One take-away from this is that it can be dangerous to loop over lists that change during the loop. (Clearly Python does allow you do this, and it can be less of an issue when things are being appended to the list, but if you can avoid doing this sort of thing, you're less likely to introduce unforeseen bugs.)

When analysing the code above, we did a couple of common things. First, we tested specific examples to ensure that we understood the logic of our program. Second, we included print statements to check that the value of variables while the code ran. Both of these sorts of tests can be much harder to do the longer your code is, which is why it is important to write modular code made up of small pieces that can be tested easily. It's worth considering whether you could make your code more modular if you couldn't easily test it in the way we did above.

## Debugging packages

We debugged the last example by testing individual lines in the code. There are various packages that are available to support this sort of debugging, for example `bdb` and `pdb`, which allow you to create **break points**, i.e. points in the code that if reach cause the code to stop, allowing you to assess the value of the variables then. Other useful functionality is the ability to create **conditional break points**, which only break when a given condition is met, and to continue on from one break point to the next, for example by stepping through loops or functions. This can sometimes be a bit cumbersome using the packages above, where it is necessary to add lines of additional code. This can be handled in a much more elegant way by the IDE that you are using. Jupyter Notebook does not have this functionality, but Jupyter Labs (which also comes with Anaconda) does. Other more sophisticated IDEs, such as PyCharm, also allow you to debug more effectively.

### Extension point
Experiment with the debugging packages listed above and try out an IDE that facilitates debugging. While these can be very usefull and can speed up debugging, they are not necessary and debugging can be achieved with print statements and modular code.

## Unit testing

When you write a program, it is important to consider how you might test it to ensure that it works as intended. In computer science and software engineering, this may be referred to as *unit testing*. Ideally it should be easy to implement or even automate your tests. What tests should you run? Consider cases where you know what the result will be, or 'edge' cases where parameter take extreme values.

## Error handling

When creating tests, particularly automated ones, you may want to handle errors in a way that does not cause the entire program to crash. Python has very useful error handling functionality, which we will only briefly review here. One useful way to handle errors is using the `try` and `except` key words.

In [None]:
def test_function(f,x):
    try:
        f(x)
    except ZeroDivisionError:
        print(f'Function {f} caused a Zero Division Error.')

test_function(mean,[1,2,3])
test_function(mean,[])

Above we have defined a function that tests whether a function given an argument results in a Zero Division Error. Notice how the cell above runs even in the case where the function causes an error.

Python also has ways to force errors, for example using `raise` and `assert`. The `raise` command must be used with one of the "base exceptions", of which we have seen `ZeroDivisionError`, `NameError`, `IndexErrorz` and `TypeError`. You can customise the error message that the error prompts:

In [None]:
raise NameError('Whoops!')

A list of all the built in errors can be found in the [Python documentation](https://docs.python.org/3/library/exceptions.html#bltin-exceptions).

Another useful way to trigger an error is using the `assert` command, which you can use with a logical statement:

In [None]:
x=3
assert x%2==0, 'x must be odd!' 