[![Py4Life](https://raw.githubusercontent.com/Py4Life/TAU2016/gh-pages/img/Py4Life-logo-small.png)](http://py4life.github.io/TAU2016/)

## Lecture 5 - 30.3.2016
### Last update: 30.3.2016
### Tel-Aviv University / 0411-3122 / Spring 2016

# Previously

- Modules
- Files I/O
- The CSV format
- File parsing
- Regular expression

# Today

- Bugs
- Debugging
- Tests and assertions
- Exceptions

# Testing & Debugging

>  Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. __ â€” B. W. Kernighan and P. J. Plauger, [The Elements of Programming Style](http://www.amazon.com/gp/product/0070342075?ie=UTF8&tag=catv-20&linkCode=as2&camp=1789&creative=390957&creativeASIN=0070342075).__

> Code that cannot be tested is flawed.

> Why do we never have time to do it right, but always have time to do it over?

> Fast, good, cheap: pick any two. - [Project management triangle](http://en.wikipedia.org/wiki/Project_management_triangle)

![bugs](http://assets.nydailynews.com/polopoly_fs/1.1064084!/img/httpImage/image.jpg_gen/derivatives/landscape_635/bugs01-web.jpg)

# Bug categories

## Errors

`SyntaxError`: Illegal Python code. This error will appear when the program is preparing to run.

In [None]:
x = . 5

Often the error is precisely indicated, as above, but sometimes you have to search for the error on the previous line.

`IndentationError`: a line in the code has bad indentation.

In [None]:
a = 7
 b = 5

This can be tricky at times, because sometimes the indentation seems OK but Python still complains -- this is usually because the indentation is in spaces when it needs to be in tabs, or vice versa.

The next sample of errors are _runtime_ errors - they only appear when the program is running. 
Therefore, they can be elusive: these bugs don't always appear because they depend on variable values and program flow.

`NameError`: A name (variable, function, module) is not defined.

In [None]:
b = a + 2

__`a` isn't defined since the code in which it was defined earlier crashed__

Look at the _Traceback_ to see where in the program the error occurs. The most common reasons for a `NameError` are

- a misspelled name
- a variable that is not initialized
- a function that you have forgotten to define
- a module that is not imported

Working in the IPython Notebook can introduce such errors when you forget to run a cell and use the variables from that cell in another cell.

`TypeError`: An object of wrong type is used in an operation.

In [None]:
n = 1
x = '2'
product = (1.0/(n+1))*(x/(1.0+x))**(n+1)

Print out objects and their types (here: `print(x, type(x), n, type(n))`), and you will most likely get a surprise. The reason for a `TypeError` is often far away from the line where the `TypeError` occurs.

`ValueError`: An object has an illegal value.

In [None]:
print(x, type(x))
print(n, type(n))

In [None]:
import math
a = 5
b = 7*a
c = (a + b)*(a-2)
d = c/12
e = d-11
z = math.sqrt(e)
print(z)

Print out the value of objects that can be involved in the error (here: `print(e)`).

In [None]:
import math
a = 5
b = 7*a
c = (a + b)*(a-2)
d = c/12
e = d-11
print(e)
z = math.sqrt(e)
print(z)

`IndexError`: An index in a list, tuple, or a string is too large.

In [None]:
values = [1,27,33,46,52]
n = 0
for i in range(len(values)):
    n += values[i+1]

Print out the length of the list, and the index if it involves a variable (here: `print(len(values), i+1)`).

In [None]:
values = [1,27,33,46,52]
n = 0
for i in range(len(values)):
    print(len(values), i+1)
    n += values[i+1]

`KeyError`: this is `IndexError`'s cousin; it is raised when looking up non-existant keys in a `dict`. 
Remember that you can use `dict.get(key, default_value)` to prevent this error.

In [None]:
d = {}
print(d['a'])

In [None]:
print(d.get('a'))

In [None]:
print(d.get('a',0))

## Exercise 1 - Errors

Let's solve the following bugs. Each notebook cell has a single program with at least one bug that may either cause an error or make the program incorrect (producing wrong results).

**Fix the code.**

In [None]:
x = '7'
y = 8
z = x + y
print(z)

In [None]:
x = 1
y = 0
while x < 4:
    y += x
print(y)

In [None]:
switch = 'on'
if switch = 'off':
    print('go home')

In [None]:
range()

In [None]:
range(2.5)

In [None]:
range(2,3,0)

In [None]:
counter = 0
while counter < 5:
    print('hello')
    counter += 1
while counter < 5:
    print('bye')
    counter += 1

## Logical bugs

Some bugs don't cause errors. These are risky because we can easily miss them. For example, this function for the [sum of a geometric series](http://en.wikipedia.org/wiki/Geometric_series#Formula):

$$
\sum_{k>=1}{a r^k} = \frac{a}{1-r}
$$

In [None]:
def geosum(a, r):
    return a/(1 - r)

This works well for some values, causes errors for other values, and gives incorrect results for yet other values:

In [None]:
print("Correct:")
print(geosum(1,0), 1)
print(geosum(1,0.5), 2)
print(geosum(0,0.5), 0)
print(geosum(0,2), 0)

print("Incorrect:")
print(geosum(1,2), "\u221e")
print(geosum(-1,2), "-\u221e")
print(geosum(2,-1), "NaN")

print("Error:")
print(geosum(1,1))

For this kind of bugs we have to write **tests**. 

The simplest way to do this is using [`assert` statements](https://docs.python.org/3/reference/simple_stmts.html#the-assert-statement). 
The `assert` command will check a statement and if it is `False` it will raise an `AssertionError`. You can also attach a message explaining why the assertion the failed:

In [None]:
assert geosum(1,0) == 1, "Bad value"
assert geosum(1,0.5) == 2, "Bad value"
assert geosum(0,0.5) == 0, "Bad value"
assert geosum(0,2) == 0, "Bad value"

assert geosum(1,2) == None, "Bad value"
assert geosum(-1,2) == None, "Bad value"
assert geosum(2,-1) == None, "Bad value"
assert geosum(1,1) == None, "Bad value"

Let's fix the function:

In [None]:
def geosum(a, r):
    if a == 0:
        return 0.0  # always return same type 
    elif abs(r) >= 1:
        return None # formula only defined for |r|<1
    return a/(1 - r)

In [None]:
assert geosum(1,0) == 1, "Bad value"
assert geosum(1,0.5) == 2, "Bad value"
assert geosum(0,0.5) == 0, "Bad value"
assert geosum(0,2) == 0, "Bad value"

assert geosum(1,2) == None, "Bad value"
assert geosum(-1,2) == None, "Bad value"
assert geosum(2,-1) == None, "Bad value"
assert geosum(1,1) == None, "Bad value"

There are more sophisticated ways to write tests. 
The [unittest](https://docs.python.org/3/library/unittest.html) module is a good starting point and [nose](https://nose.readthedocs.org/en/latest/) is _nicer testing for Python_.

## Exercise 2 - test

Below is a function that calculates the length of the largest side of a right triangle given the lengths of the other two sides using the [Pythagorean theorem](http://en.wikipedia.org/wiki/Pythagorean_theorem):

$$
a^2 + b^2 = c^2
$$

![Pythagorean theorem](http://upload.wikimedia.org/wikipedia/commons/thumb/d/d2/Pythagorean.svg/265px-Pythagorean.svg.png)

In [None]:
def pythagoras(a,b):
    return math.sqrt(a**2 + b**2)

Write a series of assertions to test the function.

In [None]:
assert(pythagoras(3,4)==5)
import math
assert(pythagoras(1,2)==math.sqrt(5))

## Try and except

Errors (also called _exceptions_) can be caught and handled, if you know how to handle them.

For example, trying to open a file that does not exist gives a `FileNotFoundError`:

You can catch the error using a `try-except` and either recover from the error (if you can) or handle it differently. For example, we can alert the user on the problem without the "ugly" error:

#### Exception: `FileNotFoundError` trying to open non-existing file

In [None]:
filename = "myfile.txt"
with open(filename) as f:
    print(f.read())

#### Catch with `try`-`except`

In [None]:
filename = "myfile.txt"
try:
    with open(filename) as f:
        print(f.read())
except FileNotFoundError:
    print("File",filename,"not found, please try a different filename")

#### Exception: `ValueError` on parsing a number

In [None]:
number = input("Give me a number please: ")
number = int(number)
print(number)

#### Catch with `try`-`except`

In [None]:
number = input("Give me a number please: ")
try:
    number = int(number)
except ValueError:
    print("I asked for a number and you gave me:", number)

## Exercise - *Sabotage* and protein mass

Here's a nice little program that calculates the mass of a protein given the amino acid sequence of the protein.

In [None]:
from urllib import request
request.urlretrieve("https://raw.githubusercontent.com/Py4Life/TAU2015/master/aa_weights.txt", "aa_weights.txt")

In [None]:
with open("aa_weights.txt") as f:
    weights = {}
    for line in f:
        aa,w = line.strip().split()
        w = float(w)
        weights[aa] = w
print(weights)

In [None]:
def protein_mass(sequence):
    mass = 0
    for aa in sequence:
        if aa not in weights:
            raise ValueError("Input sequence contains an illegal aa: %s" % aa)
        mass += weights[aa]
    return mass

In [None]:
seq = 'SKADYEK'
assert round(protein_mass(seq), 3) == 821.392
print("Success")

Open the notebook on your computer and sabotage the program by hiding exactly 5 bugs in the code.

Now, change seats with a partner and find the bugs that your partner hid in the code.

The problem protein mass problem appears in [Rosalind](http://rosalind.info/problems/prtm/). 
The *Sabotage* exercise is burrowed from a post in the [Teach Computing](https://teachcomputing.wordpress.com/2013/11/23/sabotage-teach-debugging-by-stealth/) blog by [Alan O'Donohoe](https://twitter.com/teknoteacher).

# References

- [Debugging in Python](http://hplgit.github.io/teamods/debugging/debug.html) by Hans Petter Langtangen. Some of the material here is borrowed or influenced from this wonderful resource. Check it out for more debugging tips, examples and methods.

- [Sabotage: Teach Debugging By Stealth](https://teachcomputing.wordpress.com/2013/11/23/sabotage-teach-debugging-by-stealth/) by Alan O'Donohoe

## Fin
This notebook is part of the _Python Programming for Life Sciences Graduate Students_ course given in Tel-Aviv University, Spring 2016.

The notebook was written using [Python](http://pytho.org/) 3.5 and [IPython](http://ipython.org/) 3.1.

The code is available at https://github.com/Py4Life/TAU2016/blob/master/lecture5.ipynb.

The notebook can be viewed online at http://nbviewer.ipython.org/github/Py4Life/TAU2016/blob/master/lecture5.ipynb.

This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.

![Python logo](https://www.python.org/static/community_logos/python-logo.png)