# Week 5 Tutorial: Debugging and Testing in Python

## POP77001 Computer Programming for Social Scientists

##### Module website: [bit.ly/POP77001](https://bit.ly/POP77001)

## Debugging with `print()`

- `print()` statement can be used to check the internal state of a program during evaluation
- Can be placed in critical parts of code (before or after loops/function calls/objects loading) 
- For harder cases switch to Python debugger (`pdb`)

## Exercise: Debug function for Pearson correlation

- See the function for calculating Pearson correlation below
- What do you think is the correlation coefficient between lists [1, 2, 3, 4, 5] and [-3, -5, -7, -9, -11]?
- Check the output of the function, is it correct?
- Find and fix any problems that you encounter

In [1]:
def pearson(array_x, array_y):
    """Calculates Pearson correlation coefficient
    
    Takes two ordered collections as an input
    Returns the result as floating 
    """
    
    mean_x = sum(array_x)/len(array_x)
    mean_y = sum(array_y)/len(array_y)
    
    numerator = sum([(x - mean_x) * (y - mean_y) for x,y in zip(array_x, array_y)])
    denominator = (
        sum([(x - mean_x)**2 for x in array_x])**1/2 *
        sum([(y - mean_y)**2 for y in array_y])**1/2
    )
    
    r = numerator/denominator
    
    # Make sure that floating point arithmetic does not
    # produce absolute values larger than 1
    r = max(min(r, 1.0), -1.0)
    
    return r

## Python debugger (`pdb`)

- `pdb` is an interactive source code debugger for Python
- It lets you
    - Step through the function at its execution time
    - Check the internal state as well as 
    - Run run arbitrary code in that context
    - Set breakpoints when execution pauses for inspection
    
Extra: [Python documentation on pdb](https://docs.python.org/3/library/pdb.html)

In [2]:
# 'pdb' exists as a Python module
import pdb

## Running `pdb`

- `pdb.run()` method allows to run a code snippet passed as a string under debugger control

In [3]:
pdb.run('x = [1, 2, 3]; n = len(x)')

> [0;32m<string>[0m(1)[0;36m<module>[0;34m()[0m

ipdb> n
--Return--
None
> [0;32m<string>[0m(1)[0;36m<module>[0;34m()[0m

ipdb> x
[1, 2, 3]
ipdb> len(x)
3
ipdb> n


## Common debugger commands

| Command      | Description                                                  |
|:-------------|:-------------------------------------------------------------|
| `n(ext)`     | Execute next line of the current function                    |
| `s(tep)`     | Execute next line, stepping inside the function (if present) |
| `c(ontinue)` | Continue execution, only stop when breakpoint in encountered |
| `r(eturn)`   | Continue execution until function returns                    |
| `a(rgs)`     | Print the argument list of the current function              |
| `q(uit)`     | Quit from the debugger, executed program is aborted          |

Extra: [Full list of pdb commands](https://docs.python.org/3/library/pdb.html#debugger-commands)

## Using `pdb` to debug a function

- `pdb.runcall()` method allows to run and step through the function


```
pdb.runcall(<function_name>, <*args>, <**kwargs>)
```

## Exercise: Use `pdb` to debug a function

- Let's look again at the problematic `calculate_median` function from the lecture
- Run `pdb` debugger and step through it
- While inside the function print out the values of m and the result of summation 
- Fix the bugs

In [4]:
def calculate_median(lst):
    lst.sort()
    n = len(lst)
    m = (n + 1)//2
    if n % 2 == 1:
        median = lst[m]
    else:
        median = sum(lst[(m-1):m])/2
    return median

In [None]:
pdb.runcall(calculate_median, [1, 2, 3])

## Week 5 Exercise (unassessed)

- Create tests for `pearson()` and `calculate_median()` functions that
    - Test whether the sign of a calculated pearson correlation is correct
    - Test whether median calculated on an array with even number of elements has an absolute difference of no more than 0.0001 from the correct answer