# Python Debugging

As a programmer and data engineer, you will often encounter unexpected behavior from your code, and will need to be able to diagnose the problem and find a solution. That process, of finding and resolving errors (bugs) in your program, is called *debugging*. Debugging is part art and part science - sometimes correcting program errors can be surprisingly difficult. We will only discuss the basics of Python debugging here, but if you are interested to learn more, see the additional links at the bottom of this document. 

## Common Errors

First, let's take a look at some of the most commonly encountered errors.

### Syntax Error

Syntax errors result from things like missing colons or parentheses, or incorrect use of white space. Here's an example of forgetting a colon at the end of a line in a function definition, which will generate a `SyntaxError`:

In [None]:
def hello_world()
    message = "Hello, world!"
    print(message)
     return message


Syntax errors are usually easy to correct, but can sometimes be tricky. In the next example, we have corrected the problem with the missing colon, but now there is a different issue with the code:

In [None]:
def hello_world():
	message = "Hello, World!!"
	print(message)
        return message

The Python interpreter helpfully tells us the problem (inconsistent use of tabs and spaces). But this error is not quite as straightforward, because the location given in the traceback (the end of the return statement line) is not the actual location of the tab error. And it is not obvious where the mixed tabs and spaces are, because they both look the same to our eye, even though the underlying character representation of what we see is not the same. The fallback when fixing these sorts of errors is to go through your code line by line in the region of the error, delete the whitespace, and correct the indentation, keeping your use of tabs (or spaces, whichever you pick) consistent.

<br>

### Name Error

A `NameError` occurs when you try to use a variable that has not been defined yet, and thus does not exist in your Python namespace. Let's take a look:

In [None]:
# my_new_variable is not defined, so we get a NameError
print(my_new_variable)

#### Exercise:

- Assign `my_new_variable` to a value, run the code cell, then go back and run the cell above that threw the error. Often `NameErrors` in notebooks are the result of not running all the cells, or not running them in the necessary order.

### IndexError

Another common error type is the `IndexError`. This occurs when we try to access an element in a list or other object with an index that is out of range:

In [None]:
my_list = range(10)

my_list[11]

#### Exercise:

- The code below will throw an IndexError. Without changing the list itself, change the code in the 'for' loop so it prints out each item in the list.

In [None]:
num_list = [10, 2, 30, 4, 5]

for i in num_list:
    print(num_list[i])



The easiest and most common debugging method in Python does not involve any special tools - we have been using it already throughout this tutorial. Whenever your code generates an error, the Python interpreter will automatically print out a [traceback](https://realpython.com/python-traceback/) (or *stack trace*), showing you details of the error and where exactly in your code it was produced. Let's take a look at another common error (division by zero) to see:


In [None]:
a = 0
print(1/a)

We can see that the interpreter helpfully points to the exact line of code that generated the error and gives an error message that helps us diagnose the problem.

#### Exercise:

- The code cell below is trying to append the number `3` to the end of the list, but it throws a TypeError. Read the error, and then fix the code. 

In [None]:
num_list = [10, 2, 30, 4, 5]

print(num_list + 3)

## Debugging with pdb

There are a couple of ways to set a *breakpoint*, which is a point in the program where the interpreter will halt execution, allowing the programmer to check variable values or otherwise examine program behavior in more detail.

The more traditional way uses the `set_trace()` method included with pdb. The other way uses the `breakpoint()` function which is built into Python as of version 3.7.

```python
### breakpoint using set_trace()
import pdb; pdb.set_trace()

### breakpoint using breakpoint() (the newer way)
breakpoint()
```

Notice that we do not require an import, or attach a prefix to the function name, to use `breakpoint()`. Python 3.7 incorporated this function in the standard library.

Here's an example, given a file `ex_debug.py` which contains the following code:

```python
filename = __file__
breakpoint()
print(f'path = {filename}')
```

If we call the python interpreter on the ex_debug.py file by running `python ex_debug.py`, it will halt execution after the first line and enter debugging mode:

```bash
> /Users/myuser/Desktop/ex_debug.py(3)<module>()
-> print(f'path = {filename}')
(Pdb) 
```

Now, if you enter `p filename` at the prompt, the debugger will print the value of the `filename` variable:

Output:
<pre>
(Pdb) p filename
'/Users/myuser/Desktop/ex_debug.py'
(Pdb)
</pre>

This is very useful for examining programming behavior in detail. If you are trying to track down the source of an error, you can set breakpoints in your code and take a close look at what is going on in your program at a particular point. Especially with difficult-to-diagnose errors, this can be much more helpful than simply using `print()` to print out variable values, which can get quite messy.

There's another useful feature of pdb besides the ability to set breakpoints. Once you have set a breakpoint and stopped program execution, you can pass the `n` and `s` commands to execute your program one line at a time:
- `n`: continue execution until the next line and 'step over' any foreign functions (i.e., do not stop in a foreign function if one is called)
- `s`: execute the current line and 'step into' a foreign function if one is called.
This allows us to walk through our code, one line at a time. We can check our variables as we go, using `p` as before, allowing us to know the exact state of our code before the line that generates the error. This can also be very helpful if our code is not failing but _is_ producing unexpected results. We can use the debugger to see how the variables change from line to line.

Some other useful pdb commands:
- `q`: quit the debugger
- `c`: continue until the next breakpoint
- `whatis <VARIABLE NAME>`: see the data type of a variable

## VS Code Debugger

VS Code also has a useful debugging tool, which you can run by clicking on the bug icon in the left control panel, by selecting 'Start Debugging' from the 'Run' dropdown menu at the top, or by pressing the F5 key. A control panel will appear at the top, with option like 'step into', 'step over', and 'continue'. You will also see a new panel to the left, which allows you to watch how the values of variables change as you progress through your code. Two features to note:

- You can track the value of specific expressions by clicking the '+' icon in the 'Watch' box, then typing the name of the expression

- Add breakpoints simply clicking to the left of a line number; the breakpoint will show up as a red dot.

## Exception Handling

As a programmer, you should obviously strive to make your code as error-free as possible. But even in the best code, there will sometimes still be unexpected behavior that causes errors. Python, like most programming languages, provides us a mechanism for handling exceptions so that they do not crash our program. The `try` and `except` block will do this for us.

In [None]:
# demonstration of handling an exception
try:
    with open('nonexistentfile.txt', 'r') as f:
        data = f.read()
except:
    print("The file could not be found.")

print("The program keeps executing.")

Notice that instead of throwing an exception and halting, the interpreter handles the exception by executing the code in the `except` block, and continues to run. We can display an error message to the user and prompt them to specify another file. Use a `try` and `except` block anywhere you have code that you believe will throw an exception (such as file operations, OS-specific code, etc.)

We can also except specific types of errors in our `except` clause. For example, we could specifically `except` the error associated with not finding a file:


In [None]:
# demonstration of handling a FileNotFoundError exception
try:
    with open('nonexistentfile.txt', 'r') as f:
        data = f.read()
except FileNotFoundError as E:
    print(E)
    print("The file could not be found.")
print("The program keeps executing.")

#### Exercises:

1. Using some code from one of the previous exercises, perform interactive debugging using pdb:
    1. Introduce an error somewhere in your program
    1. Set a breakpoint
    1. Run the program from the interpreter
    1. When program execution hits the breakpoint, use pdb to step through the code line-by-line
    1. Print out the value of one of your variables
    1. Practice exception handling by wrapping the problematic statement in a `try... except` block

### Further Reading
- [Understanding the Python Traceback](https://realpython.com/python-traceback/)
- [pdf - The Python Debugger](https://docs.python.org/3/library/pdb.html)  (python.org)
- [Python debugging in VSCode](https://code.visualstudio.com/docs/python/debugging) (visualstudio.com)
- [Debugging and Profiling](https://docs.python.org/3/library/debug.html) (python.org)
- [Python Built-in Exceptions](https://docs.python.org/3/library/exceptions.html)
- [Python breakpoint() docs](https://www.python.org/dev/peps/pep-0553/)