# Debugging

> The scientific method’s central motivation is the ubiquity of error—the awareness that
> mistakes and self-delusion can creep in absolutely anywhere and that the scientist’s
> effort is primarily expended in recognizing and rooting out error.
>
> —Donoho 2009


This chapter will prepare
you to recognize, diagnose, and fix bugs using various tools and methods for
“debugging” your code. It will do so by introducing:

- When, how, and by whom bugs are encountered
- Methods of diagnosing bugs
- Interactive debugging, for diagnosing bugs quickly and systematically
- Profiling tools to quickly identify memory management issues
- Linting tools to catch style inconsistencies and typos


## Encountering a Bug

A bug may take the form of incorrect syntax, imperfect logic, an infinite loop, poor
memory management, failure to initialize a variable, user error, or myriad other
human mistakes. It may materialize as:
    
- An unexpected error while compiling the code
- An unexpected error message while running the code
- An unhandled exception from a linked library
- An incorrect result
- An indefinite pause or hang-up
- A full computer crash
- A segmentation fault
- Silent failure

The longer a bug exists undetected in a piece of trusted software, the more
dire the situation:
1. If a bug is found in testing, it can be fixed before the software is ever used.
2. If a bug is found before there are users, it can be fixed before it affects anyone
running the code.
3. If a bug is found when the code is run, it can be fixed before analysis is done on
the results.
4. If a bug is found when the results of the code are analyzed, it can be fixed before
the results are published in a journal article.
5. If a bug is found after the results are published, the paper has to be retracted.

## Print Statements

Print statements are every developer’s first debugger. Because of this, we’ll start here
—but know that they are not the best practice for effective computing and that we
will be covering better methods later in the chapter. Printing is typically a check that
asks one or both of these questions:
    
- Is the bug happening before a certain line?
- What is the status of some variable at that point?

In the following example, something about the code is causing it to “hang.” That is, it
simply seems to run forever, as if stalled:

In [None]:
def mean(nums):
    bot = len(nums)
    it = 0
    top = 0
    while it < len(nums):
        top += nums[it]
    return float(top) / float(bot)

a_list = [1, 2, 3, 4, 5, 6, 10, "one hundred"]
mean(a_list)

### Exercise: Print Statement Debugging

In the code below, use print statements to:
    
1) Determine whether the code is still running at line 5. 

2) Determine the value of `top` during the while loop.

3) Determine the value of `it` during the while loop.

In [None]:
def mean(nums):
    bot = len(nums)
    it = 0
    top = 0
    print("Still Running at line 5")
    while it < len(nums):
        top += nums[it]
        print(top)
        print(it)
    return float(top) / float(bot)

a_list = [1, 2, 3, 4, 5, 6, 10, "one hundred"]


One way to fix this loop is to increase the value of `it` incrementally 
during the loop. Another way to simplify the code is below. 
However, it still throws an error. We'll use a bigger hammer for this later.

In [None]:
def mean(nums):
    top = sum(nums)
    bot = len(nums)
    return float(top) / float(bot)

a_list = [1, 2, 3, 4, 5, 6, 10, "one hundred"]
mean(a_list)

## Interactive Debugging

Rather than littering one’s code base with print statements, interactive debuggers
allow the user to pause during execution and jump into the code at a certain line of
execution. Interactive debuggers, as their name suggests, allow the developer to query
the state of the code in an interactive way. They allow the developer to move forward
through the code execution to determine the source of the error.
Interactive debugging tools generally enable the user to:
    
- Query the values of variables
- Alter the values of variables
- Call functions
- Do minor calculations
- Step line by line through the call stack


## Debugging in Python (pdb)

We can query the situation with the mean function using pdb.

### Exercise: Set a Trace

1. Create a file containing the buggy mean code, as below.

```python
def mean(nums):
    top = sum(nums)
    bot = len(nums)
    return float(top) / float(bot)

if __name__ == "__main__":
    a_list = [1, 2, 3, 4, 5, 6, 10, "one hundred"]
    mean(a_list)
```

2. Import pdb in that file.
3. Decide where you would like to set a trace and add a line there
   that reads pdb.set_trace().
4. Save the file. If you try running it, what happens?

In [None]:
!python mean.py

## Stepping through the code
The first time using a tool, you should find out how to get help. In pdb, typing help
provides a table of available commands. Can you guess what some of them do?

In [None]:
help(pdb)

### Exercise: Step Through the Execution

1. Run your script from the last exercise.
2. Determine the expected effects of stepping through the execution
   by one line.
3. Type s. What just happened?

In [None]:
!python mean.py

## Continuing the Execution
Rather than stepping through the rest of the code one line at a time, we can continue
the execution through to the end with the continue command. The shorthand for
this command is c. 

### Exercise: Continue the Execution to Success

1. Run the script from the previous exercise.
2. Step forward one line.
3. Change one hundred to 100 in a_list.
4. Continue execution with c. What happened? Was the mean of
the list printed correctly? Why?

In [None]:
!python mean.py

What you've done was obvious in this case, but isn't always so easy.

In [None]:
def mean(nums):
    top = sum(nums)
    bot = len(nums)
    return float(top) / float(bot)

a_list = [1, 2, 3, 4, 5, 6, 10, 100]
result = mean(a_list)
print(result)

## Breakpoints

In pdb, we can set a breakpoint using the break or shorthand b syntax. We set it at a
certain line in the code by using the line number of that place in the code or the name
of the function to flag:

```
b(reak) ([file:]lineno | function)[, condition]
```

Try setting a breakpoint. 


## Profiling

Tools called profilers are used to sketch a profile of the time spent in each part of the
execution stack. Profiling goes hand in hand with the debugging process. When there
are suspected memory errors, profiling is the same as debugging. When there are
simply memory inefficiencies, profiling can be used for optimization.

In Python, cProfile is a common way to profile a piece of code. For our
fixed_mean.py file, in which the bugs have been fixed, cProfile can be executed on the
command line, as follows:

```bash
$ python -m cProfile -o output.prof fixed_mean.py
```

This gives the output file a name. It typically ends in the prof extension.
It also provides the name of the Python code file to be examined.

### Excercise: Profile the mean function

1) Move the fixed mean function into its own file, called `fixed_mean.py`.

2) Execute the cProfile command above.

3) Examine the output file. It's not very readable, so we'll need a way to view it.

In [None]:
!python -m cProfile -o output.prof fixed_mean.py

In [None]:
## Viewing the Profile with pstats

In [None]:
import pstats
p = pstats.Stats('output.prof') 
p.print_stats()

## Linting

Linting removes “lint” from source code. It’s a type of cleanup that is neither debugging
nor testing nor profiling, but can be helpful at each of these stages of the programming
process. Linting catches unnecessary imports, unused variables, potential
typos, inconsistent style, and other similar issues.

Linting in Python can be achieved with the pyflakes tool. Get it? Errors are more than
just lint, they’re flakes!

As an example of how to use a linter, recall the elementary.py file from Chapter 6. To
lint a Python program, execute the pyflakes command on it:

```bash
$ pyflakes elementary.py
```

pyflakes responds with a note indicating that a package has been imported but
remains unused throughout the code execution:

```
elementary.py:2: 'numpy' imported but unused
```

That said, most linting tools do focus on cosmetic issues. Style-related linting tools
such as flake8, pep8, or autopep8 can be used to check for errors, variable name misspelling,
and PEP8 compatibility. For more on the PEP8 style standard in Python, see
Chapter 19. To use the pep8 tool, simply call it from the command line:

```bash
$ pep8 elementary.py
```

It will analyze the Python code that you have provided and will respond with a lineby-
line listing of stylistic incompatibilities with the PEP8 standard.

### Excercise: Lint Your Project Code
Your project includes a lot of source code. 

1) Get used to pyflakes and pep8 on the command line using the elementary.py and other examples files.

2) For each python file in your project code, apply pyflakes and clean up the lint.

3) For each python file in your project code, apply pep8 and become pep8 compatible.

## Debugging Wrap-up

- Understand bugs
- Track down their cause
- Prototype solutions
- Check for success
- Use profilers and linters to optimize once you’ve fixed the bugs. 


In [1]:
from IPython.core.display import HTML
def css_styling():
    styles = open("styles/custom.css", "r").read()
    return HTML(styles)
css_styling()