Skip to content

Latest commit

 

History

History
278 lines (207 loc) · 12.8 KB

README.md

File metadata and controls

278 lines (207 loc) · 12.8 KB

Debugging Python code

This is a tutorial how to get started with debugging Python code by Christoph Deil.

We will start by looking at how Python executes code, exceptions and stack frames first, and only in the second half move on to using a debugger.

Throughout the tutorial you will find short exercises marked with 👉. Usually the solution is given directly below. Please execute the examples and try things for yourself. Interrupt with questions at any time!

This is the first time I'm giving a tutorial on this topic. Please let me know if you have any suggestions to improve!

Outline

Questions

Please help me adjust the tutorial content and speed a bit:

  • How often do you debug Python code? (never, last year, all the time)?
  • Do you know how Python executes code?
  • Do you know what exceptions and stack frames are and how to read a traceback?
  • Have you used pdb to debug from Python?
  • Have you used %debug or %run -d from ipython or Jupyter?
  • Have you used the PyCharm debugger?
  • Have you used any other Python debugging tool?

Prerequisites

This tutorial assumes that you have used a terminal, Python, ipython and Jupyter before. No experience with Python debugging is assumed, this tutorial will get you started and focus on the basics.

👉 Check that you have Python (3.5 or later), ipython and jupyter installed

$ python --version
$ ipython --version
$ jupyter --version

If you don't have this, one nice option to get it is Anaconda Python.

At the end of this tutorial, i will demo how to use the visual debugger in PyCharm.

If you want to try it out, install the free community edition of PyCharm. After installing PyCharm, you need to configure two things: your Python interpreter and execute Tools | Create Command-line Launcher.

👉 Check that you have PyCharm installed and configured.

One way to launch PyCharm is to cd into the directory for this tutorial and use the command line launcher like this:

cd python-tutorials/debug
charm .

Then right-click on analysis.py and select "run analysis". A console at the bottom should appear the output of "5.0" that we print from that script.

Note: there are many other editors and IDEs that have Python debugging support (either built in or via extensions), e.g. vim or emacs or Visual Studio Code. I'm not familiar with those, and in any case we will not have time to sort out installation / setup problems for those during the tutorial. If you want to use those, try them after the tutorial and try to re-do the examples from this tutorial.

1. When to debug?

Suspect result

When you have an incorrect or at least suspect output of your program, you have to investigate your code and data to try and pin down why the output is not what you expect. This is the worst, compared to this issue, the next two cases are nice, because it's obvious that there's a problem and you get a traceback with lots of info where to start looking.

Exception

Most of the time you will be able to read the traceback and code and figure out what is wrong and not need to start a debugger. But sometimes it's not clear and you need to 'look around'; that's when you start a debugger.

Crash

The Python process can crash. This is very rare, except if you work on or use buggy Python C extensions. To debug it you would use gdb or lldb. There are tutorials (see e.g. here), we won't cover it here.

👉 Cause Python to crash.

$ python
>>> import ctypes
>>> ctypes.string_at(1)
Segmentation fault: 11
$

2. How Python executes code

To debug Python code, you need to know how Python executes code. Have a look at the Python module point.py that defines a Point class and a distance function, and the analysis.py script that does from point import Point, distance and runs a simple analysis.

Is it clear what happens when you run python analysis.py?

The short answer is that Python executes code top to bottom, line by line. When a def or class statement is encountered, a function or type object are created in the module namespace, and import point causes the code in point.py to be executed, and when the bottom is reached, the point module is stored in the global sys.modules dict, i.e. a second import point will be a no-op, not execute the code in point.py again. You should never reload in Python, always restart the interpreter if you edit any code.

If you're not sure what Python does with def or class or import, please ask now, and we'll spend a few minutes to add print statements to show what is going on.

Another important concept you need to know about is how Python variables and function calls work. Superficially Python seems similar to C or C++, there are variables to store data and function calls create stack frames. But if you look a bit closer, you'll see that it works completely differently under the hood: in Python everything is an object, variables are entries in namespace dictionaries (globals() and locals()) pointing to objects, and Python is dynamic, i.e. happy to have an integer variable data = 999 and then on the next line change to a string variable data = 'spam'. Memory management is automatic, using a reference counting garbage collector that deletes objects with zero references. Python is both "compiled" and "interpreted": code is parsed into an ast (abstract syntax tree), compiled into bytecode, and executed by the CPython "interpreter" or "virtual machine", which executes one byte code after the other in an infinite while loop.

Most Python programmers don't know how Python works "under the hood". That's good, Python is supposed to be a high-level language that "fits your brain" and does what you intuitively expect. But you still need to have a "mental model" about variables, objects and the stack of function calls, each with it's own local namespace. The best way to learn about this is actually to step through code and see how the Python program state changes. Before we do this in a debugger, check this out:

👉 Step through the point example using http://pythontutor.com/.

If you'd like to learn more, the Whirlwind Tour Of Python is a beginner-level introduction, and Python epiphanies is a notebook explaining everything in great detail.

3. Exceptions and Tracebacks

If you use Python, you will see exceptions and tracebacks all the time.

In Python, the term "error" is often used to mean the same thing as "exception". Although of course, "error" and "exception" are very general terms and are widely used, not always referring to a Python exception (i.e. instances of the built-in Exception class or subclasses of Exception like TypeError).

Once you've learned to read a traceback, there will be many times where it's enough information for you to figure out what's wrong, and you will only start the debugger in the cases where the problem isn't clear from reading the traceback and relevant code for a minute or two.

Example: exception.py

$ python exception.py
Traceback (most recent call last):
  File "exception.py", line 14, in <module>
    main()
  File "exception.py", line 12, in main
    move_it(p)
  File "exception.py", line 8, in move_it
    point.move(42, '43')
  File "/Users/deil/code/python-tutorials/debug/point.py", line 16, in move
    self.y += dy
TypeError: unsupported operand type(s) for +=: 'int' and 'str'

Sometimes you will see a "chained exception" (also called "double inception"), where a second exception is raised inside an exception handler, i.e. an except block.

Example: chained_exception.py

$ python exception_chain.py
Traceback (most recent call last):
  File "exception_chain.py", line 4, in <module>
    a / b
ZeroDivisionError: division by zero

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "exception_chain.py", line 6, in <module>
    print('Bad data:', a, c)
NameError: name 'c' is not defined

👉 Start python and create some of the most common exceptions.

These are some very common errors you'll see a lot:

  • SyntaxError
  • IndentationError
  • NameError
  • AttributeError
  • KeyError
  • IndexError

👉 What other exceptions have you seen? Are they clear or do you have any question?

There's more info on Python exceptions here and an overview of all built-in exceptions here.

4. Debugging with pdb

Let's use the Python scripts from above (exception.py and silent_error.py) to debug with pdb, the Python debugger.

  • with print statements
  • import pdb; pdb.set_trace()
  • python -m pdb myscript.py

5. Debugging with ipython and Jupyter

Similarly how ipython and jupyter often give a nicer interactive Python environment than python, they also make it often easier to debug.

  • ipython -i
  • import IPython; IPython.embed()
  • ipdb

To learn debugging from Jupyter, let's use the Errors and Debugging notebook from the the excellent Python Data Science Handbook by Jake VanderPlas. It's freely available at https://github.com/jakevdp/PythonDataScienceHandbook and generally is a great resource to learn, so I wanted to introduce it.

👉 Clone the PythonDataScienceHandbook git repository and start Jupyter.

cd <where you have your repositories>
git clone https://github.com/jakevdp/PythonDataScienceHandbook.git
cd PythonDataScienceHandbook
jupyter notebook notebooks/01.07-Timing-and-Profiling.ipynb
  • %debug
  • %run -d
  • %pdb

6. Debugging with PyCharm

PyCharm has a great visual debugger.

Since PyCharm is visual, it's hard to write a tutorial so that you can follow along offline by just reading the info here. Note that the PyCharm folks have a tutorial on debugging with many screenshots here and a 6 min video on Youtube here.

One way to launch PyCharm is to cd into the directory for this tutorial and use the command line launcher like this:

cd python-tutorials/debug
charm .

👉 Right-click on analysis.py and select "debug analysis".

Things to remember

  • Python is a very dynamic language
    • Very powerful
    • Easy to make mistakes
    • Easy to inspect and debug
  • Use pdb from Python and ipdb from ipython and Jupyter for debugging, or a visual debugger like e.g. the one from Pycharm.
  • See the debugger commands with help or here.
  • From ipython / jupyter, the commands are %debug, %run -d and %pdb
  • Most people don't use a debugger often. There's code reading and print and IPython.embed() or just using ipython and the Jupyter notebook to see what's going on.

Going further

These are good resources to learn more: