Skip to content

Commit

Permalink
gdb special command to print tensors (#54339)
Browse files Browse the repository at this point in the history
Summary:
This is something which I wrote because it was useful during my debugging sessions, but I think it might be generally useful to other people as well so I took the liberty of proposing an official `pytorch-gdb` extension.

`pytorch-gdb` is a gdb script written in python. Currently, it contains only one command: `torch-tensor-repr`, which prints a human-readable repr of an `at::Tensor` object. Example:
```
Breakpoint 1, at::native::neg (self=...) at [...]/pytorch/aten/src/ATen/native/UnaryOps.cpp:520
520     Tensor neg(const Tensor& self) { return unary_op_impl(self, at::neg_out); }
(gdb) # the default repr of 'self' is not very useful
(gdb) p self
$1 = (const at::Tensor &) 0x7ffff72ed780: {impl_ = {target_ = 0x5555559df6e0}}
(gdb) torch-tensor-repr self
Python-level repr of self:
tensor([1., 2., 3., 4.], dtype=torch.float64)
```

The idea is that by having an official place where to put these things, `pytorch-gdb` will slowly grow other useful features and make the pytorch debugging experience nicer and faster.

Pull Request resolved: #54339

Reviewed By: bdhirsh

Differential Revision: D27253674

Pulled By: ezyang

fbshipit-source-id: dba219e126cc2fe66b2d26740f3a8e3b886e56f5
  • Loading branch information
antocuni authored and facebook-github-bot committed Mar 23, 2021
1 parent 583c4bf commit 21a9a93
Show file tree
Hide file tree
Showing 4 changed files with 185 additions and 0 deletions.
14 changes: 14 additions & 0 deletions .gdbinit
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# automatically load the pytoch-gdb extension.
#
# gdb automatically tries to load this file whenever it is executed from the
# root of the pytorch repo, but by default it is not allowed to do so due to
# security reasons. If you want to use pytorch-gdb, please add the following
# line to your ~/.gdbinit (i.e., the .gdbinit file which is in your home
# directory, NOT this file):
# add-auto-load-safe-path /path/to/pytorch/.gdbinit
#
# Alternatively, you can manually load the pytorch-gdb commands into your
# existing gdb session by doing the following:
# (gdb) source /path/to/pytorch/tools/gdb/pytorch-gdb.py

source tools/gdb/pytorch-gdb.py
63 changes: 63 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
- [Use CCache](#use-ccache)
- [Use a faster linker](#use-a-faster-linker)
- [C++ frontend development tips](#c-frontend-development-tips)
- [GDB integration](#gdb-integration)
- [CUDA development tips](#cuda-development-tips)
- [Windows development tips](#windows-development-tips)
- [Known MSVC (and MSVC with NVCC) bugs](#known-msvc-and-msvc-with-nvcc-bugs)
Expand Down Expand Up @@ -735,6 +736,68 @@ framework, which you can read up about to learn how to configure the test runner
submitting a new feature, we care very much that you write appropriate tests.
Please follow the lead of the other tests to see how to write a new test case.
### GDB integration
If you are debugging pytorch inside GDB, you might be interested in
[pytorch-gdb](tools/gdb/pytorch-gdb.py). This script introduces some
pytorch-specific commands which you can use from the GDB prompt. In
particular, `torch-tensor-repr` prints a human-readable repr of an at::Tensor
object. Example of usage:
```
$ gdb python
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04) 9.2
[...]
(gdb) # insert a breakpoint when we call .neg()
(gdb) break at::native:neg
No source file named at::native.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (at::native:neg) pending.
(gdb) run
[...]
>>> import torch
>>> t = torch.tensor([1, 2, 3, 4], dtype=torch.float64)
>>> t
tensor([1., 2., 3., 4.], dtype=torch.float64)
>>> t.neg()
Breakpoint 1, at::native::neg (self=...) at [...]/pytorch/aten/src/ATen/native/UnaryOps.cpp:520
520 Tensor neg(const Tensor& self) { return unary_op_impl(self, at::neg_out); }
(gdb) # the default repr of 'self' is not very useful
(gdb) p self
$1 = (const at::Tensor &) @0x7ffff72ed780: {impl_ = {target_ = 0x5555559df6e0}}
(gdb) torch-tensor-repr self
Python-level repr of self:
tensor([1., 2., 3., 4.], dtype=torch.float64)
```
GDB tries to automatically load `pytorch-gdb` thanks to the
[.gdbinit](.gdbinit) at the root of the pytorch repo. Howevever, auto-loadings is disabled by default, because of security reasons:
```
$ gdb
warning: File "/path/to/pytorch/.gdbinit" auto-loading has been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load".
To enable execution of this file add
add-auto-load-safe-path /path/to/pytorch/.gdbinit
line to your configuration file "/home/YOUR-USERNAME/.gdbinit".
To completely disable this security protection add
set auto-load safe-path /
line to your configuration file "/home/YOUR-USERNAME/.gdbinit".
For more information about this security protection see the
"Auto-loading safe path" section in the GDB manual. E.g., run from the shell:
info "(gdb)Auto-loading safe path"
(gdb)
```
As gdb itself suggests, the best way to enable auto-loading of `pytorch-gdb`
is to add the following line to your `~/.gdbinit` (i.e., the `.gdbinit` file
which is in your home directory, **not** `/path/to/pytorch/.gdbinit`):
```
add-auto-load-safe-path /path/to/pytorch/.gdbinit
```
## CUDA development tips
If you are working on the CUDA code, here are some useful CUDA debugging tips:
Expand Down
52 changes: 52 additions & 0 deletions tools/gdb/pytorch-gdb.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
import gdb
import textwrap

class DisableBreakpoints:
"""
Context-manager to temporarily disable all gdb breakpoints, useful if
there is a risk to hit one during the evaluation of one of our custom
commands
"""

def __enter__(self):
self.disabled_breakpoints = []
for b in gdb.breakpoints():
if b.enabled:
b.enabled = False
self.disabled_breakpoints.append(b)

def __exit__(self, etype, evalue, tb):
for b in self.disabled_breakpoints:
b.enabled = True

class TensorRepr(gdb.Command):
"""
Print a human readable representation of the given at::Tensor.
Usage: torch-tensor-repr EXP
at::Tensor instances do not have a C++ implementation of a repr method: in
pytoch, this is done by pure-Python code. As such, torch-tensor-repr
internally creates a Python wrapper for the given tensor and call repr()
on it.
"""
__doc__ = textwrap.dedent(__doc__).strip()

def __init__(self):
gdb.Command.__init__(self, 'torch-tensor-repr',
gdb.COMMAND_USER, gdb.COMPLETE_EXPRESSION)

def invoke(self, args, from_tty):
args = gdb.string_to_argv(args)
if len(args) != 1:
print('Usage: torch-tensor-repr EXP')
return
name = args[0]
with DisableBreakpoints():
res = gdb.parse_and_eval('torch::gdb::tensor_repr(%s)' % name)
print('Python-level repr of %s:' % name)
print(res.string())
# torch::gdb::tensor_repr returns a malloc()ed buffer, let's free it
gdb.parse_and_eval('(void)free(%s)' % int(res))

TensorRepr()

56 changes: 56 additions & 0 deletions torch/csrc/utils.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -252,3 +252,59 @@ void THPPointer<THPStorage>::free() {
}

template class THPPointer<THPStorage>;

namespace torch { namespace gdb {
/* ~~~ misc debugging utilities ~~~
*
* torch::gdb::* functions are NOT meant to be called by general pytorch code,
* but only from within a gdb session. As such, utils.h does not contain any
* declaration for those.
*/

// This is a helper needed by the torch-tensor-repr gdb command.
// Return an human-readable representation of the given Tensor. The resulting
// string is stored into a malloc()ed buffer. The caller is responsible to
// free() it. We use malloc() instead of new[] because it's much easier to
// call free than delete[] from withing gdb.
// Currently the code for computing the repr of a tensor is written in Python,
// so we need to wrap the Tensor into a Python object first.
char *tensor_repr(at::Tensor tensor) {
PyGILState_STATE gil = PyGILState_Ensure();
PyObject *pytensor = NULL;
PyObject *repr = NULL;
Py_ssize_t bufsize;
const char *buf = NULL;
char *result = NULL;

pytensor = THPVariable_Wrap(at::Tensor(tensor));
if (!pytensor)
goto error;
repr = PyObject_Repr(pytensor);
if (!repr)
goto error;
buf = PyUnicode_AsUTF8AndSize(repr, &bufsize);
if (!buf)
goto error;
result = static_cast<char*>(malloc(bufsize + 1)); // account for the trailing \0
if (!result) {
fprintf(stderr, "cannot allocate memory for the result\n");
goto error;
}
strcpy(result, buf);
Py_XDECREF(pytensor);
Py_XDECREF(repr);
PyGILState_Release(gil);
return result;

error:
fprintf(stderr, "torch::gdb::tensor_repr: unexpected error\n");
if (PyErr_Occurred())
PyErr_Print();
Py_XDECREF(pytensor);
Py_XDECREF(repr);
free(result);
PyGILState_Release(gil);
return NULL;
}

}} // namespace torch::gdb

0 comments on commit 21a9a93

Please sign in to comment.