Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the debug mode #142

Merged
merged 70 commits into from
Feb 4, 2021
Merged

Add the debug mode #142

merged 70 commits into from
Feb 4, 2021

Conversation

antocuni
Copy link
Collaborator

@antocuni antocuni commented Dec 16, 2020

This PR introduces the debug mode. It contains many different pieces which are all connected together, so let me try to summarize:

  • the debug mode is implemented as a wrapper around a generic HPyContext: handles are represented by struct DebugHandle which at the moment contains just a reference to the underlying h and a couple of extra fields. In the future, we can put more stuff in it, for example a C stack trace to remember who created the handle

  • I have killed hpy/universal/src/handles.c. hpy.universal now no longer uses small integers to represent handles, but it uses directly PyObject*, although with a bit of masking/unmasking (see the comments in the code). The original reason to represent handles as indexes into an array was to support debug mode, but now that the bug mode is a fully wrapper it's no longer needed. Moreover, it makes the non-debug universal mode slightly faster and fixes hpy.universal leak handles and it's unnecessary slow #155

  • the code which implements the debug wrapper is in hpy/debug/src/debug_ctx_*.c: these files are compiled together with hpy.universal for simplicity, but in theory are fully generic. I expect that e.g. PyPy should be able to vendor these files as they are

  • to load a module in debug mode, you use hpy.universal.load(..., debug=True). I am perfectly aware that this is not user friendly nor the API that we want at the end. The idea is that PR introduces a low-level interface, and then later we will build a high-level, user-friendly interface and/or importing mechanism on top of it. For example,I can imagine having an env variable HPY_DEBUG=foo,bar to signal that we want to use the debug mode for those two modules, or maybe a configuration file, etc.

  • hpy/debug/src/_debugmod.c implements the app-level interface to the debug mode; in particular, it exports an applevel DebugHandle type a functions like get_open_handles(). The nice thing is that _debugmod.c is implemented in HPy itself :). For simplicity it is also bundled and compiled together with hpy.universal, and it is exposed as hpy.universal._debug. Again, I expect it to work unmodified on PyPy.

  • hpy/debug/*.py: this is a higher level API implemented on top of _debug. In particular, it implements the context manager LeakDetector.

  • all tests are now run three times: [cpython], [universal] and [debug].

  • I addedd ctx->h_NotImplemented (because I needed it) and ctx->h_Ellipsis (because it was the only remaining one)

There are tons of stuff to make the debug mode better, but I think it's best to do them in subsequent PRs. Incomplete list of next steps:

  1. add a way to check whether we are using a closed handle
  2. implement additional sanity check for many functions
  3. make it possible to enable the debug mode without using hpy.universal.load and integrate this mechanism with the stub .py files
  4. improve the pytest fixtures so that it automatically checks for leaks when we execute a [debug] test, and make sure that all the existing tests pass

- name: mostly used only to make it easier to distinguish the context inside
  gdb, will be useful when we introduce the debug ctx

- _private: to make it possible to store additional data on the ctx,
  implementation-defined
In particular, hpy.debug._ctx is a C module which wraps an existing
HPyContext. So far, it just prints a string when calling HPy_Add and delegates
everything else to the original context.
…ill need to distinguish it from g_debug_ctx later
… the name and the path as two normal arguments
…ommit 22a05e8)

- we need to have different code path to load universal and cpython modules,
  because soon we will introduce loading universal modules in debug mode.

- The idea of having a single function to load both universal and cpython
  modules was nice, but it ended up being too complicated and fragile; in
  particular, we had to monkey-patch sys.path, undo the insertion in
  sys.modules, etc.: the original idea was that tests should interact with the
  importing system as little as possible, to ensure isolation.

- Now we have some very minor code duplication, but the logic in each function
  is way simpler.
It is now possible to load an universal module in debug mode: in that case
hpy.universal imports hpy.debug._ctx and asks it to wrap the global
context. At the moment the only difference is that the debug ctx has a
different ctx->name, but this is enough to make test_debug.test_ctx_name
passing!

Moreover, the hpy_abi pytest fixture now can 'cpython', 'universal' or
'debug', which means that all tests will be automatically tested also in debug
mode.  In theory, we could cache the result of the compilation since
'universal' and 'debug' compiles the very same .hpy.so file, but on my machine
the compilation itself seems very fast, so I'm not sure it is worth the
complexity. We might want to revisit it in the future.
The original approach was to have a separate hpy.debug module which wraps a
generic ctx into a debug ctx; hpy.debug could in theory be distributed
independently than hpy.universal and be used both by CPython, PyPy,
GraalPython, etc.

The theory was nice but in practice it made the code and file structure
unnecessarily complex. The new approach is:

- keep hpy/debuc/src/*.c: this is still a generic implementation of the debug
  mode which wraps a generic context

- bundle the debug code inside hpy.universal

- PyPy/GraalPython/etc. will be able to vendor hpy/debug/src/*.c and compile
  it together with their own implementation of HPy
…en we run make debug; this also catches a type error in an assert which was compiled away
hodgestar and others added 13 commits December 21, 2020 15:48
…Currently the wrappers simply forwards everything to the original ctx
…ce to the debug mode.

The module is written in HPy itself so that it will be easily reusable by
other implementations if they want. However, differently than a "standard" HPy
extension it is not loaded using dlopen. Instead, it is built together with
hpy.universal and manually "imported" by it at startup.
…dify autogen accordingly. See the comments inside the code for more info
@antocuni antocuni changed the title WIP: Add the debug mode Add the debug mode Jan 29, 2021
@@ -15,7 +19,7 @@ cppcheck: cppcheck-build-dir

infer:
python3 setup.py build_ext -if -U NDEBUG | compiledb
@infer --fail-on-issue --compilation-database compile_commands.json
@infer --fail-on-issue --compilation-database compile_commands.json --report-blacklist-path-regex "hpy/debug/src/debug_ctx.c"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we perhaps leave a comment on why debug_ctx.c had to be blacklisted?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea. The full explanation is in the log message of cd8cd6e, I added a comment which redirects to it

@hodgestar
Copy link
Contributor

Question for when we implement HPY_DEBUG=1 or similar: We should probably make sure not to load hpy.universal._debugmod with a debug context, and maybe other third party libraries would like the option to not be loaded with debug either?

HPyDef_METH(new_generation, "new_generation", new_generation_impl, HPyFunc_NOARGS)
static UHPy new_generation_impl(HPyContext uctx, UHPy self)
{
HPyContext dctx = hpy_debug_get_ctx(uctx);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obtaining a new debug ctx from the uctx requires that the uctx be the same in this module as in the module being debugged, yes? Which means there essentially needs to be only one ctx per interpreter? This is already true in the CPython and PyPy (I think) implementations, but it's not currently required by HPy.

I'm not quite sure how to remove this constraint right now -- the ctx is fairly invisible from Python code, and that likely needs to stay that way.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to explain more what are the implicit and explicit assumptions in commit 6bad6c3. If it's still unclear, please comment/review/reword directly on PR #164

//DebugHandle *closed_handles; // linked list
} HPyDebugInfo;

static inline HPyDebugInfo *get_info(HPyContext dctx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this function should be called DHPy_get_info?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe, but note that it's a static inline in a private header, so it's not exposed anywhere else. I preferred to keep it short because it's used a bit everywhere

Copy link
Contributor

@fangerer fangerer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thank you!

DebugHandle *handle)
{
DebugHandleObject *dhobj;
UHPy u_result = HPy_New(uctx, u_DebugHandleType, &dhobj);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to put the DebugHandle type on the context like we do for other build-in types like List?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is also a good illustration of the slight hopping one has to do at the moment since global type objects are gone -- and the hopping is actually not too bad in simple cases.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to put the DebugHandle type on the context like we do for other build-in types like List?

no, because it's not a builtin. Why should it be different than, say, ndarray?

I guess this is also a good illustration of the slight hopping one has to do at the moment since global type objects are gone -- and the hopping is actually not too bad in simple cases.

exactly, that's the point. We need a way to have per-module "globals" at the C level, which we still don't have :(

*/

typedef struct {
HPyObject_HEAD
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we make this a pure type once the cast support branch lands?

Copy link
Collaborator Author

@antocuni antocuni Feb 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes!
Not only we "could", but we should. HPyObject_HEAD will be killed anyway, won't it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. HPyObject_HEAD is already gone in the cast support branch.

UHPy uh_args = HPy_NULL;
UHPy uh_result = HPy_NULL;

uh_fmt = HPyUnicode_FromString(uctx, "<DebugHandle 0x%x for %r>");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do with PyUnicode_FromFormat here.

Copy link
Collaborator Author

@antocuni antocuni Feb 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, although we don't have HPyUnicode_FromFormat here. And we don't want to use the Py_* version because this is supposed to be a pure-hpy module, remember it will be used also by pypy and hopefully graalpython.
I added an XXX in e2b548a

info->uctx = uctx;
info->current_generation = 0;
info->open_handles = NULL;
//info->closed_handles = NULL;
Copy link
Contributor

@hodgestar hodgestar Feb 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a plan to use closed_handles later?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. It will be used to detect when you try to use a handle which has already been closed

_HPyFunc_args_VARARGS *a = (_HPyFunc_args_VARARGS*)args;
DHPy dh_self = _py2dh(dctx, a->self);
Py_ssize_t nargs = PyTuple_GET_SIZE(a->args);
DHPy dh_args[nargs * sizeof(DHPy)];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a comment here about removing the use VLAs later?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it doesn't matter in this file?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it matters, because VLAs seems to be unsupported by VC++.
Added comments in 3756927

signature = toC(node)
rettype = toC(node.type.type)
#
def get_params():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason get_params is a nested function?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just liked it that way. It's basically an alternative way to group a bit of logic together instead of putting a comment compute the params.
And putting it outside seems overkill

// XXX this code is basically the same as found in cpython/hpy.h. We
// should probably share and/or autogenerate both versions
/* Constants */
ctx->h_None = _py2h(Py_None);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand why the ctx initialization is here inside hpymodule. Could you explain a bit?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before this PR the fields were statically initialized to things like CONSTANT_H_NONE, CONSTANT_H_TRUE etc, which were small integers defined in an enum. And the old handles.c made sure to allocate the corresponding handle at the "right place". So this code is the equivalent of what was inside the old allocate_more_handles.

It would be nice to statically initialize things like this:

struct _HPyContext_s g_universal_ctx = {
    ...,
    h_None = _py2h(Py_None),

but I don't think it is legal to call _py2h in a static initializer

Copy link
Contributor

@hodgestar hodgestar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. Woot.

I left a few comments, although they were largely me coming to grips with the code. Maybe they will result in some small things we want to change now, or some issues we want to write for the future, but I'm also happy for this PR to be merged and for us to work on things in smaller PRs from here.

@antocuni antocuni merged commit b6ada82 into master Feb 4, 2021
@antocuni
Copy link
Collaborator Author

antocuni commented Feb 9, 2021

This PR is already merged, so I am addressing the comments in #164.

Question for when we implement HPY_DEBUG=1 or similar: We should probably make sure not to load hpy.universal._debugmod with a debug context, and maybe other third party libraries would like the option to not be loaded with debug either?

remind that _debugmod is "statically loaded" by hpy.universal itself, so it is its responsibility to ensure that it's initialized with the "correct" context. But add a sanity check looks like a good idea, I added one in commit 8c492e8.

@antocuni antocuni deleted the antocuni/debug-mode branch February 9, 2021 08:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

hpy.universal leak handles and it's unnecessary slow
3 participants