Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hpy.universal leak handles and it's unnecessary slow #155

Closed
antocuni opened this issue Jan 15, 2021 · 2 comments · Fixed by #142
Closed

hpy.universal leak handles and it's unnecessary slow #155

antocuni opened this issue Jan 15, 2021 · 2 comments · Fixed by #142

Comments

@antocuni
Copy link
Collaborator

antocuni commented Jan 15, 2021

The title mentions two different problems but I am grouping them in the same issue since I think they can be solved together.
I am already working on a fix in PR #142 but I thought it was useful to report my findings here, for the future.

First problem: consider ctx_CallRealFunctionFromTrampoline:

ctx_CallRealFunctionFromTrampoline(HPyContext ctx, HPyFunc_Signature sig,
void *func, void *args)
{
switch (sig) {
case HPyFunc_NOARGS: {
HPyFunc_noargs f = (HPyFunc_noargs)func;
_HPyFunc_args_NOARGS *a = (_HPyFunc_args_NOARGS*)args;
a->result = _h2py(f(ctx, _py2h(a->self)));
return;
}

in CPython-ABI mode this is not a problem because _py2h and _h2py are just no-op casts, but in in universal mode, _py2h allocates new handles which are never freed:

HPy
_py2h(PyObject *obj)
{
if (obj == NULL) {
// Return the existing copy of HPy_NULL and don't create a new
// handle.
return HPy_NULL;
}
if (h_free_list < 0) {
allocate_more_handles();
}
Py_ssize_t i = h_free_list;
h_free_list = ((Py_ssize_t)all_handles[i]) >> 1;
all_handles[i] = obj;
return (HPy){i};
}

The proper solution is something like this:

    switch (sig) {
    case HPyFunc_NOARGS: {
        HPyFunc_noargs f = (HPyFunc_noargs)func;
        _HPyFunc_args_NOARGS *a = (_HPyFunc_args_NOARGS*)args;
        HPy h0 = _py2h(a->self);
        a->result = _h2py(f(ctx, h0));
        _hclose_nodecref(h0);
        return;
    }

However, there is a better solution: the second problem is that the current implementation of handles is unnecessarily slow. I tried the ujson and piconumpy benchmarks:

  • ujson: the universal ABI is ~15% slower than the CPython ABI
  • piconumpy: the universal ABI is ~35% slower than the CPython ABI

In PR #142 I am experimenting with a different approach for hpy.universal. In particular, _py2h and _h2py are implemented like this:

// The main reason for +1/-1 is to make sure that if people casts HPy to
// PyObject* directly, things explode.

static inline HPy _py2h(PyObject *obj) {
    if (obj == NULL)
        return HPy_NULL;
    return (HPy){(HPy_ssize_t)(obj + 1)};
}

static inline PyObject *_h2py(HPy h) {
    if HPy_IsNull(h)
        return NULL;
    return (PyObject *)(h._i - 1);
}

So, they are basically no-op casts again, and the benchmarks are much faster:

  • ujson: the universal ABI is ~0.9% slower than the CPython ABI
  • piconumpy: the universal ABI is ~6% slower than the CPython ABI

Historical note: why do we represent hpy.universal handles as indexes in a list? The original ideas was to support the debug mode, so that we could easily store extra debugging infos for each handles.

The idea that I am trying in PR #142 is different, i.e. to wrap a generic universal ctx into a debug ctx: so, debug handles are wrapper around generic opaque universal handles and the extra infos can be attached directly on the wrappers. Moreover, by doing that we pay the overhead of "heavy" handles only for the modules for which the debug mode is enabled.

EDIT: fixed the PR number

@fangerer
Copy link
Contributor

Could it be that PR #43 is the wrong reference?

@antocuni
Copy link
Collaborator Author

Could it be that PR #43 is the wrong reference?

yes, sorry! I fixed it with the correct link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants