Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

edgedb-python test fails with uvloop 0.15.0 #2222

Closed
fantix opened this issue Feb 12, 2021 · 9 comments
Closed

edgedb-python test fails with uvloop 0.15.0 #2222

fantix opened this issue Feb 12, 2021 · 9 comments
Assignees

Comments

@fantix
Copy link
Member

fantix commented Feb 12, 2021

  • EdgeDB Version: 1.0-alpha.8+dev.5432.g931252366.d20210212
  • OS Version: macOS 11.2.1 / Darwin 20.3.0

This is reproducible without edgedb/uvloop using a Cython script, which illustrates how edgedb/uvloop 0.15 run:

import contextvars


cdef class P:
    cdef object t

    def __init__(self, t):
        self.t = t

    def run(self):
        self.t.clear()
        t = self.t  # segfault: self.t is NULL


cdef class T:
    cdef:
        object _p, _run

    def __init__(self):
        self._p = P(self)
        self._run = self._p.run

    def clear(self):
        self._p = None
        self._run = None


cdef main(T tt):
    ctx = contextvars.copy_context()
    ctx.run(tt._run)


if __name__ == '__main__':
    t = T()
    main(t)
@fantix fantix self-assigned this Feb 12, 2021
@fantix
Copy link
Member Author

fantix commented Feb 12, 2021

Potential fix:

(patches uvloop)

cdef main(T tt):
    ctx = contextvars.copy_context()
    run = tt._run
    ctx.run(run)

or (patches edgedb)

    def run(self):
        t = self.t
        t.clear()

I think patching uvloop makes more sense - as ctx.run() is a C function that doesn't hold any reference to the method to run, so we should.

@elprans
Copy link
Member

elprans commented Feb 12, 2021

Is this a Cython bug then?

@fantix
Copy link
Member Author

fantix commented Feb 12, 2021

No, I don't think so - if we're calling tt._run() directly, Cython could INCREF the tt._run correctly; Cython doesn't INCREF the parameter in ctx.run(tt._run) which is also correct I believe. It's more like a question if we should patch CPython context.run() to INCREF the method before invoking, which I haven't thought through the pros and cons yet.

UPDATE: looking at some CPython implementations, e.g. list.sort(key=method), I think it is assumed that the caller should own the method, while list.sort() simply invokes the method without INCREF. I'll create a PR in uvloop to fix any context.run() calls then.

fantix added a commit to fantix/uvloop that referenced this issue Feb 13, 2021
Because `context.run()` doesn't hold reference to the callable, when
e.g. the protocol is written in Cython, the callbacks were not
guaranteed to hold the protocol reference. This PR fixes the issue by
explicitly add a reference before `context.run()` calls.

Refs edgedb/edgedb#2222
@fantix
Copy link
Member Author

fantix commented Feb 13, 2021

With more debugging, this looks more like a Cython bug to me now. I'll update with more details

@elprans
Copy link
Member

elprans commented Feb 13, 2021

@fantix, check upstream master branch/issues too. It is likely that they've fixed this already, just haven't released.

fantix added a commit to fantix/uvloop that referenced this issue Feb 15, 2021
Because `context.run()` doesn't hold reference to the callable, when
e.g. the protocol is written in Cython, the callbacks were not
guaranteed to hold the protocol reference. This PR fixes the issue by
explicitly add a reference before `context.run()` calls.

Refs edgedb/edgedb#2222
fantix added a commit to fantix/uvloop that referenced this issue Feb 15, 2021
Because `context.run()` doesn't hold reference to the callable, when
e.g. the protocol is written in Cython, the callbacks were not
guaranteed to hold the protocol reference. This PR fixes the issue by
explicitly add a reference before `context.run()` calls.

Refs edgedb/edgedb#2222
fantix added a commit to MagicStack/uvloop that referenced this issue Feb 15, 2021
Because `context.run()` doesn't hold reference to the callable, when
e.g. the protocol is written in Cython, the callbacks were not
guaranteed to hold the protocol reference. This PR fixes the issue by
explicitly add a reference before `context.run()` calls.

Refs edgedb/edgedb#2222
fantix added a commit to MagicStack/uvloop that referenced this issue Feb 15, 2021
Because `context.run()` doesn't hold reference to the callable, when
e.g. the protocol is written in Cython, the callbacks were not
guaranteed to hold the protocol reference. This PR fixes the issue by
explicitly add a reference before `context.run()` calls.

Refs edgedb/edgedb#2222
@fantix
Copy link
Member Author

fantix commented Feb 15, 2021

TL;DR: uvloop 0.15.1 should've fixed this issue on our end. I'll create an issue in Cython to continue the discussion there.

Below is an incremental list of things I tried so far:

  1. ❌ Latest Cython from the master branch is also having the same issue. Searching the recent Cython issues didn't give me very relevant information.
  2. ✅ After changing the run() method signature to def run(self, *args), the segfault is gone.
  3. ❌ But changing to def run(self, x, y) doesn't fix the segfault.
  4. ✅ At last, I changed ctx.run(tt._run) to functools.reduce(tt._run, [1, 2]), and the segfault is gone again.

Details:

TBD

@1st1
Copy link
Member

1st1 commented Oct 22, 2021

I think this has been fixed and can be closed.

@1st1 1st1 closed this as completed Oct 22, 2021
@elprans
Copy link
Member

elprans commented Aug 13, 2022

Seems like asyncio in Python 3.11 has a similar issue? https://github.com/MagicStack/uvloop/runs/7822201198?check_suite_focus=true#step:7:768

@fantix
Copy link
Member Author

fantix commented Aug 24, 2022

OK so there are 2 issues here.

  1. Cython Fast C-Call Issue

In Python 3.9 or lower, Cython uses a fast C-call method to call contextvars.Context.run(), which does not INCREF its arguments, while in 3.10 and above, the regular Cython __Pyx__PyObject_CallOneArg uses a tuple to store the args during the call.

This difference led to the explained issue at the very beginning in Python 3.9, and the issue was fixed in uvloop by INCREFing the args manually. Still, this is an issue in Cython anyways I think (?).

  1. Python 3.11 new method call

The test in uvloop however is simulating the Cython issue with a weakref, a simplified version looks like this:

import weakref

a = None

class A:
    def __init__(self):
        self.v = 42

    def f(self):
        self = weakref.ref(self)
        global a
        a = None
        print(self().v)

a = A()
a.f()

This passes in Python 3.10 and below but fails on 3.11 because of the new Python call implementation I think, so it is not an issue in 3.11 asyncio because the self reference would still be there in asyncio code during the function call.

For now I will skip this uvloop test in 3.10 and above, and seek a fix in Cython for 3.9 and lower.

Olaf1022 added a commit to Olaf1022/Ultra-fast-asyncio that referenced this issue Aug 13, 2023
Because `context.run()` doesn't hold reference to the callable, when
e.g. the protocol is written in Cython, the callbacks were not
guaranteed to hold the protocol reference. This PR fixes the issue by
explicitly add a reference before `context.run()` calls.

Refs edgedb/edgedb#2222
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants