Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please provide a way to disable the warning printed if the signal module's wakeup fd overflows #74236

Closed
njsmith opened this issue Apr 12, 2017 · 13 comments
Labels
3.7 interpreter-core type-feature

Comments

@njsmith
Copy link
Contributor

@njsmith njsmith commented Apr 12, 2017

BPO 30050
Nosy @pitrou, @vstinner, @njsmith, @serhiy-storchaka, @1st1
PRs
  • #4792
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2017-12-18.04:13:58.271>
    created_at = <Date 2017-04-12.08:54:33.419>
    labels = ['interpreter-core', 'type-feature', '3.7']
    title = "Please provide a way to disable the warning printed if the signal module's wakeup fd overflows"
    updated_at = <Date 2017-12-18.04:13:58.255>
    user = 'https://github.com/njsmith'

    bugs.python.org fields:

    activity = <Date 2017-12-18.04:13:58.255>
    actor = 'yselivanov'
    assignee = 'none'
    closed = True
    closed_date = <Date 2017-12-18.04:13:58.271>
    closer = 'yselivanov'
    components = ['Interpreter Core']
    creation = <Date 2017-04-12.08:54:33.419>
    creator = 'njs'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 30050
    keywords = ['patch']
    message_count = 13.0
    messages = ['291532', '291533', '291536', '295824', '308064', '308065', '308066', '308067', '308069', '308089', '308099', '308511', '308512']
    nosy_count = 5.0
    nosy_names = ['pitrou', 'vstinner', 'njs', 'serhiy.storchaka', 'yselivanov']
    pr_nums = ['4792']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue30050'
    versions = ['Python 3.7']

    @njsmith
    Copy link
    Contributor Author

    @njsmith njsmith commented Apr 12, 2017

    When a wakeup fd is registered via signal.set_wakeup_fd, then the C level signal handler writes a byte to the wakeup fd on each signal received. If this write fails, then it prints an error message to the console.

    Some projects use the wakeup fd as a way to see which signals have occurred (asyncio, curio). Others use it only as a toggle for "a wake up is needed", and transmit the actual data out-of-line (twisted, tornado, trio – I guess it has something to do with the letter "t").

    One way that writing to the wakeup fd can fail is if the pipe or socket's buffer is already full, in which case we get EWOULDBLOCK or WSAEWOULDBLOCK. For asyncio/curio, this is a problem: it indicates a lost signal! Printing to the console isn't a great solution, but it's better than letting the error pass silently.

    For twisted/tornado/trio, this is a normal and expected thing – the semantics we want are that after a signal is received then the fd will be readable, and if its buffer is full then it's certainly readable! So for them, EWOULDBLOCK/WSAEWOULDBLOCK are *success* conditions. Yet currently, the signal module insists on printing a scary message to the console whenever we succeed in this way.

    It would be nice if there were a way to disable this; perhaps something like: signal.set_wakeup_fd(fd, warn_on_full_buffer=False)

    This is particularly annoying for trio, because I try to minimize the size of the wakeup fd's send buffer to avoid wasting non-swappable kernel memory on what's essentially an overgrown bool. This ends up meaning that on Linux the buffer is 6 bytes, and on MacOS it's 1 byte. So currently I don't use the wakeup fd on Linux/MacOS, which is *mostly* OK but it would be better if we could use it. Trio bug with a few more details: python-trio/trio#109

    @njsmith njsmith added the 3.7 label Apr 12, 2017
    @vstinner
    Copy link
    Member

    @vstinner vstinner commented Apr 12, 2017

    One way that writing to the wakeup fd can fail is if the pipe or socket's buffer is already full, in which case we get EWOULDBLOCK or WSAEWOULDBLOCK.

    Is it a theorical question or did you notice the error in the wild? I mean: without sending a signal in a loop.

    Right now, we write the signal number into the pipe each time a signal is received. Do you need to get N bytes if the signal is received N times? Maybe we can use a flag and only write the byte once by signal number?

    Printing to the console isn't a great solution, but it's better than letting the error pass silently.

    If the pipe becomes full, you loose signals: it's an hard error. I decided to log an error into fd 2 because in a signal handler, the number of allowed functions is very limited.

    I suggest to have a flag to remind if the pipe overflowed, and in that case: schedule a Python callback. The callback can be a new parameter to set_wakeup_fd()?

    A pipe seems limited to 64 KB, limit hardcoded in Linux kernel code. Is it the case if we use a socket pair, as we already do on Windows? Or is it possible to increase the socket buffer size?

    This is particularly annoying for trio, because I try to minimize the size of the wakeup fd's send buffer to avoid wasting non-swappable kernel memory on what's essentially an overgrown bool.

    asyncio uses wakeup fd to get signal numbers: it's not only a bool. See the issue bpo-21645 for the rationale of using the FD to get signal numbers in asyncio.

    It was a bool on Python 2 which only writes NUL bytes into the FD.

    Related question: does asyncio support signals on Windows? I added support for set_wakeup_fd() to Windows for asyncio, but then I forgot this TODO item :-)

    @njsmith
    Copy link
    Contributor Author

    @njsmith njsmith commented Apr 12, 2017

    I haven't noticed the error in the wild because I don't use set_wakeup_fd on Linux/MacOS, because of this issue :-). But on MacOS literally all it would take is to receive two signals in quick succession, or to receive one signal at a moment when someone has recently scheduled a callback from a thread (which also writes to the wakeup fd). In principle it can also happen on Windows, but on Windows apparently the smallest buffer you get for a socketpair is like 500 kilobytes, so it doesn't come up except in rather artificial situations.

    And yes, a possible workaround is to use an artificially large buffer on Linux/MacOS as well. I'm not saying this is a huge showstopper. But it's silly for downstream libraries to carry around workarounds for issues in CPython when we could just fix CPython.

    If the pipe becomes full, you loose signals: it's an hard error. I decided to log an error into fd 2 because in a signal handler, the number of allowed functions is very limited.

    This makes sense for asyncio, but not for most other async libraries in Python, because they *don't* lose signals when the pipe becomes full. That's why I'm suggesting that the signal module should have an option to let users decide whether they want the warning or not.

    (It might also make sense to switch asyncio to stop using the fd as a communications channel. Theoretically this would be the ideal solution, but I understand why it's not a priority.)

    Related question: does asyncio support signals on Windows? I added support for set_wakeup_fd() to Windows for asyncio, but then I forgot this TODO item :-)

    No, asyncio doesn't use set_wakeup_fd on Windows, so code like 'loop.run_until_complete(asyncio.sleep(99999))' can't be interrupted. I'd appreciate if you could wait a few days to fix this though, because the blog post that I'm almost finished writing about signal handling currently uses this as an example of how trio is more awesome than asyncio ;-).

    (This is a joke. Well, I mean, mostly. My draft does literally have that sleep example in it, but I won't be grumpy if you fix it first. Or at least, not *very* grumpy :-).)

    @pitrou
    Copy link
    Member

    @pitrou pitrou commented Jun 12, 2017

    It would be nice if there were a way to disable this; perhaps something like: signal.set_wakeup_fd(fd, warn_on_full_buffer=False)

    That's a reasonable idea. Nathaniel, would you like to submit a PR for this?

    @serhiy-storchaka
    Copy link
    Member

    @serhiy-storchaka serhiy-storchaka commented Dec 11, 2017

    I'm now well acquainted with with the signal and asyncio modules, but is this the best solution? Wouldn't be better to specify a custom handler? If this is possible.

    @njsmith
    Copy link
    Contributor Author

    @njsmith njsmith commented Dec 11, 2017

    Serhiy: I don't know what "specify a custom handler" means in this context. Can you elaborate? The fd buffer overflow happens in a very delicate context where we definitely cannot call python code or even safely touch PyObject* variables.

    @pitrou
    Copy link
    Member

    @pitrou pitrou commented Dec 11, 2017

    A custom handler would be overkill IMO. set_wakeup_fd() is really useful for a small category of library (most notably event loop frameworks).

    @serhiy-storchaka
    Copy link
    Member

    @serhiy-storchaka serhiy-storchaka commented Dec 11, 2017

    I meant a callback that will be called after writing a byte to the wakeup fd is failed. By default it would write a warning to stderr, but the user could specify other callback for silencing a warning or for using other way for logging it. If this is impossible or overkill I have no other questions.

    @vstinner
    Copy link
    Member

    @vstinner vstinner commented Dec 11, 2017

    As I wrote, I never saw this warning, even while debugging signal issues in
    asyncio on FreeBSD. I don't think that this is an use case for a callback.
    Nathaniel use case is just to ignore the warning, I consider that it's a
    reasonable compromise for the issue.

    If someone wants a more advanced signal handler, it's still possible to use
    register a custom signal handler at the C level.

    @njsmith
    Copy link
    Contributor Author

    @njsmith njsmith commented Dec 12, 2017

    Yeah, I agree with Antoine and Victor that a callback would be overkill, and it would be extremely difficult to implement since at the point where the error occurs, we don't have and can't take the GIL.

    @vstinner
    Copy link
    Member

    @vstinner vstinner commented Dec 12, 2017

    it would be extremely difficult to implement since at the point where the error occurs, we don't have and can't take the GIL.

    The signal handler uses Py_AddPendingCall() which calls a C function "later", but it can be anytime, between two Python instruction. If the callback fails, it can be annoying to get an error "anywhere". It can be even worse if the callback only fails if we are calling a specific function.

    Handling signals is very tricky, it's hard to "get it right".

    For example, Py_AddPendingCall() has an hardcoded queue of 32 callbacks. If the queue is full... the callback is lost...

    Py_AddPendingCall() also tries to acquire a lock... from a signal handler... Probably the worst idea... But that's another idea :-)

        /* try a few times for the lock.  Since this mechanism is used
         * for signal handling (on the main thread), there is a (slim)
         * chance that a signal is delivered on the same thread while we
         * hold the lock during the Py_MakePendingCalls() function.
         * This avoids a deadlock in that case.
         * Note that signals can be delivered on any thread.  In particular,
         * on Windows, a SIGINT is delivered on a system-created worker
         * thread.
         * We also check for lock being NULL, in the unlikely case that
         * this function is called before any bytecode evaluation takes place.
         */
        if (lock != NULL) {
            for (i = 0; i<100; i++) {
                if (PyThread_acquire_lock(lock, NOWAIT_LOCK))
                    break;
            }
            if (i == 100)
                return -1;
        }

    We may also get a new signal while handling a signal...

    Currently, the Windows implementation of the signal handler remembers if there is pending call to report_wakeup_send_error(), so it doesn't fill the pending call callback queue. But the Linux implementation doesn't.

    With Nathaniel's PR, the queue is always filled. I'm not sure that it's ideal, but it's also hard to handle synchronization here so I didn't complain :-) Always calling Py_AddPendingCall() does work for the signal module, but it may prevent other functions to schedule a pending call later. In practice, in CPython, only the signal handler calls Py_AddPendingCall() :-)

    @1st1
    Copy link
    Member

    @1st1 1st1 commented Dec 18, 2017

    New changeset 902ab80 by Yury Selivanov (Nathaniel J. Smith) in branch 'master':
    bpo-30050: Allow disabling full buffer warnings in signal.set_wakeup_fd (bpo-4792)
    902ab80

    @1st1
    Copy link
    Member

    @1st1 1st1 commented Dec 18, 2017

    I think that what Nathaniel proposes is very reasonable. The PR is LGTM and I've just merged it. Closing the issue.

    @1st1 1st1 added the interpreter-core label Dec 18, 2017
    @1st1 1st1 closed this as completed Dec 18, 2017
    @1st1 1st1 added the type-feature label Dec 18, 2017
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 interpreter-core type-feature
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants