New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

glfwPostEmptyEvent sometimes fails to unblock glfwWaitEvents on Ubuntu 18.04 #1281

Open
OlivierSohn opened this Issue Jun 1, 2018 · 63 comments

Comments

Projects
None yet
3 participants
@OlivierSohn

OlivierSohn commented Jun 1, 2018

From times to times, the empty event posted by a call to glfwPostEmptyEvent won't unblock a call to glfwWaitEvent in the main thread.

A minimal program reproducing the issue is available here
It can be built and run on linux like so:
g++ main.cpp -std=c++14 -lglfw -lX11 -ldl -lpthread && ./a.out

If the calls of glfwPostEmptyEvent and glfwWaitEvent don't overlap in time, then the bug will never happen, but if the 2 calls are concurrent, the bug happens, "sometimes".

Note that on OSX, this bug never happens.

GLFW Version (from glfwGetVersionString()):
3.2.1 X11 GLX EGL clock_gettime /dev/js Xf86vm shared

Ubuntu version
18.04 LTS

  • Note that I have tried posting "several" empty events at the same time: it doesn't help either, it just makes the bug a little less reproducible but doesnt fix it entirely.

Edit : added the build command
Edit2 : fixed the build command

@OlivierSohn

This comment has been minimized.

Show comment
Hide comment
@OlivierSohn

OlivierSohn Jun 1, 2018

The bug also exists with the latest version of the master branch

OlivierSohn commented Jun 1, 2018

The bug also exists with the latest version of the master branch

@OlivierSohn OlivierSohn changed the title from [Ubuntu 18.04] Race condition between glfwPostEmptyEvent and glfwWaitEvent to [Ubuntu 18.04 / X11] Race condition between glfwPostEmptyEvent and glfwWaitEvent Jun 1, 2018

@OlivierSohn OlivierSohn changed the title from [Ubuntu 18.04 / X11] Race condition between glfwPostEmptyEvent and glfwWaitEvent to [Ubuntu 18.04 / X11] Sometimes, glfwPostEmptyEvent fails to unblock glfwWaitEvent Jun 1, 2018

@OlivierSohn

This comment has been minimized.

Show comment
Hide comment
@OlivierSohn

OlivierSohn Jun 1, 2018

I also tried to fix the bug by changing the implementations of glfwPostEmptyEvent and glfwWaitEvents for X11, but could only make the bug "less frequent", so it's not a solution.

My workaround sofar is to systematically post a second empty event, one millisecond after the first one, to have the guarantee that the freeze will not exceed a millisecond when it happens.

OlivierSohn commented Jun 1, 2018

I also tried to fix the bug by changing the implementations of glfwPostEmptyEvent and glfwWaitEvents for X11, but could only make the bug "less frequent", so it's not a solution.

My workaround sofar is to systematically post a second empty event, one millisecond after the first one, to have the guarantee that the freeze will not exceed a millisecond when it happens.

@OlivierSohn OlivierSohn changed the title from [Ubuntu 18.04 / X11] Sometimes, glfwPostEmptyEvent fails to unblock glfwWaitEvent to [Ubuntu 18.04 / X11] Sometimes, glfwPostEmptyEvent fails to unblock glfwWaitEvents Jun 6, 2018

@kovidgoyal

This comment has been minimized.

Show comment
Hide comment
@kovidgoyal

kovidgoyal Jun 7, 2018

Contributor

Wouldn't the fix be simply to use a mutex either wrapping your calls to glfwPostEmptyEvent() and glfwWaitEvents() or add a mutex to glfw itself (which may not be appropriate since glfw does not depend on ptthreads, IIRC)

Contributor

kovidgoyal commented Jun 7, 2018

Wouldn't the fix be simply to use a mutex either wrapping your calls to glfwPostEmptyEvent() and glfwWaitEvents() or add a mutex to glfw itself (which may not be appropriate since glfw does not depend on ptthreads, IIRC)

@OlivierSohn

This comment has been minimized.

Show comment
Hide comment
@OlivierSohn

OlivierSohn Jun 7, 2018

@kovidgoyal if I wrap my calls to glfwPostEmptyEvent() and glfwWaitEvents() in a mutex, I won't be able, ever, to wake up the thread that made the glfwWaitEvents() call by calling glfwPostEmptyEvent() from another thread, because ... of the mutex :) so it's not a solution.

Using a mutex at a finer level inside glfw is an approach that could work, my first attempt was not conclusive, but there might be a possible solution there. Do you want to give it a try?

OlivierSohn commented Jun 7, 2018

@kovidgoyal if I wrap my calls to glfwPostEmptyEvent() and glfwWaitEvents() in a mutex, I won't be able, ever, to wake up the thread that made the glfwWaitEvents() call by calling glfwPostEmptyEvent() from another thread, because ... of the mutex :) so it's not a solution.

Using a mutex at a finer level inside glfw is an approach that could work, my first attempt was not conclusive, but there might be a possible solution there. Do you want to give it a try?

@kovidgoyal

This comment has been minimized.

Show comment
Hide comment
@kovidgoyal

kovidgoyal Jun 7, 2018

Contributor

Oh yeah, sorry, I wasn't paying much attention to the context :) Thinking about it some more, I dont think this will work, or at least, one needs to understand the root cause of the failure first. What is happening to the events sent by XSendEvent? Why aren't they being picked up by XNextEvent?

Contributor

kovidgoyal commented Jun 7, 2018

Oh yeah, sorry, I wasn't paying much attention to the context :) Thinking about it some more, I dont think this will work, or at least, one needs to understand the root cause of the failure first. What is happening to the events sent by XSendEvent? Why aren't they being picked up by XNextEvent?

@kovidgoyal

This comment has been minimized.

Show comment
Hide comment
@kovidgoyal

kovidgoyal Jun 7, 2018

Contributor

I tried your minimal test case and on my system (with the i3 window manager, it freezes before starting the test). So I removed the window creation bits form the test (as far as I can tell having a window is immaterial to this issue) and with that it runs, but does not reproduce your issue.
Here is the source I used:

#include "assert.h"

#include <stdio.h>
#include <unistd.h>

#include <atomic>
#include <ctime>
#include <thread>

//#include "../glfw/include/GLFW/glfw3.h"
#include <GLFW/glfw3.h>

/*
In this test, a producer thread posts empty events, and a consumer thread consumes them.

- The producer thread waits for the consumer to consume the event before posting a new event.

- The consumer thread waits for the producer to declare that an event /will/ be produced before
   trying to consume it.
*/

std::atomic_int event_status;
constexpr int CONSUMED = 0;
constexpr int POSTED = 1;

void consume() {
    while(true) {
        while(event_status != POSTED) {
            std::this_thread::yield();
        }

        glfwWaitEvents(); printf("c");

        event_status = CONSUMED;
    }
}

void produce() {
    while(true) {
        using namespace std;
        clock_t begin = clock();
        while(event_status != CONSUMED) {
            this_thread::yield();
            clock_t end = clock();
            double elapsed_secs = double(end - begin) / CLOCKS_PER_SEC;
            if(elapsed_secs > 3) {
                printf(
                    "\nThe bug has been reproduced:"
                    "\nThe main thread is blocked since 3 seconds in glfwWaitEvents(), "
                    "eventhough an empty event has been posted to unblock it.\n");
                exit(1);
            }
        }

        event_status = POSTED;

        glfwPostEmptyEvent(); printf(".");

        // Note that if we move 'event_status = POSTED;' here
        // (thereby ensuring that glfwPostEmptyEvent() and glfwWaitEvents()
        // do not execute at the same time), the bug is not reproduced.
        //
        // In other words, the bug seems to exist only when glfwPostEmptyEvent()
        // and glfwWaitEvents() are executing at the same time.
    }
}

void error_callback(int error, const char* description)
{
    fprintf(stderr, "Error (%d) : %s\n", error, description);
}

int main() {

    // disable stdout buffering, to see logs immediately.
    setbuf(stdout, NULL);

    if (!glfwInit())
    {
        return 1;
    }

    printf("glfwGetVersionString = %s\n\n", glfwGetVersionString());
    glfwSetErrorCallback(error_callback);

    // Window creation has generated some events in the queue so we clear the queue:
    for(int i=0; i<10; ++i) {
        glfwWaitEventsTimeout(0.1);
    }

    printf(
        "Please stay here in the terminal while the test is running, \n"
        "to ensure that no event is sent to the glfw window.\n"
        "\nPress a key to start the test...");
    getchar();


    {
        event_status = CONSUMED;
        std::thread producer(produce);
        consume();
    }

    return 0;
}

// build on OSX :
// g++ main.cpp -std=c++14 -I"/usr/local/Cellar/glfw/3.2.1/include/GLFW/" -lglfw3
// build on linux:
// g++ main.cpp -std=c++14 -lglfw -lpthread
Contributor

kovidgoyal commented Jun 7, 2018

I tried your minimal test case and on my system (with the i3 window manager, it freezes before starting the test). So I removed the window creation bits form the test (as far as I can tell having a window is immaterial to this issue) and with that it runs, but does not reproduce your issue.
Here is the source I used:

#include "assert.h"

#include <stdio.h>
#include <unistd.h>

#include <atomic>
#include <ctime>
#include <thread>

//#include "../glfw/include/GLFW/glfw3.h"
#include <GLFW/glfw3.h>

/*
In this test, a producer thread posts empty events, and a consumer thread consumes them.

- The producer thread waits for the consumer to consume the event before posting a new event.

- The consumer thread waits for the producer to declare that an event /will/ be produced before
   trying to consume it.
*/

std::atomic_int event_status;
constexpr int CONSUMED = 0;
constexpr int POSTED = 1;

void consume() {
    while(true) {
        while(event_status != POSTED) {
            std::this_thread::yield();
        }

        glfwWaitEvents(); printf("c");

        event_status = CONSUMED;
    }
}

void produce() {
    while(true) {
        using namespace std;
        clock_t begin = clock();
        while(event_status != CONSUMED) {
            this_thread::yield();
            clock_t end = clock();
            double elapsed_secs = double(end - begin) / CLOCKS_PER_SEC;
            if(elapsed_secs > 3) {
                printf(
                    "\nThe bug has been reproduced:"
                    "\nThe main thread is blocked since 3 seconds in glfwWaitEvents(), "
                    "eventhough an empty event has been posted to unblock it.\n");
                exit(1);
            }
        }

        event_status = POSTED;

        glfwPostEmptyEvent(); printf(".");

        // Note that if we move 'event_status = POSTED;' here
        // (thereby ensuring that glfwPostEmptyEvent() and glfwWaitEvents()
        // do not execute at the same time), the bug is not reproduced.
        //
        // In other words, the bug seems to exist only when glfwPostEmptyEvent()
        // and glfwWaitEvents() are executing at the same time.
    }
}

void error_callback(int error, const char* description)
{
    fprintf(stderr, "Error (%d) : %s\n", error, description);
}

int main() {

    // disable stdout buffering, to see logs immediately.
    setbuf(stdout, NULL);

    if (!glfwInit())
    {
        return 1;
    }

    printf("glfwGetVersionString = %s\n\n", glfwGetVersionString());
    glfwSetErrorCallback(error_callback);

    // Window creation has generated some events in the queue so we clear the queue:
    for(int i=0; i<10; ++i) {
        glfwWaitEventsTimeout(0.1);
    }

    printf(
        "Please stay here in the terminal while the test is running, \n"
        "to ensure that no event is sent to the glfw window.\n"
        "\nPress a key to start the test...");
    getchar();


    {
        event_status = CONSUMED;
        std::thread producer(produce);
        consume();
    }

    return 0;
}

// build on OSX :
// g++ main.cpp -std=c++14 -I"/usr/local/Cellar/glfw/3.2.1/include/GLFW/" -lglfw3
// build on linux:
// g++ main.cpp -std=c++14 -lglfw -lpthread
@OlivierSohn

This comment has been minimized.

Show comment
Hide comment
@OlivierSohn

OlivierSohn Jun 7, 2018

with the i3 window manager, it freezes before starting the test

Can you add logs to the test program to see where it freezes?

If I run your program on my system, it works fine, so that shows that we need a window to reproduce the problem.

OlivierSohn commented Jun 7, 2018

with the i3 window manager, it freezes before starting the test

Can you add logs to the test program to see where it freezes?

If I run your program on my system, it works fine, so that shows that we need a window to reproduce the problem.

@kovidgoyal

This comment has been minimized.

Show comment
Hide comment
@kovidgoyal

kovidgoyal Jun 7, 2018

Contributor

It freezes on the window creation step, basically glfwCreateWindow() never returns

If you need a window to reproduce it, it means something on your system is interfering with event handling, probably accessing the X event queue itself. Some kind of plugin that hooks into all processes? Ubuntu is notorious for those.

Contributor

kovidgoyal commented Jun 7, 2018

It freezes on the window creation step, basically glfwCreateWindow() never returns

If you need a window to reproduce it, it means something on your system is interfering with event handling, probably accessing the X event queue itself. Some kind of plugin that hooks into all processes? Ubuntu is notorious for those.

@kovidgoyal

This comment has been minimized.

Show comment
Hide comment
@kovidgoyal

kovidgoyal Jun 7, 2018

Contributor

I can reproduce with your example, by adding glfwWindowHint(GLFW_VISIBLE, false);
before the call to the window creation step.

Contributor

kovidgoyal commented Jun 7, 2018

I can reproduce with your example, by adding glfwWindowHint(GLFW_VISIBLE, false);
before the call to the window creation step.

@OlivierSohn

This comment has been minimized.

Show comment
Hide comment
@OlivierSohn

OlivierSohn Jun 7, 2018

It freezes on the window creation step, basically glfwCreateWindow() never returns

I think I'm using "the standard way" of creating a glfw window, so I'm surprised that it doesn't work for you.

I can reproduce with your example, by adding glfwWindowHint(GLFW_VISIBLE, false);
before the call to the window creation step.

I don't see why this call is necessary, it seems to me it's an unrelated issue (an issue nonetheless).

If you need a window to reproduce it, it means something on your system is interfering with event handling, probably accessing the X event queue itself. Some kind of plugin that hooks into all processes? Ubuntu is notorious for those.

The window itself probably needs the X event queue, for mouse events, resize events, etc... so I'm not surprised that having or not having a window can make a difference.

OlivierSohn commented Jun 7, 2018

It freezes on the window creation step, basically glfwCreateWindow() never returns

I think I'm using "the standard way" of creating a glfw window, so I'm surprised that it doesn't work for you.

I can reproduce with your example, by adding glfwWindowHint(GLFW_VISIBLE, false);
before the call to the window creation step.

I don't see why this call is necessary, it seems to me it's an unrelated issue (an issue nonetheless).

If you need a window to reproduce it, it means something on your system is interfering with event handling, probably accessing the X event queue itself. Some kind of plugin that hooks into all processes? Ubuntu is notorious for those.

The window itself probably needs the X event queue, for mouse events, resize events, etc... so I'm not surprised that having or not having a window can make a difference.

@kovidgoyal

This comment has been minimized.

Show comment
Hide comment
@kovidgoyal

kovidgoyal Jun 7, 2018

Contributor

The reason the window is needed is because of line 1115 in window.c Basically glfwpostemptyevent is a no-op when no windows are present. I dont know why that is the case, probably needed by some other backend.

Contributor

kovidgoyal commented Jun 7, 2018

The reason the window is needed is because of line 1115 in window.c Basically glfwpostemptyevent is a no-op when no windows are present. I dont know why that is the case, probably needed by some other backend.

@OlivierSohn

This comment has been minimized.

Show comment
Hide comment
@OlivierSohn

OlivierSohn Jun 7, 2018

one needs to understand the root cause of the failure first. What is happening to the events sent by XSendEvent? Why aren't they being picked up by XNextEvent?

Let me explain what I understood about "what can be the root cause" :

I think the problem is that to wait for events, we use select (see the comment on top of waitForEvent in x11_window.c) instead of using X-related functions.

X-related functions use the display-lock when necessary to avoid race conditions, but select doesn't.

OlivierSohn commented Jun 7, 2018

one needs to understand the root cause of the failure first. What is happening to the events sent by XSendEvent? Why aren't they being picked up by XNextEvent?

Let me explain what I understood about "what can be the root cause" :

I think the problem is that to wait for events, we use select (see the comment on top of waitForEvent in x11_window.c) instead of using X-related functions.

X-related functions use the display-lock when necessary to avoid race conditions, but select doesn't.

@kovidgoyal

This comment has been minimized.

Show comment
Hide comment
@kovidgoyal

kovidgoyal Jun 7, 2018

Contributor

That should be easy to check, just replace the implementation of waitForEvent with XEventsQueued see if that fixes it.

Contributor

kovidgoyal commented Jun 7, 2018

That should be easy to check, just replace the implementation of waitForEvent with XEventsQueued see if that fixes it.

@kovidgoyal

This comment has been minimized.

Show comment
Hide comment
@kovidgoyal

kovidgoyal Jun 7, 2018

Contributor

Using XPeekEvent instead of the select does indeed fix the issue on my system

Contributor

kovidgoyal commented Jun 7, 2018

Using XPeekEvent instead of the select does indeed fix the issue on my system

@kovidgoyal

This comment has been minimized.

Show comment
Hide comment
@kovidgoyal

kovidgoyal Jun 7, 2018

Contributor

And this commit is the one that introduced the use of select, as best as I can tell: eb7688d

I dont see any good way to fix this. As far as I can tell, you cannot both be lock free and avoid race conditions. I dont know why it is important to be able to use GLX calls from another thread, OpenGL calls, sure, but GLX?

Contributor

kovidgoyal commented Jun 7, 2018

And this commit is the one that introduced the use of select, as best as I can tell: eb7688d

I dont see any good way to fix this. As far as I can tell, you cannot both be lock free and avoid race conditions. I dont know why it is important to be able to use GLX calls from another thread, OpenGL calls, sure, but GLX?

@OlivierSohn

This comment has been minimized.

Show comment
Hide comment
@OlivierSohn

OlivierSohn Jun 7, 2018

XEventsQueued will block, but if we repeatedly call XPending() and then sched_yield() maybe that would be ok? I'm trying this right now, to see if CPU consumption is not too high

OlivierSohn commented Jun 7, 2018

XEventsQueued will block, but if we repeatedly call XPending() and then sched_yield() maybe that would be ok? I'm trying this right now, to see if CPU consumption is not too high

@kovidgoyal

This comment has been minimized.

Show comment
Hide comment
@kovidgoyal

kovidgoyal Jun 7, 2018

Contributor

CPU consumption would be awful with that. The other possibility is to use a dedicated extra fd pair. Have the select call wait on that as well and have the postemptyevent function write to it.

Contributor

kovidgoyal commented Jun 7, 2018

CPU consumption would be awful with that. The other possibility is to use a dedicated extra fd pair. Have the select call wait on that as well and have the postemptyevent function write to it.

@OlivierSohn

This comment has been minimized.

Show comment
Hide comment
@OlivierSohn

OlivierSohn Jun 7, 2018

so XPending() + sched_yield() works (the race condition is solved), we are lock-free (i.e one can take the display lock when we release it), but the problem is we use 100%cpu! we could use usleep(...) instead of sched_yield() but that would increase latency...

I dont know why it is important to be able to use GLX calls from another thread, OpenGL calls, sure, but GLX?

I have no idea

OlivierSohn commented Jun 7, 2018

so XPending() + sched_yield() works (the race condition is solved), we are lock-free (i.e one can take the display lock when we release it), but the problem is we use 100%cpu! we could use usleep(...) instead of sched_yield() but that would increase latency...

I dont know why it is important to be able to use GLX calls from another thread, OpenGL calls, sure, but GLX?

I have no idea

@kovidgoyal

This comment has been minimized.

Show comment
Hide comment
@kovidgoyal

kovidgoyal Jun 7, 2018

Contributor

See my previous post.

Contributor

kovidgoyal commented Jun 7, 2018

See my previous post.

@OlivierSohn

This comment has been minimized.

Show comment
Hide comment
@OlivierSohn

OlivierSohn Jun 7, 2018

The other possibility is to use a dedicated extra fd pair. Have the select call wait on that as well and have the postemptyevent function write to it.

Sounds good

OlivierSohn commented Jun 7, 2018

The other possibility is to use a dedicated extra fd pair. Have the select call wait on that as well and have the postemptyevent function write to it.

Sounds good

@kovidgoyal

This comment has been minimized.

Show comment
Hide comment
@kovidgoyal

kovidgoyal Jun 7, 2018

Contributor

I'm happy to implement that in my glfw fork (and switch to using poll as well instead of select) since this issue could affect me, though I have not seen it in the wild so far. However you'll need to get buy in from @elmindreda if you want to get the fix merged here.

Contributor

kovidgoyal commented Jun 7, 2018

I'm happy to implement that in my glfw fork (and switch to using poll as well instead of select) since this issue could affect me, though I have not seen it in the wild so far. However you'll need to get buy in from @elmindreda if you want to get the fix merged here.

@OlivierSohn

This comment has been minimized.

Show comment
Hide comment
@OlivierSohn

OlivierSohn Jun 7, 2018

Thanks, it would be great! If you want I can cherrypick your commit to propose a merge here afterwards (or did you intend to merge your fork?).

Just to give some context, in my games I use event-driven loops : no rendering happens until an event (coming from the server, or from the platform) occurs. I'm not sure I would have noticed this bug had I used a time-based loop where every 10 milli seconds a render occurs...

OlivierSohn commented Jun 7, 2018

Thanks, it would be great! If you want I can cherrypick your commit to propose a merge here afterwards (or did you intend to merge your fork?).

Just to give some context, in my games I use event-driven loops : no rendering happens until an event (coming from the server, or from the platform) occurs. I'm not sure I would have noticed this bug had I used a time-based loop where every 10 milli seconds a render occurs...

@kovidgoyal

This comment has been minimized.

Show comment
Hide comment
@kovidgoyal

kovidgoyal Jun 7, 2018

Contributor

I use an event based loop as well, isn't that what everybody uses? Why would you render repeatedly if nothing has happened? As for merging my fork, it seems unlikely at this point. It has pretty much diverged since #1140 and tons f other bug fixes/enhancements.

Feel free to cherry pick and propose a merge, I have no objections.

Contributor

kovidgoyal commented Jun 7, 2018

I use an event based loop as well, isn't that what everybody uses? Why would you render repeatedly if nothing has happened? As for merging my fork, it seems unlikely at this point. It has pretty much diverged since #1140 and tons f other bug fixes/enhancements.

Feel free to cherry pick and propose a merge, I have no objections.

@OlivierSohn

This comment has been minimized.

Show comment
Hide comment
@OlivierSohn

OlivierSohn Jun 7, 2018

I should have added this to the context description : in my case, I also have different sources of events : glfw events (which I listen to on the main thread), and server events (which are listened to on another thread). If the server event arrives before the glfw event, then to continue on the main thread I need to wakeup the glfw wait function (glfwWaitEvents) using glfwPostEmptyEvent.

Hence, this bug becomes very visible :)

OlivierSohn commented Jun 7, 2018

I should have added this to the context description : in my case, I also have different sources of events : glfw events (which I listen to on the main thread), and server events (which are listened to on another thread). If the server event arrives before the glfw event, then to continue on the main thread I need to wakeup the glfw wait function (glfwWaitEvents) using glfwPostEmptyEvent.

Hence, this bug becomes very visible :)

@kovidgoyal

This comment has been minimized.

Show comment
Hide comment
@kovidgoyal

kovidgoyal Jun 7, 2018

Contributor

Yeah same architecture for me (in my case there are two secondary threads listening for server events and events from child processes), but I've never run into this in the wild and kitty (my project that uses glfw) has been out in the wild for almost two years now.

But anyway, I agree with your point, this is a potentially bad bug and should be fixed.

Contributor

kovidgoyal commented Jun 7, 2018

Yeah same architecture for me (in my case there are two secondary threads listening for server events and events from child processes), but I've never run into this in the wild and kitty (my project that uses glfw) has been out in the wild for almost two years now.

But anyway, I agree with your point, this is a potentially bad bug and should be fixed.

@OlivierSohn

This comment has been minimized.

Show comment
Hide comment
@OlivierSohn

OlivierSohn Jun 7, 2018

IMO, kovidgoyal/kitty#426 may be related to this. Quoting the last comment:

They stay that way, until I do something to snap them out of it. Switching desktops or sometimes even just pressing shift (!) seems to do it, or just moving the mouse

This is the same behaviour I saw in my games : if I moved the mouse or pressed a key, then that would wake up the glfwWaitEvents call.

But I might be wrong.

OlivierSohn commented Jun 7, 2018

IMO, kovidgoyal/kitty#426 may be related to this. Quoting the last comment:

They stay that way, until I do something to snap them out of it. Switching desktops or sometimes even just pressing shift (!) seems to do it, or just moving the mouse

This is the same behaviour I saw in my games : if I moved the mouse or pressed a key, then that would wake up the glfwWaitEvents call.

But I might be wrong.

@OlivierSohn

This comment has been minimized.

Show comment
Hide comment
@OlivierSohn

OlivierSohn Jun 7, 2018

Btw, regarding the availability of ppoll it would be nice if we can document from which version of the different OSes it is available? I'll try to do my research on that...

Also, did you intentionally leave the non-timeout call using poll? I was thinking that we can use ppoll here too, to be consistent with the timeout case (passing NULL instead of -1 for the time indicates "waiting forever").

OlivierSohn commented Jun 7, 2018

Btw, regarding the availability of ppoll it would be nice if we can document from which version of the different OSes it is available? I'll try to do my research on that...

Also, did you intentionally leave the non-timeout call using poll? I was thinking that we can use ppoll here too, to be consistent with the timeout case (passing NULL instead of -1 for the time indicates "waiting forever").

@kovidgoyal

This comment has been minimized.

Show comment
Hide comment
@kovidgoyal

kovidgoyal Jun 8, 2018

Contributor

Feel free to add documentation for version availability. There is no need to use ppoll when timeout is not specified, so I didn't change it.

Contributor

kovidgoyal commented Jun 8, 2018

Feel free to add documentation for version availability. There is no need to use ppoll when timeout is not specified, so I didn't change it.

kovidgoyal added a commit to kovidgoyal/glfw that referenced this issue Jun 8, 2018

kovidgoyal added a commit to kovidgoyal/glfw that referenced this issue Jun 8, 2018

kovidgoyal added a commit to kovidgoyal/glfw that referenced this issue Jun 8, 2018

kovidgoyal added a commit to kovidgoyal/glfw that referenced this issue Jun 8, 2018

@kovidgoyal

This comment has been minimized.

Show comment
Hide comment
@kovidgoyal

kovidgoyal Jun 8, 2018

Contributor

Your test case was failing on wayland as well, so I ported the fix to the wayland backend and cleaned it up a little removing the externs and using pipe2, since pipe2 is available everywhere that ppoll is available.

Please review.

Contributor

kovidgoyal commented Jun 8, 2018

Your test case was failing on wayland as well, so I ported the fix to the wayland backend and cleaned it up a little removing the externs and using pipe2, since pipe2 is available everywhere that ppoll is available.

Please review.

@OlivierSohn

This comment has been minimized.

Show comment
Hide comment
@OlivierSohn

OlivierSohn Jun 8, 2018

I made 2 inline comments on the last commit. Please do not squash your next commits, so that I can see what changes more easily instead of retraversing all the changes every time :)

OlivierSohn commented Jun 8, 2018

I made 2 inline comments on the last commit. Please do not squash your next commits, so that I can see what changes more easily instead of retraversing all the changes every time :)

kovidgoyal added a commit to kovidgoyal/glfw that referenced this issue Jun 8, 2018

@OlivierSohn

This comment has been minimized.

Show comment
Hide comment
@OlivierSohn

OlivierSohn Jun 8, 2018

See my previous comment :)

OlivierSohn commented Jun 8, 2018

See my previous comment :)

@OlivierSohn

This comment has been minimized.

Show comment
Hide comment
@OlivierSohn

OlivierSohn Jun 8, 2018

I also made a new comment on the last commit.

OlivierSohn commented Jun 8, 2018

I also made a new comment on the last commit.

@OlivierSohn

This comment has been minimized.

Show comment
Hide comment
@OlivierSohn

OlivierSohn Jun 8, 2018

@kovidgoyal It's hard to follow comments on commits, so I'll write it here (ideally we should discuss this on a PR):

  • I think there is some code that could be factored between the x11 and wayland backends, especially in the new code. One way to share the code would be to write functions in a source file, that we #include in both backends. This way, when we make a change / review other people's code, we don't have to double check everything, which is error-prone. For example the deinitialization code now has 2 different behaviours and there is no way to know if it's because something was forgotten to be changed or because it was intended to be this way. (see next point)
  • in one of the 2 backends, the pipes fd are closed only if the display is not null, is it intentional? If so, please add a comment saying why.

OlivierSohn commented Jun 8, 2018

@kovidgoyal It's hard to follow comments on commits, so I'll write it here (ideally we should discuss this on a PR):

  • I think there is some code that could be factored between the x11 and wayland backends, especially in the new code. One way to share the code would be to write functions in a source file, that we #include in both backends. This way, when we make a change / review other people's code, we don't have to double check everything, which is error-prone. For example the deinitialization code now has 2 different behaviours and there is no way to know if it's because something was forgotten to be changed or because it was intended to be this way. (see next point)
  • in one of the 2 backends, the pipes fd are closed only if the display is not null, is it intentional? If so, please add a comment saying why.
@OlivierSohn

This comment has been minimized.

Show comment
Hide comment
@OlivierSohn

OlivierSohn Jun 8, 2018

I'll take a shot at refactoring the code to use more functions to encapsulate the code, and to share code between backends.

OlivierSohn commented Jun 8, 2018

I'll take a shot at refactoring the code to use more functions to encapsulate the code, and to share code between backends.

OlivierSohn added a commit to OlivierSohn/glfw that referenced this issue Jun 8, 2018

kovidgoyal added a commit to kovidgoyal/glfw that referenced this issue Jun 8, 2018

kovidgoyal added a commit to kovidgoyal/glfw that referenced this issue Jun 8, 2018

@elmindreda

This comment has been minimized.

Show comment
Hide comment
@elmindreda

elmindreda Jun 11, 2018

Member

That's a lot of detective work, thank you! 💜

I'm currently finishing Windows 10 per-monitor v2 DPI support. Will have a proper look at this as soon as I'm able. PR noted.

The GLX call that's expected on non-main threads is glXSwapBuffers.

Member

elmindreda commented Jun 11, 2018

That's a lot of detective work, thank you! 💜

I'm currently finishing Windows 10 per-monitor v2 DPI support. Will have a proper look at this as soon as I'm able. PR noted.

The GLX call that's expected on non-main threads is glXSwapBuffers.

@OlivierSohn

This comment has been minimized.

Show comment
Hide comment
@OlivierSohn

OlivierSohn Jun 11, 2018

@elmindreda you're welcome, I wanted to get to the bottom of this bug to see what was going on exactly, eventhough I had a working workaround...

OlivierSohn commented Jun 11, 2018

@elmindreda you're welcome, I wanted to get to the bottom of this bug to see what was going on exactly, eventhough I had a working workaround...

@kovidgoyal

This comment has been minimized.

Show comment
Hide comment
@kovidgoyal

kovidgoyal Jul 14, 2018

Contributor

Just FYI, I greatly cleaned up the implementation of this in my fork. The X11/Wayland event loops now support watching an arbitrary number of fds as well as an arbitrary number of timers. This was needed to integrate DBUS into the event loops which in turn was needed for IME support #41

Contributor

kovidgoyal commented Jul 14, 2018

Just FYI, I greatly cleaned up the implementation of this in my fork. The X11/Wayland event loops now support watching an arbitrary number of fds as well as an arbitrary number of timers. This was needed to integrate DBUS into the event loops which in turn was needed for IME support #41

@OlivierSohn

This comment has been minimized.

Show comment
Hide comment
@OlivierSohn

OlivierSohn Jul 14, 2018

@kovidgoyal could you please point out the relevant commits in your fork? I'd like to have an idea of what the incremental changes are, if we want to report the changes here. Thanks!

OlivierSohn commented Jul 14, 2018

@kovidgoyal could you please point out the relevant commits in your fork? I'd like to have an idea of what the incremental changes are, if we want to report the changes here. Thanks!

@kovidgoyal

This comment has been minimized.

Show comment
Hide comment
@kovidgoyal

kovidgoyal Jul 14, 2018

Contributor

There is only one commit in my fork, the latest kovidgoyal@f4b72f2

If you want to see the code under development look at the commits under the glfw sub-directory of kitty.

Contributor

kovidgoyal commented Jul 14, 2018

There is only one commit in my fork, the latest kovidgoyal@f4b72f2

If you want to see the code under development look at the commits under the glfw sub-directory of kitty.

@OlivierSohn

This comment has been minimized.

Show comment
Hide comment
@OlivierSohn

OlivierSohn Jul 14, 2018

Thanks. imo it's a change that should not be part of this particular issue. But it might be relevant for other issues or features.

OlivierSohn commented Jul 14, 2018

Thanks. imo it's a change that should not be part of this particular issue. But it might be relevant for other issues or features.

kovidgoyal added a commit to kovidgoyal/glfw that referenced this issue Aug 12, 2018

kovidgoyal added a commit to kovidgoyal/glfw that referenced this issue Sep 4, 2018

@elmindreda elmindreda changed the title from [Ubuntu 18.04 / X11] Sometimes, glfwPostEmptyEvent fails to unblock glfwWaitEvents to glfwPostEmptyEvent sometimes fails to unblock glfwWaitEvents on Ubuntu 18.04 Oct 14, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment