Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kicad very sluggish when some of its windows are hidden #23512

Closed
kuon opened this issue May 1, 2023 · 60 comments
Closed

Kicad very sluggish when some of its windows are hidden #23512

kuon opened this issue May 1, 2023 · 60 comments
Labels
Wayland wxGTK Wayland-specific issues

Comments

@kuon
Copy link

kuon commented May 1, 2023

I am trying to find the reason for the following issue.

When using kicad under Xwayland under sway, if one window is hidden, the whole application will be very sluggish.

After some investigation, it seems to be due to eglSwapBuffers "slowing down" when window is hidden, ref.

I am still unsure if this should be fixed here. I'll happily help as much as I can if you need more info.

kicad 7.0.1 with wxWidgets 3.2.2

@vadz
Copy link
Contributor

vadz commented May 1, 2023

Thanks for reporting this, I didn't know that calling eglSwapBuffers() on hidden windows is so bad. Of course, it doesn't make much sense to call wxGLCanvas::SwapBuffers() on a hidden window anyhow, but I guess this might somehow be happening in KiCad...

Anyhow, it would be simple enough to add a check for whether a window is hidden to

bool wxGLCanvasEGL::SwapBuffers()
{
// Under Wayland, if eglSwapBuffers() is called before the wl_surface has
// been realized, it will deadlock. Thus, we need to avoid swapping before
// this has happened.
if ( !m_readyToDraw )
return false;
return eglSwapBuffers(m_display, m_surface);
}

E.g. we could just replace the test for !m_readyToDraw with IsShownOnScreen(). Would you be able to test if this fixes the problem?

@kuon
Copy link
Author

kuon commented May 1, 2023

Yes, I can definitely test if it fixes the problem. Maybe not today as I'll have to compile everything, but this week I should be able to.

@dgud
Copy link
Contributor

dgud commented May 2, 2023

How expensive is IsShownOnScreen?
We don't want do something expensive in every call to SwapBuffers() that might be called very often.

@martantoine
Copy link

Hello there,
Just to confirm I got the same problem on my laptop running sway, those freezes are really slowing my workflow
Glad to see I'm not the only one with this issue
kicad 7.0.1

@imciner2
Copy link
Contributor

Some searching shows that there is a slightly different recommended way of handling not drawing on a hidden window in the Wayland rendering loop: https://emersion.fr/blog/2018/wayland-rendering-loop/. I am not sure how easily that can be adapted to work on the GL canvas though, so I wonder if it is something we might have to put into downstream KiCad instead.

vadz added a commit to vadz/wxWidgets that referenced this issue May 17, 2023
Ensure that we only draw when we really should and do not block in
SwapBuffers() when we shouldn't.

Use eglSwapInterval() to disable blocking in eglSwapBuffers() and ensure
that we only call it when we should by using m_readyToDraw not only for
the first repaint, but also for all the subsequent one.

Closes wxWidgets#23512.
@vadz
Copy link
Contributor

vadz commented May 17, 2023

Thanks Ian, this was a very useful link and I have a much better understanding of how this works after reading it (or, rather, now I have some understanding of how it works while I had no idea about it before).

But unless my understanding is very wrong, it looks like there is a very simple fix to our problems: we just need to call eglSwapInterval(m_display, 0) before calling eglSwapBuffers() to make it non-blocking. IIUC this should work fine because we already use the frame callback to determine whether we should be drawing or not. Except that we're not doing it correctly, AFAICS, because we never reset m_readyToDraw to false after setting it to true once. Of course, if we do reset it, we also need to arrange the callback to set it again later, otherwise we'd never draw anything after the first time.

I've created #23554 doing just this and it seems to work fine for me with the cube sample, but it would be great if somebody could test it with KiCad itself, of course.

Also, @swt2c please let me know if I did something wrong with your code/logic. TIA!

@swt2c
Copy link
Contributor

swt2c commented May 17, 2023

Except that we're not doing it correctly, AFAICS, because we never reset m_readyToDraw to false after setting it to true once.

Well, that's because it was originally only intended to avoid the deadlock on the first draw before the surface was realized.

However, your changes make sense to me. Hopefully they improve things for KiCad.

@swt2c
Copy link
Contributor

swt2c commented May 17, 2023

After re-reading @kuon's initial report, though, I wonder if these changes are going to have any effect for the OP. AFAICS, it appears that @kuon is using XWayland, which would mean the X11 implementation of wxGLCanvasEGL is being used.

@imciner2
Copy link
Contributor

X11 implementation of wxGLCanvasEGL is being used.

Can you clarify what this means? I thought EGL bypassed the XWayland system and went to the Wayland compositor directly, so using wxGLCanvasEGL would talk with Wayland directly.

@swt2c
Copy link
Contributor

swt2c commented May 17, 2023

X11 implementation of wxGLCanvasEGL is being used.

Can you clarify what this means? I thought EGL bypassed the XWayland system and went to the Wayland compositor directly, so using wxGLCanvasEGL would talk with Wayland directly.

Well, EGL itself might, but wxGLCanvasEGL checks the GdkWindow it's been given and if it determines its an X11 one, then it does all the X11 stuff. Presumably if using XWayland, the GdkWindow will be an X11 one?

@vadz
Copy link
Contributor

vadz commented May 17, 2023

I wonder if the OP could actually be using Wayland in spite of running Xwayland. Unless you explicitly set GDK_BACKEND=x11, wx applications will use Wayland and not X11 if it's available.

@kuon
Copy link
Author

kuon commented May 18, 2023

I do use Xwayland because kicad doesn't work natively under wayland and has GDK_BACKEND hardcoded to x11. I checked with Xeyes and I can confirm that.

I can test PR #23554 and I'll let you know.

@vadz
Copy link
Contributor

vadz commented May 18, 2023

Why doesn't KiCad work natively under Wayland? If it's due to some wx problems, could you please open (another) issue for this? We definitely want to support using Wayland natively, not just via XWayland.

Unfortunately #23554 shouldn't change anything for you if you're using X11, it only changes Wayland-specific code paths (but testing it wouldn't hurt, of course). And I have no idea about what could we do for X11 branch. The only workaround I see is to switch to using GLX instead of EGL, which currently requires building wxWidgets with --disable-glcanvasegl.

@kuon
Copy link
Author

kuon commented May 18, 2023

You can read the discussion here https://gitlab.com/kicad/code/kicad/-/issues/7207

@vadz
Copy link
Contributor

vadz commented May 18, 2023

Well, this ends with "things seem to work", so I'm still not sure what is the problem with running KiCad under Wayland natively any more (yes, there was an error shown at the beginning of this issue, but this was a couple of years ago).

@kuon
Copy link
Author

kuon commented May 18, 2023

Well, I have not tested myself. The thing is, there seem to be no interest from kica developers https://gitlab.com/kicad/code/kicad/-/issues/7207#note_828671381 to support wayland.

I use kicad under Xwayland, and the only real issue I have is this one (kicad freeze when some windows are hidden). So I thought it would be easier to fix.

But in the end, native wayland support would be better. But I cannot steer kicad development.

@vadz
Copy link
Contributor

vadz commented May 18, 2023

I can't speak for KiCad developers and I admit that I can share their frustration with some of Wayland design issues (not allowing applications to save/restore window geometry is just incomprehensible IMO, and I'm speaking as a user here -- I want the applications I use to do it), but at least some parts related to wxWidgets might be out of date. But if KiCad relies on events that we don't currently generate (wxEVT_DPI_CHANGED?) and if it's at all possible to provide them, we should try do it, which is why I'd like to have separate issues for things like this.

Anyhow, to return to the issue at hand, I wonder if we could need to call eglSwapInterval(0) for EGL when using X11 too. Unfortunately I still don't have any simple test case that would allow me to reproduce the problem, so I can't really test this myself.

OTOH it looks like there shouldn't be any problem with skipping drawing on hidden windows in EGL/X11 case, so I've added a commit doing this to the PR (I've force pushed there to also remove "Closes #23512" from the existing commit, as it doesn't really fix this issue). Could you please check it? And if it still doesn't fix the issue, could you please experiment with moving the eglSwapInterval() call below, i.e. make it unconditional for both X11 and Wayland cases, instead of only doing it for the latter? TIA!

vadz added a commit to vadz/wxWidgets that referenced this issue May 18, 2023
This is useless and might be actually harmful if it results in blocking
in eglSwapBuffers() and slowing down the application.

See wxWidgets#23512.
@SL-RU
Copy link

SL-RU commented May 25, 2023

I have the same issue with KiCad, it is very annoying and reaction of KiCad developers is disappointing

@vadz
Copy link
Contributor

vadz commented May 25, 2023

@SL-RU Please consider helping by testing the proposed changes.

@kuon
Copy link
Author

kuon commented May 25, 2023 via email

@imciner2
Copy link
Contributor

Why doesn't KiCad work natively under Wayland? If it's due to some wx problems, could you please open (another) issue for this? We definitely want to support using Wayland natively, not just via XWayland.

I thought the wxGrid rendering/behavior is still problematic on Wayland (I thought there was an issue somewhere for that as well, but I can't recall what number it is, since I thought that was one of the more well known issues). There are several uses of wxClientDC in the grid code right now, and those are broken on wayland according to #17820 and #16890 (comment).

@vadz
Copy link
Contributor

vadz commented May 25, 2023

We should have create issues for things that don't work as right now I don't see anything wxGrid-related among Wayland bugs.

Looking at the code, it seems like the problems might be limited to drag-moving, which is definitely still a problem, but I'm not even sure if it affects KiCad. If there is something (even) more serious, please don't hesitate to open an issue for it and I'll try to look at it. TIA!

@mordae
Copy link

mordae commented Sep 1, 2023

I have tried running KiCad with vadz@f2214ff applied and it did not help. It still freezes when switching between fullscreen eeschema and pcbnew.

@tdaniel22
Copy link
Contributor

Fixed this by calling eglSwapInterval(m_display, 0) before every buffer swap, in the X11 codepath. Setting the swap interval to 0 once in CreateSurface() is not enough, something must be setting it back to 1 somewhere else.

That's probably not a viable fix to implement though, because I'm guessing this will cause the render loop to run at full speed on native X11?

tdaniel22 added a commit to tdaniel22/wxWidgets that referenced this issue Sep 23, 2023
@vadz
Copy link
Contributor

vadz commented Sep 23, 2023

Fixed this by calling eglSwapInterval(m_display, 0) before every buffer swap, in the X11 codepath. Setting the swap interval to 0 once in CreateSurface() is not enough, something must be setting it back to 1 somewhere else.

But currently we don't even set it to 0 there when using X11, my changes only affect Wayland. So, just to confirm, did you actually add this call in wxGTKImpl::IsX11(window) case? It would be surprising if this was changed from somewhere else, but if it really is, could you please run the program under gdb, put a breakpoint on eglSwapInterval and see where is it called from?

That's probably not a viable fix to implement though, because I'm guessing this will cause the render loop to run at full speed on native X11?

I'm not so sure about it, you're only supposed to call SwapBuffers() when the window needs to be repainted or when enough time passes, i.e. the application shouldn't be calling it in a tight loop (and if it does, it's on it to change it).

So maybe the fix is really as simple as just disabling the swap interval for EGL when using X11 too...

@tdaniel22
Copy link
Contributor

But currently we don't even set it to 0 there when using X11, my changes only affect Wayland.

Sorry I forgot to mention, the X11 codepath is the one used on XWayland.

So, just to confirm, did you actually add this call in wxGTKImpl::IsX11(window) case? It would be surprising if this was changed from somewhere else, but if it really is, could you please run the program under gdb, put a breakpoint on eglSwapInterval and see where is it called from?

Yes I tried to set the swap interval in the wxGTKImpl::IsX11(window) case. GDB didn't find any other call to eglSwapInterval() than the one I added. Not sure yet what's up with that.

I'm not so sure about it, you're only supposed to call SwapBuffers() when the window needs to be repainted or when enough time passes, i.e. the application shouldn't be calling it in a tight loop (and if it does, it's on it to change it).

Setting the swap interval to 0 effectively disables vertical synchronization on X11, which will cause visible screen tearing. On top of pegging the CPU to 100% if the application doesn't throttle itself by some other means. Usually relying on the blocking behaviour of SwapBuffers() is the right design decision. Applications managing render timings themselves won't work properly on high or variable refresh rate monitors for instance. As a user of both, it's really frustrating when that one application decides it will run at a fixed 60 Hz, stutter and tear in your otherwise buttersmooth and frame perfect 165 Hz desktop environment. (I'm looking at you Kicad.)

For power saving reasons, Wayland only requests frames from the application when the surface is visible. This caused eglSwapBuffers() to block indefinitely, and hang applications that do logic in the render thread. So recently this was "fixed" by polling hidden surfaces at 1Hz, to at least give them a chance to process a bit .. Anyways this is a well known problem and there are heated discussions on this topic.

I think what happens with Kicad is the following. Kicad windows are not totally independant: selecting a component in one window will highlight it in the other, and vice versa. There must be synchronization going on between window threads, but one of them is running at a leisurely 1Hz, causing the active window to slow down to a crawl while it waits for locks and so on.

In theory though this should not happen: according to the X11 codepath, eglSwapBuffers() should be skipped if the surface is not IsShownOnScreen(). In practice however, this method seems to need some time to detect the change, after which it returns the correct value and eglSwapBuffers() is skipped.

So maybe the fix is really as simple as just disabling the swap interval for EGL when using X11 too...

Here's a list of the options that I see, by order of correctness:

1) Applications should decouple application logic from the render thread

That's the right way™ of doing a render loop if you need to keep processing stuff in the background when the user is not interacting with your application. And do not assume the render thread will be polled at any particular interval. We all know that's not happening though.

2) Applications should use the native Wayland backend and not force using XWayland

In the Wayland codepath, eglSwapBuffers() is only called after the frame callback. That's the correct way of doing it. But there are obviously some other blockers with Wayland that need fixing before Kicad switches to it.

3) wxWidgets should correctly detect hidden surfaces in the X11 codepath

As mentionned above, if wxWidgets could detect when the surface is hidden "atomically", then we wouldn't risk hanging in eglSwapBuffers(). Not sure if that's possible however.

4) wxWidgets should use non-blocking eglSwapBuffers() in the X11 codepath

The easiest but also the dirtiest fix. We would loose vertical synchronization, which means screen tearing, maxed out CPU, and all the caveats I mentioned earlier. The impact could be limited by only setting the swap interval to 0 when XWayland is in use.

vadz added a commit to vadz/wxWidgets that referenced this issue Oct 3, 2023
Set EGL swap interval to 0 to prevent EGL functions from blocking for 1
second when the window is entirely obscured, resulting in catastrophic
slowdown of the entire program whenever this happened.

See wxWidgets#23909, wxWidgets#23512.

(cherry picked from commit aaabb84)
@dsa-t
Copy link
Contributor

dsa-t commented Oct 4, 2023

Now, before the patch, I cannot reproduce the slowdown for some reason.

@Manolo-ES
Copy link
Contributor

This PR seems to me completely unrelated to wxWidgets.
wx provides just the basic to have a OGL environment. The rest, if some app is sluggish or not, belongs only to that app.

SwapBuffers have a v-sync by default: wait for next hardware vertical blank. And this works on most cases. But people using OpenGL usually wants more... and there are a lot of ways of achiving faster drawing (buffer mapping, vertex and frame shaders optimization, etc). Changing the SwapInterval to 0 may work in some cases, but not all cases.

Nothing prevents you of sending gl-commands (and filling buffers) for a hidden (minimized or behind other window). The thing is that calling SwapBuffers for such a window will dismiss all pixels because the GPU is not the owner of those pixels, but the OS is. The side effect of calling SwapBuffers in this case is the CPU waiting for next v-sync, making the app sluggish if it needs more than 60 fps.
The PR catching of this circumstance by m_readyToDraw is a good improvement, although perhaps is the user who must take care and not wx.

A good way to get the most out of the GPU is issuing data/commands to the GPU, but only call for drawing a few times.
Rendering to a "Frame Buffer Object" (FBO) can be done offscreen. Then a simple "gl-blit" followed by SwapBuffers will suffice.

@vadz
Copy link
Contributor

vadz commented Oct 28, 2023

@Manolo-ES The above makes it clear that you didn't really follow this in details so, to prevent further confusion and misunderstanding, let me repeat: this is not about optimizing anything, because the part that you missed is that Wayland sets swap interval to 1 (*one) FPS, not 60 FPS, for the obscured windows, so it's not about making the app sluggish at more than 60 FPS but about making it completely unusable as soon as any windows containing wxGLCanvas becomes obscured. Please refer to the previous discussions for more details.

@sre
Copy link

sre commented Nov 7, 2023

IIUIC the bug is supposed to be fixed in wxWidgets 3.2.3, but I can still reproduce this in Debian testing/trixie, which has KiCAD 7.0.8 and wxWidgets 3.2.3. (My setup is the same as OP, so Sway and KiCAD under XWayland [default]).

@vadz
Copy link
Contributor

vadz commented Nov 7, 2023

This is worrisome, but the upcoming 3.2.4 will have a few more fixes to this area, please retry with this when it's released in a couple of days.

@sre
Copy link

sre commented Nov 20, 2023

I'm on KiCAD 7.0.8 with wxWidgets 3.2.4 now, still running Sway. I still see lag problems, but on closer investigation the lag behaviour changed. My usual setup is having two maximized windows (PCB + schematics) on different workspaces. When switching between the windows/workspaces it hangs for some seconds. Afterwards it runs smoothly. So it might actually be a different bug. When putting both windows on the same workspace next to each other everything is working fine, though. So this is still about window occlusion.

@evils
Copy link

evils commented Nov 21, 2023

i can't reproduce occlusion lag [as you describe it] with KiCad 7.0.8 or 7.0.9 on NixOS with wxWidgets 3.2.4 and sway 1.8.1
could it be you have 3.2.4 installed, but your KiCad was built with an older wxWidgets?
(not sure how this works (on non-Nix packages), but 7.0.8 is about a month older than 3.2.4)

@sre
Copy link

sre commented Nov 21, 2023

It was not build against 3.2.4, but using 3.2.4 at runtime (can be easily checked in KiCAD's Help -> About KiCAD dialog). Anyways - I now installed KiCAD 7.0.9 built against wxWidgets 3.2.4 and nothing changed. I did some further tests and can narrow down the issue a bit more:

First of all: When both windows are visible there are no lags. With two displays sway has one workspace per screen. Even then switching between the windows works fine. So it's not related to sway's workspaces.

When I have one window per workspace (and only one of the workspaces visible at the same time), I get a period with some seconds of unresponsiveness when switching between the windows. But the lag only happens, if the other window gets focus. E.g. this sequence works without any lags:

step 1: workspace: KiCAD Schematics (focus) + Terminal
step 2: workspace: KiCAD PCB + Terminal (focus)
step 3: workspace: KiCAD Schematics (focus)

But this one does lag:

step 1: workspace 1: KiCAD Schematics (focus) + Terminal
step 2: workspace 2: KiCAD PCB (focus, lag) + Terminal
step 3: workspace 1: KiCAD Schematics (focus, lag)

I tried resetting KiCAD config and it does happen with the default config. During the lag CPU is more or less completely idle. Also the switch lag seems to be exactly 10 seconds.

Considering eGL is involved, I guess mesa version and graphics hardware are also relevant. In my case I can reproduce the issue with two different AMD GPU based systems and I'm running mesa 23.2.1 (on both systems).

@sre
Copy link

sre commented Nov 21, 2023

I tried reproducing this on my old Intel based laptop and that works fine without lags.

@sre
Copy link

sre commented Dec 1, 2023

I tried running kicad in gdb and stopping it when the lag happens to generate a backtrace:

Thread 1 "kicad" received signal SIGINT, Interrupt.
0x00007ffff5119a1f in __GI___poll (fds=fds@entry=0x7fffffff9c58, nfds=nfds@entry=1, timeout=timeout@entry=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
29	in ../sysdeps/unix/sysv/linux/poll.c
(gdb) bt
#0  0x00007ffff5119a1f in __GI___poll (fds=fds@entry=0x7fffffff9c58, nfds=nfds@entry=1, timeout=timeout@entry=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
#1  0x00007ffff406fd12 in poll (__timeout=-1, __nfds=1, __fds=0x7fffffff9c58) at /usr/include/x86_64-linux-gnu/bits/poll2.h:47
#2  _xcb_conn_wait (c=c@entry=0x5555560eb220, cond=cond@entry=0x55555f359298, vector=vector@entry=0x0, count=count@entry=0x0) at ../../src/xcb_conn.c:508
#3  0x00007ffff407216a in xcb_wait_for_special_event (c=0x5555560eb220, se=0x55555f359270) at ../../src/xcb_in.c:806
#4  0x00007fffe8bc7658 in dri3_wait_for_event_locked (full_sequence=0x0, draw=0x55555f4aa528) at ../src/loader/loader_dri3_helper.c:598
#5  dri3_wait_for_event_locked (draw=0x55555f4aa528, full_sequence=0x0) at ../src/loader/loader_dri3_helper.c:579
#6  0x00007fffe8bc7828 in dri3_find_back (draw=draw@entry=0x55555f4aa528, prefer_a_different=false) at ../src/loader/loader_dri3_helper.c:755
#7  0x00007fffe8bca051 in dri3_get_buffer (format=format@entry=4098, buffer_type=buffer_type@entry=loader_dri3_buffer_back, draw=draw@entry=0x55555f4aa528, driDrawable=<optimized out>)
    at ../src/loader/loader_dri3_helper.c:2060
#8  0x00007fffe8bca35d in loader_dri3_get_buffers (driDrawable=<optimized out>, format=4098, stamp=0x55555eb1b2f0, loaderPrivate=0x55555f4aa528, buffer_mask=<optimized out>, buffers=0x7fffffff9f90)
    at ../src/loader/loader_dri3_helper.c:2283
#9  0x00007fffc5ebbe4f in dri_image_drawable_get_buffers (drawable=drawable@entry=0x55555eb1b2f0, images=images@entry=0x7fffffff9f90, statts=statts@entry=0x55555f4a98b0, statts_count=statts_count@entry=2)
    at ../src/gallium/frontends/dri/dri2.c:316
#10 0x00007fffc5ebbfde in dri2_allocate_textures (ctx=0x55555f01cba0, drawable=0x55555eb1b2f0, statts=0x55555f4a98b0, statts_count=2) at ../src/gallium/frontends/dri/dri2.c:488
#11 0x00007fffc5ebefe4 in dri_st_framebuffer_validate (st=<optimized out>, pdrawable=0x55555eb1b2f0, statts=0x55555f4a98b0, count=2, out=0x7fffffffa180, resolve=<optimized out>)
    at ../src/gallium/frontends/dri/dri_drawable.c:79
#12 0x00007fffc5f81d4f in st_framebuffer_validate (stfb=stfb@entry=0x55555f4a9450, st=st@entry=0x55555ebb8760) at ../src/mesa/state_tracker/st_manager.c:239
#13 0x00007fffc5f82c25 in st_manager_validate_framebuffers (st=st@entry=0x55555ebb8760) at ../src/mesa/state_tracker/st_manager.c:1239
#14 0x00007fffc61d4b17 in st_update_framebuffer_state (st=0x55555ebb8760) at ../src/mesa/state_tracker/st_atom_framebuffer.c:120
#15 0x00007fffc61d951d in st_validate_state (pipeline_state_mask=1107296512, st=0x55555ebb8760) at ../src/util/bitscan.h:117
#16 st_Clear (ctx=0x55555eb74390, mask=18) at ../src/mesa/state_tracker/st_cb_clear.c:400
#17 0x00007fffc6036748 in _mesa_unmarshal_Clear (ctx=<optimized out>, cmd=<optimized out>) at src/mapi/glapi/gen/marshal_generated1.c:216
#18 0x00007fffc5f2e382 in glthread_unmarshal_batch (job=job@entry=0x55555eb76510, gdata=gdata@entry=0x0, thread_index=thread_index@entry=0) at ../src/mesa/main/glthread.c:139
#19 0x00007fffc5f2e97a in _mesa_glthread_finish (ctx=0x55555eb74390) at ../src/mesa/main/glthread.c:399
#20 _mesa_glthread_finish (ctx=0x55555eb74390) at ../src/mesa/main/glthread.c:364
#21 0x00007fffc602f75f in _mesa_marshal_GetError () at src/mapi/glapi/gen/marshal_generated1.c:1809
#22 0x00007fff9a60eecd in checkGlError(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, char const*, int, bool)
    (aInfo="switching framebuffer", aFile=aFile@entry=0x7fff9ac5dd68 "./common/gal/opengl/opengl_compositor.cpp", aLine=aLine@entry=385, aThrow=aThrow@entry=true) at ./common/gal/opengl/utils.cpp:47
#23 0x00007fff9a60c5ff in KIGFX::OPENGL_COMPOSITOR::bindFb(unsigned int) (this=this@entry=0x55555eaf4a60, aFb=7) at /usr/include/c++/13/bits/basic_string.tcc:242
#24 0x00007fff9a60c74b in KIGFX::OPENGL_COMPOSITOR::SetBuffer(unsigned int) (this=0x55555eaf4a60, aBufferHandle=2) at ./common/gal/opengl/opengl_compositor.cpp:278
#25 0x00007fff9a60066e in KIGFX::OPENGL_GAL::EndDrawing() (this=0x55555eb09510) at ./common/gal/opengl/opengl_gal.cpp:712
#26 0x00007fff9a5e1e2e in KIGFX::GAL_DRAWING_CONTEXT::~GAL_DRAWING_CONTEXT() (this=<synthetic pointer>, __in_chrg=<optimized out>) at ./include/gal/graphics_abstraction_layer.h:1144
#27 EDA_DRAW_PANEL_GAL::DoRePaint() (this=0x555558041030) at ./common/draw_panel_gal.cpp:306
#28 0x00007ffff7224202 in wxEvtHandler::ProcessEventIfMatchesId(wxEventTableEntryBase const&, wxEvtHandler*, wxEvent&) (entry=..., handler=<optimized out>, event=...) at ./src/common/event.cpp:1431
#29 0x00007ffff72246ae in wxEvtHandler::SearchDynamicEventTable(wxEvent&) (this=this@entry=0x555558041030, event=...) at ./src/common/event.cpp:1901
#30 0x00007ffff7224a14 in wxEvtHandler::TryHereOnly(wxEvent&) (this=this@entry=0x555558041030, event=...) at ./src/common/event.cpp:1624
#31 0x00007ffff7224abe in wxEvtHandler::TryBeforeAndHere(wxEvent&) (event=..., this=0x555558041030) at ./include/wx/event.h:4007
#32 wxEvtHandler::ProcessEventLocally(wxEvent&) (this=0x555558041030, event=...) at ./src/common/event.cpp:1561
#33 0x00007ffff7224bc1 in wxEvtHandler::ProcessEvent(wxEvent&) (this=0x555558041030, event=...) at ./src/common/event.cpp:1534
#34 0x00007ffff7225ce5 in wxEvtHandler::ProcessPendingEvents() (this=this@entry=0x555558041030) at ./src/common/event.cpp:1398
#35 0x00007ffff70a93fa in wxAppConsoleBase::ProcessPendingEvents() (this=0x5555560e6cd0) at ./src/common/appbase.cpp:570
#36 wxAppConsoleBase::ProcessPendingEvents() (this=0x5555560e6cd0) at ./src/common/appbase.cpp:546
#37 0x00007ffff7765675 in wxApp::DoIdle() (this=0x5555560e6cd0) at ./src/gtk/app.cpp:149
#38 0x00007ffff7765757 in wxapp_idle_callback(gpointer) () at ./src/gtk/app.cpp:101
#39 0x00007ffff5ffd0d9 in g_main_dispatch (context=context@entry=0x555556114b10) at ../../../glib/gmain.c:3476
#40 0x00007ffff6000317 in g_main_context_dispatch_unlocked (context=0x555556114b10) at ../../../glib/gmain.c:4284
#41 g_main_context_iterate_unlocked (context=0x555556114b10, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at ../../../glib/gmain.c:4349
#42 0x00007ffff6000c1f in g_main_loop_run (loop=loop@entry=0x555556057c00) at ../../../glib/gmain.c:4551
#43 0x00007ffff65fd63d in gtk_main () at ../../../gtk/gtkmain.c:1329
#44 0x00007ffff7782995 in wxGUIEventLoop::DoRun() (this=0x555556c6faf0) at ./src/gtk/evtloop.cpp:61
#45 0x00007ffff70e4481 in wxEventLoopBase::Run() (this=0x555556c6faf0) at ./src/common/evtloopcmn.cpp:87
#46 0x00007ffff70aa86f in wxAppConsoleBase::MainLoop() (this=0x5555560e6cd0) at ./src/common/appbase.cpp:381
#47 0x000055555581f36f in APP_KICAD::OnRun() (this=<optimized out>) at ./kicad/kicad.cpp:453
#48 0x00007ffff7130a7b in wxEntry(int&, wchar_t**) (argc=@0x7ffff72cc1a4: 1, argv=<optimized out>) at ./src/common/init.cpp:497
#49 0x00007ffff7131506 in wxEntry(int&, char**) (argc=<optimized out>, argv=<optimized out>) at ./src/common/init.cpp:509
#50 0x000055555580198c in main(int, char**) (argc=<optimized out>, argv=<optimized out>) at ./kicad/kicad.cpp:549

Looks like KiCAD is hanging in this glGetError() call.

@sre
Copy link

sre commented Dec 1, 2023

For anyone looking into this with an AMD system: The bug does not occur when I force software rendering in mesa like this: LIBGL_ALWAYS_SOFTWARE=1 kicad . For me this is enough of a reason to create a mesa bug and continue there:

https://gitlab.freedesktop.org/mesa/mesa/-/issues/10235

@sre
Copy link

sre commented Dec 4, 2023

@vadz do you think the workaround added to wxwidgets EGL code could also be added for GLX? Apparently KiCAD/wxwdigets/glew in Debian is currently build without EGL support (I'm writing a bug report in parallel to change that). Running KiCAD with vblank_mode=0 environment fixes the issue for me and in the mesa bug report it was suggested to copy the workaround implemented for EGL to GLX.

@dsa-t
Copy link
Contributor

dsa-t commented Dec 4, 2023

Speaking of the following comment, does it really block for up to a second on a real X session, or was it only tested on XWayland?

// Before doing anything else, ensure that eglSwapBuffers() doesn't block:
// under Wayland we don't want it to because we use the surface callback to
// know when we should draw anyhow and with X11 it blocks for up to a
// second when the window is entirely occluded and because we can't detect
// this currently (our IsShownOnScreen() doesn't account for all cases in
// which this happens) we must prevent it from blocking to avoid making the
// entire application completely unusable just because one of its windows
// using wxGLCanvas got occluded or unmapped (e.g. due to a move to another
// workspace).

@vadz
Copy link
Contributor

vadz commented Dec 25, 2023

do you think the workaround added to wxwidgets EGL code could also be added for GLX?

@sre Sorry for the delay with the reply, I hoped to actually do it, but as I clearly failed and am not sure to find time to do it in the near future too, let me at least reply to this: yes, we could do the same thing in wxGLCanvasX11::SwapBuffers(), of course, there is a glXSwapInterval() extension which is apparently universally available, so it "just" needs to be done. If you can make a PR doing it (and check that it works), it would be great, of course!

Speaking of the following comment, does it really block for up to a second on a real X session, or was it only tested on XWayland?

@dsa-t Personally I only tested this under XWayland.

@sre
Copy link

sre commented Dec 25, 2023

If you can make a PR doing it (and check that it works), it would be great, of course!

I will try to find some time, but not guarantees from my side either.

Speaking of the following comment, does it really block for up to a second on a real X session, or was it only tested on XWayland?

@dsa-t Personally I only tested this under XWayland.

My understanding is, that native X does not run into this. It keeps the rendering loop active when a windows is not visible and wasting power. But on native X the current code is expected to also has issues when the KiCAD windows are on two different monitors running with different refresh rates (e.g. a gaming monitor at 144Hz and a normal one with 60Hz). Possibly less noticeable, since the refresh rate should be at least 60 Hz in that specific case. I don't have any gaming monitor to test that claim from the mesa people.

vadz added a commit to vadz/wxWidgets that referenced this issue Dec 25, 2023
We need to do it when using XWayland for the same reasons as we had to
do it in the EGL version when using either XWayland or Wayland directly:
without this, we can block for up to 1 second in glXSwapBuffers() if the
window is hidden, see wxWidgets#23512.

Closes wxWidgets#24163.
@vadz
Copy link
Contributor

vadz commented Dec 25, 2023

I will try to find some time, but not guarantees from my side either.

I couldn't resist trying to do it, finally, so I did, in #24165. This works under XWayland, but I still didn't test it under Xorg, it would be great if you could please do it. TIA!

@dsa-t
Copy link
Contributor

dsa-t commented Dec 26, 2023

But we don't want to render faster than the display refresh rate. KiCad sets swap interval to adaptive or 1 to limit it.

KiCad 7.99 generally renders from:

  • first mouse event;
  • idle event when the mouse event rate is low;
  • in between mouse events when the mouse event rate is high.

There's a limit of at least 3 ms between frame end and next frame start, but for most displays (60 Hz), disabling swap interval will cause increased power draw.

@vadz
Copy link
Contributor

vadz commented Dec 26, 2023

But we don't want to render faster than the display refresh rate.

Sorry, I don't understand: why wasn't it a problem for EGL but is for GLX?

KiCad sets swap interval to adaptive or 1 to limit it.

Wouldn't it be better to avoid calling SwapBuffers() too often instead?

@dsa-t
Copy link
Contributor

dsa-t commented Dec 26, 2023

Sorry, I don't understand: why wasn't it a problem for EGL but is for GLX?

I'm sure this is a problem in EGL+(X)Wayland as well, but not freezing for 1 second probably was more important than high power usage.

Wouldn't it be better to avoid calling SwapBuffers() too often instead?

How do we know when it's the best time to swap (aka at vblank)?

On native Wayland, this can probably be hacked together via https://wayland.app/protocols/presentation-time#wp_presentation_feedback:event:presented

On Xwayland... Not sure.
May try to query display refresh rate and limit frame rate to it. Which will cause jitter.

@vadz
Copy link
Contributor

vadz commented Dec 26, 2023

Sorry, I don't understand: why wasn't it a problem for EGL but is for GLX?

I'm sure this is a problem in EGL+(X)Wayland as well, but not freezing for 1 second probably was more important than high power usage.

So I guess the question is: does this freezing happen with GLX and XWayland only? If yes, then I need to indeed add a check for XWayland being in use...

BTW, I can't reproduce the problem with GLX and XWayland here, not sure why.

Wouldn't it be better to avoid calling SwapBuffers() too often instead?

How do we know when it's the best time to swap (aka at vblank)?

Unfortunately I don't have any good answer to this neither, sorry.

vadz added a commit that referenced this issue Jan 5, 2024
We need to do it when using XWayland for the same reasons as we had to
do it in the EGL version when using either XWayland or Wayland directly:
without this, we can block for up to 1 second in glXSwapBuffers() if the
window is hidden, see #23512.

Closes #24163.

Closes #24165.
@navaati
Copy link

navaati commented Feb 9, 2024

Hi.

I’m waiting on this fix to land in the Flatpak it doesn’t seem yet because I have a similar bug: a big pause when switching between fullscreen windows under Gnome. It was very frustrating to have the program freeze whenever alt-tabbing between the schematic and the PCB windows.

For those like me, I devised a small workaround: don’t put your windows entirely in full-screen, instead resize them to nearly the whole screen except a small band left in which you see the other window open, and same for your second window on the other side.

That way, when you have the PCB in front you see a bit of the schematic window behind, and when you have the schematic window in front you see a bit of the PCB window behind. That way, Gnome never decides that a window is completely hidden (as it would with complete fullscreen) and the window doesn’t get in and out of the hidden state that causes the bug.

And it works, no more freeze when alt-tabbing ! Hope it helps someone, and please devs ping us here when the patch should be in the Flatpak, so we can test :).

@sre
Copy link

sre commented Feb 10, 2024

@navaati You can set vblank_mode=0 environment variable as workaround. It should give you the same experience as the fix. I'm not using flatpak for KiCad myself, but this should work:

flatpak run --env=vblank_mode=0 org.kicad.KiCad

Assuming the above does work, the following should make it permanent:

flatpak override --env=vblank_mode=0 org.kicad.KiCad

@ftdt10
Copy link

ftdt10 commented Feb 11, 2024

A solution to KiCAD with Ubuntu 24.04 Wayland mode,
https://gitlab.com/kicad/code/kicad/-/issues/15578#note_1767266872
(It uses native Wayland mode, not using XWayland compatibility layer)

Build options for native Wayland.
KICAD_USE_EGL=on
KICAD_USE_BUNDLED_GLEW=on
KICAD_WAYLAND=on


For wxWidgets library, use v3.2.4 or 3.2.5-proposed-backports branch.
git clone -b BRANCH_NAME https://github.com/wxWidgets/wxWidgets.git --recurse-submodules
then, use "configure" command --with-gtk --prefix=PATH, and set LD_LIBRARY_PATH to the PATH.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Wayland wxGTK Wayland-specific issues
Projects
None yet
Development

No branches or pull requests