Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copy paste with kitty terminal is semi-broken #2770

Closed
Ranguvar opened this issue Oct 5, 2018 · 34 comments
Closed

Copy paste with kitty terminal is semi-broken #2770

Ranguvar opened this issue Oct 5, 2018 · 34 comments

Comments

@Ranguvar
Copy link

Ranguvar commented Oct 5, 2018

Taking a risk because I know "[we] are not fond of bug reports", and this may be a bug in kitty, but maybe others have noticed and will help me debug this.

Also, the bug hinges on sway workspaces, which makes me think it might be a sway bug.

I'm using KITTY_ENABLE_WAYLAND=1 kitty 0.12.3 with latest git sway as of writing - "sway version 1.0-alpha.6-147-gc1f09939 (Oct 5 2018, branch 'makepkg')"

Copy paste works, but seems quite janky.

For instance, if I ctrl+shift+c from kitty, I can ctrl+v into kitty or Firefox on the same workspace - no problems.

However, if I switch workspace and ctrl+shift+v to another kitty or Firefox there, the terminal blinker pauses for a few moments, then carries on as usual - no paste.

If I ctrl+shift+v to that program on another workspace, and then during the time the terminal blinker is paused, I switch rapidly between workspaces, the paste then happens.

If I ctrl+shift+v to a program that I start after copying, or to an existing program on the same workspace, it works fine.

I'm not certain if this is a kitty bug or a sway bug, but it doesn't happen with termite.
It also doesn't happen with kitty+Weston.
The bug was rejected by the kitty developers because they don't have time to devote to sway-specific bug.

@emersion
Copy link
Member

emersion commented Oct 5, 2018

Can you upload a log of kitty with WAYLAND_DEBUG=1 set?

@Ranguvar
Copy link
Author

Ranguvar commented Oct 5, 2018

Absolutely.

First run - allowing kitty to fail to copy, by not switching workspaces rapidly after attempting to paste.

First kitty (copying from): http://ix.io/1lZf
Second kitty (copying to): http://ix.io/1lZz

Second run - switch workspaces back and forth when the stall hits after pasting, which results in a successful paste.

First kitty (copying from): http://ix.io/1lZA
Second kitty (copying to): http://ix.io/1m3H

I also noticed that during the paste attempt, while terminal blinker pauses, ALL debug output ceases.

@tarmack
Copy link
Contributor

tarmack commented Oct 8, 2018

I am also seeing this issue, what I noticed as well is that Kitty seems to not actually render any output from the running command either when not on the active workspace. When getting focus back to the workspace the terminal is running at it seems to catch up really quickly.

I am inclined to deduce from this that Kitty stops processing (only the display threat?) when not in focus for some reason. Another reason for coming to this conclusion: Start Kitty from a terminal, move the Kitty window to a different workspace, go back to the previous terminal and ^C it. Kitty will not die, the workspace with Kitty will stay. Only when you focus the workspace with Kitty again it will quickly quit.

And yet another point, Sway and running Kitty in single-process mode are not compatible, the Kitty windows will only work on the workspace that has the master process.

I know this is just a bunch of anecdotes, but they may help to get to the issue. I will be digging in the code (both Sway and Kitty) some where later this week.

@emersion
Copy link
Member

emersion commented Oct 8, 2018

This seems like a Kitty (or GLFW) issue. When a client isn't visible (e.g. hidden because on another workspace) we don't send wl_surface.frame events (to prevent it from rendering frames that won't be displayed). However this seems to stall Kitty completely. Clients not receiving wl_syrface.frame events should stop rendering but should continue to process other Wayland events (like requests to send the clipboard data).

On the sending side:

[1196800.577] wl_data_source@29.send("text/plain;charset=utf-8", fd 13)
[278 19:03:24.774997] [glfw error 65544]: Wayland: Could not copy writing to destination fd failed with error: Broken pipe

On the receiving side:

[1191389.160]  -> wl_data_offer@4278190080.receive("text/plain;charset=utf-8", fd 15)
[278 19:03:21.365602] [glfw error 65544]: Wayland: Failed to read clipboard data from pipe (timed out)

@martinetd
Copy link
Member

That would probably also be why copy/paste doesn't work; kitty is heavily mono-threaded so odds are it just asked to draw something and waits there..

Actually it's just calling eglSwapBuffer without making sure the view is ready to draw, so that just hangs forever because it cannot do the swap while the term isn't displayed.
It's creating frame callbacks but it doesn't try to do the swap depending on the callback coming; it shouldn't be too complicated to fix... that part is in the kitty code so at least there's no worry about having to fix glfw both in kitty and upstream.
The dev will probably just tell you to use Xwayland if you ask but he'll take a patch if you send one, open an issue there if there isn't one yet :)

@martinetd
Copy link
Member

Hmm actually spoke too fast, I don't see where kitty would explicitly create a wl_surface_frame, but it does according to the WAYLAND_DEBUG output so the first thing to do would be to figure where that happens... Anyway, it's definitely a kitty issue, closing here - feel free to ask more question about how wl_surface.frame events work here anyway.

@emersion
Copy link
Member

emersion commented Oct 8, 2018

IIRC eglSwapBuffer manages frame events "for you" when you don't have eglSwapInterval(0).

@martinetd
Copy link
Member

martinetd commented Oct 8, 2018

meh, if so that's going to be quite a bit of work to make kitty properly wait for windows to be ready to display, which will imply some glfw changes as well... It realy hasn't been designed for having a single program display multiple windows over different workspaces and things like that

otoh there's already a kitty option to set eglSwapInterval to 0, setting sync_to_monitor no - that seems to help with that

(EDIT: finished sentence)

@kovidgoyal
Copy link

If I understand the issue correctly, all you need to do is patch glfw to track whether a window is ready to swap buffers or not and make glfwSwapBuffers() simply return early if it is not.

@kovidgoyal
Copy link

Oh and there is no need to fix glfw upstream (unless you want to) kitty's copy of glfw has permanently diverged, principally because of glfw/glfw#1140

@kovidgoyal
Copy link

Similar fix for mpv: mpv-player/mpv#249

And this seems like a problem with wayland/eglSwapBuffers design. There seems to be an unavoidable race here. Any client trying to use eglSwapBuffers has to somehow know if the window is drawable or not before calling eglSwapBuffers. Since the notification is asynchronous, there is no reliable way to do that. It can happen that the window is hidden while the client is calling eglSwapBuffers.

@martinetd
Copy link
Member

The problem is the current glfw code doesn't expose the frame callback because eglSwapBuffers does it for us ; mpv is slightly different because they already did have a frame callback to check and know if it's safe to draw (the wl->frame.pending is reset in that callback)

Basically, a wayland app that wants to "sync to monitor" needs to explicitly create a frame, and swap buffers once the callback comes; so the proper fix for kitty is more involved than what I thought at first because glfw does not register that callback.

So a proper fix would be for glfwSwapBuffers() to create a frame if there isn't one yet, and then do the actual swap when the callback comes. I think one glfwSwapBuffers when callback comes is guaranteed not to block (or rather, we should set eglSwapInterval(0) anyway so it'll know it doesn't need to create a new frame, and won't block whatever happens anyway?) and the kitty sync_to_monitor option would either schedule that callback to be in sync, or just call eglSwapBuffer directly if it's off.
eglSwapBuffer with eglSwapInterval(1) does the same except that it just waits until the callback comes inline, which is effectively why you call it racy, but we can do the work for it and remove the race - if I understood things correctly, if your window is hidden between the callback and the eglSwapBuffer call, it'll just draw that one whenever the window is displayed again, and won't send any new callback until then ; basically you're always one step late so it doesn't matter when the window is hidden.

Actually setting sync_to_monitor off does "fix" the hang behaviour already but I'd like to hear from people who reported the bug if it fixes their copy/paste, that was the original issue afterall...

(fwiw, I understand kitty's glfw and upstream have permanently diverged, but I was talking about fixing both because we have some glfw people lurking around, Cc @linkmauve - this might be something others run into)

@kovidgoyal
Copy link

doesn't seem that hard. Just add a #ifdef _GLFW_WAYLAND block to glfwSwapBuffers() in context.c that can register the callback on the window as needed.

@kovidgoyal
Copy link

Just for giggles, I tried to reproduce this issue using weston owrkspaces, and I cannot. Steps I tried:

  1. Start kitty
  2. Copy some text
  3. switch to another workspace
  4. start weston-terminal and right click and select paste
  5. text from (2) is copied.

I tried this several times just in case it is timing related. Looking at the kitty code, kitty does not render windows that are marked as invisible. So presumably weston simply allows clients to swap buffers at their discretion, leaving it up to the client to decide when to stop rendering, instead of trying to force the client to not swap buffers.

Seems to me the correct fix is for sway to do the same. That way all clients will work with it and the worst that will happen is clients that do not checkvisible status of their windows before swapping, will do some extra rendering. No potential for races here, or relying on things like swap buffers not hanging after a callback.

@emersion
Copy link
Member

emersion commented Oct 9, 2018

I think there's some confusion here.

Looking at the kitty code, kitty does not render windows that are marked as invisible.

Wayland doesn't have a way to "mark windows as invisible". Windows not visible have nothing different from visible ones, except they may not receive frame events.

So presumably weston simply allows clients to swap buffers at their discretion, leaving it up to the client to decide when to stop rendering, instead of trying to force the client to not swap buffers.

The compositor cannot prevent clients from swapping buffers. However as @martinetd said, if you don't eglSwapInterval(0) then eglSwapBuffers will block until the next frame event. This is a compatibility shim to allow clients not aware of frame events to work on Wayland too (preventing e.g. busy loops). This is the client decision to keep this behavior or don't block with eglSwapBuffers.

Seems to me the correct fix is for sway to do the same.

Weston probably just sends frame events to all windows even if they're invisible. That means that e.g. if you have a video player in a hidden workspace then it will continue to render frames even if none of them will be displayed. Thus for frame events Weston cannot be used as a reference. The fact that Weston always sends frame events is a Weston-specific behavior, and not doing it is (1) not violating the spec and (2) actually a lot better.

We will not send frame events to hidden clients. Instead, clients should properly:

  • Use frame events to know when they should render
  • Continue to process other events even when frame events don't come in

In order to do so, I suggest:

  • You always use eglSwapInterval(0) on Wayland
  • When you want to render, you request a frame event
  • When you receive a frame event, you render + eglSwapBuffers (which doesn't block)

By doing so, other events (such as clipboard) should properly be handled.

@kovidgoyal
Copy link

kovidgoyal commented Oct 9, 2018 via email

@ddevault
Copy link
Contributor

ddevault commented Oct 9, 2018

So clients on Wayland cannot know if they need to render their windows or not? It means that kitty on wayland will unnecessarily have to render each frame for every invisible window.

No, Wayland clients should render their window when they get the frame callback. They don't need to know if they're "invisible" or not. They just need to render every time they get a frame callback, no more or less frequently then that. If they are "invisible" they will not receive frame events.

@ddevault
Copy link
Contributor

ddevault commented Oct 9, 2018

Of course, if they get a frame callback and nothing has changed since the last one, they can choose not to rerender, too.

@kovidgoyal
Copy link

kovidgoyal commented Oct 9, 2018 via email

@emersion
Copy link
Member

emersion commented Oct 9, 2018

Practically speaking, that is never going to happen, since most toolkits aren't designed that way, and even the reference Wayland compositor is not designed that way, since it sends frames even for hidden windows.

Note that GTK, Qt, Weston clients, mpv, sway clients and many other clients handle this properly. The reference compositor isn't optimized, but this could be fixed.

It's indeed more difficult for cross-platform toolkits that weren't designed this way.

@emersion
Copy link
Member

emersion commented Oct 9, 2018

I could probably use some heuristic hack

Is there a reason why you can't implement frame callbacks properly?

@kovidgoyal
Copy link

I dont see an elegant way to integrate frame callbacks into kitty's event loop (all the other backends kitty works with dont have any such concept, and I'd really rather avoid having two different event loops per backend).

Basically the rendering logic in kitty is:

wait for window system events
check all windows that are visible and that have changed their state since the last render
render these windows and call swap buffers

The main problem is with this structure, I dont see a good way to decide when not to render a window. Only thing I can think of is a hearistic based on time since last frame callback. But if someone else has a bright idea, I am all ears.

@martinetd
Copy link
Member

You must already have the wayland socket fd somewhere in your events, the callback is a function that will be called when you process wayland events.

Basicaly just create the callback when you would be swapping then forget about it (just check if there isn't another pending callback first) ; and in the callback function do the swap itself.
I think it'll work fine for kitty, it's just not going to be pretty because wayland won't behave the same as other backends even if it's all hidden in glfw/context.c :/

@kovidgoyal
Copy link

kovidgoyal commented Oct 9, 2018

The problem is that on wayland all the processing that is skipped for hidden windows on other backends will happen without the heuristic I mentioned. Fixing the hang issue with sway is pretty trivial, as I said earlier. The problem I am concerned about is the lack of window visible state in wayland.

@ddevault
Copy link
Contributor

ddevault commented Oct 9, 2018

Indeed, in pretty much any application I have ever seen, high level rendering is never driven solely by window expose events, but by lots of other input sources, background data processing, etc.

Done poorly, then. These applications should set a dirty bit somewhere to indicate that a re-render is required on the next frame.

It sounds like you should incorporate the Wayland socket into your internal event loop and use "was there a frame event" as part of the critiera for whether or not to render a frame.

@ddevault
Copy link
Contributor

ddevault commented Oct 9, 2018

You should just assume windows are never invisible on wayland, because this is not a concept that makes sense on wayland.

@kovidgoyal
Copy link

kovidgoyal commented Oct 9, 2018 via email

@kovidgoyal
Copy link

kovidgoyal commented Oct 9, 2018 via email

@kovidgoyal
Copy link

And coming to the question of implementing a workaround for sway in glfw, I just realised that even that is not that easy to get right, correctly. The problem is that if you dont swap buffers in the swapbuffers call, and instead do it during event processing, then the buffer can be in an indeterminate state, since the application will assume the buffer has been swapped and coud write partial updates to what it incorrectly thinks is the back buffer during event processing. So if the frame callback event hapens at the wrong time, the rendered buffer could be corrupted/partially rendered.

So I guess the only fix left is to just use a swap interval of zero and forget about the inefficiency.

@ddevault
Copy link
Contributor

ddevault commented Oct 9, 2018

Windows might not be visible on Wayland compositors, that's not what I mean. I mean that clients don't know about it. We might want to throttle your frame events for some other reason, like we're only showing your window as a thumbnail and want to improve performance. The point is that visible or not is the wrong way for clients to think about frame events on Wayland.

@ddevault
Copy link
Contributor

ddevault commented Oct 9, 2018

What do you think is more likely: the entire Wayland ecosystem was designed and built by idiots, or you don't fully understand it yet?

@kovidgoyal
Copy link

I dont really care about frame events. I want some way to know when it is safe to skip updating the GPU data for a window. In other windowing systems that is when the window in in hidden or iconified state. As far as I can tell, there is no way to make that determination on wayland. Please correct me if I am wrong.

@kovidgoyal
Copy link

What do you think is more likely: the entire Wayland ecosystem was designed and built by idiots, or you don't fully understand it yet?

Given all the things I've had to work around for wayland, definitely the former :) But in all seriousness, I have outlined my issue, I wait eagerly for someone to expain to me how to fix that issue on wayland, if it possible to fix at all.

@ddevault
Copy link
Contributor

ddevault commented Oct 9, 2018

I dont really care about frame events.

Then your client is badly designed and will not be performant on Wayland

I want some way to know when it is safe to skip updating the GPU data for a window.

This is what the frame callback is for

I have outlined my issue, I wait eagerly for someone to expain to me how to fix that issue on wayland

It has been explained to you: use the frame callback.

@swaywm swaywm locked as off-topic and limited conversation to collaborators Oct 9, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

No branches or pull requests

6 participants