-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve input latency #673
Comments
|
Would something like Nvidia's Fast Sync help in this case? |
|
I wrote this on the /r/rust thread and someone asked me to file a ticket but I thought a comment in the existing ticket may be better, so here it is. On my setup there's noticeable input lag in Alacritty, compared to other terminals like st and konsole. I don't know how to measure it and I'm happy to help with getting some numbers/debugging/profiling etc. Here's how I feel the lag: when I keep a key pressed so that it repeats (e.g. arrow keys or hjkl in vim), after I stop pressing the key the key repeats one or more times, whereas in other terminals key repeat immediately stops. This makes alacritty feel like it's coming behind my actual key presses (as if key press events are waiting to be handled but alacritty is not fast enough, so even after I stop pressing it handles old key press events). My key repeat settings: 260ms repeat delay and repeat speed of 55 key strokes per second ( Secondly, when typing I notice that it takes slightly more time in alacritty to see the letters appear. But this isn't as serious issue as and I'd probably get used to this if the other issue is fixed. The problem with performance and latency problems is that everything is fast enough until you see something faster (e.g. 30 FPS in games was acceptable years ago, now anything below 60 FPS seems laggy), so it's hard to talk about these issues. Let me know if I can provide anything to diagnose the problem. |
|
@osa1 thanks for posting this feedback here, this is really helpful. Can you share which window manager and graphics driver you're using? One final question, may we ping you when there's patches ready for evaluation to see if they address the problem? |
I have two systems and I can observe this in both. Both systems are Xubuntu
One thing I can add is that the latency is much more significant on my laptop
Of course! Let me know if there's anything I can help with. |
|
Just so you know X11 locking is kind of a pain so you might not want to go the path of spawning a separate thread. This probably isn't a good idea for typical applications but one trick I've found is that it is perfectly reasonable to open a separate connection specifically for rendering on a different thread. The X server is still single-threaded of course but it pretty much handles locking automatically. It also means you can have a conventional blocking input loop and a separately rendering loop. This is pretty much mostly only useful for games though. Another improvement is that the different threads can use different X libraries. As I recall OpenGL requires the use of Xlib but the separate thread (or process is even possible) can entirely avoid Xlib. |
|
@MarcoPolo While this is related to this issue, what you're showing off looks like an actual bug somewhere. The typing performance in Alacritty shouldn't be noticeably bad, this is just for improving what is already good. I'm not using macOS, but I'd be interested in knowing which branch you run and if you are actually running with a dedicated GPU. This might be related to #1348, so I'd be interested to see what your benchmark times look like. If they are extremely bad like shown in that issue, it's probably best to follow up there. |
|
Built from master (commit: d2ed015) Just ran the same benchmark: On Terminal.app (the snappier one): On Alacritty: So even though Alacritty feels slower the benchmark is faster. So my guess is it's not related to that issue. Is there another branch I should try? thanks! |
|
Yeah, that looks like it's actual input latency rather than rendering issues or similar. You could try the #1403 PR, that's the only other branch which has a chance of improving this situation. |
|
Yeah, I would recommend taking this to a separate issue. |
|
Since 2016 MacBook Pros default to a hidpi scaled resolution and the 13"
model doesn't have a dedicated GPU (and never has). Meeting users where
they are is always a good idea.
The latency is very noticeable, especially vs Terminal.app which may be the
most responsive Mac app ever for typing. Even Sublime feels slow in
comparison.
On Thu, Oct 11, 2018 at 6:50 PM Christian Duerr ***@***.***> wrote:
Yeah, I would recommend taking this to a separate issue.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#673 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ANaFLCC0jiOsCaX3BzNA7RtGJNTm5E-Iks5uj8tPgaJpZM4Ob3Oz>
.
--
Sent from Gmail Mobile
|
|
I guess one could provide an "input latency hack" where input is sent to the GPU immediately. this gives 1 VBI of latency but may cause input to pop in and out with some programs. sudo and things like it would still work fine because it emits control codes to hide input, but this is probably best as an opt-in rather than a default. |
|
The problem is that libreadline applications, like bash, turn off the terminal's built-in echoing support just like sudo does. So you either get jank from sudo, or the single most popular terminal application hits the slow path. |
|
hm. what the hell does mosh do then. |
|
Ugly, ugly heuristics. The sort of thing that should not be necessary on a local machine.
|
|
What happens if you do it on a local machine? |
In other words, mosh doesn't do the predictive echoing all the time, because it wants to avoid stray characters where it gets the prediction wrong. Assuming you went ahead and did predictive echoing even though local terminal's should never hit mosh's high-latency trigger limit:
Accepting the possibility of glitch letters appearing just to decrease latency by 1/60th of a second under very specific circumstances doesn't seem worth it. |
|
I bring measurements! Of output latency! (Using https://github.com/mstoeckl/latencytool, a 187Hz camera, 60Hz QHD display, and uncomposited X11. This may not be a realistic benchmark, but it is very easy to perform. Expect ±10 ms uncertainty for the 99% percentile, and ±3 ms on averages.) With a small (400x400px window), some timings between
With a full screen (2560x1401) window, same environment/lighting as above:
Slightly modified full screen testing environment, (2560x1401):
On sway, full screen, there is no significant difference relative to X11:
With an [unoptimized + debuginfo] build:
Why this performance reduction? Edit: Updated with more fullscreen measurements. Font size choices can make a big difference, and some terminal emulators may have optimized clear screen ( Edit 2: The unoptimized build is surprisingly slow. On clear screen, it calls a function per grid cell; for comparison, xterm's clear screen operation is almost a memset, and can perform ~1200 full screen clears per second. (2300 fps is the theoretical maximum, at a 25.6 GB/s data transfer rate.) Edit 3: For alacritty, the dark->light transitions of the test take slightly longer than the light->dark transitions. After looking with apitrace, it turns out since my background color is black, almost all Edit 5: In the test above, Kitty was run with the default double buffering, 10ms output delay, and 3ms input delay. (A power-saving policy.) Switching to single buffering, and 0 ms input/output delay, I observe 27-30ms full screen average render times, with a 99th %ile of ~45ms. It remains unexplained why alacritty requires so long to render with full screens, in comparison. |
|
@mstoeckl Thanks a ton for looking into this. It's always great to have some more benchmarks especially in areas that are hard to automate. Just out of curiosity, are you running an AMD or Nvidia GPU? Because there are some performance-relevant workarounds in the renderer which might affect this. |
|
@mstoeckl seconding the massive thank you! Really interesting data you've gathered here. Out of curiosity, do you have a similar table comparing terminals at full screen window size? Separately, did you mean to link the Kitty shaders in your second link? |
Intel HD5500, i915, UXA, X11, i3. Due to relatively long intervals between color switches (~100ms), the GPU was, AFAIK, sitting at 450 MHz for all the tests.
I can make one, but that will need to wait until the weekend. As most terminals do partial updates for tasks in which low latency is desired, full screen tests are not as useful. (I do have similar data for application toolkit latency -- with the most efficient methods (xcb, framebuffer), a full screen color switch has 25 ms average latency between command and camera measurement.
As an example of complicated shaders :-) Running |
|
I have a patch that reduces the total amount of work done by the GPU, on my computer, for realistic inputs, by about 50%. This has minor latency impact (maybe 0.5ms ?). (My graphics system performs the fragment discard in 0001-Distinct-render-batches-for-background-and-text.patch.txt I'm posting this here because I don't have expect to have the experience/motivation/time to make a proper changeset in the near future. If anyone wants to pick this up, the following things are advised:
|
|
Using single buffering reduces average time-to-camera on my screen by ~3 ms, for a full screen window. The time between sending |
|
anecdotally, i see that 0.3.0 hasn't significantly improved since I last looked at this: still within the average of most terminals though - nothing to be ashamed of, but it would be real nice to get below that 10ms threshold, although in my experiments I found it was really hard if not impossible the second a compositor steps in with the double-buffering - even xterm fails to get below 10ms then. |
|
Has anyone reproduced these benchmarks with the latest Alacritty? |
|
Just want to add that comparing latest kitty to latest alacritty, kitty is noticeably more responsive for me when typing and it seems they default to 100 fps. However it's not always desirable to do this (iTerm can turn off GPU rendering on battery). One way to make the lag more obvious is to use your trackpad to scroll up and down really fast. Doing this on a macbook you can feel that kitty is definitely more responsive. Also not sure if related, but starting nvim you can see the background fill from top to bottom, whereas on kitty it instantly appears. (On a 2.4 ghz core i9 macbook, whether on intel integrated or AMD 5500M) |
|
Oh, macOS, I really wonder what is going on on macOS with input, since it's just slow compared to Wayland/X11. One thing that could help is to schedule frames to render closer to vblank, but there could be issues in our windowing stack, which makes it slow. Maybe we're using something that macOS don't like and it getting even slower. |
|
What is terminal.app doing to achieve such low latency? Are they using custom hooks like the Windows hardware cursor that bypasses the display manager? Its throughput/framerate can't touch Alactitty but the typing experience feels much better. |
For what it's worth, I've been told that the way browsers minimize low latency for text input is to skip double buffering (for text input) and draw immediately without waiting for vsync. (For animations they do wait for vsync, to avoid tearing.) One reason other terminal emulators may have lower input latency is that they aren't going out of their way to double buffer, and since they aren't using OpenGL, they aren't getting double buffering by default either. If tearing doesn't seem to be a problem for them in practice, it probably wouldn't be for Alacritty either! I don't know how OpenGL controls those settings, but I have a side project using wgpu and if I change PresentMode between Immediate and any of the modes that wait for vsync, the input latency difference is instantly noticeably slower when waiting for vsync, but neither one exhibits tearing (for text inputs at least). |
|
I did a PR to explicitly disable vsync: #3955 I didn't do any benchmark though, please be free to test it! |
|
Closing in favor of #3972 which provides a concrete solution to the problem. While there always might be some latency issues, I think after resolving #3972 all the low-ish hanging fruit should be done with and we should look more at specific problems rather than just the general "improve input latency" (which would never be "done"). |
|
will #3972 provide some sort of a hard upper limit to latency? |
|
@chrisduerr Would an input latency tracking issue be alright, to gather relevant issues & PRs over input latency and be edited with new issue/pull numbers over time? (Or could this issue be converted into a tracking issue, if it's not too cluttered?) |
|
As I've just stated, no, that would not be alright. There aren't a lot of things left to do when it comes to input latency, so it makes no point to keep open an issue indefinitely just to track something that doesn't have any actual fix because it is way too vague. If you want to follow an issue, look at #3972, if you have concrete issues after that is fixed, you should open a specific issue outlining the problems. |
Mind you there's a patch for st to improve that: Have you re-run the test lately? How verify hz/fps in alacritty? This is a standard browser test: |




First, for background, please see https://danluu.com/term-latency/
In the case of Alacritty, we have a worst-case latency (assuming reasonable draw times) of 3 VBLANK intervals. Here's the scenario:
In total, that's
3 * VBI - draw_time. In a perfect world, draw_time is zero, and our worst case input latency is 3 VBI.This can be resolved by moving the rendering to a separate thread. Certain windowing APIs require input processing to occur on the main thread, so input processing must stay in place. In the same scenario as described above, we can reduce the worst case to 2 VBLANK intervals. With input processing on its own thread, it no longer needs to wait for
swap_buffersto return. Key press events can be sent to the terminal immediately, which means any drawing the child program does will be available to draw on the very next frame.The text was updated successfully, but these errors were encountered: