Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Multiplexing is very slow #1872

Closed
joexue opened this issue Apr 13, 2022 · 46 comments
Closed

The Multiplexing is very slow #1872

joexue opened this issue Apr 13, 2022 · 46 comments
Labels
bug Something isn't working fixed-in-nightly This is (or is assumed to be) fixed in the nightly builds. multiplexer

Comments

@joexue
Copy link

joexue commented Apr 13, 2022

What Operating System(s) are you seeing this problem on?

Windows

WezTerm version

20220101-133340

Did you try the latest nightly build to see if the issue is better (or worse!) than your current version?

No, and I'll explain why below

Describe the bug

The functionality of TLS multiplexing is very good but a bit slow, if I directly use ssh connect to the host, every thing is fine but if I use the TLS multiplexing, it is very slow, I even can see a double line under the character I typed in. If I use delete the cursor even can go back beyond my prompt to the first column of the terminal. Since it it is connecting to company's server, I cannot capture what it is, sorry about that.

To Reproduce

configure a tls multiplexing server and connect to it.

Configuration

no specific

Expected Behavior

run at least as fast as ssh connection

Logs

No response

Anything else?

No response

@joexue joexue added the bug Something isn't working label Apr 13, 2022
@wez
Copy link
Owner

wez commented Apr 15, 2022

Can you describe in more detail what you're doing when you notice that it is slow?
What is the ping/round-trip latency between the local and remote hosts?
How large is the terminal (rows and columns)?

Have you tried adjusting the local_echo_threshold_ms setting? See bottom of https://wezfurlong.org/wezterm/config/lua/TlsDomainClient.html?highlight=local_echo_threshold_ms#tlsdomainclient for details

@wez wez added the waiting-on-op Waiting for more information from the original poster label Apr 15, 2022
@joexue
Copy link
Author

joexue commented Apr 18, 2022

I notice it is slow because when I type, it response not instantly, as I describe in the original post I even can see the double line under the characters I type in in short time. for example below, if I type in "ls example" I will see the double under line from "l" to "e" in order for short time. and if I try to delete the "ls example" the cursor even can reach 'h".

host$ ls example

The network is a bit slow I know, ping round-trip time is 226ms.
of course if I connect to a host with fast network, the symptoms not so significant.
but if I use ssh directly connect to problematic server without multiplexing, the response is much faster. so I think multiplexing contributes more than network speed.

I use local_echo_threshold_ms = 10 and it doesn't help.

@joexue
Copy link
Author

joexue commented Apr 18, 2022

the terminal is not large, I use small terminal. 1920x1280 and font size is 14

@joexue
Copy link
Author

joexue commented Apr 20, 2022

try to use ssh domain instead of tls domain, the result is same.

@joexue
Copy link
Author

joexue commented Apr 20, 2022

to be more clear, the client is windows and the server is Linux, I cannot try to use local wezterm to connect mux server, since my server doesn't have GUI support.

@joexue
Copy link
Author

joexue commented Apr 20, 2022

I managed to record a video by phone and now you can see the problem

IMG_2216_MOV_AdobeCreativeCloudExpress

@wez
Copy link
Owner

wez commented Apr 20, 2022

Try setting local_echo_threshold_ms = 300 to discourage the local predictive echo?

@joexue
Copy link
Author

joexue commented Apr 20, 2022

Try setting local_echo_threshold_ms = 300 to discourage the local predictive echo?

looks not helping

@joexue joexue changed the title The TLS Multiplexing is very slow The Multiplexing is very slow Apr 21, 2022
@joexue
Copy link
Author

joexue commented Apr 21, 2022

updated the tile to remove TLS since I found it is not TLS specific, the problem happens whatever the multiplexing is by TLS or UNIX domain

@digitallyserviced
Copy link
Contributor

I actually have had a similar issue. When using the mux unix domain, and navigating while holding down arrows or hjkl in neovim, the repeat actually gets stuck and I cannot even stop it without spamming opposite keys which then have their own delayed repeat until the cursor finally stops.

I have since stopped using the unix domain mux just because there is a huge delay sometitmes, while a direct wezterm-gui feels immediate and very responsive in comparison.

I can say that this seems to happen after a little while into the server running after it has used up some memory. Killing and restarting the process earns me some more responsiveness for a little while but in the end becomes enough of an issue to have left the feature altogether.

I will work on getting some data,logs,rec of the issue but I had been keeping an eye on this issue because of it mentioning slowness, and now that OP mentions it happening over even unix domain for them too I figured I should chime in as me too.

@github-actions
Copy link

This issue has been automatically closed because there has been no response to our request for more information from the original author. With only the information that is currently in the issue, we don't have enough information to take action. Please reach out if you have or find the answers we need so that we can investigate further.

@wez wez reopened this Apr 29, 2022
@wez wez removed the waiting-on-op Waiting for more information from the original poster label Apr 29, 2022
@digitallyserviced
Copy link
Contributor

Is this something related to the #1814 #1841 that may have been resolved with the recent
b908e2d commit?

@joexue
Copy link
Author

joexue commented Dec 6, 2022

Used sometime to dig it, the slow looks from proxy more than network transmission somehow.

@joexue
Copy link
Author

joexue commented Dec 7, 2022

22:55:13.618 TRACE wezterm_term::terminalstate::keyboard > key_down: sending "x", Char('x') NONE
22:55:13.618 TRACE async_io::driver > main_loop: waiting on I/O
22:55:13.618 TRACE async_io::reactor > process_timers: 0 ready wakers
22:55:13.618 TRACE polling > Poller::wait(, None)
22:55:13.618 TRACE polling::epoll > wait: epoll_fd=5, timeout=None
22:55:13.618 TRACE wezterm_mux_server_impl::sessionhandler > 384 processing time 155.85µs
22:55:13.618 TRACE polling::epoll > modify: epoll_fd=5, fd=7, ev=Event { key: 18446744073709551615, readable: true, writable: false }
22:55:13.618 TRACE async_std::task::builder > block_on
22:55:13.618 TRACE async_io::driver > block_on()
22:55:13.618 TRACE async_io::driver > block_on: sleep until notification
22:55:13.618 DEBUG codec > serialized+compress len 332 vs 2175
22:55:13.618 TRACE polling::epoll > modify: epoll_fd=5, fd=18, ev=Event { key: 1, readable: true, writable: false }
22:55:13.618 DEBUG wezterm_term::terminalstate::performer > perform Print('x')
22:55:13.618 TRACE wezterm_term::terminalstate::performer > print x=24 y=6 print_width=1 width=80 cell=x CellAttributes { attributes: 0, intensity: Normal, underline: None, blink: None, italic: false, reverse: false, strikethrough: false, invisible: false, wrapped: false, overline: false, semantic_type: Output, foreground: Default, background: Default, fat: None }
22:55:13.618 TRACE polling::epoll > modify: epoll_fd=5, fd=18, ev=Event { key: 1, readable: true, writable: false }
22:55:13.618 TRACE async_io::driver > block_on: completed
22:55:13.619 DEBUG codec > serialized+compress len 315 vs 2181
22:55:13.619 TRACE polling::epoll > modify: epoll_fd=5, fd=18, ev=Event { key: 1, readable: true, writable: false }
22:55:13.869 TRACE polling::epoll > new events: epoll_fd=5, res=1
22:55:13.869 TRACE polling::epoll > modify: epoll_fd=5, fd=6, ev=Event { key: 18446744073709551615, readable: true, writable: false }
22:55:13.869 TRACE async_io::reactor > react: 1 ready wakers
22:55:13.869 TRACE async_io::driver > main_loop: sleeping for 50 us
22:55:13.869 TRACE async_io::driver > main_loop: notified
22:55:13.869 TRACE polling::epoll > modify: epoll_fd=5, fd=18, ev=Event { key: 1, readable: true, writable: false }
22:55:13.869 TRACE async_io::driver > main_loop: waiting on I/O
22:55:13.869 TRACE async_io::reactor > process_timers: 0 ready wakers
22:55:13.869 TRACE polling > Poller::wait(
, None)
22:55:13.869 TRACE polling::epoll > wait: epoll_fd=5, timeout=None
22:55:13.869 TRACE polling::epoll > modify: epoll_fd=5, fd=7, ev=Event { key: 18446744073709551615, readable: true, writable: false }
22:55:13.869 TRACE polling::epoll > new events: epoll_fd=5, res=1
22:55:13.869 TRACE polling::epoll > modify: epoll_fd=5, fd=6, ev=Event { key: 18446744073709551615, readable: true, writable: false }
22:55:13.869 TRACE async_io::reactor > react: 1 ready wakers
22:55:13.869 TRACE async_io::driver > main_loop: sleeping for 50 us
22:55:13.869 TRACE async_io::reactor > readable: fd=18
22:55:13.869 TRACE polling::epoll > modify: epoll_fd=5, fd=18, ev=Event { key: 1, readable: true, writable: false }
22:55:13.870 TRACE async_io::driver > main_loop: waiting on I/O
22:55:13.870 TRACE wezterm_term::terminalstate::keyboard > key_down: sending "i", Char('i') NONE

From the above log, you can see between the typing "x" and "i", there is a big pause happen from 22:55:13.619 to 22:55:13.869 This happened when I just type "exit" in terminal

the fd list

lrwx------ 1 joe joe 64 Dec 7 23:09 0 -> /dev/pts/2
l-wx------ 1 joe joe 64 Dec 7 23:09 1 -> /tmp/log2.txt
lrwx------ 1 joe joe 64 Dec 7 23:09 10 -> 'socket:[419121606]'
lrwx------ 1 joe joe 64 Dec 7 23:09 11 -> 'anon_inode:[eventfd]'
lrwx------ 1 joe joe 64 Dec 7 23:09 13 -> /dev/ptmx
lrwx------ 1 joe joe 64 Dec 7 23:09 14 -> 'socket:[419116866]'
lrwx------ 1 joe joe 64 Dec 7 23:09 15 -> /dev/ptmx
lrwx------ 1 joe joe 64 Dec 7 23:09 16 -> /dev/ptmx
lrwx------ 1 joe joe 64 Dec 7 23:09 17 -> 'socket:[419116867]'
l-wx------ 1 joe joe 64 Dec 7 23:09 2 -> /tmp/log2.txt
lr-x------ 1 joe joe 64 Dec 7 23:09 3 -> /etc/passwd
l-wx------ 1 joe joe 64 Dec 7 23:09 4 -> /run/user/1000/wezterm/wezterm-mux-server-log-21307.txt
lrwx------ 1 joe joe 64 Dec 7 23:09 5 -> 'anon_inode:[eventpoll]'
lrwx------ 1 joe joe 64 Dec 7 23:09 6 -> 'anon_inode:[eventfd]'
lrwx------ 1 joe joe 64 Dec 7 23:09 7 -> 'anon_inode:[timerfd]'
lr-x------ 1 joe joe 64 Dec 7 23:09 8 -> anon_inode:inotify
lrwx------ 1 joe joe 64 Dec 7 23:09 9 -> 'anon_inode:[eventpoll]'

@joexue
Copy link
Author

joexue commented Dec 7, 2022

Hi @wez, I can provide more specific log if you need. The functionality is really good and useful, I hate I cannot use RUST to program. But I can make some change and build to test, if you have any idea.

@cunha
Copy link
Contributor

cunha commented Dec 11, 2022

I have just met what looks like the same problem when using a unix socket to connect to a remote host. The slowdown I am experiencing seems related to redrawing operations. Some observations:

  • If I keep the terminal at 80x25, then it performs reasonably well
  • If I maximize the screen (318x75) then
    • vim is extremely slow (0.5s delay when pressing j)
    • nano still performs reasonably well

One difference between the two editors is that vim updates the cursor position at the bottom right of the terminal, while nano does nothing other than modify the cursor position.

This is with local_echo_threshold_ms = 50000, and the latency between me and the server is around 40ms.

Edit: Just tested with an SSH domain and observed the same behavior. Without multiplexing (i.e., just SSH'ing into the server) does not lead to any slowdown even for maximized terminals.

@sirfz
Copy link

sirfz commented Jan 4, 2023

I experienced the same with both direct ssh and multiplexing. Using ssh from wsl works normally as already reported.

@lrvdijk
Copy link

lrvdijk commented Feb 23, 2023

I can also confirm this, and agreeing with cunha that vim is especially slow when multiplexing (using wezterm connect to a GCP VM). The lag is still noticeable with wezterm ssh but not as bad.

Smaller windows help (e.g. using several horizontal splits).

If there's anything I can do to help profile the issue, let me know.

wez added a commit that referenced this issue Apr 2, 2023
At 80x24 (< 2k cells), sending all 24 lines of the viewport as an
optimistic prefetch is "OK", but for full screen 4k (80*250 = 20k cells)
the volume is oversized and pushes latency up.

This commit dials that back down to the minimal useful data; the
cursor row + any changed rows in the viewport.  When navigating
in vim that reduces things down to 2 rows of prefetch per movement,
assuming that the status line is being updated to show the cursor
position.

This feels a little more snappy for me.

refs: #2503
refs: #1872
@cd-a
Copy link

cd-a commented Apr 3, 2023

Since d36ad7c was merged I wanted to check again on latest master.

I can't connect to the mux server anymore though. @wez did you change something there since last release that I'm not aware of?

The standard config I used before does not connect to a local unix socket anymore, but works on latest release:

config.unix_domains = {
  {
    name = 'unix',
  },
}

config.default_gui_startup_args = { 'connect', 'unix' }

@wez
Copy link
Owner

wez commented Apr 3, 2023

@cd-a share the error message(s) that you see?

@cd-a
Copy link

cd-a commented Apr 3, 2023

@wez I'd love to, but there are none :)

When I run this config on the release, and there is no socket open yet, on first terminal it briefly shows no socket yet, and creates it, then every terminal afterwards connects to it. So when I close, reopen, it loads the layout again.

On latest master, nothing. No message about missing socket, no attaching to socket. Just behaves as if there is no mux set up.

I tried killing the existing socket from the release version, that maybe master would do it differently, but no.

Any way I can log this verbosely?

@wez
Copy link
Owner

wez commented Apr 3, 2023

@cd-a please file an issue with complete repro steps and config

@cd-a
Copy link

cd-a commented Apr 3, 2023

@wez sure! Here it is: #3438

@cd-a
Copy link

cd-a commented Apr 3, 2023

So after wez helped me figure out how to properly start the mux server on the master build, here are my findings using:

  • 4k, full screen wezterm
  • neovim latest

It's blazingly fast now, I can hardly notice a difference to non-mux performance now. Maybe a tiny fraction, but only noticeable if I compare them directly against each other... or maybe I'm even hallucinating that one. It just works great!

So I will now switch to the mux version, loving it. Great work wez!

@joshuarubin
Copy link

Can you specify any config and minimum wezterm build dates that are required to get fast mux on a 4k full screen? I'm running 20230326-111934-3666303c on my server and 20230330-214948-d1c2257b on my client and those may not be new enough.

@wez
Copy link
Owner

wez commented Apr 3, 2023

@joshuarubin d36ad7c or later

@joshuarubin
Copy link

@joshuarubin d36ad7c or later

thanks for the info. is that for both the client and server, or is it ok for one side to be older?

@wez
Copy link
Owner

wez commented Apr 3, 2023

While that change in theory only changed some logic on the server, there were some other multiplexer changes that bumped the CODEC_VERSION over the weekend, and the client will give up talking to a server with a mismatching CODEC VERSION. So you'll want to upgrade both the client and the server.

@wez
Copy link
Owner

wez commented Apr 4, 2023

This should be resolved now in main.

It typically takes about an hour before commits are available as nightly builds for all platforms. Linux builds are the fastest to build and are often available within about 20 minutes. Windows and macOS builds take a bit longer.

Please take a few moments to try out the fix and let me know how that works out. You can find the nightly downloads for your system in the wezterm installation docs.

If you prefer to use packages provided by your distribution or package manager of choice and don't want to replace that with a nightly download, keep in mind that you can download portable packages (eg: a .dmg file on macOS, a .zip file on Windows and an .AppImage file on Linux) that can be run without permanently installing or replacing an existing package, and can then simply be deleted once you no longer need them.

If you are eager and can build from source then you may be able to try this out more quickly.

@wez wez added the fixed-in-nightly This is (or is assumed to be) fixed in the nightly builds. label Apr 4, 2023
wez added a commit that referenced this issue Apr 4, 2023
@cd-a
Copy link

cd-a commented Apr 4, 2023

@joshuarubin for the config, please check #3438 where it's linked with the description.

@joshuarubin
Copy link

Thanks for your help. I think I've got it working.

New issue, I can make separately, but in case you are already aware, on main when I'm connected to an ssh domain and do win:perform_action(wezterm.action.SplitHorizontal({ domain = "DefaultDomain" }), pane) I get this:

22:13:19.082 WARN mux::domain > Directory "/home/<USERNAME>" is not readable and will not be used for the command we are spawning: No such file or directory (os error 2)

This is on MacOS and my home dir is in /Users, so I'm not sure why that's happening.

@wez
Copy link
Owner

wez commented Apr 4, 2023

@joshuarubin please file a separate issue so that this one can remain focused on the performance aspect

@calops
Copy link
Contributor

calops commented Apr 4, 2023

I've tried the latest commit and the sluggish behavior hasn't disappeared for me. I don't know if it's faster than before, but it's still significantly slower than a normal SSH session. Scrolling in neovim, for example, is very slow.

@cunha
Copy link
Contributor

cunha commented Apr 5, 2023

My tests indicate UNIX domains are no longer impacted by the severe slowdown I observed before. Vim on gigantic terminal window now updates as fast as expected for a remote (SSH) session. 💯

I used

[zeus:~/git/wezterm main]% wezterm --version
wezterm 20230405-073038-96f1585b

and the following UNIX domain config:

  {
    name = "zeus",
    socket_path = "Users/cunha/.local/share/wezterm/zeus.sock",
    proxy_command = { "ssh", "-T", "-A", "zeus", "/usr/bin/wezterm", "cli", "proxy" },
    local_echo_threshold_ms = 50000,
  }

@calops
Copy link
Contributor

calops commented Apr 5, 2023

I've tried with an identical unix_domains configuration and wezterm version 20230405-084502-37e8a504 and it behaves the exact same way as the normal ssh_domains solution: very sluggishly. Big vim windows are unusable.

I don't understand what I'm doing differently.

@sirfz
Copy link

sirfz commented Apr 5, 2023

I can confirm the experience is quite smooth with the nightly build. My setup:

OS: Windows 10
domain:

unix_domains = {
     {
       name = 'wsl-mux',
       serve_command = { 'wsl', 'wezterm-mux-server', '--daemonize' },
     },
},

WSL: Ubuntu 20.04 (with nightly wezterm-mux-server)

I logon to WSL and just ssh and work on remote servers all day, working great!

@lrvdijk
Copy link

lrvdijk commented Apr 7, 2023

For me the performance has increased a lot too, so much that I can now use it as a daily driver! 🎉

Still feel it's a little bit slower than iTerm2 + ssh + Tmux, but it's good enough for me.

@wez
Copy link
Owner

wez commented Apr 7, 2023

@calops if you have vim plugins that cause every line to change when moving the cursor up or down (eg: displaying relative line numbers at the start of each line), then wezterm will still need to send a lot of data on each cursor movement. Perhaps there is something like that in your config that is triggering the worst case performance?

@calops
Copy link
Contributor

calops commented Apr 7, 2023

@wez Yes, I do have something like this, but that's irrelevant for a simple use-case like "scrolling through a file" where lines are gonna get redrawn no matter what. And even simple stuff such as selecting text (which shouldn't redraw much) has a noticeable latency, which is made very obvious when doing it with the mouse.

The real problem here is that a plain ssh session still handles all this and performs perfectly fine compared to a wezterm mux session. It's faster by at least one or two orders of magnitude. By that I mean that the SSH session feels (almost) as fast as a local session.

So it seems obvious to me that wezterm adds a significant overhead to this. Is this a technical limitation for this solution, or is there hope that it could be improved somehow?

@wez
Copy link
Owner

wez commented Apr 7, 2023

Yes, there is more overhead because there obviously is more going on.
I think we all hope for improvement here.

One of my hopes is that enough users are willing to lend support that I can afford to
spend more time on issues like this one.
https://wezfurlong.org/wezterm/sponsor.html has some more information about this.

@supernomad
Copy link

@wez I am still seeing this performance bottle neck as well with MacOS <> Linux and the latest nightly. I am more than happy to help dive into the code, but would need some assistance as to where to start. If you can give me some guidance as to where to start, I can take a whack at what is going on as I have a simple reproduction setup and can dive in over the next few weeks as I should have some spare time. Let me know if that makes sense to you!

@wez
Copy link
Owner

wez commented Apr 7, 2023

@supernomad thanks for the offer, but I think it will likely be more time consuming to bootstrap someone than for me to sit down and do the work. At a high level the issue is this:

  • The mux sync protocol is line-based. So if a line is mutated, it gets flagged as dirty and that line will then either be unilaterally pushed to the client, or listed in a dirty lines set for the client to pull when it needs to render it
  • Some terminal update operations are encoded in a few bytes of escape sequence, but can touch every line in the screen, amplifying the data that needs to be sent over the wire and driving up the latency

So the change granularity is O(changed-lines) and results in transmitting data that is O(changed-lines x terminal_width).

The end state I have in mind is:

  • At the mux layer, tap into the terminal output byte stream and journal it
  • Build up a client + cursor concept that enables that journal when a client subscribes to it. It needs to be zero cost if there are no subscribers, and the journal needs to be pruned as soon as all clients (there can be multiple!) have read past it
  • When subscribed, if the nature of the update being pushed by the server can be described by an incremental read from that journal, just send those bytes, otherwise send the line data.
  • The client would need to be able to apply both the delta line updates and the byte stream updates, which means building a local wezterm_term::Terminal and applying both sets of changes to a local copy of the terminal emulation
  • Rendering would need to be re-structured to run off of that model

That would bring the data to be processed down closer to O(output-bytes) in the best case.

These are pretty significant changes across several layers; it's not a small task.

@cunha
Copy link
Contributor

cunha commented Apr 7, 2023

I did some additional tests on a 136x566 window and indeed could see some slowdown with relativenumber set in Vim (although I'd say the terminal was still useable). On a more comfortable 77x318 window I could not notice any slowdown.

This is over the Internet when connected to a server that is 30ms away; one's mileage will likely vary depending on connection quality.

@calops
Copy link
Contributor

calops commented Apr 12, 2023

Yes, there is more overhead because there obviously is more going on.

That part wasn't obvious to me, actually. What specifically is done in addition to what an SSH/tmux stack is doing?

@wez
Copy link
Owner

wez commented Apr 12, 2023

@calops you can read the code if you're really interested. It's too much to explain here for a casual comment; it will take time I don't have and it will have only cost and no benefit to me to do so to someone that isn't committed to spending some of their time also working on this, and I don't believe that you are that person.

@wez wez closed this as completed Jul 12, 2023
@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 12, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working fixed-in-nightly This is (or is assumed to be) fixed in the nightly builds. multiplexer
Projects
None yet
Development

No branches or pull requests

10 participants