-
-
Notifications
You must be signed in to change notification settings - Fork 801
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crash upon RDP disconnect reconnection perhaps due to screen resize #265
Comments
Attempt to recover in the case where we cannot re-allocate our vertex buffers; try to restore the prior dimensions and resize the OS window to match. refs: #265
Thanks for the detailed information! RDP does something "funny" with GPU availability; back in #40 I added some logic to fall back to the software renderer when the app is opened in an RDP session. Based on the output you shared, I think the more recent changes to WGL/EGL initialization may now have made it possible to get a working OpenGL context in that environment. The panic message sounds like the RDP OpenGL context may not have enough VRAM to support the screen size at disconnect; I'm assuming that the disconnect results in an excessively large screen size (perhaps due to it being tiled/maximized?) that isn't supportable in that environment. However, if the app is started with access to the true display it has access to the full VRAM available and that might explain why the app survives in that case. I've pushed 3484478 that attempts to recover from a failed attempt to allocate vertex buffers. It should log the old and new dimensions to help us understand a bit better what is going on. I'm not sure if the recovery attempt will prevent a later panic, but I'm hoping that we can at least learn a bit more, and if we're lucky, it will keep things running until you next attach to the session and trigger a resize where it will succeed. I'd appreciate it if you could try out the windows build associated with that commit once the CI has finished; it should be downloadable from here https://github.com/wez/wezterm/actions/runs/252069384 within ~30 minutes of this comment being posted. If it is still crashy and getting in your way, a workaround would be to set this in the return {
front_end = "Software",
} that will set the renderer to the Software renderer unconditionally; it will have degraded performance and different visual characteristics that you'll remember from #235, but should at least keep running. |
I usually place my working wezterm window right tiled. I can run 2 independent wezterm processes as two separate windows I may need to update the oringial description to update the original issue description to include the regular-resizable-window mode as also susceptible to this bug, I did not notice it earlier as it was frozen for a while, and maybe needs some keyboard typing attempts in the shell before it crashes. I first noticed this when I attempted the below. I resized the normal window to as large as a right tile (but not really right tiled) and the crash happened
This below log is with the small regular-sized window as soon as it pops up, not even moving the window/replacing the window. It remained on screen for a while. and I could even alt-tab to it and to the firefox browser in which I type this. I could even move the frozen window around. When it eventually did disapear is didn't log much, it crashed silently, no backtrace
did many reattempts with various resizing/placements. attempted to resize frozen window, error handling caught some intermediate window sizes. Expanded window has undrawn area extensions painted black.
Conclusion
fail to see any pattern/correlation between sizes Even in wezterm 20200909-002054-4c9af461 the regular-resizable window survives a while in frozen state Freezing and persistence of frozen window could be because of error trapping but insufficient error handling and recovery. |
Regarding the failure even with the new build; I suspect that there might not be much that can be done to detect and handle this, as the This thread: https://social.technet.microsoft.com/Forums/windowsserver/en-US/c8295ef8-3711-4576-9293-2c4965280165/opengl-and-remote-desktop Another article I found was this one that discusses a Group Policy setting that affects GPU availability: |
I agree, it may be beyond an opengl client to push through an opengl context in a state of tear-down. As the RDP use case is very common,
The tscon idea works: Essentially, it disconnects the RDP session in a script, which soon after runs the command outside of the RDP session. So it definitely is a handy way to try run the app outside of the RDP session, when one is unable to start the app locally before attempting the RDP session qwinsta is a tool to query windows session terminal applications.
In one trial, I ran a script that sleeps 5 seconds and runs wezterm, I disconnect RDP soon after starting the script and wait unconnected long enough for the script to have started wezterm and then RDP reconnect. It turns out if done that way, wezterm would start but use the software-renderer as though it doesn't have any GL hardware.
I attempted the grouppolicy "Use the hardware default graphics adapter for all Remote Desktop Services sessions”" to see if that allows wezterm to be stable on RDP disconnect, I applied that setting and rebooted. After rebooting, the connection attempt for a fresh RDP session, failed with the following error. ... reading throught the links .. will update here as I need to note something The fact that a process started within a RDP session has different resources and privileges is very unsettling. It now makes me think if an app started that way, then it is stunted in some way. Like take for example, firefox/any browser, which uses opengl, started in RDP, will it crash, fail or run underpowered when it is later transitioned to a direct-display later. It seems like some limitation in the design of RDP. The Windows-10 machine is not a simultaneous multi-user graphics-capable system. If a local user switches to a different login, the entire display is switched to that user. If an RDP user connects, the local access is locked with a login-screen. If any user attempts a local login, the RDP session will be disconnected. The RDP user is not given full hardware graphics capabilities to the RDP session. I wonder how this is different from two users who are locally logged in using the switch-user feature and both run their own opengl apps. Can't wezterm discover the vGPU-state-changes/RDP-disconnect, sleep/wait till the new vGPU-state settles and rejuvenate the front-end ? Assuming the long term goal of wezterm was to be a new kind of tmux, with opengl accelerated front-end clients,
Is it possible to do this yet ? |
Now that I have been using wezterm as default terminal windows due to split support, this issue seems to be a major concern primarily due to work from home and I have to always use RDP to access work machine. I tried alacritty and seems to have the same crash. Windows Terminal doesn't have this issue. Does this mean using DirectWrite is better on Windows compared to opengl? GVim on windows also uses DirectWrite and doesn't have crashes. |
Yes, this is described here in the docs here: |
I picked OpenGL because it promises to enable GPU acceleration for multiple platforms with basically the same API for all of them, making it easier to support. I've been thinking a bit about possible solutions to this, including targeting DirectWrite/2D/3D. My line of thinking is this:
Which leads me to conclude that a decent sounding state would be:
There doesn't appear to be a ready-to-download binary distribution of Mesa on Windows, so some amount of effort will need to go into that; I'm not sure how long it takes to build and whether it is feasible to check it out and build it as part of the CI, or whether I'll need to build and upload those binaries somewhere. As I mentioned: I'm not often in dev mode at the same time as being on Windows, so investigating this aspect has a really high activation cost. |
If we've failed to initialize EGL, try setting `LIBGL_ALWAYS_SOFTWARE=true` in the environment and make another pass at initialization in the hope that it brings up something usable. This commit only impacts linux systems at the time of writing. I've made the line that logs the GL implementation information have `error` level again, because it is more convenient for me even if it isn't technically an error. refs: #272 (but isn't the true fix; this is just trying to make the consequences of that problem less. I would like to get that fixed correctly) refs: #265 (comment) (which discusses what I think the end state should be)
http://mesa.fdossena.com/ has a pre-built opengl32.dll: There's some relevant portions of code here: |
... saw your above comment, submitted 7 min before I submitted this. I loathe to go down the WSL, WSL2 requires Hyper-V, WSL2 takes a lot of space, tricky to get choice of distribution. but I will take look at it. I installed a fuller fedora with GUi in the vbox vm and noticed that running wezterm in linux defaults to the scrawny looking inbuilt software renderer, as the VM does not have true 3D hardware display card. In linux, it is possible to choose renderer via environment variables.
Similarly, in windows, dropping a opengl32.dll is one way a binary is made to use a custom opengl renderer. ex: fdossena Bundling/depending Mesa does make wezterm installer bigger. |
This is a bit of a switch-up, see this comment for more background: refs: #265 (comment) This commit: * Adds a pre-compiled mesa3d opengl32.dll replacement * The mesa dll is deployed to `<appdir>/mesa/opengl32.dll` which by default is ignored. * When the frontend is set to `Software` then the `mesa` directory is added to the dll search path, causing the llvmpipe renderer to be enabled. * The old software renderer implementation is available using the `OldSoftware` frontend name I'm not a huge fan of the subdirectory for the opengl32.dll, but I couldn't get it to work under a different dll name; the code thought that everything was initialized, but the window just rendered a white rectangle.
Having multiple backend support is definitely not good and incurs huge cost. Not sure what exactly Neovide does but this release seems to work for RDP https://github.com/Kethku/neovide/releases/tag/0.3.0 . Latest one crashes. It uses vulkan but doesn't for everyone I guess. Other option would be to have vulkan and pure software render only. I have absolutely no idea about graphic system but would libraries such as SDL2 or gfx-rs help?
If there is a hello world app more than happy to build my self and try it out if it works in RDP or not. |
SDL2 was one of the first I tried; it has a pretty rough user experience on Linux, making the whole screen flicker each time a window is opened. All of the other GUI related libraries in the Rust ecosystem are based on winit which traditionally had huge problems even with opening a second window and which also have relatively poor performance on Linux. I think OpenGL is probably the best abstraction we have available and I think the llvmpipe software renderer is fine as a fallback, provided that we can nail automatically activating it. https://docs.microsoft.com/en-us/windows/win32/termserv/detecting-the-terminal-services-environment has details on how to programmatically detect an RDP session; there's quite a lot of code involved and I didn't get around to porting it to rust. Rather than looking at another gfx backend, if you feel like writing a bit of rust code, I'd appreciate it if someone were to write a rust implementation of the https://github.com/wez/wezterm/blob/master/src/frontend/gui/mod.rs#L57 #[cfg(windows)]
{
if !is_running_in_rdp_session() {
// Using OpenGL in RDP is not safe/reliable on disconnect,
// so fallback to software rendering
::window::prefer_swrast();
}
} |
I just pushed 9e46ac8 to for software mode in an RDP session. |
tested in rdp. it works. i can now remove software render from wezterm.lua |
Great, let's call this done! |
This is similar in spirit to the work in 4d71a79 but for Windows. This commit adds ANGLE binaries built from https://chromium.googlesource.com/angle/angle/+/07ea804e620132517b6af0ef92fe85ea737d0c27 to the repo. The build and packaging will copy those into the same directory as wezterm.exe so that they can be resolved at runtime. By default, `prefer_egl = true`, which will cause the window crate to first try to load an EGL implementation. If that fails, or if `prefer_egl = false`, then the window crate will perform the usual WGL initialization. The practical effect of this change is that Direct3D11 is used for the underlying render, which avoids problematic OpenGL drivers and means that the process can survive graphics drivers being updated. It may also increase the chances that the GPU will really be used in an RDP session rather than the pessimised use of the software renderer. The one downside that I've noticed is that the resize behavior feels a little janky in comparison to WGL (frames can render with mismatched surface/window sizes which makes the window contents feel like they're zooming/rippling slightly as the window is live resized). I think this is specific to the ANGLE D3D implementation as EGL on other platforms feels more solid. I'm a little on the fence about making this the default; I think it makes sense to prefer something that won't quit unexpectedly while a software update is in progress, so that's a strong plus in favor of EGL as the default, but I'm not sure how much the resize wobble is going to set people off. If you prefer WGL and are fine with the risk of a drive update killing wezterm, then you can set this in your config: ```lua return { prefer_egl = false, } ``` refs: #265 closes: #156
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further. |
priority: important, affects usability
Describe the bug
Wezterm is started in a RDP session and is put into a left-tiled/right tiled or maximized state, the RDP session is disconnected. When the RDP session is subsequently reconnected, wezterm will have crashed.
edit: The regular resizable window is also affected, but it will be in a frozen state, movable, alt-tab-able, until something triggers it to crash.
This is perhaps caused by the 'screen resize adjustments'/'allocation of fresh screen' as the RDP session ends between remote and local machines
Happens when Wezterm is started inside the RDP session, and not when wezterm is already started before the RDPsession. ergo, a workaround exists and is described later.
Use case: Using wezterm in an RDP session may be a common use case
Environment (please complete the following information):
OS
Machine1: RDP client machine: Windows-10 version 2004 (OS Build 19041.508)
Machine2: RDP remote (server) machine: Windwos-10 version 2004 (OS Build 19041.508)
RDP client version, C:\windows\system32\mstsc.exe
Windows RDP client: 10.0.19041.423 (builtin) 8/12/2020
Wezterm Version: output of
wezterm -V
wezterm 20200909-002054-4c9af461
The RDP-remotemachine Machine2 has a resolution of 2160x1440
The RDP local machine Machine1 has a resolution of 1920x1080
No desktop dpi-scaling (desktop font scaling) in either machine (this is set in Settings/Display)
Note that in my case, the wezterm started inside RDP discovers a larger screen-size upon disconnect.
To Reproduce
Steps to reproduce the behavior (1)
Steps to reproduce the behavior (2)
2020-09-12T19:14:30.770Z ERROR window::os::windows::wgl > failed to created extended OpenGL context (CreateContextAttribsARB failed), fall back to basic
Configuration
Nothing unusual
Expected behavior
Wezterm should handle RDP screen-size changes and more importantly never crash
Screenshots
NA
Session Recording
NA, can't do wt-record in windows
If the issue is with the way that escape sequences are processed it can be helpful
to capture the terminal output using the
wt-record
script to run
wezterm
and record a transcript. This requires thescript
utilityto be installed on your system (this is part of macOS and available in the
util-linux
package on linux systems).
In the example below a file named
20180225161026.tgz
is produced. Please attach thatfile to this issue, or if it contains private or sensitive issue that you don't want the
public to see on GitHub, please find some other way to get that file to a project
contributor (perhaps Dropbox or email?).
You can use
wt-replay 20180225161026.tgz
to replay that file.wt-record
can only record the terminal output; it cannot record the input events goingin to the terminal, so if you are having an issue with input, please be sure to describe
it below!
RUST_LOG
with set RUST_LOG=info
with set RUST_BACKTRACE=1
with set RUST_BACKTRACE=full (seems identical to previous)
When pre-starting wezterm, as mentioned in the workaround, that is running wezterm locally in direct login on Machine 2, tiling wezterm, and then from Machine1 trying RDP connect, disconnect, reconnect, no error log messages appear beyond the initial opengl init, and wezterm silently survives the reconnect.
I ran the the westerm exe, timing it with date commands. After disconnecting, I gave a 2 minute pause, in order to determine whether the crash happens at disconnect, reconnect/local-login. After reconnecting, I immediately, manually, gave the 3rd date command after RDP reconnection. The below log confirms that the crash happens at disconnect.
WGL_INFO
running wglinfo64.exe inside RDP session on Machine2
wglinfo_remote_in_RDP.txt
running wglinfo64.exe on direct login on Machine2. Seems like no difference
wglinfo_machine2_directlogin.txt
wglinfo on machine 1 same as in bug #235
Possibly, depending on how RDP works, opengl on machine-1 may not matter to opengl process on machine-2.
Additional context
Add any other context about the problem here.
Misc
One wonders how wezterm might behave under other non-microsoft RDP clients.
On Linux, Linux RDP clients, Remmina and Vinagre, allow more control over the RDP session inside the client-window. For example, the remote machine's RDP-session display-size is preconfigurable.
Workaround
Rather that starting wezterm inside the RDP-session, login locally on Machine2 and pre-start wezterm before remote-RDP-ing into it. Started that way, wezterm is resilient to the RDP disconnection,
The text was updated successfully, but these errors were encountered: