Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alacritty window randomly disappears on sway/wayland #2978

Closed
divoxx opened this issue Nov 14, 2019 · 29 comments
Closed

Alacritty window randomly disappears on sway/wayland #2978

divoxx opened this issue Nov 14, 2019 · 29 comments

Comments

@divoxx
Copy link

divoxx commented Nov 14, 2019

When using Alacritty under Sway/Wayland and nouveau driver, the window randomly disappears from sway but the process does not get kill and continues to run in the background.

I initially reported this issue on sway (swaywm/sway#4435) but I want to cross report it here since the issue is likely being triggered by something Alacritty is doing.

Whenever the issue happens, this is what I see on the logs:

2019-11-13 21:21:27 - [EGL] command: eglCreateImageKHR, error: 0x3003, message: "dri2_create_image_khr_texture"
2019-11-13 21:21:27 - [types/wlr_buffer.c:98] Failed to upload texture
2019-11-13 21:21:27 - [types/wlr_surface.c:304] Failed to upload buffer

The issue is hard to reproduce but it happens multiple times a day when using the terminal intensively.

Let me know if there is any other information I can provide to help debug this issue.

@kchibisov
Copy link
Member

kchibisov commented Nov 14, 2019

You were already told on sway bug tracker to upload WAYLAND_DEBUG=client, so I'm asking here the same thing. You should just start alacritty with WAYLAND_DEBUG=client alacritty 2> log.txt to collect it (not sway, just alacritty).

P.s.
I've never seen this issue on sway.

@divoxx
Copy link
Author

divoxx commented Nov 14, 2019

@kchibisov I'll upload a copy of the log later (I'm on a different computer now).

But just to comment on this note:

I've never seen this issue on sway.

I have 3 different computers running sway + alacritty and it only happens on one, the main relevant difference being i915 driver vs nouveau, but I'm happy to provide additional information on other differences that you think might be affecting this.

Some additional info: I've recently formatted my disk and completely reinstalled everything on that computer but the problem remains. I've also tried both 0.3.3 from the gentoo portage tree and manually installing it directly using cargo, it happens on both cases.

@kchibisov
Copy link
Member

So, it's likely a nouveau thing. I'm using only i915 and amdgpu drivers, so can't test it.

@divoxx
Copy link
Author

divoxx commented Nov 14, 2019

Yeah, I think it's related to nouveau. The part that is tricky is that only Alacritty, at least on my system, causes that issue.

My guess is that it's a combination of factors, something Alacritty is doing is creating a specific interaction with the driver that causes the texture upload to fail, which then causes wlroot to close the window, even though Alacritty continues to run in the background (it doesn't crash) and nothing shows up on Alacritty log, even with debug level.

Anyway, I'll try to reproduce the issue with WAYLAND_DEBUG=client and see what I can find.

@chrisduerr
Copy link
Member

The part that is tricky is that only Alacritty, at least on my system, causes that issue.

Do you use any other application which uses OpenGL? If your GPU is never used, chances are you won't run into a lot of GPU related bugs.

@divoxx
Copy link
Author

divoxx commented Nov 15, 2019

I run some software that use hardware acceleration (stream, discord, firefox, etc) plus some games without any issue. Also, I believe sway itself uses the GPU for composition and such.

Yesterday, I've enabled WAYLAND_DEBUG=client but I wasn't able to reproduce the issue. I'll keep trying and upload the log as soon as I manage to.

@divoxx
Copy link
Author

divoxx commented Mar 12, 2020

This has been tricky to reproduce and it has been happening way less frequently. I thought it was resolved and stopped running Alacritty with WAYLAND_DEBUG=client enabled but today it happened again.

When I've checked the log, I noticed a different output that might help point this to the root cause:

2020-03-11 17:25:11 - [EGL] command: eglCreateImageKHR, error: 0x3003, message: "dri2_create_image_khr_texture"
[destroyed object]: error 7: importing the supplied dmabufs failed
thread 'main' panicked at 'Wayland connection lost.: Os { code: 71, kind: Other, message: "Protocol error" }', src/libcore/result.rs:999:5
stack backtrace:
[2020-03-11 17:25] [DEBUG] New num_cols is 279 and num_lines is 75
[2020-03-11 17:25] [INFO] Width: 2538, Height: 1382
   0: <unknown>
   1: <unknown>
   2: <unknown>
   3: <unknown>
   4: <unknown>
   5: <unknown>
   6: <unknown>
   7: <unknown>
   8: <unknown>
   9: <unknown>
  10: <unknown>
  11: __libc_start_main
  12: <unknown>
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
[2020-03-11 17:25] [INFO] Goodbye

The symbols seems to be stripped. Is there a way to compile a release build with symbols untouched?

I'll enable WAYLAND_DEBUG=client once again and see if I can get more information, but hopefully this information is already helpful.

@chrisduerr
Copy link
Member

The symbols seems to be stripped. Is there a way to compile a release build with symbols untouched?

Yes, see the main Cargo.toml in the root of the repository. Setting debug to true there instead of 1 should do it. Also the binary shouldn't be stripped after that.

@b10rn
Copy link

b10rn commented Mar 13, 2020

I have the exact same problem, but I am using amdgpu.

@divoxx
Copy link
Author

divoxx commented Apr 6, 2020

This issue has been happening more rarely but today it happened again. Unfortunately, the change suggested by @chrisduerr did not work.

I've cloned the repository, set debug to true in the top-level Cargo.toml, and compiled alacritty for release cargo build --release.

$ alacritty --version
alacritty 0.5.0-dev (13eb50d)

But the backtrace still does not include the symbols. Is there anything else I need to do?

@chrisduerr
Copy link
Member

An executable compiled from source without stripping it definitely should include debug symbols. How big is the binary for you?

@kchibisov
Copy link
Member

But the backtrace still does not include the symbols. Is there anything else I need to do?

WAYLAND_DEBUG=1 alacritty output is the only one that makes sense here.

@divoxx
Copy link
Author

divoxx commented Apr 7, 2020

@kchibisov I am trying to reproduce the issue with WAYLAND_DEBUG enabled but reproducing the issue is really hard and leaving WAYLAND_DEBUG enabled all the time causes the log files to grow really really large.

Btw, I figured out the symbols issues. My PATH config was not correctly being loaded into the sway session, so it was running the system alacritty when launching using sway but using the dev version when running on the terminal.

I'll try to capture the issue again when it happens.

@divoxx
Copy link
Author

divoxx commented May 4, 2020

Alright, I have finally been able to reproduce the issue while having a backtrace and wayland debug enabled.

The log is really long since the session was running for a while before it crashed but here are the 1000 most recent lines: https://gist.github.com/divoxx/164cb2f06e654657c158c5fe13a8a3d4

@kchibisov
Copy link
Member

Can you post this log to your sway issue referenced here?

@divoxx
Copy link
Author

divoxx commented May 5, 2020

I'll add it there if you think it's relevant. That said, I do want to point out one thing that has changed since when this issue and the other one was reported.

The issue seems to be the same (triggering a eglCreateImageKHR error and loosing connection to wayland) but the behavior has changed a bit. Originally, it would cause sway to close the window while the alacritty process continued to run headless. Now, it looks like the same issue is causing alacritty to fully crash and exit.

It's obviously possible that I am wrong about this, since I am not familiar with the internals of either sway or alacritty, but my guess is that the issue is likely on Alacritty and not on sway. Nonetheless, I'll upload the logs there as well.

@mfsch
Copy link

mfsch commented Jun 8, 2020

Just wanted to add that I seem to have the same issue with Alacritty on Sway/Wayland, but in my case it’s with Intel graphics (running NixOS). If that’s useful I could try to obtain logs but as others wrote it’s a bit hard to reproduce since it happens randomly after a couple of days (sometimes weeks) and suspend/resume cycles.

@stephan-t
Copy link

I've also been having the same problem with alacritty and sway. However, it usually happens when I reload the sway config file. It doesn't happen every time but it does happen quite frequently. I've also seen it happen after resuming from suspend.

Other programs such as Firefox (Wayland), GNOME Terminal (Wayland), urxvt (XWayland) do not crash.

I am running an AMD GPU.

@justinlovinger
Copy link

justinlovinger commented Oct 12, 2020

I have had the same issue on Sway in NixOS. It appears to happen during DPMS. I've seen it on Intel and AMD graphics. I believe it happens when no Sway outputs are available, but that is hard to test because the only reason no Sway outputs would be available in my setup is a Sway bug. See swaywm/sway#5728.

I have also seen Alacritty windows text become fuzzy, as if reverting to XWayland, under the same conditions that they might disappear.

@kchibisov
Copy link
Member

I wonder if something changes with our master?

@justinlovinger
Copy link

I wonder if something changes with our master?

I think it is more likely that something changed in Sway/Wayland to make the circumstances that cause this more frequent. I have experienced this bug on a commit a few before 0.5 and on 0.5 . One of my machines experienced it on Sway 1.4 (NixOS 20.03) and the other machine only started experiencing it on Sway 1.5 (NixOS 20.09).

@kchibisov
Copy link
Member

I've just looked at some of the things you've mentioned, and fuzzy thing may be solved by master.

@justinlovinger
Copy link

I've just looked at some of the things you've mentioned, and fuzzy thing may be solved by master.

I just tested master, and I couldn't reproduce the fuzziness issue.

Also, I couldn't reproduce Alacritty windows disappearing on 0.5 or master. However, that was always harder to reproduce. I honestly can't say if master fixed windows disappearing, something I recently did with Sway fixed it, or if I just can't easily reproduce it.

P.S. I was previously able to reproduce the fuzziness issue with for ((i=0; i<10; ++i)); do swaymsg 'output * disable'; sleep 1; swaymsg 'output * enable'; done.

@kchibisov
Copy link
Member

kchibisov commented Oct 12, 2020

we can't really fix windows disappearing ( it looks like mesa thing), but we can do something about fuzziness.

@marcin-sucharski
Copy link

The same issue happens to me (NixOS 20.09; Alacritty 0.5.0; Sway 1.5; AMD GPU RX550; dual monitor setup). I do not observe any correlation with DPMS. I suspect that it is correlated with runtime of single terminal window (the longer it is running the more likely it is to disappear).

Is there any issue on Mesa side which can be followed? Is there any way I can help solving this issue (I can do some basic debugging / collecting information however I do not know Alacritty's codebase nor Mesa's)? I can reproduce this issue pretty frequently on my setup

@kchibisov
Copy link
Member

Is there any issue on Mesa side which can be followed? Is there any way I can help solving this issue (I can do some basic debugging / collecting information however I do not know Alacritty's codebase nor Mesa's)? I can reproduce this issue pretty frequently on my setup

Also, if you'be able to repro on weston or other compositor it'll be interesting, since if you get wlroots errors as the original author it could identify who is at fault.

@marcin-sucharski
Copy link

I had issue reproducing it now. Before I wanted to reproduce it it would happen every hour or so and now I had to wait several days. I failed to reproduce it exactly on Weston (it is not my main environment and I had not time to experiment with it) however it seems that I misinterpreted my issue. Actually I encountered two different problems and it seems that both of them are already fixed on master :)

My problems (just for reference and future readers):

  1. Issue which occurs when window is resized. I was able to reproduce it only when weechat via tmux was open in the window. It printed on the stderr:
thread 'main' panicked at 'index out of bounds: the len is 24 but the index is 18446744073709551591', /build/source/alacritty_terminal/src/term/search.rs:423:15
stack backtrace:
   0:     0x56432437d85f - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h8e44e3bad104136e
   1:     0x564324266e9d - core::fmt::write::he8cb6d64ed166147
   2:     0x56432438cce6 - std::io::Write::write_fmt::hf25ce96005919ce6
   3:     0x5643243856b0 - std::panicking::default_hook::{{closure}}::hf8bcda2c877e2dcc
   4:     0x564324385182 - std::panicking::rust_panic_with_hook::h7b83b0fe7900eb7a
   5:     0x564324384e18 - rust_begin_unwind
   6:     0x564324265b20 - core::panicking::panic_fmt::h61e03e91a1a8868a
   7:     0x5643242659b1 - core::panicking::panic_bounds_check::ha31c8fe198f5f816
   8:     0x5643241dde08 - alacritty_terminal::term::search::<impl alacritty_terminal::term::Term<T>>::line_search_right::hfae2da2814b53da2
   9:     0x56432416b7af - alacritty::display::Display::draw::h8b0a9fed9b9d7e7a
  10:     0x5643241fbf72 - alacritty::event::Processor<N>::run::{{closure}}::h6e78d74fc0a6f65a.6172
  11:     0x56432408dc41 - winit::platform_impl::platform::wayland::event_loop::EventLoop<T>::run_return::h4019b16481c7363c
  12:     0x5643241bf566 - alacritty::event::Processor<N>::run::h93f141a726b1f984
  13:     0x5643241265c2 - alacritty::main::h5d054fddf28d7be0
  14:     0x56432437d6e3 - std::sys_common::backtrace::__rust_begin_short_backtrace::h9597725d379735e1
  15:     0x56432412026f - main
  16:     0x7f7bc82dcc7d - __libc_start_main
  17:     0x5643240641da - _start
  18:                0x0 - <unknown>

It seems to be the same problem as #4384 thus I assume that it is fixed.

  1. Spontaneous crash when Alacritty is running for too long. Stderr:
file descriptor expected, object (49), message keymap(uhu)
thread 'main' panicked at 'failed to read wayland events: Invalid argument (os error 22)', /build/rustc-1.45.2-src/src/libstd/macros.rs:16:9

It looks like it is fixed on master too (#4358).

Thank you!

@kchibisov
Copy link
Member

Yeah, in general, if you don't see an output from either sway/wlroots or mesa it's likely a different issue, which could also be fixed on the latest master.

@kchibisov
Copy link
Member

I'll close this since it's likely fixed on either sway or alacritty side, but I've never seen this issue myself in the wild on sway due to the reason initial report has.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

8 participants