Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SDDM races with DRM GPU drivers #1917

Open
theofficialgman opened this issue Apr 24, 2024 · 11 comments
Open

SDDM races with DRM GPU drivers #1917

theofficialgman opened this issue Apr 24, 2024 · 11 comments
Labels

Comments

@theofficialgman
Copy link

theofficialgman commented Apr 24, 2024

A similar issue was encountered in GDM and the fix can be found linked in this issue https://gitlab.gnome.org/GNOME/mutter/-/issues/2909

amdgpu driver developer does not want to classify this as an amdgpu driver bug https://gitlab.freedesktop.org/drm/amd/-/issues/3341 as it is independent of the driver in use.

The issue is that on boot the (internal laptop and external displays if connected) are black but backlight is lit. I am able to boot into recovery mode without issue since the graphics drivers are not loaded in that case (only amd framebuffer driver and userspace mesa llvmpipe). Cold boot from OFF seems to be the most common case for this issue and it happens about 50-75% of the time from there.

As seen in the syslog, sddm is starting before amdgpu has fully initialized and this results in sddm starting with a black screen.
https://launchpadlibrarian.net/725987752/preboot.txt

When the sddm starts after amdgpu has fully initialized, it works as intended.
https://launchpadlibrarian.net/725988838/cuttent_boot.txt

This issue has been confirmed to occur on i915 as well with similar frequency.

Alternatives Considered: have systemd CanGraphical state change only after initialization has finished systemd/systemd#32509

@theofficialgman theofficialgman changed the title black screen on boot due to amdgpu race condition black screen on boot due to drm race condition Apr 26, 2024
@theofficialgman theofficialgman changed the title black screen on boot due to drm race condition SDDM races with DRM GPU drivers Apr 26, 2024
@Web-Dev-Codi
Copy link

I am currently running a 7800XT on a B550 MB. I installed Ubuntu Cinnamon 24.04 and got a black screen after the grub menu. The first reboot went fine and stopped working after that. Are there any fixes for this error?

@theofficialgman
Copy link
Author

theofficialgman commented Apr 28, 2024

Cinnamon traditionally uses LightDM as the display manager, not SDDM which is what this bug report is for. I have not checked LightDM for the same issue but it might happen there too.

The only workarounds currently are restarting the display manager or artificially delaying it's start after the driver has initialized.l (eg: adding a large sleep command to the service that starts the DM)

@superm1
Copy link

superm1 commented Apr 28, 2024

As a "clean" no code workaround, can you add in the systemd unit ExecStartPre command something like "udevadm settle"?

@Web-Dev-Codi
Copy link

Thanks. This opens up more I can look into for a possible fix. I installed a pre-release 2 weeks ago and it worked flawlessly. This error only occurred with the official release. I hope I can solve this soon.

@Vogtinator
Copy link
Contributor

Alternatives Considered: have systemd CanGraphical state change only after initialization has finished systemd/systemd#32509

Yeah, that's the way to go. This is all just a giant hack, so it should be worked around in a single place which is systemd.

As a "clean" no code workaround, can you add in the systemd unit ExecStartPre command something like "udevadm settle"?

FWICT there's also a systemd-udev-settle.service which is meant for legacy services but fits perfectly here. Please try #1924.

@Web-Dev-Codi
Copy link

Can someone break this down in lehman's terms? I have no idea what this means and how to find a workaround.

@Web-Dev-Codi
Copy link

Excuse my new noobishness. I enjoy the cinnamon flavor for my web development and would love to use 24 soon

@Vogtinator
Copy link
Contributor

Can someone break this down in lehman's terms? I have no idea what this means and how to find a workaround.

If you use ubuntu cinnamon, this is the wrong place anyway, see #1917 (comment).

@theofficialgman
Copy link
Author

theofficialgman commented May 1, 2024

Alternatives Considered: have systemd CanGraphical state change only after initialization has finished systemd/systemd#32509

Yeah, that's the way to go. This is all just a giant hack, so it should be worked around in a single place which is systemd.

As a "clean" no code workaround, can you add in the systemd unit ExecStartPre command something like "udevadm settle"?

FWICT there's also a systemd-udev-settle.service which is meant for legacy services but fits perfectly here. Please try #1924.

I won't be able to reliably report back the results of adding that I think. My racecondition is too small.
It varies from AMDGPU finishing loading ~1 second before sddm starts to finishing loading ~1 second after sddm starts.
Some days it wants to start everytime, othertimes nearly never. I imagine calling the settle service with its sleeps just waiting for the daemon to start will be enough to push sddm starting later enough on its own, regardless of the checks it does.

Also systemd-udev-settle.service and udevadm settle (they are the same thing) are deprecated and you aren't supposed to use them https://github.com/systemd/systemd/blob/main/src/udev/udevadm-settle.c

@superm1
Copy link

superm1 commented May 1, 2024

I see the pull request as a no code workaround until there is a better solution like a code change to systemd or to sddm to do the settling actions on PCI VGA class devices.

@shoffmeister
Copy link

shoffmeister commented May 11, 2024

In the past I had regular problems with boot-stuck SDDM on an Intel+NVIDIA notebook for a while, see https://bugs.kde.org/show_bug.cgi?id=478616 - could those be related?

This problem manifested in log output such as

kwin_core: Failed to open /dev/dri/card0 device (Did not receive a reply.

and black screens.

FWIW, I am now on Fedora 40 / KDE 6 proper, and I do not recollect any such misbehaviour in recent times. I am unable to trace back this "magically disappeared" to any specific change:

  • all the drivers changed (NVIDIA, Intel), together with the kernel
  • removed any/all NVIDIA and Intel driver overrides
  • cleaned up some pre-F40 SDDM overrides (towards Wayland)
  • removed some environment variables on KDE which set DRM device (card) order

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants