Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aurora blank screen on discrete gpu followed with os crash #1261

Closed
tomrutsaert opened this issue May 7, 2024 · 12 comments
Closed

Aurora blank screen on discrete gpu followed with os crash #1261

tomrutsaert opened this issue May 7, 2024 · 12 comments

Comments

@tomrutsaert
Copy link

Aurora after while shows a blank screen on displays connected to discrete GPU followed up with a system crash and automatic reboot.

This happens not immediately, sometimes it happens after a half day of work, and sometimes after a n hour of gaming.
I am suspecting it has to do with amd drivers on my discrete GPU because only the displays connected to the discrete gpu go blank. The display connected met integrated gpu keeps working untill the pc crashes after a couple of seconds.
Music also keeps on playing.

device info:
https://paste.centos.org/view/237b191d

Can I add any other info ? Because I have no clue where to start and search for the cause?

@m2Giles
Copy link
Member

m2Giles commented May 7, 2024

I'm not seeing anything in your info. Nice computer btw.

Has this happened pretty consistently?

If this is happening, see if you can VT switch and dump journalctl. I'll look at kwin bugzilla, but I'm guessing kwin is not liking the multi GPU use.

@tomrutsaert
Copy link
Author

Thx (work + gaming pc in one)
This happens daily at least once
I will try the vt switch next time, but I might not be fast enough to type the dump journalctl dump?

Does it help to pastbin the ujust logs-last-boot?

I could link my other display also on my discrete gpu if really needed and see if it goes away

@m2Giles
Copy link
Member

m2Giles commented May 7, 2024

Yeah last boot logs should work.

@tomrutsaert
Copy link
Author

https://paste.centos.org/view/89598922
this the log last boot from the afternoon when i had the crash

@m2Giles
Copy link
Member

m2Giles commented May 7, 2024

Do you have any additional plasmoids installed?

You have this: plasmashell[6779]: kf.svg: findInCache with a lastModified timestamp of 0 is deprecated

Being spammed after login. You then also get H264 decode errors.

Pipewire and Spectacle also seem to have issues with getting Codecs as well.

Then at the end you hit some sort of Kernel bug:
kwin_wayland[6510]: kwin_wayland_drm: Pageflip timed out! This is a kernel bug

@tomrutsaert
Copy link
Author

tomrutsaert commented May 7, 2024

not that I am aware of. I am running rather stock aurora, I only changed keyboards shortcuts, installed brew, jetbrains, flatpaks and some distroboxes.....
I did some cpu/ram/disk widgets on the desktop at one time, but removed them again very rapidly

I did notice the virtualdesktoppager did giltch out at one time and all panels went away and came back...... ( with a popup error message)

What codecs would I be missing, aren't all necesarry codecs included in the image?
At the time of the decode errors i was in a teams meeting.... (teams app via edge flatpak)

Is that kernel the bug the crash? Anyway to figure out what caused it?

@m2Giles
Copy link
Member

m2Giles commented May 7, 2024

Can you try switching from the Aurora theme to the stock Fedora Theme? I want to rule out the Aurora theme.

The Codecs in question is saying bitstream error on H264. If you were inside a flatpak, the driver should of been provided there. I haven't seen that message in my logs but was an interesting note. Pipewire and Spectacle both call vainfo and they do find the radeonsi.so, but they call it multiple times.

The most jarring thing is that deprecated notice is being spammed over and over. The warning at the end said DRM flip action timed out and likely why the desktop went black.

@tomrutsaert
Copy link
Author

tomrutsaert commented May 7, 2024

I have now switched the theme to fedora (breeze dark)
I will post my my log-last-login again when I have my next crash
Should I already put my third monitor in my discrete gpu or do I wait with that?

@tomrutsaert
Copy link
Author

It happened again

the last-boot-log (only last 2 hours - log was too big for pastebin): https://paste.centos.org/view/ee772d38

I was trying to taking a rectangle screenshot with spectacle from intellij.
my theme is now fedora.
Could it be related to intellij and wayland?

@m2Giles
Copy link
Member

m2Giles commented May 8, 2024

I'm leaning to something with spectacle.

Though that spam of the kf.svg is still occurring.

Same bug at the end. Page flip timed out.

Doing some research and this appears to be GPU firmware bug. We have a similar issue with gnome when using rt scheduling and have just started setting user thread instead.

I'm wondering if there is a similar variable for kwin to not use real time scheduling for the compositor.

@tomrutsaert
Copy link
Author

So I did not have a crash for a couple of days (did not use spectacle), and just now had a crash again after taking several rectangle screenshots with spectacle. So if it is really spectacle this should be easily reproducible of another machine by taken a couple of rectangle screenshots....

@m2Giles
Copy link
Member

m2Giles commented May 17, 2024

I haven't been able to reproduce with my intel based machine.

@castrojo castrojo closed this as completed Jun 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants