-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kernel 6.6.30-v8+ and 6.6.31-v8+ strange system freeze due to firmware error #6170
Comments
Can you identify the exact update which caused this. See: If you click on each commit the end of the url contains a git hash. Run You know it's between 6.6.28 and 6.6.30 so there aren't too many commits to check. |
just catch it again. I will try different commits. But it's not easy, because this issue is floating and I cannot reproduce it with stable steps. This time it happens when I clicked on close tab button on chromium-browser:
I see there are a lot of errors, but system freeze happened at about 08:57:45 |
I tried to install I had some short freezes (for about 1-2 sec) and check the log, the last messages are:
but before that it has a lot of errors, here is some of them:
I also tried
when I tried to execute it again:
it is full of the same messages |
I found that there is huge memory leak in firefox, so out of memory may be the reason for this issue. |
catch the issue several times with The system is freezed and don't respond even on ssh connections. Alt+SysRq+E don't works. It happens when I click on close tab in chromium-browser.
|
just catch it again, but this time I was able to connect via ssh and see the logs before the system stopped responding completely. This time I reproduced it with reboot and opening chromium-browser with one tab (tg.me page), I read page for some time and then opened new tab and clicked on close button for old tab, at this moment Wayland desktop was freezes, but I connected via ssh and take logs: vclog contains thousands of the same alloc_compact_internal errors in a loop, github don't accept so many text, so I skipped it. sudo vclog -m:
dmesg:
|
after latest update to 6.6.31 it happens again:
|
Are you viewing a web page with multiple videos on? What is the url? |
usually this is telegram site, sometimes youtube. Almost any channel has many videos on the page. The issue happens again and again on a regular basis. But note, it don't happens immediately, it needs to spend some long time (at minimum several hours) to catch it. But in some rare cases I catch it immediately after reboot. Very often unrecoverable Wayfire freeze happens when I open a new blank tab in the browser and clicking on close for old tab. At the moment of close tab click it leads to complete freeze. Usually there are just two tabs at that moment - new empty and old with loaded some page that I read before (it can happens for page with video, but sometimes it happens for usual page like stackoverflow.com, some technical forum, or just google.com) The log consists a bunch of errors in drivers/media/common/videobuf2/videobuf2-core.c
Usually these error messages appears after visiting web sites with video, but the system still works normally. But after some time something happens and some kernel code stops to respond. First it leads to a frozen chormium-browser or firefox. I notice that at this time web sites which uses graphics (web games like slither.io and agar.io) can't be loaded and attempt to open it leads to a browser freeze with no way to close it in usual way. But my OpenGL app which uses a lot of OpenGL calls with textures still works ok. Force kill from system-monitor kills it, but attempt to start it again and open such page again leads to the same app freeze issue. Only system reboot can solve that state. Then after some time Wayfire apparently freezes (current picture stops and don't update anymore), but the system may still respond on ssh connections. Attempt to kill and restart Wayfire don't have effect, it still shows the frozen picture with no changes. At this point it is noticeable that ssh connection has some strange random long duration freezes even when I type command it may stop to give echo for several seconds. And in short period after Wayfire freeze, entire system freezes and stops to respond even on ssh connections. At early stage of this state it just refuse ssh connections, but later just stops to respond at all. Sometimes it leads to unrecoverable complete kernel freeze (with no way to connect through ssh) immediately after clicking on close tab in the browser, but sometimes the issue may going worse and worse step by step. When it happens step by step, attempt to force kill browser app which cause it don't help. The browser process is killed, but once it happens something in firmware code is broken at that point and if you restart the browser it will happens again. Only manual I even thinking about rollback to kernel 6.6.28, but unfortunately old kernel has some issue with video codec support and cannot play some video formats. |
just catch it again (complete freeze, but ssh still alive), here is the log:
vclog:
|
The same report from other user: https://forums.raspberrypi.com/viewtopic.php?t=372078 |
Since there is kernel 6.6.32-v8+ and issue is not reproduced for a long time, the issue is not relevant anymore and I will close it. |
Describe the bug
I catch it several times when switched from 6.6.28 to 6.6.30 (I switch with
sudo rpi-update rpi-6.6.y
). The mouse cursor freezes, and after some time background sound player is stopped. When it happens I cannot connect with ssh, it asks the password, but then freezes.Sometimes Alt+SysRq+I allows to restart the Wayfire. But this time it didn't helped. Complete freeze...
I tried to switch back to 6.6.28 and it didn't happens. Then I switched to 6.6.30 and after about half hour it happens again.
Steps to reproduce the behaviour
I don't know how to reproduce it. It just happens after some runtime.
Device (s)
Raspberry Pi 4 Mod. B
System
Logs
Additional context
3 user processes were running during issue: sdrpp, chromium-browser and vlc. Previously I catched it without sdrpp.
Power supply is ok, temperature didn't exceed 50 °C (usually it is 39-46°C under load).
The text was updated successfully, but these errors were encountered: