-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Device Information
Framework 13th AMD, Ryzen 7 7840U
System Model or SKU
Framework 13th AMD, Ryzen 7 7840U
Please select one of the following
- Framework Laptop 13 (11th Gen Intel® Core™)
- Framework Laptop 13 (12th Gen Intel® Core™)
- Framework Laptop 13 (13th Gen Intel® Core™)
- Framework Laptop 13 (AMD Ryzen™ 7040 Series)
- Framework Laptop 13 (Intel® Core™ Ultra Series 1)
- Framework Laptop 16 (AMD Ryzen™ 7040 Series)
BIOS VERSION
Please provide the bios version.
Linux: 03.05
DIY Edition information
If you are experiencing an issue on a DIY system, Please also fill out the memory and storage devices you are using.
Memory: Framework official RAM DDR5 2x32GB
Storage: Samsung SSD 970 EVO Plus 2TB
Port/Peripheral information
Seems to be related to the BIOS/AMDGPU driver.
Standalone Operation
Are you running your mainboard as a standalone device. Is standalone mode enabled in the BIOS?
- Yes
- No
Describe the bug
Under kernel 6.11.X, clocksource seems to be reported as unstable, logs message in kernel log buffer (dmesg) indicates this is due to a broken BIOS.
AMDGPU can sometimes also report very weird errors @superm1, this could also be due to the BIOS/firmware.
on CPU2: Marking clocksource 'tsc' as unstable because the skew is too large:
'hpet' wd_nsec: 503458959 wd_now: 2ed90c0a wd_last: 2e6b0d62 mask: ffffffff
'tsc' cs_nsec: 503985962 cs_now: fccb13b417a6 cs_last: fccab0c1f179 mask: ffffffffffffffff
Clocksource 'tsc' skewed 527003 ns (0 ms) over watchdog 'hpet' interval of 503458959 ns (503 ms)
sept. 06 20:57:03 hostname kernel: clocksource: 'tsc' is current clocksource.
sept. 06 20:57:03 hostname kernel: tsc: Marking TSC unstable due to clocksource watchdog
sept. 06 20:57:03 hostname kernel: TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'.
sept. 06 20:57:03 hostname kernel: sched_clock: Marking unstable (22961617650112, 61413986153193)<-(84375623658522, -19871412)
sept. 06 20:57:03 hostname kernel: clocksource: Checking clocksource tsc synchronization from CPU 4 to CPUs 0-2,5,13,15.
sept. 06 20:57:03 hostname kernel: clocksource: Switched to clocksource hpet
nov. 26 22:25:05 hostname kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
nov. 26 22:25:05 hostname kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
nov. 26 22:25:05 hostname kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
nov. 26 22:25:06 hostname kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
nov. 26 22:25:06 hostname kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
nov. 26 22:25:06 hostname wpa_supplicant[1896]: wlp1s0: CTRL-EVENT-SIGNAL-CHANGE above=1 signal=-38 noise=9999 txrate=286700
nov. 26 22:25:06 hostname kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
nov. 26 22:25:07 hostname kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
nov. 26 22:25:07 hostname kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
nov. 26 22:25:07 hostname kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
nov. 26 22:25:07 hostname kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
nov. 26 22:25:08 hostname kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
nov. 26 22:25:08 hostname kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
nov. 26 22:25:08 hostname kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
nov. 26 22:25:08 hostname kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
nov. 26 22:25:09 hostname kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
nov. 26 22:25:09 hostname kernel: amdgpu 0000:c1:00.0: [drm] *ERROR* dc_dmub_srv_log_diagnostic_data: DMCUB error - collecting diagnostic data
Steps To Reproduce
Steps to reproduce the behavior:
- Let the computer be used for at least 7 days of uptime and 30GB of RAM at least.
- Randomly the issue will show up I have no way of reproducing (I'm under 6.11.8 and have seen the issue from 6.11.4 up to 6.11.8 Linux kernels)
- Multiples suspends cycles were done during the 7 days of usage (USB-C dock can be used or not doesn't really matter)
workaround for AMDGPU issue
Sometimes manually triggering a gpu recovery with the driver seems to resolve the very heavy lag situation but it can also just freeze the laptop :
sudo cat /sys/kernel/debug/dri/1/amdgpu_gpu_recover
Expected behavior
Both AMDGPU and the clocksource issue should not be happening.
Operating System (please complete the following information):
- OS/Distribution: NixOS
- Version: 24.05 (soon will move to the next release 24.11) however note I'm using the latest kernel available in Nixpkgs stable repositories.
- Linux Kernel Version:
uname -a: Linux hostname 6.11.8#1-NixOS SMP PREEMPT_DYNAMIC Thu Nov 14 12:21:16 UTC 2024 x86_64 GNU/Linux
Additional context
I have opened this topic on Framework Community but no one has answered yet, there are more logs output inside and more kernels version where I hit the issues : https://community.frame.work/t/nixos-amd-framework-13th-amd-ryzen-7-7840u-64gb-framework-ddr5-ram-uma-settings-gamer-on-kernels-6-11-x-have-random-heavy-lags-related-to-amdgpu-or-possibly-firmware/60561
I will try 6.12 kernels next week hopefully and also BIOS 3.06 once the release is marked as stable.