Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kerne panic with kernel 6.5.5 because of sound driver #8574

Closed
halobarrlets opened this issue Oct 7, 2023 · 5 comments
Closed

Kerne panic with kernel 6.5.5 because of sound driver #8574

halobarrlets opened this issue Oct 7, 2023 · 5 comments
Labels
affects-4.2 This issue affects Qubes OS 4.2. C: kernel C: Xen P: default Priority: default. Default priority for new issues, to be replaced given sufficient information. R: self-closed Voluntarily closed by the person who opened it before another resolution occurred. T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists.

Comments

@halobarrlets
Copy link

Qubes OS release

R4.2

Brief summary

I'm not sure whatever I should report this bug to Qubes OS because of some specific kernel patches or to Xen or to SOF but I wanted to ask here first.
I have a laptop with SoundWire Audio for which I've physically disconnected webcam+microphones cable from its motherboard.
I have Qubes OS installed on it and it worked fine with dom0 kernel 6.4.13 but new kernel 6.5.5 is crashing with cable disconnected but works with cable connected.
The crash log:

[   30.517082] S[   30.517082] SDW: Invalid deviDW: Invalid device for paging :0ce for paging :0

[   30.517087] rt715-sdca sdw:3:025d:0714:01: Failed to set private value: ffffffea <= 6100000 24832
[   30.517145] SDW: Invalid device for paging :0
[   30.517147] rt715-sdca sdw:3:025d:0714:01: Failed to set private value: ffffffea <= 6100000 26368
[   30.517170] SDW: Invalid device for paging :0
[   30.517180] BUG: unable to handle page fault for address: ffff888308a23f58
[   30.517184] #PF: supervisor read access in kernel mode
[   30.517185] #PF: error_code(0x0000) - not-present page
[   30.517187] PGD 2c24067 P4D 2c24067 PUD 0 
[   30.517190] Oops: 0000 [#1] PREEMPT SMP NOPTI
[   30.517192] CPU: 4 PID: 1994 Comm: alsactl Not tainted 6.5.5-1.qubes.fc37.x86_64 #1
...
[   30.517240]  ? asm_exc_page_fault+0x26/0x30
[   30.517245]  ? memcpy+0xc/0x20
[   30.517247]  kmemdup+0x36/0x50
[   30.517252]  regcache_maple_drop+0x135/0x2b0
[   30.517258]  _regmap_raw_write_impl+0x667/0x930
[   30.517260]  _regmap_write+0x50/0x100
[   30.517262]  _regmap_update_bits+0xf4/0x110
[   30.517265]  regmap_update_bits_base+0x5f/0x90
[   30.517269]  snd_soc_component_update_bits+0x44/0xe0 [snd_soc_core]
[   30.517313]  rt715_sdca_put_volsw+0x121/0x1b0 [snd_soc_rt715_sdca]
[   30.517322]  snd_ctl_elem_write+0xfe/0x1d0 [snd]
[   30.517341]  snd_ctl_ioctl+0x125/0x630 [snd]
...
[   30.517400] Modules linked in: snd_ctl_led snd_soc_sof_sdw snd_soc_intel_hda_dsp_common snd_soc_intel_sof_maxim_common snd_sof_probes snd_hda_codec_hdmi snd_soc_rt711_sdca snd_soc_rt715_m
...

The problem is in SOF SoundWire driver for microphone rt715-sdca. Because it's not present physically this is causing some problem in driver when it's trying to access the microphone registers.
But this rt715-sdca error doesn't cause kernel panic for Fedora 38 with kernel 6.5.5 without Xen:

[   26.704684] SDW: Invalid device for paging :0
[   26.704690] rt715-sdca sdw:3:025d:0714:01: Failed to set private value: ffffffea <= 6100000 24832
[   26.704720] SDW: Invalid device for paging :0
[   26.704722] rt715-sdca sdw:3:025d:0714:01: Failed to set private value: ffffffea <= 6100000 26368
[   26.704735] SDW: Invalid device for paging :0
[   26.704739] rt715-sdca sdw:3:025d:0714:01: ASoC: error at snd_soc_component_update_bits on sdw:3:025d:0714:01 for register: [0x40800109] -22

Could this be related to some Qubes OS kernel patches? Or is it in Xen?

@halobarrlets halobarrlets added P: default Priority: default. Default priority for new issues, to be replaced given sufficient information. T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists. labels Oct 7, 2023
@marmarek
Copy link
Member

marmarek commented Oct 7, 2023

Can you try the exact same kernel but without Xen? You can do that by editing grub entry:

  1. comment out "multiboot2" line
  2. Change vmlinuz "module2" to "linux"
  3. Change initramfs "module2" to "initrd" (and remove --nounzip option)

Of course no qube will start, but it should answer the question if it's about something in our kernel, or in Xen.

@halobarrlets
Copy link
Author

Thank you for your suggestion, I've tried it and Qubes OS kernel is not crashing without Xen so it seems to not be related to Qubes OS.
I'll close this issue as not related to Qubes OS and open a new one for SOF instead.

@halobarrlets halobarrlets closed this as not planned Won't fix, can't repro, duplicate, stale Oct 8, 2023
@andrewdavidwong andrewdavidwong added the R: self-closed Voluntarily closed by the person who opened it before another resolution occurred. label Oct 8, 2023
@DemiMarie
Copy link

I’m reopening this because Xen is one of Qubes OS’s dependencies and this is still a real bug from an end-user perspective.

@DemiMarie DemiMarie reopened this Oct 14, 2023
@DemiMarie DemiMarie removed the R: self-closed Voluntarily closed by the person who opened it before another resolution occurred. label Oct 14, 2023
@andrewdavidwong andrewdavidwong added C: Xen needs diagnosis Requires technical diagnosis from developer. Replace with "diagnosed" or remove if otherwise closed. affects-4.2 This issue affects Qubes OS 4.2. labels Oct 15, 2023
@halobarrlets
Copy link
Author

I have a similar problem with kernel panic but now it's because of nouveau with disabled NVIDIA dGPU.
I've followed this guide to fully power down discrete GPU:
https://wiki.archlinux.org/title/Hybrid_graphics#Fully_power_down_discrete_GPU
Now if I boot with kernel 6.1.57-1.qubes.fc37.x86_64 the dom0 kernel will panic because of nouveau driver trying to access not-present page.
If I add module_blacklist=nouveau kernel boot option in grub then I can boot with kernel 6.1.57-1.qubes.fc37.x86_64 without a problem.
If I boot with kernel 6.5.6-2.qubes.fc37.x86_64 or 6.4.13-1.qubes.fc37.x86_64 without module_blacklist=nouveau then it'll boot successfully without kernel panic.
So I guess this access to not-present page in nouveau in case of disabled dGPU was fixed in later kernel versions.

@halobarrlets
Copy link
Author

Kernel is not crashing anymore with kernel 6.6.2 so I think the issue can be closed.

@andrewdavidwong andrewdavidwong added C: kernel R: self-closed Voluntarily closed by the person who opened it before another resolution occurred. and removed needs diagnosis Requires technical diagnosis from developer. Replace with "diagnosed" or remove if otherwise closed. labels Dec 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-4.2 This issue affects Qubes OS 4.2. C: kernel C: Xen P: default Priority: default. Default priority for new issues, to be replaced given sufficient information. R: self-closed Voluntarily closed by the person who opened it before another resolution occurred. T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists.
Projects
None yet
Development

No branches or pull requests

4 participants