Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] sof-audio-pci-intel-tgl intermittent crash on audio playback (Dell XPS 13 9310) #3584

Closed
dcompoze opened this issue Apr 12, 2022 · 4 comments
Labels
bug Something isn't working Community end-user or distro-reported issues DMA memory alloc NOT on topic/sof-dev issue that does not happen on branch topic/sof-dev

Comments

@dcompoze
Copy link

dcompoze commented Apr 12, 2022

Device: Dell XPS 13 9310

BIOS/UEFI version: DELL BIOS 3.5.1 02/25/2022

Oparating System: Arch Linux 5.17.1-arch1-1 GNU/Linux

Package installed: sof-firmware 2.0-1 (pacman)

Modules loaded: lsmod | grep snd_sof

snd_sof_pci_intel_tgl    16384  0
snd_sof_intel_hda_common   131072  1 snd_sof_pci_intel_tgl
soundwire_intel        53248  1 snd_sof_intel_hda_common
snd_sof_intel_hda      20480  1 snd_sof_intel_hda_common
snd_sof_pci            20480  2 snd_sof_intel_hda_common,snd_sof_pci_intel_tgl
snd_sof_xtensa_dsp     20480  1 snd_sof_intel_hda_common
snd_sof               217088  2 snd_sof_pci,snd_sof_intel_hda_common
snd_soc_hdac_hda       28672  1 snd_sof_intel_hda_common
snd_hda_ext_core       36864  4 snd_sof_intel_hda_common,snd_soc_hdac_hdmi,snd_soc_hdac_hda,snd_sof_intel_hda
snd_soc_acpi_intel_match    61440  2 snd_sof_intel_hda_common,snd_sof_pci_intel_tgl
snd_soc_acpi           16384  2 snd_soc_acpi_intel_match,snd_sof_intel_hda_common
snd_soc_core          385024  7 soundwire_intel,snd_sof,snd_sof_intel_hda_common,snd_soc_hdac_hdmi,snd_soc_hdac_hda,snd_soc_dmic,snd_soc_skl_hda_dsp
snd_intel_dspcfg       36864  2 snd_hda_intel,snd_sof_intel_hda_common
snd_intel_sdw_acpi     20480  2 snd_sof_intel_hda_common,snd_intel_dspcfg
ledtrig_audio          16384  5 snd_ctl_led,snd_hda_codec_generic,dell_wmi,snd_sof,dell_laptop
snd_hda_core          114688  11 snd_hda_codec_generic,snd_hda_codec_hdmi,snd_hda_intel,snd_hda_ext_core,snd_hda_codec,snd_hda_codec_realtek,snd_soc_intel_hda_dsp_common,snd_sof_intel_hda_common,snd_soc_hdac_hdmi,snd_soc_hdac_hda,snd_sof_intel_hda
snd_pcm               163840  12 snd_hda_codec_hdmi,snd_hda_intel,snd_hda_codec,soundwire_intel,snd_sof,snd_sof_intel_hda_common,snd_soc_hdac_hdmi,snd_compress,snd_soc_core,snd_hda_core,snd_pcm_dmaengine
snd                   126976  29 snd_ctl_led,snd_hda_codec_generic,snd_seq,snd_seq_device,snd_hda_codec_hdmi,snd_hwdep,snd_hda_intel,snd_hda_codec,snd_hda_codec_realtek,snd_sof,snd_timer,snd_soc_hdac_hdmi,snd_compress,snd_soc_core,snd_pcm,snd_soc_skl_hda_dsp

Audio packages installed (pacman):

gst-plugin-pipewire 1:0.3.49-1
pipewire 1:0.3.49-1
pipewire-alsa 1:0.3.49-1
pipewire-jack 1:0.3.49-1
pipewire-media-session 1:0.4.1-1
pipewire-pulse 1:0.3.49-1

Issue:

The kernel module snd_sof_pci_intel_tgl started crashing since about 3-4 weeks ago after upgrading all packages on the system. I tried reverting to previous versions of sof-firmware (1.9.3, 1.9.2, 1.9, 1.8, 1.7) but that didn't help I also tried downgrading some of the pipewire packages and that also didn't help.

The crash usually happens when playing video/audio in a browser (chromium) and produces the following log (journalctl -k):

Apr 12 21:23:20 laptop kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: error: memory alloc failed: -12
Apr 12 21:23:20 laptop kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000
Apr 12 21:23:20 laptop kernel: #PF: supervisor read access in kernel mode
Apr 12 21:23:20 laptop kernel: #PF: error_code(0x0000) - not-present page
Apr 12 21:23:20 laptop kernel: PGD 0 P4D 0
Apr 12 21:23:20 laptop kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI
Apr 12 21:23:20 laptop kernel: CPU: 7 PID: 1378 Comm: pipewire Tainted: G     U            5.17.1-arch1-1 thesofproject/sof#1 0ea933cb6bfe82a8dc16ab834a4bccdd297f98b7
Apr 12 21:23:20 laptop kernel: Hardware name: Dell Inc. XPS 13 9310/08607K, BIOS 3.5.1 02/25/2022
Apr 12 21:23:20 laptop kernel: RIP: 0010:dma_free_noncontiguous+0x81/0xb0
Apr 12 21:23:20 laptop kernel: Code: 10 48 8b 12 48 83 e2 fc 48 85 c0 74 0b 41 89 c8 4c 89 c9 ff d0 0f 1f 00 4c 89 e7 e8 d9 da 3c 00 4c 89 e7 41 5c e9 0f 74 1b 00 <48> 8>
Apr 12 21:23:20 laptop kernel: RSP: 0000:ffffb897819ff740 EFLAGS: 00010246
Apr 12 21:23:20 laptop kernel: RAX: 0000000000000000 RBX: 00000000fffffff4 RCX: 0000000000000000
Apr 12 21:23:20 laptop kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff96a281f7f0d0
Apr 12 21:23:20 laptop kernel: RBP: ffff96a2833b62e8 R08: 0000000000000001 R09: ffffb897819ff478
Apr 12 21:23:20 laptop kernel: R10: ffffb897819ff470 R11: ffffffffaeacab48 R12: 0000000000000000
Apr 12 21:23:20 laptop kernel: R13: 000000000007b000 R14: 0000000000000000 R15: ffff96a2833b6028
Apr 12 21:23:20 laptop kernel: FS:  00007f67136075c0(0000) GS:ffff96a5ff7c0000(0000) knlGS:0000000000000000
Apr 12 21:23:20 laptop kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 12 21:23:20 laptop kernel: CR2: 0000000000000000 CR3: 000000034b0e4001 CR4: 0000000000770ee0
Apr 12 21:23:20 laptop kernel: PKRU: 55555554
Apr 12 21:23:20 laptop kernel: Call Trace:
Apr 12 21:23:20 laptop kernel:  <TASK>
Apr 12 21:23:20 laptop kernel:  cl_stream_prepare.constprop.0.cold+0x30/0x76 [snd_sof_intel_hda_common 34d8ac6d8ab5971622d696b973730a4228a659b8]
Apr 12 21:23:20 laptop kernel:  hda_dsp_cl_boot_firmware+0x87/0x780 [snd_sof_intel_hda_common 34d8ac6d8ab5971622d696b973730a4228a659b8]
Apr 12 21:23:20 laptop kernel:  hda_dsp_cl_boot_firmware_iccmax+0x6f/0xc0 [snd_sof_intel_hda_common 34d8ac6d8ab5971622d696b973730a4228a659b8]
Apr 12 21:23:20 laptop kernel:  snd_sof_run_firmware+0xef/0x2c0 [snd_sof 417f8679ec40533c81480ea6ddb94881207c114a]
Apr 12 21:23:20 laptop kernel:  sof_resume.isra.0+0xf1/0x240 [snd_sof 417f8679ec40533c81480ea6ddb94881207c114a]
Apr 12 21:23:20 laptop kernel:  ? pci_pme_active+0xa2/0x190
Apr 12 21:23:20 laptop kernel:  pci_pm_runtime_resume+0xa7/0xc0
Apr 12 21:23:20 laptop kernel:  ? pci_pm_freeze_noirq+0x100/0x100
Apr 12 21:23:20 laptop kernel:  __rpm_callback+0x41/0x150
Apr 12 21:23:20 laptop kernel:  ? pci_pm_freeze_noirq+0x100/0x100
Apr 12 21:23:20 laptop kernel:  rpm_callback+0x59/0x70
Apr 12 21:23:20 laptop kernel:  rpm_resume+0x561/0x800
Apr 12 21:23:20 laptop kernel:  __pm_runtime_resume+0x4a/0x80
Apr 12 21:23:20 laptop kernel:  snd_soc_pcm_component_pm_runtime_get+0x2f/0xb0 [snd_soc_core 221a1f5fe4a4e159d0a73d46fabd0c8050f816a2]
Apr 12 21:23:20 laptop kernel:  __soc_pcm_open+0x54/0x540 [snd_soc_core 221a1f5fe4a4e159d0a73d46fabd0c8050f816a2]
Apr 12 21:23:20 laptop kernel:  dpcm_be_dai_startup+0x13d/0x240 [snd_soc_core 221a1f5fe4a4e159d0a73d46fabd0c8050f816a2]
Apr 12 21:23:20 laptop kernel:  dpcm_fe_dai_open+0x108/0x8a0 [snd_soc_core 221a1f5fe4a4e159d0a73d46fabd0c8050f816a2]
Apr 12 21:23:20 laptop kernel:  ? make_alloc_exact+0x99/0x110
Apr 12 21:23:20 laptop kernel:  snd_pcm_open_substream+0x54f/0x8b0 [snd_pcm dd509d9968361c996f65d1f4a91c17c72ff42958]
Apr 12 21:23:20 laptop kernel:  snd_pcm_open+0x129/0x250 [snd_pcm dd509d9968361c996f65d1f4a91c17c72ff42958]
Apr 12 21:23:20 laptop kernel:  ? dput+0xcc/0x2e0
Apr 12 21:23:20 laptop kernel:  ? wake_up_q+0x90/0x90
Apr 12 21:23:20 laptop kernel:  snd_pcm_playback_open+0x3d/0x60 [snd_pcm dd509d9968361c996f65d1f4a91c17c72ff42958]
Apr 12 21:23:20 laptop kernel:  chrdev_open+0xc8/0x260
Apr 12 21:23:20 laptop kernel:  ? cdev_device_add+0x90/0x90
Apr 12 21:23:20 laptop kernel:  do_dentry_open+0x14c/0x3b0
Apr 12 21:23:20 laptop kernel:  path_openat+0xbdd/0x11c0
Apr 12 21:23:20 laptop kernel:  ? fsnotify+0x3ce/0x660
Apr 12 21:23:20 laptop kernel:  do_filp_open+0xa5/0x150
Apr 12 21:23:20 laptop kernel:  do_sys_openat2+0xb0/0x170
Apr 12 21:23:20 laptop kernel:  __x64_sys_openat+0x53/0x90
Apr 12 21:23:20 laptop kernel:  do_syscall_64+0x59/0x80
Apr 12 21:23:20 laptop kernel:  ? do_syscall_64+0x69/0x80
Apr 12 21:23:20 laptop kernel:  ? syscall_exit_to_user_mode+0x23/0x40
Apr 12 21:23:20 laptop kernel:  ? do_syscall_64+0x69/0x80
Apr 12 21:23:20 laptop kernel:  ? do_syscall_64+0x69/0x80
Apr 12 21:23:20 laptop kernel:  ? do_syscall_64+0x69/0x80
Apr 12 21:23:20 laptop kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xae
Apr 12 21:23:20 laptop kernel: RIP: 0033:0x7f6713729f84
Apr 12 21:23:20 laptop kernel: Code: 24 20 eb 8f 66 90 44 89 54 24 0c e8 76 7b f8 ff 44 8b 54 24 0c 44 89 e2 48 89 ee 41 89 c0 bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3>
Apr 12 21:23:20 laptop kernel: RSP: 002b:00007ffc16bd4700 EFLAGS: 00000293 ORIG_RAX: 0000000000000101
Apr 12 21:23:20 laptop kernel: RAX: ffffffffffffffda RBX: 0000000000080802 RCX: 00007f6713729f84
Apr 12 21:23:20 laptop kernel: RDX: 0000000000080802 RSI: 00007ffc16bd48c0 RDI: 00000000ffffff9c
Apr 12 21:23:20 laptop kernel: RBP: 00007ffc16bd48c0 R08: 0000000000000000 R09: 0000000000000000
Apr 12 21:23:20 laptop kernel: R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000080802
Apr 12 21:23:20 laptop kernel: R13: 0000000000070001 R14: 00007ffc16bd48c0 R15: 0000000081204101
Apr 12 21:23:20 laptop kernel:  </TASK>
Apr 12 21:23:20 laptop kernel: Modules linked in: rfcomm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_ad>
Apr 12 21:23:20 laptop kernel:  dell_laptop wacom ac97_bus dell_wmi usbhid snd_pcm_dmaengine ledtrig_audio dell_smbios mei_pxp dell_wmi_sysman mei_hdcp intel_ishtp_hid f>
Apr 12 21:23:20 laptop kernel:  processor_thermal_mbox int3403_thermal tps68470_regulator processor_thermal_rapl clk_tps68470 intel_rapl_common mac_hid intel_ish_ipc int>
Apr 12 21:23:20 laptop kernel: CR2: 0000000000000000
Apr 12 21:23:20 laptop kernel: ---[ end trace 0000000000000000 ]---
Apr 12 21:23:20 laptop kernel: RIP: 0010:dma_free_noncontiguous+0x81/0xb0
Apr 12 21:23:20 laptop kernel: Code: 10 48 8b 12 48 83 e2 fc 48 85 c0 74 0b 41 89 c8 4c 89 c9 ff d0 0f 1f 00 4c 89 e7 e8 d9 da 3c 00 4c 89 e7 41 5c e9 0f 74 1b 00 <48> 8>
Apr 12 21:23:20 laptop kernel: RSP: 0000:ffffb897819ff740 EFLAGS: 00010246
Apr 12 21:23:20 laptop kernel: RAX: 0000000000000000 RBX: 00000000fffffff4 RCX: 0000000000000000
Apr 12 21:23:20 laptop kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff96a281f7f0d0
Apr 12 21:23:20 laptop kernel: RBP: ffff96a2833b62e8 R08: 0000000000000001 R09: ffffb897819ff478
Apr 12 21:23:20 laptop kernel: R10: ffffb897819ff470 R11: ffffffffaeacab48 R12: 0000000000000000
Apr 12 21:23:20 laptop kernel: R13: 000000000007b000 R14: 0000000000000000 R15: ffff96a2833b6028
Apr 12 21:23:20 laptop kernel: FS:  00007f67136075c0(0000) GS:ffff96a5ff7c0000(0000) knlGS:0000000000000000
Apr 12 21:23:20 laptop kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 12 21:23:20 laptop kernel: CR2: 0000000000000000 CR3: 000000034b0e4001 CR4: 0000000000770ee0
Apr 12 21:23:20 laptop kernel: PKRU: 55555554

This causes any type of audio playback to stop working completely and requires a system restart - after which it eventually crashes again in a couple of hours.

@dcompoze dcompoze added the bug Something isn't working label Apr 12, 2022
@dcompoze dcompoze changed the title sof-audio-pci-intel-tgl intermittent crash on audio playback (Dell XPS 13 9310) [BUG] sof-audio-pci-intel-tgl intermittent crash on audio playback (Dell XPS 13 9310) Apr 12, 2022
@plbossart
Copy link
Member

@dcompoze this is a known issue that was fixed in upstream kernel commit b7fb0ae, which was applied to all recent stable kernels as well. I double-checked this is in 5.17.2, please update your kernel.

What is more concerning is why you run out of memory. We have a related bug on this, see #3530. If you see this pattern again we'd appreciate more information to try and root-cause this problem. Audio should use a minimal amount of memory, I don't see what could cause this problem unless we have a memory leak that was missed.

@plbossart plbossart transferred this issue from thesofproject/sof Apr 12, 2022
@plbossart plbossart added Community end-user or distro-reported issues NOT on topic/sof-dev issue that does not happen on branch topic/sof-dev labels Apr 12, 2022
@ujfalusi
Copy link
Collaborator

@dcompoze, can you check if 5.17.4 is fixing the issue? A fallback implementation of a memory allocator got backported which hopefully going to solve the issue.

@dcompoze
Copy link
Author

@ujfalusi I can confirm that kernel 5.17.4 has fixed the issue for me.

@ujfalusi
Copy link
Collaborator

@dcompoze, thank you for the confirmation, can we close the issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Community end-user or distro-reported issues DMA memory alloc NOT on topic/sof-dev issue that does not happen on branch topic/sof-dev
Projects
None yet
Development

No branches or pull requests

3 participants