Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kernel oops during sof_bootloop.sh test #1466

Closed
plbossart opened this issue Nov 12, 2019 · 7 comments
Closed

kernel oops during sof_bootloop.sh test #1466

plbossart opened this issue Nov 12, 2019 · 7 comments
Assignees
Labels
bug Something isn't working

Comments

@plbossart
Copy link
Member

Just saw this on SoundWire device, this may be related to recent ALSA/soc-core changes?

[  791.884742] ------------[ cut here ]------------
[  791.884747] WARNING: CPU: 6 PID: 740 at kernel/workqueue.c:3031 __flush_work+0x1a3/0x1c0
[  791.884747] Modules linked in: snd_soc_sdw_rt711_rt1308_rt715 snd_soc_hdac_hdmi snd_hda_codec_hdmi snd_soc_dmic snd_sof_pci(-) snd_sof_intel_hda_common snd_soc_hdac_hda soundwire_intel_init soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda snd_hda_ext_core snd_hda_codec snd_hwdep snd_hda_core snd_sof_acpi snd_sof_intel_byt snd_soc_acpi_intel_match snd_sof_intel_bdw snd_sof_intel_ipc snd_sof snd_sof_xtensa_dsp snd_soc_acpi snd_soc_rt1011 snd_soc_rt715 snd_soc_rt1308_sdw snd_soc_rt711 snd_soc_rt700 regmap_sdw soundwire_bus snd_soc_max98090 snd_soc_max98357a snd_soc_wm8804_i2c snd_soc_wm8804 snd_soc_pcm512x_i2c snd_soc_pcm512x snd_soc_rt5682 snd_soc_rt5677_spi snd_soc_rt5670 snd_soc_rt5651 snd_soc_rt5645 snd_soc_rt5640 snd_soc_rl6231 snd_soc_rt286 snd_soc_rl6347a snd_soc_da7219 snd_soc_da7213 snd_soc_core snd_pcm snd_intel_dspcfg snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device snd_timer snd soundcore ax88179_178a usbnet dell_laptop dell_wmi
[  791.884764]  ledtrig_audio dell_smbios wmi_bmof dell_wmi_descriptor dcdbas x86_pkg_temp_thermal intel_powerclamp i915 iwlmvm i2c_algo_bit drm_kms_helper syscopyarea sysfillrect hid_multitouch sysimgblt mei_me fb_sys_fops iwlwifi mei drm processor_thermal_device intel_soc_dts_iosf wmi int3403_thermal int340x_thermal_zone int3400_thermal acpi_thermal_rel efivarfs xhci_pci intel_lpss_pci intel_lpss xhci_hcd mfd_core i2c_hid [last unloaded: snd_pcm]
[  791.884774] CPU: 6 PID: 740 Comm: pulseaudio Not tainted 5.4.0-rc7-test+ #63
[  791.884774] Hardware name: Dell Inc. Latitude 9510/, BIOS 0.1.3 10/01/2019
[  791.884775] RIP: 0010:__flush_work+0x1a3/0x1c0
[  791.884777] Code: ff ff 41 c6 04 24 00 fb 45 31 f6 eb 8e 8b 0b 48 8b 53 08 83 e1 08 48 0f ba 2b 03 80 c9 f0 e9 5d ff ff ff 0f 0b e9 71 ff ff ff <0f> 0b 45 31 f6 e9 67 ff ff ff e8 fe 2f fe ff 66 66 2e 0f 1f 84 00
[  791.884777] RSP: 0018:ffff8f34c0ac3d10 EFLAGS: 00010246
[  791.884778] RAX: 0000000000000000 RBX: ffff8a73ed2f4e38 RCX: 000000008040003e
[  791.884778] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8a73ed2f4e38
[  791.884778] RBP: ffff8a73e62f0000 R08: 0000000000000000 R09: 0000000000000001
[  791.884779] R10: ffff8a7419f11140 R11: 0000000000000020 R12: ffff8a74189db000
[  791.884779] R13: ffff8a74189db000 R14: 0000000000000001 R15: dead000000000100
[  791.884780] FS:  00007f7b6a537ec0(0000) GS:ffff8a741c180000(0000) knlGS:0000000000000000
[  791.884780] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  791.884781] CR2: 00007f25e326ce90 CR3: 000000047f9c8003 CR4: 00000000003606e0
[  791.884781] Call Trace:
[  791.884785]  ? try_to_del_timer_sync+0x4a/0x80
[  791.884790]  soc_pcm_private_free+0x17/0x20 [snd_soc_core]
[  791.884793]  snd_pcm_free+0x1a/0x50 [snd_pcm]
[  791.884796]  __snd_device_free+0x46/0x60 [snd]
[  791.884798]  snd_device_free_all+0x3a/0x70 [snd]
[  791.884799]  release_card_device+0x14/0x50 [snd]
[  791.884802]  device_release+0x23/0x80
[  791.884804]  kobject_put+0x84/0x1a0
[  791.884806]  snd_card_file_remove+0x108/0x120 [snd]
[  791.884807]  snd_ctl_release+0x103/0x110 [snd]
[  791.884810]  __fput+0xb4/0x240
[  791.884811]  task_work_run+0x7c/0xa0
[  791.884813]  exit_to_usermode_loop+0x98/0xa0
[  791.884814]  do_syscall_64+0xea/0x110
[  791.884816]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  791.884817] RIP: 0033:0x7f7b6b3043ab
[  791.884818] Code: 03 00 00 00 0f 05 48 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 e3 fb ff ff 8b 7c 24 0c 41 89 c0 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2f 44 89 c7 89 44 24 0c e8 21 fc ff ff 8b 44
[  791.884818] RSP: 002b:00007ffde14583f0 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
[  791.884819] RAX: 0000000000000000 RBX: 00005644cbb166f0 RCX: 00007f7b6b3043ab
[  791.884819] RDX: 00007f7b6b189be0 RSI: 0000000000000000 RDI: 000000000000001d
[  791.884820] RBP: 00005644cbb14220 R08: 0000000000000000 R09: 00007ffde1458398
[  791.884820] R10: 00007ffde1458390 R11: 0000000000000293 R12: 0000000000000000
[  791.884821] R13: 00005644cba6cb60 R14: 00005644cbb02ec8 R15: 00005644cb812ea0
[  791.884821] ---[ end trace ce2b13b0055079cf ]---
[  791.884825] BUG: unable to handle page fault for address: fffffffffffffff8
[  791.884827] #PF: supervisor read access in kernel mode
[  791.884828] #PF: error_code(0x0000) - not-present page
[  791.884828] PGD 1e520c067 P4D 1e520c067 PUD 1e520e067 PMD 0 
[  791.884830] Oops: 0000 [#1] SMP NOPTI
[  791.884831] CPU: 6 PID: 740 Comm: pulseaudio Tainted: G        W         5.4.0-rc7-test+ #63
[  791.884832] Hardware name: Dell Inc. Latitude 9510/, BIOS 0.1.3 10/01/2019
[  791.884834] RIP: 0010:snd_soc_pcm_component_free+0x49/0x60 [snd_soc_core]
[  791.884835] Code: f8 48 39 c5 75 24 eb 2a 48 8b 47 60 48 8b 40 70 48 85 c0 74 08 4c 89 e6 e8 84 5b 42 d7 48 8b 43 08 48 8d 58 f8 48 39 c5 74 08 <48> 8b 3b 48 85 ff 75 d6 5b 5d 41 5c c3 66 2e 0f 1f 84 00 00 00 00
[  791.884836] RSP: 0018:ffff8f34c0ac3d80 EFLAGS: 00010282
[  791.884837] RAX: 0000000000000000 RBX: fffffffffffffff8 RCX: 000000008040003e
[  791.884837] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8a73e62f0000
[  791.884838] RBP: ffff8a73ed2f4eb0 R08: 0000000000000000 R09: 0000000000000001
[  791.884838] R10: ffff8a7419f11140 R11: 0000000000000020 R12: ffff8a73e62f0000
[  791.884839] R13: ffff8a74189db000 R14: dead000000000122 R15: dead000000000100
[  791.884840] FS:  00007f7b6a537ec0(0000) GS:ffff8a741c180000(0000) knlGS:0000000000000000
[  791.884840] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  791.884841] CR2: fffffffffffffff8 CR3: 000000047f9c8003 CR4: 00000000003606e0
[  791.884842] Call Trace:
[  791.884843]  snd_pcm_free+0x1a/0x50 [snd_pcm]
[  791.884845]  __snd_device_free+0x46/0x60 [snd]
[  791.884847]  snd_device_free_all+0x3a/0x70 [snd]
[  791.884848]  release_card_device+0x14/0x50 [snd]
[  791.884849]  device_release+0x23/0x80
[  791.884851]  kobject_put+0x84/0x1a0
[  791.884852]  snd_card_file_remove+0x108/0x120 [snd]
[  791.884854]  snd_ctl_release+0x103/0x110 [snd]
[  791.884855]  __fput+0xb4/0x240
[  791.884856]  task_work_run+0x7c/0xa0
[  791.884857]  exit_to_usermode_loop+0x98/0xa0
[  791.884858]  do_syscall_64+0xea/0x110
[  791.884859]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  791.884860] RIP: 0033:0x7f7b6b3043ab
[  791.884861] Code: 03 00 00 00 0f 05 48 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 e3 fb ff ff 8b 7c 24 0c 41 89 c0 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2f 44 89 c7 89 44 24 0c e8 21 fc ff ff 8b 44
[  791.884861] RSP: 002b:00007ffde14583f0 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
[  791.884862] RAX: 0000000000000000 RBX: 00005644cbb166f0 RCX: 00007f7b6b3043ab
[  791.884863] RDX: 00007f7b6b189be0 RSI: 0000000000000000 RDI: 000000000000001d
[  791.884863] RBP: 00005644cbb14220 R08: 0000000000000000 R09: 00007ffde1458398
[  791.884864] R10: 00007ffde1458390 R11: 0000000000000293 R12: 0000000000000000
[  791.884864] R13: 00005644cba6cb60 R14: 00005644cbb02ec8 R15: 00005644cb812ea0
[  791.884865] Modules linked in: snd_soc_sdw_rt711_rt1308_rt715 snd_soc_hdac_hdmi snd_hda_codec_hdmi snd_soc_dmic snd_sof_pci(-) snd_sof_intel_hda_common snd_soc_hdac_hda soundwire_intel_init soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda snd_hda_ext_core snd_hda_codec snd_hwdep snd_hda_core snd_sof_acpi snd_sof_intel_byt snd_soc_acpi_intel_match snd_sof_intel_bdw snd_sof_intel_ipc snd_sof snd_sof_xtensa_dsp snd_soc_acpi snd_soc_rt1011 snd_soc_rt715 snd_soc_rt1308_sdw snd_soc_rt711 snd_soc_rt700 regmap_sdw soundwire_bus snd_soc_max98090 snd_soc_max98357a snd_soc_wm8804_i2c snd_soc_wm8804 snd_soc_pcm512x_i2c snd_soc_pcm512x snd_soc_rt5682 snd_soc_rt5677_spi snd_soc_rt5670 snd_soc_rt5651 snd_soc_rt5645 snd_soc_rt5640 snd_soc_rl6231 snd_soc_rt286 snd_soc_rl6347a snd_soc_da7219 snd_soc_da7213 snd_soc_core snd_pcm snd_intel_dspcfg snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device snd_timer snd soundcore ax88179_178a usbnet dell_laptop dell_wmi
[  791.884872]  ledtrig_audio dell_smbios wmi_bmof dell_wmi_descriptor dcdbas x86_pkg_temp_thermal intel_powerclamp i915 iwlmvm i2c_algo_bit drm_kms_helper syscopyarea sysfillrect hid_multitouch sysimgblt mei_me fb_sys_fops iwlwifi mei drm processor_thermal_device intel_soc_dts_iosf wmi int3403_thermal int340x_thermal_zone int3400_thermal acpi_thermal_rel efivarfs xhci_pci intel_lpss_pci intel_lpss xhci_hcd mfd_core i2c_hid [last unloaded: snd_pcm]
[  791.884878] CR2: fffffffffffffff8
[  791.884879] ---[ end trace ce2b13b0055079d0 ]---
[  791.884881] RIP: 0010:snd_soc_pcm_component_free+0x49/0x60 [snd_soc_core]
[  791.884882] Code: f8 48 39 c5 75 24 eb 2a 48 8b 47 60 48 8b 40 70 48 85 c0 74 08 4c 89 e6 e8 84 5b 42 d7 48 8b 43 08 48 8d 58 f8 48 39 c5 74 08 <48> 8b 3b 48 85 ff 75 d6 5b 5d 41 5c c3 66 2e 0f 1f 84 00 00 00 00
[  791.884883] RSP: 0018:ffff8f34c0ac3d80 EFLAGS: 00010282
[  791.884884] RAX: 0000000000000000 RBX: fffffffffffffff8 RCX: 000000008040003e
[  791.884884] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8a73e62f0000
[  791.884885] RBP: ffff8a73ed2f4eb0 R08: 0000000000000000 R09: 0000000000000001
[  791.884885] R10: ffff8a7419f11140 R11: 0000000000000020 R12: ffff8a73e62f0000
[  791.884886] R13: ffff8a74189db000 R14: dead000000000122 R15: dead000000000100
[  791.884887] FS:  00007f7b6a537ec0(0000) GS:ffff8a741c180000(0000) knlGS:0000000000000000
[  791.884887] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  791.884888] CR2: fffffffffffffff8 CR3: 000000047f9c8003 CR4: 00000000003606e0
@plbossart
Copy link
Member Author

This seems to happen during a pm_runtime resume operation

[ 1014.440467] sof-audio-pci 0000:00:1f.3: snd_sof_runtime_resume
[ 1014.440472] sof-audio-pci 0000:00:1f.3: sof_resume

[ 1014.443499] sof-audio-pci 0000:00:1f.3: loading firmware

[ 1014.443511] sof-audio-pci 0000:00:1f.3: booting DSP firmware

[ 1014.869378] sof-audio-pci 0000:00:1f.3: Firmware download successful, booting...

[ 1014.876224] sof-audio-pci 0000:00:1f.3: firmware boot complete
[ 1014.877244] sof-audio-pci 0000:00:1f.3: sof_restore_pipelines

[ 1014.886163] sof-audio-pci 0000:00:1f.3: sof_restore_kcontrols

[ 1014.887437] sof-audio-pci 0000:00:1f.3: sof_send_pm_ctx_ipc
[ 1014.887441] sof-audio-pci 0000:00:1f.3: ipc tx: 0x40020000: GLB_PM_MSG: CTX_RESTORE
[ 1014.887498] sof-audio-pci 0000:00:1f.3: ipc tx succeeded: 0x40020000: GLB_PM_MSG: CTX_RESTORE
[ 1014.887506] sof-audio-pci 0000:00:1f.3: snd_sof_runtime_idle
[ 1014.888037] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0x20140000 successful
[ 1014.888104] rt711 sdw:0:25d:711:0: in rt711_jack_init disable
[ 1014.888198] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0x1010f0f successful
[ 1014.888202] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0x1000f0f successful
[ 1014.888204] sof-audio-pci 0000:00:1f.3: DSP core(s) enabled? 0 : core_mask 1
[ 1014.888213] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0xf0f successful
[ 1014.888217] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0xf0f successful
[ 1014.888219] sof-audio-pci 0000:00:1f.3: DSP core(s) enabled? 0 : core_mask 1
[ 1014.888225] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0xf0f successful
[ 1014.888230] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0xf0f successful
[ 1014.888233] sof-audio-pci 0000:00:1f.3: DSP core(s) enabled? 0 : core_mask 1
[ 1014.888241] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0xf0f successful
[ 1014.888247] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0xf0f successful
[ 1014.888249] sof-audio-pci 0000:00:1f.3: DSP core(s) enabled? 0 : core_mask 1
[ 1014.888255] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0xf0f successful
[ 1014.888259] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0xf0f successful
[ 1014.888261] sof-audio-pci 0000:00:1f.3: DSP core(s) enabled? 0 : core_mask 1
[ 1014.888267] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0xf0f successful
[ 1014.888272] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0xf0f successful
[ 1014.888275] sof-audio-pci 0000:00:1f.3: DSP core(s) enabled? 0 : core_mask 1
[ 1014.888281] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0xf0f successful
[ 1014.888285] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0xf0f successful
[ 1014.888288] sof-audio-pci 0000:00:1f.3: DSP core(s) enabled? 0 : core_mask 1
[ 1015.047293] general protection fault: 0000 [#1] SMP NOPTI
[ 1015.047296] CPU: 10 PID: 761 Comm: pulseaudio Not tainted 5.4.0-rc7-test+ #63
[ 1015.047297] Hardware name: Dell Inc. Latitude 9510/, BIOS 0.1.3 10/01/2019
[ 1015.047303] RIP: 0010:snd_soc_pcm_component_free+0x27/0x60 [snd_soc_core]

@plbossart
Copy link
Member Author

same problem on HDAudio+DMIC machine... Gah.

@ranj063
Copy link
Collaborator

ranj063 commented Nov 12, 2019

@plbossart a long shot but can you please try if this helps?

diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c
index 55014e7ae0d8..317fc0e0d54c 100644
--- a/sound/soc/soc-core.c
+++ b/sound/soc/soc-core.c
@@ -311,7 +311,6 @@ static int snd_soc_rtdcom_add(struct snd_soc_pcm_runtime *rtd,
                return -ENOMEM;
 
        rtdcom->component = component;
-       INIT_LIST_HEAD(&rtdcom->list);
 
        /*
         * When rtd was freed, created rtdcom here will be
@@ -430,6 +429,8 @@ static void soc_free_pcm_runtime(struct snd_soc_pcm_runtime *rtd)
         *      soc_new_pcm_runtime()
         */
        device_unregister(rtd->dev);
+
+       INIT_LIST_HEAD(&rtd->component_list);
 }
 
 static struct snd_soc_pcm_runtime *soc_new_pcm_runtime(

@plbossart
Copy link
Member Author

The error is "root-caused' to "ASoC: soc-core: add soc_unbind_dai_link()"
I have spent a quite a bit of time to bisect but I don't understand that change

the code moved the pcm runtime cleanup from soc_cleanup_card_resources() to snd_soc_remove_dai_link(), but the latter is called both from soc_cleanup_card_resources() and topology.

I have no idea why things are done in this way...

@ranj063
Copy link
Collaborator

ranj063 commented Nov 12, 2019

The error is "root-caused' to "ASoC: soc-core: add soc_unbind_dai_link()"
I have spent a quite a bit of time to bisect but I don't understand that change

the code moved the pcm runtime cleanup from soc_cleanup_card_resources() to snd_soc_remove_dai_link(), but the latter is called both from soc_cleanup_card_resources() and topology.

I have no idea why things are done in this way...

@plbossart Im guessing my change didnt help then. Anyway, my theory is that the problem is caused because snd_soc_remove_dai_link() called when the topology is removed unregisters the rtd->dev and later on when the pcm->private_free() gets called, it runs into errors because both the rtd device has been already unregistered and the component removed too.

@Liviali155
Copy link

Error not seen on
CML Chrome with onboard codec RT5682 in I2S mode
ICL RVP with onboard codec ALC700 in HDA mode
GLK Chrome with onboard codec DA7219 in I2S mode
CFL-S RVP with onboard codec ALC700 in HDA mode
CML Mantis with onboard codec ALC3204 in HDA mode
CML Helios with onboard codec RT5682 in I2S mode
APL UP2 nocodec
APL UP2 with codec PCM512x in I2S mode
BYT MB nocodec
BYT MB with codec RT5682 in I2S mode

Environment
Firmware: 9b5dc8c https://github.com/thesofproject/sof/commits/master
Kernel: bc30957 https://github.com/thesofproject/linux/commits/topic/sof-dev

@Liviali155
Copy link

closed

@mengdonglin mengdonglin added the bug Something isn't working label Nov 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants