Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]Hot-Plug in HDMI cause kernel NULL pointer dereference address: 0000000000000010 #1536

Closed
dengyangchao opened this issue Nov 22, 2019 · 14 comments
Assignees
Labels
APL Applies to ApolloLake platform bug Something isn't working CFL Applies to Coffee Lake platform CML Applies to Comet Lake platform P1 Blocker bugs or important features

Comments

@dengyangchao
Copy link

dengyangchao commented Nov 22, 2019

Describe the bug
1.Hot Plug in hdmi cause kernel NULL pointer dereference, screen turn black and seems freeze.
2.Hot plug/unplug DP work fine, and can detect the unplug/plug status.
3.Power-off-> plug in HDMI -> power on, boot normal and show HDMI display, but amixer contents can't detect the HDMI. And unplug HDMI issue reproduced.

numid=15,iface=CARD,name='HDMI/DP,pcm=3 Jack'
  ; type=BOOLEAN,access=r-------,values=1
  : values=off

To Reproduce
1.Sudo reboot
2.Wait runtime PM status to suspend
3.Plug in HDMI

Reproduction Rate
5/5

Expected behavior
Detect HDMI normal and work fine

Impact
Cause kernel NULL pointer dereference, and system freeze

Environment
Platform: CML Chrome with onborad codec RT5682 in I2S mode
Firmware: 9b5dc8c https://github.com/thesofproject/sof/commits/master
Kernel: bc30957 https://github.com/thesofproject/linux/commits/topic/sof-dev
Topology: file: tools/topology/sof-cml-rt5682-max98357a.tplg same as firmware

[   46.036089] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0x1000f0f successful
[   46.036091] sof-audio-pci 0000:00:1f.3: DSP core(s) enabled? 0 : core_mask f
[   46.036118] sof-audio-pci 0000:00:1f.3: Debug PCIR: 00000010 at  00000044
[  130.894290] BUG: kernel NULL pointer dereference, address: 0000000000000010
[  130.894298] #PF: supervisor read access in kernel mode
[  130.894302] #PF: error_code(0x0000) - not-present page
[  130.894305] PGD 0 P4D 0
[  130.894312] Oops: 0000 [#1] SMP NOPTI
[  130.894319] CPU: 2 PID: 832 Comm: Xorg Tainted: G        W         5.4.0-rc8-daily-default-20191122-1 #bc30957c
[  130.894322] Hardware name: Google Hatch/Hatch, BIOS  07/05/2019
[  130.894335] RIP: 0010:snd_ctl_notify+0x97/0x1d0 [snd]
[  130.894341] Code: 0f 84 0b 01 00 00 41 8b 47 50 85 c0 74 ec 4d 8d 6f 40 4c 89 ef e8 99 d4 f9 fa 49 89 c6 49 8b 47 58 49 8d 4f 58 48 39 c1 74 1f <41> 8b 14 24 39 50 10 75 0e e9 f5 00 00 00 39 50 10 0f 84 ec 00 00
[  130.894345] RSP: 0018:ffffa3bc4088f808 EFLAGS: 00010002
[  130.894350] RAX: ffff9fcefb529f60 RBX: ffff9fcf1294b000 RCX: ffff9fcf08a60858
[  130.894354] RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffff9fcf08a60840
[  130.894357] RBP: ffff9fcf1294b4d0 R08: ffff9fcf12b54800 R09: ffff9fcf1335a028
[  130.894360] R10: 0000000000000000 R11: ffffa3bc4088f759 R12: 0000000000000010
[  130.894363] R13: ffff9fcf08a60840 R14: 0000000000000202 R15: ffff9fcf08a60800
[  130.894368] FS:  00007fb25f6f0a80(0000) GS:ffff9fcf16280000(0000) knlGS:0000000000000000
[  130.894372] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  130.894375] CR2: 0000000000000010 CR3: 00000002513d4004 CR4: 00000000003606e0
[  130.894378] Call Trace:
[  130.894390]  ? _cond_resched+0x10/0x40
[  130.894399]  update_eld+0x223/0x590 [snd_hda_codec_hdmi]
[  130.894460]  ? i915_audio_component_get_eld+0x66/0x140 [i915]
[  130.894467]  hdmi_present_sense+0x21a/0x3b0 [snd_hda_codec_hdmi]
[  130.894476]  check_presence_and_report+0x7b/0xc0 [snd_hda_codec_hdmi]
[  130.894526]  intel_audio_codec_enable+0x11c/0x180 [i915]
[  130.894579]  intel_encoders_enable.isra.126+0x61/0x90 [i915]
[  130.894630]  haswell_crtc_enable+0x236/0x750 [i915]
[  130.894640]  ? finish_wait+0x2a/0x60
[  130.894691]  intel_update_crtc+0x52/0x3c0 [i915]
[  130.894740]  skl_update_crtcs+0x267/0x2b0 [i915]
[  130.894787]  intel_atomic_commit_tail+0x225/0x14c0 [i915]
[  130.894795]  ? flush_workqueue+0x193/0x3c0
[  130.894841]  intel_atomic_commit+0x238/0x2c0 [i915]
[  130.894856]  drm_atomic_helper_set_config+0x72/0x80 [drm_kms_helper]
[  130.894878]  drm_mode_setcrtc+0x176/0x6e0 [drm]
[  130.894900]  ? drm_mode_getcrtc+0x180/0x180 [drm]
[  130.894915]  drm_ioctl_kernel+0xa7/0xf0 [drm]
[  130.894932]  drm_ioctl+0x2e1/0x390 [drm]
[  130.894952]  ? drm_mode_getcrtc+0x180/0x180 [drm]
[  130.894962]  do_vfs_ioctl+0x9f/0x620
[  130.894970]  ksys_ioctl+0x6b/0x80
[  130.894976]  __x64_sys_ioctl+0x11/0x20
[  130.894982]  do_syscall_64+0x43/0x120
[  130.894989]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  130.894994] RIP: 0033:0x7fb25c8d15d7
[  130.895000] Code: b3 66 90 48 8b 05 b1 48 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 81 48 2d 00 f7 d8 64 89 01 48
[  130.895004] RSP: 002b:00007ffd560b26b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  130.895010] RAX: ffffffffffffffda RBX: 00007ffd560b26f0 RCX: 00007fb25c8d15d7
[  130.895014] RDX: 00007ffd560b26f0 RSI: 00000000c06864a2 RDI: 000000000000000c
[  130.895017] RBP: 00007ffd560b26f0 R08: 0000000000000000 R09: 000055fa66c87180
[  130.895020] R10: 00007ffd560b27b0 R11: 0000000000000246 R12: 00000000c06864a2
[  130.895023] R13: 000000000000000c R14: 000055fa664bb010 R15: 00007ffd560b27b0
[  130.895028] Modules linked in: asix usbnet hid_multitouch snd_soc_sof_rt5682 snd_soc_hdac_hdmi snd_hda_codec_hdmi snd_soc_dmic snd_sof_pci snd_sof_intel_hda_common snd_soc_hdac_hda snd_intel_dspcfg snd_sof_intel_hda snd_sof_intel_byt snd_soc_acpi_intel_match snd_sof_intel_ipc snd_sof snd_sof_xtensa_dsp snd_soc_acpi snd_hda_ext_core snd_hda_codec snd_soc_max98357a snd_hwdep snd_hda_core snd_seq_midi snd_seq_midi_event snd_soc_rt5682 x86_pkg_temp_thermal snd_soc_rl6231 intel_powerclamp snd_rawmidi snd_soc_core snd_pcm i915 snd_seq elan_i2c i2c_algo_bit int3403_thermal snd_seq_device snd_timer drm_kms_helper processor_thermal_device mei_me syscopyarea int340x_thermal_zone int3400_thermal sysfillrect mei acpi_thermal_rel sysimgblt intel_soc_dts_iosf snd fb_sys_fops drm soundcore intel_lpss_pci intel_lpss mfd_core efivarfs sdhci_pci xhci_pci cqhci xhci_hcd sdhci i2c_hid
[  130.895090] CR2: 0000000000000010
[  130.895094] ---[ end trace b486920983c99934 ]---
[  130.895102] RIP: 0010:snd_ctl_notify+0x97/0x1d0 [snd]
[  130.895107] Code: 0f 84 0b 01 00 00 41 8b 47 50 85 c0 74 ec 4d 8d 6f 40 4c 89 ef e8 99 d4 f9 fa 49 89 c6 49 8b 47 58 49 8d 4f 58 48 39 c1 74 1f <41> 8b 14 24 39 50 10 75 0e e9 f5 00 00 00 39 50 10 0f 84 ec 00 00
[  130.895110] RSP: 0018:ffffa3bc4088f808 EFLAGS: 00010002
[  130.895114] RAX: ffff9fcefb529f60 RBX: ffff9fcf1294b000 RCX: ffff9fcf08a60858
[  130.895116] RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffff9fcf08a60840
[  130.895119] RBP: ffff9fcf1294b4d0 R08: ffff9fcf12b54800 R09: ffff9fcf1335a028
[  130.895122] R10: 0000000000000000 R11: ffffa3bc4088f759 R12: 0000000000000010
[  130.895125] R13: ffff9fcf08a60840 R14: 0000000000000202 R15: ffff9fcf08a60800
[  130.895128] FS:  00007fb25f6f0a80(0000) GS:ffff9fcf16280000(0000) knlGS:0000000000000000
[  130.895131] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  130.895134] CR2: 0000000000000010 CR3: 00000002513d4004 CR4: 00000000003606e0

dmesg.log

amixer.txt

@dengyangchao dengyangchao added bug Something isn't working CML Applies to Comet Lake platform labels Nov 22, 2019
@dengyangchao
Copy link
Author

dengyangchao commented Nov 22, 2019

Issue also reproduced when try DP-MST(connect two DP monitor through DP-HUB)
Platform: CML Chrome with onborad codec RT5682 in I2S mode

Dmesg
[ 1246.106813] asix 1-3.1:1.0 eth0: register 'asix' at usb-0000:00:14.0-3.1, ASIX AX88772B USB 2.0 Ethernet, 00:0e:c6:bc:48:cf
[ 1246.171786] usb 1-3.2: new low-speed USB device number 12 using xhci_hcd
[ 1246.256948] usb 1-3.2: New USB device found, idVendor=413c, idProduct=2107, bcdDevice= 1.15
[ 1246.256950] usb 1-3.2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[ 1246.256951] usb 1-3.2: Product: Dell USB Entry Keyboard
[ 1246.256952] usb 1-3.2: Manufacturer: Dell
[ 1246.263406] input: Dell Dell USB Entry Keyboard as /devices/pci0000:00/0000:00:14.0/usb1/1-3/1-3.2/1-3.2:1.0/0003:413C:2107.0005/input/input17
[ 1246.314867] hid-generic 0003:413C:2107.0005: input,hidraw2: USB HID v1.10 Keyboard [Dell Dell USB Entry Keyboard] on usb-0000:00:14.0-3.2/input0
[ 1246.325882] asix 1-3.1:1.0 enx000ec6bc48cf: renamed from eth0
[ 1247.609190] IPv6: ADDRCONF(NETDEV_CHANGE): enx000ec6bc48cf: link becomes ready
[ 1247.612067] asix 1-3.1:1.0 enx000ec6bc48cf: link up, 100Mbps, full-duplex, lpa 0x43E1
[ 1280.506007] usb 1-2: new full-speed USB device number 13 using xhci_hcd
[ 1280.634103] usb 1-2: New USB device found, idVendor=06c4, idProduct=c401, bcdDevice= 0.04
[ 1280.634110] usb 1-2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[ 1280.634114] usb 1-2: Product: DisplayPort ALT mode device
[ 1280.634118] usb 1-2: Manufacturer: BizLink Technology Inc.
[ 1310.637384] BUG: kernel NULL pointer dereference, address: 0000000000000010
[ 1310.637391] #PF: supervisor read access in kernel mode
[ 1310.637395] #PF: error_code(0x0000) - not-present page
[ 1310.637398] PGD 0 P4D 0
[ 1310.637406] Oops: 0000 [#1] SMP NOPTI
[ 1310.637413] CPU: 4 PID: 834 Comm: Xorg Tainted: G        W         5.4.0-rc8-daily-default-20191122-1 #bc30957c
[ 1310.637416] Hardware name: Google Hatch/Hatch, BIOS  07/05/2019
[ 1310.637429] RIP: 0010:snd_ctl_notify+0x97/0x1d0 [snd]
[ 1310.637435] Code: 0f 84 0b 01 00 00 41 8b 47 50 85 c0 74 ec 4d 8d 6f 40 4c 89 ef e8 99 44 91 c8 49 89 c6 49 8b 47 58 49 8d 4f 58 48 39 c1 74 1f <414 39 50 10 75 0e e9 f5 00 00 00 39 50 10 0f 84 ec 00 00
[ 1310.637439] RSP: 0018:ffff8bcf0088b808 EFLAGS: 00010002
[ 1310.637444] RAX: ffff8a8447947ae0 RBX: ffff8a8452f6b000 RCX: ffff8a84528388d8
[ 1310.637447] RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffff8a84528388c0
[ 1310.637451] RBP: ffff8a8452f6b4d0 R08: ffff8a8451137000 R09: ffff8a8451cc9828
[ 1310.637454] R10: 0000000000000000 R11: ffff8bcf0088b759 R12: 0000000000000010
[ 1310.637457] R13: ffff8a84528388c0 R14: 0000000000000202 R15: ffff8a8452838880
[ 1310.637462] FS:  00007fcf16316a80(0000) GS:ffff8a8456300000(0000) knlGS:0000000000000000
[ 1310.637466] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1310.637469] CR2: 0000000000000010 CR3: 0000000247a0e001 CR4: 00000000003606e0
[ 1310.637472] Call Trace:
[ 1310.637483]  ? _cond_resched+0x10/0x40
[ 1310.637494]  update_eld+0x223/0x590 [snd_hda_codec_hdmi]
[ 1310.637555]  ? i915_audio_component_get_eld+0x66/0x140 [i915]
[ 1310.637563]  hdmi_present_sense+0x21a/0x3b0 [snd_hda_codec_hdmi]
[ 1310.637571]  check_presence_and_report+0x7b/0xc0 [snd_hda_codec_hdmi]
[ 1310.637623]  intel_audio_codec_enable+0x11c/0x180 [i915]
[ 1310.637675]  intel_encoders_enable.isra.126+0x61/0x90 [i915]
[ 1310.637726]  haswell_crtc_enable+0x236/0x750 [i915]
[ 1310.637736]  ? finish_wait+0x2a/0x60
[ 1310.637789]  intel_update_crtc+0x52/0x3c0 [i915]
[ 1310.637837]  skl_update_crtcs+0x267/0x2b0 [i915]
[ 1310.637885]  intel_atomic_commit_tail+0x225/0x14c0 [i915]
[ 1310.637893]  ? flush_workqueue+0x193/0x3c0
[ 1310.637939]  intel_atomic_commit+0x238/0x2c0 [i915]
[ 1310.637955]  drm_atomic_helper_set_config+0x72/0x80 [drm_kms_helper]
[ 1310.637977]  drm_mode_setcrtc+0x176/0x6e0 [drm]
[ 1310.637998]  ? drm_mode_getcrtc+0x180/0x180 [drm]
[ 1310.638013]  drm_ioctl_kernel+0xa7/0xf0 [drm]
[ 1310.638029]  drm_ioctl+0x2e1/0x390 [drm]
[ 1310.638046]  ? drm_mode_getcrtc+0x180/0x180 [drm]
[ 1310.638055]  do_vfs_ioctl+0x9f/0x620
[ 1310.638063]  ksys_ioctl+0x6b/0x80
[ 1310.638070]  __x64_sys_ioctl+0x11/0x20
[ 1310.638075]  do_syscall_64+0x43/0x120
[ 1310.638082]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1310.638088] RIP: 0033:0x7fcf134f75d7
[ 1310.638094] Code: b3 66 90 48 8b 05 b1 48 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <480 ff ff 73 01 c3 48 8b 0d 81 48 2d 00 f7 d8 64 89 01 48
[ 1310.638098] RSP: 002b:00007ffc9d104028 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 1310.638104] RAX: ffffffffffffffda RBX: 00007ffc9d104060 RCX: 00007fcf134f75d7
[ 1310.638107] RDX: 00007ffc9d104060 RSI: 00000000c06864a2 RDI: 000000000000000c
[ 1310.638110] RBP: 00007ffc9d104060 R08: 0000000000000000 R09: 0000558ecdb25d00
[ 1310.638114] R10: 00007ffc9d104120 R11: 0000000000000246 R12: 00000000c06864a2
[ 1310.638117] R13: 000000000000000c R14: 0000558ecd97f230 R15: 00007ffc9d104120
[ 1310.638122] Modules linked in: asix usbnet hid_multitouch snd_soc_sof_rt5682 snd_hda_codec_hdmi snd_soc_hdac_hdmi snd_soc_dmic snd_sof_pci snd_sof__common snd_soc_hdac_hda snd_intel_dspcfg snd_sof_intel_hda snd_sof_intel_byt snd_soc_acpi_intel_match snd_sof_intel_ipc snd_sof snd_sof_xtensa_dsp sn682 snd_soc_acpi snd_soc_max98357a snd_hda_ext_core snd_soc_rl6231 snd_soc_core snd_seq_midi snd_hda_codec i915 snd_seq_midi_event x86_pkg_temp_thermaep intel_powerclamp snd_rawmidi snd_hda_core i2c_algo_bit snd_pcm snd_seq drm_kms_helper syscopyarea sysfillrect sysimgblt mei_me fb_sys_fops mei snd_e intel_lpss_pci drm snd_timer intel_lpss processor_thermal_device mfd_core intel_soc_dts_iosf snd elan_i2c int3403_thermal int340x_thermal_zone sound400_thermal acpi_thermal_rel efivarfs sdhci_pci xhci_pci cqhci sdhci xhci_hcd i2c_hid
[ 1310.638182] CR2: 0000000000000010
[ 1310.638187] ---[ end trace 1cb30c6cf2bd7eb4 ]---
[ 1310.638196] RIP: 0010:snd_ctl_notify+0x97/0x1d0 [snd]
[ 1310.638201] Code: 0f 84 0b 01 00 00 41 8b 47 50 85 c0 74 ec 4d 8d 6f 40 4c 89 ef e8 99 44 91 c8 49 89 c6 49 8b 47 58 49 8d 4f 58 48 39 c1 74 1f <414 39 50 10 75 0e e9 f5 00 00 00 39 50 10 0f 84 ec 00 00
[ 1310.638204] RSP: 0018:ffff8bcf0088b808 EFLAGS: 00010002
[ 1310.638207] RAX: ffff8a8447947ae0 RBX: ffff8a8452f6b000 RCX: ffff8a84528388d8
[ 1310.638210] RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffff8a84528388c0
[ 1310.638213] RBP: ffff8a8452f6b4d0 R08: ffff8a8451137000 R09: ffff8a8451cc9828
[ 1310.638215] R10: 0000000000000000 R11: ffff8bcf0088b759 R12: 0000000000000010
[ 1310.638218] R13: ffff8a84528388c0 R14: 0000000000000202 R15: ffff8a8452838880
[ 1310.638222] FS:  00007fcf16316a80(0000) GS:ffff8a8456300000(0000) knlGS:0000000000000000
[ 1310.638225] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1310.638228] CR2: 0000000000000010 CR3: 0000000247a0e001 CR4: 00000000003606e0

@Liviali155
Copy link

Issue also can reproduced when try DP-MST(connect two DP monitor through DP-HUB)
Platform: APL UP2 with codec PCM512x in I2S mode

@Liviali155 Liviali155 added the APL Applies to ApolloLake platform label Nov 22, 2019
@dengyangchao dengyangchao changed the title [BUG][CML]Hot-Plug in HDMI cause kernel NULL pointer dereference address: 0000000000000010 [BUG]Hot-Plug in HDMI cause kernel NULL pointer dereference address: 0000000000000010 Nov 22, 2019
@dengyangchao
Copy link
Author

Issue also reproduced when try DP-MST(connect two DP monitor through DP-HUB)
Platform: CFL-S RVP with onboard codec ALC700 in HDA mode

@dengyangchao dengyangchao added the CFL Applies to Coffee Lake platform label Nov 22, 2019
@kv2019i kv2019i self-assigned this Nov 22, 2019
@kv2019i
Copy link
Collaborator

kv2019i commented Nov 22, 2019

I'll take a look at this.

@mengdonglin mengdonglin added the P1 Blocker bugs or important features label Nov 25, 2019
@kv2019i
Copy link
Collaborator

kv2019i commented Nov 26, 2019

I've been trying to reproduce this issue, but having hard time with this. I've tried both on CML and APL systems, but no issues with the hotplug. I used latest sof-dev and the downgrade to bc30957 mentioned in original bug report, but no issues seen at hdmi hotplug.

I'll continue to try to reproduce.

UPDATE: I tried with CFL and also various DP-MST scenarios, but still no luck. @dengyangchao does this still happen to you?

kv2019i added a commit to kv2019i/linux that referenced this issue Nov 26, 2019
Add debug messages to track ELD kcontrol creation and how
they are mapped to notify calls from i915 driver upon hotplug.

BugLink: thesofproject#1536
@kv2019i
Copy link
Collaborator

kv2019i commented Nov 26, 2019

@dengyangchao If you can still reproduce this (on latest sof-dev), can you take #1550 , run the same scenario and share the dmesg. You won't get a kernel oops anymore with PR1550 and you should get following trace in dmesg "HDMI: NULL eld_ctl for pcm_idx".

I cannot trigger this on any setups I have, the eld_ctl is always valid.

@dengyangchao
Copy link
Author

@kv2019i Tried Hot-plug in HDMI, still reproduce the issue. And after applied #1550 dmesg as below:
platform: CML Chrome with onborad codec RT5682 in I2S mode

[   46.298651] sof-audio-pci 0000:00:1f.3: ipc tx succeeded: 0x40010000: GLB_PM_MSG: CTX_SAVE
[   46.298723] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0x1010f0f successful
[   46.298726] sof-audio-pci 0000:00:1f.3: FW Poll Status: reg=0x1000f0f successful
[   46.298728] sof-audio-pci 0000:00:1f.3: DSP core(s) enabled? 0 : core_mask f
[   46.298755] sof-audio-pci 0000:00:1f.3: Debug PCIR: 00000010 at  00000044
[   91.988736] snd_hda_codec_hdmi ehdaudio0D2: HDMI: NULL eld_ctl for pcm_idx 3

dmesg_pr1550.log

@kv2019i
Copy link
Collaborator

kv2019i commented Nov 27, 2019

Thanks @dengyangchao , this helps a lot. This line in the log reveals quite a bit:

[ 91.988736] snd_hda_codec_hdmi ehdaudio0D2: HDMI: NULL eld_ctl for pcm_idx 3

... we are getting an ELD update for non-existing PCM "pcm_idx 3". It's still unclear why this happens, but this helps in further debug.

kv2019i added a commit to kv2019i/linux that referenced this issue Nov 28, 2019
Add additional check in hdmi_find_pcm_slot() to not return
a pcm index that points to unallocated pcm. This could happen
if codec driver is set up in codec->mst_no_extra_pcms mode.

BugLink: thesofproject#1536
kv2019i added a commit to kv2019i/linux that referenced this issue Nov 28, 2019
Add debug messages to track ELD kcontrol creation and how
they are mapped to notify calls from i915 driver upon hotplug.

BugLink: thesofproject#1536
@kv2019i
Copy link
Collaborator

kv2019i commented Nov 28, 2019

@dengyangchao I now pushed a potential fix plus a few more debug patches to #1550
Could you give another try? And could you also share the dmesg (whether test passes or not) -- I'm curious to understand how a single HDMI monitor can trigger this.

@dengyangchao
Copy link
Author

@kv2019i The latest #1550 works, HDMI test pass, also tried DP-MST, PASS. Amixer contents can detect the unplug/plug status.
platform: CML Chrome with onborad codec RT5682 in I2S mode

Only one thing interesting, plug in HDMI first, detect it, and can aplay through HW:0,2

numid=9,iface=CARD,name='HDMI/DP,pcm=2 Jack'
 ; type=BOOLEAN,access=r-------,values=1
 : values=on

if then plug in DP, also can detect it. But HW:0,2 change to DP, HDMI change to HW:0,3(verified through aplay)

numid=9,iface=CARD,name='HDMI/DP,pcm=2 Jack'
  ; type=BOOLEAN,access=r-------,values=1
  : values=on
numid=15,iface=CARD,name='HDMI/DP,pcm=3 Jack'
  ; type=BOOLEAN,access=r-------,values=1
  : values=on
[ 1504.253569] snd_hda_codec_hdmi ehdaudio0D2: HDMI: DEBUG: pcm assign pin 6 to pcm_idx 0, num_nid 3, dev_id 1
[ 1512.166735] usb 1-1: new full-speed USB device number 13 using xhci_hcd
[ 1512.294477] usb 1-1: New USB device found, idVendor=1fc9, idProduct=5002, bcdDevice= 1.00
[ 1512.294482] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 1512.294485] usb 1-1: Product: PTN5002
[ 1512.294488] usb 1-1: Manufacturer: NXP
[ 1512.294491] usb 1-1: SerialNumber: 0000074eb595
[ 1512.611972] usb 1-1: USB disconnect, device number 13
[ 1545.075671] usb 1-1: new full-speed USB device number 14 using xhci_hcd
[ 1545.203061] usb 1-1: New USB device found, idVendor=1fc9, idProduct=5002, bcdDevice= 1.00
[ 1545.203064] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 1545.203065] usb 1-1: Product: PTN5002
[ 1545.203067] usb 1-1: Manufacturer: NXP
[ 1545.203068] usb 1-1: SerialNumber: 0000074eb595
[ 1545.583230] usb 1-1: USB disconnect, device number 14
[ 1545.913737] usb 1-1: new full-speed USB device number 15 using xhci_hcd
[ 1546.041454] usb 1-1: New USB device found, idVendor=1fc9, idProduct=5002, bcdDevice= 1.00
[ 1546.041459] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 1546.041462] usb 1-1: Product: PTN5002
[ 1546.041465] usb 1-1: Manufacturer: NXP
[ 1546.041467] usb 1-1: SerialNumber: 0000074eb595
[ 1547.814976] snd_hda_codec_hdmi ehdaudio0D2: HDMI: DEBUG: pcm assign pin 5 to pcm_idx 0, num_nid 3, dev_id 0
[ 1548.984201] snd_hda_codec_hdmi ehdaudio0D2: HDMI: DEBUG: pcm assign pin 5 to pcm_idx 0, num_nid 3, dev_id 0
[ 1549.082238] snd_hda_codec_hdmi ehdaudio0D2: HDMI: DEBUG: pcm assign pin 6 to pcm_idx 1, num_nid 3, dev_id 2

dmesg.log
dmesg-hdmi-only.log

tiwai pushed a commit to tiwai/sound that referenced this issue Nov 29, 2019
Add additional check in hdmi_find_pcm_slot() to not return
a pcm index that points to unallocated pcm. This could happen
if codec driver is set up in codec->mst_no_extra_pcms mode.
On some platforms, this leads to a kernel oops in snd_ctl_notify(),
called via update_eld().

BugLink: thesofproject#1536
Fixes: 5398e94 ALSA: hda - Add DP-MST support for NVIDIA codecs
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Link: https://lore.kernel.org/r/20191129143756.23941-1-kai.vehmanen@linux.intel.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
fengguang pushed a commit to 0day-ci/linux that referenced this issue Nov 29, 2019
Add additional check in hdmi_find_pcm_slot() to not return
a pcm index that points to unallocated pcm. This could happen
if codec driver is set up in codec->mst_no_extra_pcms mode.
On some platforms, this leads to a kernel oops in snd_ctl_notify(),
called via update_eld().

BugLink: thesofproject#1536
Fixes: 5398e94 ALSA: hda - Add DP-MST support for NVIDIA codecs
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
@kv2019i
Copy link
Collaborator

kv2019i commented Nov 29, 2019

@dengyangchao wrote:

@kv2019i The latest #1550 works, HDMI test pass, also tried DP-MST, PASS. Amixer contents can detect the unplug/plug status.
platform: CML Chrome with onborad codec RT5682 in I2S mode

Thanks, this is good news! I submitted a cleaned up patch to alsa-devel and it was already merged:
https://mailman.alsa-project.org/pipermail/alsa-devel/2019-November/159322.html

Only one thing interesting, plug in HDMI first, detect it, and can aplay through HW:0,2
[...]
if then plug in DP, also can detect it. But HW:0,2 change to DP, HDMI change to HW:0,3(verified through aplay)

Yes. this is now possible. HDMI gets disconnected in the process and reconnection may change the PCM number. I'll study this a bit further, but this is probably something we will not try to address.

kv2019i added a commit to kv2019i/linux that referenced this issue Nov 29, 2019
Add additional check in hdmi_find_pcm_slot() to not return
a pcm index that points to unallocated pcm. This could happen
if codec driver is set up in codec->mst_no_extra_pcms mode.
On some platforms, this leads to a kernel oops in snd_ctl_notify(),
called via update_eld().

BugLink: thesofproject#1536
Fixes: 5398e94 ALSA: hda - Add DP-MST support for NVIDIA codecs
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
kv2019i added a commit to kv2019i/linux that referenced this issue Dec 2, 2019
Add debug messages to track ELD kcontrol creation and how
they are mapped to notify calls from i915 driver upon hotplug.

BugLink: thesofproject#1536
@kv2019i
Copy link
Collaborator

kv2019i commented Dec 2, 2019

We need to investigate the behaviour of the i915 driver on this Chrome CML device a bit more. @dengyangchao , I sent you further details in email.

@dengyangchao
Copy link
Author

dengyangchao commented Dec 3, 2019

@kv2019i After applied the latest #1550 and enable DRM, have caught the logs
Scenarios as below:
1.Plug&unplug HDMI
2.Plug&unplug DP
3.Plug HDMI -> plug DP -> unplug DP -> unplug HDMI

Logs have been provided through Email, also add logs here

logs.zip

@dengyangchao
Copy link
Author

After upstream #1576 merged, issue can't reproduce, close.

sys-oak pushed a commit to intel/linux-intel-lts that referenced this issue May 26, 2020
Add additional check in hdmi_find_pcm_slot() to not return
a pcm index that points to unallocated pcm. This could happen
if codec driver is set up in codec->mst_no_extra_pcms mode.
On some platforms, this leads to a kernel oops in snd_ctl_notify(),
called via update_eld().

BugLink: thesofproject/linux#1536
Fixes: 5398e94 ALSA: hda - Add DP-MST support for NVIDIA codecs
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Link: https://lore.kernel.org/r/20191129143756.23941-1-kai.vehmanen@linux.intel.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Lingpl <pei.lee.ling@intel.com>
AaeonCM pushed a commit to AaeonCM/ubuntu-bionic-up that referenced this issue Oct 22, 2020
BugLink: https://bugs.launchpad.net/bugs/1867704

Add additional check in hdmi_find_pcm_slot() to not return
a pcm index that points to unallocated pcm. This could happen
if codec driver is set up in codec->mst_no_extra_pcms mode.
On some platforms, this leads to a kernel oops in snd_ctl_notify(),
called via update_eld().

BugLink: thesofproject/linux#1536
Fixes: 5398e94fb753 ALSA: hda - Add DP-MST support for NVIDIA codecs
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
Link: https://lore.kernel.org/r/20191129143756.23941-1-kai.vehmanen@linux.intel.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
(cherry picked from commit 0c0fe9e6b95ce2e9e2c83bef5563cf223e849eda)
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-By: AceLan Kao <acelan.kao@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
APL Applies to ApolloLake platform bug Something isn't working CFL Applies to Coffee Lake platform CML Applies to Comet Lake platform P1 Blocker bugs or important features
Projects
None yet
Development

No branches or pull requests

4 participants