Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes abnormal power consumption on UMI #21

Merged
merged 1 commit into from
Aug 1, 2022
Merged

Fixes abnormal power consumption on UMI #21

merged 1 commit into from
Aug 1, 2022

Conversation

chaptsand
Copy link
Contributor

Abnormal power consumption has long existed in all rebase kernels because we ignored the following changes from cmi-r-oss

- This is what Xiaomi did on cmi-r-oss [1][2].
- Disable LPM modes results in abnormal power consumption on UMI.
- Guard for THYME too as it has the same node in device tree.

Ref:
[1]: https://github.com/MiCode/Xiaomi_Kernel_OpenSource/blob/cmi-r-oss/drivers/scsi/ufs/ufs-qcom.c#L2033-L2038
[2]: https://github.com/MiCode/kernel_devicetree/blob/cmi-r-oss/qcom/umi-sm8250.dtsi#L11

* Thanks @TheVoyager0777 for found out this particular change

Change-Id: I78d1fd5936cdd871a2ca2626d4a6e0d335a96c83
@UtsavBalar1231 UtsavBalar1231 merged commit 3c09e4b into UtsavBalar1231:android12-stable Aug 1, 2022
mesziman pushed a commit to mesziman/kernel_xiaomi_sm8250 that referenced this pull request Jan 19, 2023
[ Upstream commit 9de255c461d1b3f0242b3ad1450c3323a3e00b34 ]

Commit b74351287d4b ("uio: fix a sleep-in-atomic-context bug in
uio_dmem_genirq_irqcontrol()") started calling disable_irq() without
holding the spinlock because it can sleep. However, that fix introduced
another bug: if interrupt is already disabled and a new disable request
comes in, then the spinlock is not unlocked:

root@localhost:~# printf '\x00\x00\x00\x00' > /dev/uio0
root@localhost:~# printf '\x00\x00\x00\x00' > /dev/uio0
root@localhost:~# [   14.851538] BUG: scheduling while atomic: bash/223/0x00000002
[   14.851991] Modules linked in: uio_dmem_genirq uio myfpga(OE) bochs drm_vram_helper drm_ttm_helper ttm drm_kms_helper drm snd_pcm ppdev joydev psmouse snd_timer snd e1000fb_sys_fops syscopyarea parport sysfillrect soundcore sysimgblt input_leds pcspkr i2c_piix4 serio_raw floppy evbug qemu_fw_cfg mac_hid pata_acpi ip_tables x_tables autofs4 [last unloaded: parport_pc]
[   14.854206] CPU: 0 PID: 223 Comm: bash Tainted: G           OE      6.0.0-rc7 UtsavBalar1231#21
[   14.854786] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[   14.855664] Call Trace:
[   14.855861]  <TASK>
[   14.856025]  dump_stack_lvl+0x4d/0x67
[   14.856325]  dump_stack+0x14/0x1a
[   14.856583]  __schedule_bug.cold+0x4b/0x5c
[   14.856915]  __schedule+0xe81/0x13d0
[   14.857199]  ? idr_find+0x13/0x20
[   14.857456]  ? get_work_pool+0x2d/0x50
[   14.857756]  ? __flush_work+0x233/0x280
[   14.858068]  ? __schedule+0xa95/0x13d0
[   14.858307]  ? idr_find+0x13/0x20
[   14.858519]  ? get_work_pool+0x2d/0x50
[   14.858798]  schedule+0x6c/0x100
[   14.859009]  schedule_hrtimeout_range_clock+0xff/0x110
[   14.859335]  ? tty_write_room+0x1f/0x30
[   14.859598]  ? n_tty_poll+0x1ec/0x220
[   14.859830]  ? tty_ldisc_deref+0x1a/0x20
[   14.860090]  schedule_hrtimeout_range+0x17/0x20
[   14.860373]  do_select+0x596/0x840
[   14.860627]  ? __kernel_text_address+0x16/0x50
[   14.860954]  ? poll_freewait+0xb0/0xb0
[   14.861235]  ? poll_freewait+0xb0/0xb0
[   14.861517]  ? rpm_resume+0x49d/0x780
[   14.861798]  ? common_interrupt+0x59/0xa0
[   14.862127]  ? asm_common_interrupt+0x2b/0x40
[   14.862511]  ? __uart_start.isra.0+0x61/0x70
[   14.862902]  ? __check_object_size+0x61/0x280
[   14.863255]  core_sys_select+0x1c6/0x400
[   14.863575]  ? vfs_write+0x1c9/0x3d0
[   14.863853]  ? vfs_write+0x1c9/0x3d0
[   14.864121]  ? _copy_from_user+0x45/0x70
[   14.864526]  do_pselect.constprop.0+0xb3/0xf0
[   14.864893]  ? do_syscall_64+0x6d/0x90
[   14.865228]  ? do_syscall_64+0x6d/0x90
[   14.865556]  __x64_sys_pselect6+0x76/0xa0
[   14.865906]  do_syscall_64+0x60/0x90
[   14.866214]  ? syscall_exit_to_user_mode+0x2a/0x50
[   14.866640]  ? do_syscall_64+0x6d/0x90
[   14.866972]  ? do_syscall_64+0x6d/0x90
[   14.867286]  ? do_syscall_64+0x6d/0x90
[   14.867626]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[...] stripped
[   14.872959]  </TASK>

('myfpga' is a simple 'uio_dmem_genirq' driver I wrote to test this)

The implementation of "uio_dmem_genirq" was based on "uio_pdrv_genirq" and
it is used in a similar manner to the "uio_pdrv_genirq" driver with respect
to interrupt configuration and handling. At the time "uio_dmem_genirq" was
introduced, both had the same implementation of the 'uio_info' handlers
irqcontrol() and handler(). Then commit 34cb275 ("UIO: Fix concurrency
issue"), which was only applied to "uio_pdrv_genirq", ended up making them
a little different. That commit, among other things, changed disable_irq()
to disable_irq_nosync() in the implementation of irqcontrol(). The
motivation there was to avoid a deadlock between irqcontrol() and
handler(), since it added a spinlock in the irq handler, and disable_irq()
waits for the completion of the irq handler.

By changing disable_irq() to disable_irq_nosync() in irqcontrol(), we also
avoid the sleeping-while-atomic bug that commit b74351287d4b ("uio: fix a
sleep-in-atomic-context bug in uio_dmem_genirq_irqcontrol()") was trying to
fix. Thus, this fixes the missing unlock in irqcontrol() by importing the
implementation of irqcontrol() handler from the "uio_pdrv_genirq" driver.
In the end, it reverts commit b74351287d4b ("uio: fix a
sleep-in-atomic-context bug in uio_dmem_genirq_irqcontrol()") and change
disable_irq() to disable_irq_nosync().

It is worth noting that this still does not address the concurrency issue
fixed by commit 34cb275 ("UIO: Fix concurrency issue"). It will be
addressed separately in the next commits.

Split out from commit 34cb275 ("UIO: Fix concurrency issue").

Fixes: b74351287d4b ("uio: fix a sleep-in-atomic-context bug in uio_dmem_genirq_irqcontrol()")
Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com>
Link: https://lore.kernel.org/r/20220930224100.816175-2-rafaelmendsr@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
macka69 pushed a commit to macka69/kernel_xiaomi_sm8250-1 that referenced this pull request Nov 4, 2023
[ Upstream commit 0b0747d507bffb827e40fc0f9fb5883fffc23477 ]

The following processes run into a deadlock. CPU 41 was waiting for CPU 29
to handle a CSD request while holding spinlock "crashdump_lock", but CPU 29
was hung by that spinlock with IRQs disabled.

  PID: 17360    TASK: ffff95c1090c5c40  CPU: 41  COMMAND: "mrdiagd"
  !# 0 [ffffb80edbf37b58] __read_once_size at ffffffff9b871a40 include/linux/compiler.h:185:0
  !# 1 [ffffb80edbf37b58] atomic_read at ffffffff9b871a40 arch/x86/include/asm/atomic.h:27:0
  !# 2 [ffffb80edbf37b58] dump_stack at ffffffff9b871a40 lib/dump_stack.c:54:0
   # 3 [ffffb80edbf37b78] csd_lock_wait_toolong at ffffffff9b131ad5 kernel/smp.c:364:0
   # 4 [ffffb80edbf37b78] __csd_lock_wait at ffffffff9b131ad5 kernel/smp.c:384:0
   # 5 [ffffb80edbf37bf8] csd_lock_wait at ffffffff9b13267a kernel/smp.c:394:0
   # 6 [ffffb80edbf37bf8] smp_call_function_many at ffffffff9b13267a kernel/smp.c:843:0
   # 7 [ffffb80edbf37c50] smp_call_function at ffffffff9b13279d kernel/smp.c:867:0
   # 8 [ffffb80edbf37c50] on_each_cpu at ffffffff9b13279d kernel/smp.c:976:0
   # 9 [ffffb80edbf37c78] flush_tlb_kernel_range at ffffffff9b085c4b arch/x86/mm/tlb.c:742:0
   UtsavBalar1231#10 [ffffb80edbf37cb8] __purge_vmap_area_lazy at ffffffff9b23a1e0 mm/vmalloc.c:701:0
   UtsavBalar1231#11 [ffffb80edbf37ce0] try_purge_vmap_area_lazy at ffffffff9b23a2cc mm/vmalloc.c:722:0
   UtsavBalar1231#12 [ffffb80edbf37ce0] free_vmap_area_noflush at ffffffff9b23a2cc mm/vmalloc.c:754:0
   UtsavBalar1231#13 [ffffb80edbf37cf8] free_unmap_vmap_area at ffffffff9b23bb3b mm/vmalloc.c:764:0
   UtsavBalar1231#14 [ffffb80edbf37cf8] remove_vm_area at ffffffff9b23bb3b mm/vmalloc.c:1509:0
   UtsavBalar1231#15 [ffffb80edbf37d18] __vunmap at ffffffff9b23bb8a mm/vmalloc.c:1537:0
   UtsavBalar1231#16 [ffffb80edbf37d40] vfree at ffffffff9b23bc85 mm/vmalloc.c:1612:0
   UtsavBalar1231#17 [ffffb80edbf37d58] megasas_free_host_crash_buffer [megaraid_sas] at ffffffffc020b7f2 drivers/scsi/megaraid/megaraid_sas_fusion.c:3932:0
   UtsavBalar1231#18 [ffffb80edbf37d80] fw_crash_state_store [megaraid_sas] at ffffffffc01f804d drivers/scsi/megaraid/megaraid_sas_base.c:3291:0
   UtsavBalar1231#19 [ffffb80edbf37dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0
   UtsavBalar1231#20 [ffffb80edbf37dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0
   UtsavBalar1231#21 [ffffb80edbf37de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0
   UtsavBalar1231#22 [ffffb80edbf37e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0
   UtsavBalar1231#23 [ffffb80edbf37ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0
   UtsavBalar1231#24 [ffffb80edbf37ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0
   UtsavBalar1231#25 [ffffb80edbf37ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0
   UtsavBalar1231#26 [ffffb80edbf37f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0
   UtsavBalar1231#27 [ffffb80edbf37f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0

  PID: 17355    TASK: ffff95c1090c3d80  CPU: 29  COMMAND: "mrdiagd"
  !# 0 [ffffb80f2d3c7d30] __read_once_size at ffffffff9b0f2ab0 include/linux/compiler.h:185:0
  !# 1 [ffffb80f2d3c7d30] native_queued_spin_lock_slowpath at ffffffff9b0f2ab0 kernel/locking/qspinlock.c:368:0
   # 2 [ffffb80f2d3c7d58] pv_queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/paravirt.h:674:0
   # 3 [ffffb80f2d3c7d58] queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/qspinlock.h:53:0
   # 4 [ffffb80f2d3c7d68] queued_spin_lock at ffffffff9b8961a6 include/asm-generic/qspinlock.h:90:0
   # 5 [ffffb80f2d3c7d68] do_raw_spin_lock_flags at ffffffff9b8961a6 include/linux/spinlock.h:173:0
   # 6 [ffffb80f2d3c7d68] __raw_spin_lock_irqsave at ffffffff9b8961a6 include/linux/spinlock_api_smp.h:122:0
   # 7 [ffffb80f2d3c7d68] _raw_spin_lock_irqsave at ffffffff9b8961a6 kernel/locking/spinlock.c:160:0
   # 8 [ffffb80f2d3c7d88] fw_crash_buffer_store [megaraid_sas] at ffffffffc01f8129 drivers/scsi/megaraid/megaraid_sas_base.c:3205:0
   # 9 [ffffb80f2d3c7dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0
   UtsavBalar1231#10 [ffffb80f2d3c7dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0
   UtsavBalar1231#11 [ffffb80f2d3c7de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0
   UtsavBalar1231#12 [ffffb80f2d3c7e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0
   UtsavBalar1231#13 [ffffb80f2d3c7ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0
   UtsavBalar1231#14 [ffffb80f2d3c7ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0
   UtsavBalar1231#15 [ffffb80f2d3c7ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0
   UtsavBalar1231#16 [ffffb80f2d3c7f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0
   UtsavBalar1231#17 [ffffb80f2d3c7f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0

The lock is used to synchronize different sysfs operations, it doesn't
protect any resource that will be touched by an interrupt. Consequently
it's not required to disable IRQs. Replace the spinlock with a mutex to fix
the deadlock.

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Link: https://lore.kernel.org/r/20230828221018.19471-1-junxiao.bi@oracle.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants