Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

failed to use WiFi on newest 8t device #13

Open
bluboy-official opened this issue Dec 20, 2020 · 3 comments
Open

failed to use WiFi on newest 8t device #13

bluboy-official opened this issue Dec 20, 2020 · 3 comments

Comments

@bluboy-official
Copy link

Hi there,I had used the code and rebuilt the kernel and found that the wifi crashed
Seems that it's because the newer 8t devices is different from older ones?
So could you update the code to fix that? I'll very apperciate of that.
thanks.

@xiangfeidexiaohuo
Copy link

Hi there,I had used the code and rebuilt the kernel and found that the wifi crashed
Seems that it's because the newer 8t devices is different from older ones?
So could you update the code to fix that? I'll very apperciate of that.
thanks.

官方人员出来更新源码了,8T新批次的机器,WiFi在所有第三方上全挂的!

@luk1337
Copy link

luk1337 commented Jan 27, 2021

@bluboy-official you can cherry-pick this change to get your WiFi up : LineageOS/android_kernel_oneplus_sm8250@9e7e415

@bluboy-official
Copy link
Author

@bluboy-official you can cherry-pick this change to get your WiFi up : LineageOS/android_kernel_oneplus_sm8250@9e7e415

Appreciate you for fixing that! Thanks a lot!

schqiushui pushed a commit to PixelExperiencePlus/android_kernel_oneplus_sm8250 that referenced this issue Apr 24, 2021
commit 7cf64b1 upstream.

vt_in_use() dereferences console_driver->ttys[i] without proper locking.
This is broken because the tty can be closed and freed concurrently.

We could fix this by using 'READ_ONCE(console_driver->ttys[i]) != NULL'
and skipping the check of tty_struct::count.  But, looking at
console_driver->ttys[i] isn't really appropriate anyway because even if
it is NULL the tty can still be in the process of being closed.

Instead, fix it by making vt_in_use() require console_lock() and check
whether the vt is allocated and has port refcount > 1.  This works since
following the patch "vt: vt_ioctl: fix VT_DISALLOCATE freeing in-use
virtual console" the port refcount is incremented while the vt is open.

Reproducer (very unreliable, but it worked for me after a few minutes):

	#include <fcntl.h>
	#include <linux/vt.h>

	int main()
	{
		int fd, nproc;
		struct vt_stat state;
		char ttyname[16];

		fd = open("/dev/tty10", O_RDONLY);
		for (nproc = 1; nproc < 8; nproc *= 2)
			fork();
		for (;;) {
			sprintf(ttyname, "/dev/tty%d", rand() % 8);
			close(open(ttyname, O_RDONLY));
			ioctl(fd, VT_GETSTATE, &state);
		}
	}

KASAN report:

	BUG: KASAN: use-after-free in vt_in_use drivers/tty/vt/vt_ioctl.c:48 [inline]
	BUG: KASAN: use-after-free in vt_ioctl+0x1ad3/0x1d70 drivers/tty/vt/vt_ioctl.c:657
	Read of size 4 at addr ffff888065722468 by task syz-vt2/132

	CPU: 0 PID: 132 Comm: syz-vt2 Not tainted 5.6.0-rc5-00130-g089b6d3654916 OnePlusOSS#13
	Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20191223_100556-anatol 04/01/2014
	Call Trace:
	 [...]
	 vt_in_use drivers/tty/vt/vt_ioctl.c:48 [inline]
	 vt_ioctl+0x1ad3/0x1d70 drivers/tty/vt/vt_ioctl.c:657
	 tty_ioctl+0x9db/0x11b0 drivers/tty/tty_io.c:2660
	 [...]

	Allocated by task 136:
	 [...]
	 kzalloc include/linux/slab.h:669 [inline]
	 alloc_tty_struct+0x96/0x8a0 drivers/tty/tty_io.c:2982
	 tty_init_dev+0x23/0x350 drivers/tty/tty_io.c:1334
	 tty_open_by_driver drivers/tty/tty_io.c:1987 [inline]
	 tty_open+0x3ca/0xb30 drivers/tty/tty_io.c:2035
	 [...]

	Freed by task 41:
	 [...]
	 kfree+0xbf/0x200 mm/slab.c:3757
	 free_tty_struct+0x8d/0xb0 drivers/tty/tty_io.c:177
	 release_one_tty+0x22d/0x2f0 drivers/tty/tty_io.c:1468
	 process_one_work+0x7f1/0x14b0 kernel/workqueue.c:2264
	 worker_thread+0x8b/0xc80 kernel/workqueue.c:2410
	 [...]

Fixes: 4001d7b ("vt: push down the tty lock so we can see what is left to tackle")
Cc: <stable@vger.kernel.org> # v3.4+
Acked-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20200322034305.210082-3-ebiggers@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
schqiushui pushed a commit to PixelExperiencePlus/android_kernel_oneplus_sm8250 that referenced this issue Mar 18, 2022
[ Upstream commit 4224cfd7fb6523f7a9d1c8bb91bb5df1e38eb624 ]

When bringing down the netdevice or system shutdown, a panic can be
triggered while accessing the sysfs path because the device is already
removed.

    [  755.549084] mlx5_core 0000:12:00.1: Shutdown was called
    [  756.404455] mlx5_core 0000:12:00.0: Shutdown was called
    ...
    [  757.937260] BUG: unable to handle kernel NULL pointer dereference at           (null)
    [  758.031397] IP: [<ffffffff8ee11acb>] dma_pool_alloc+0x1ab/0x280

    crash> bt
    ...
    PID: 12649  TASK: ffff8924108f2100  CPU: 1   COMMAND: "amsd"
    ...
     OnePlusOSS#9 [ffff89240e1a38b0] page_fault at ffffffff8f38c778
        [exception RIP: dma_pool_alloc+0x1ab]
        RIP: ffffffff8ee11acb  RSP: ffff89240e1a3968  RFLAGS: 00010046
        RAX: 0000000000000246  RBX: ffff89243d874100  RCX: 0000000000001000
        RDX: 0000000000000000  RSI: 0000000000000246  RDI: ffff89243d874090
        RBP: ffff89240e1a39c0   R8: 000000000001f080   R9: ffff8905ffc03c00
        R10: ffffffffc04680d4  R11: ffffffff8edde9fd  R12: 00000000000080d0
        R13: ffff89243d874090  R14: ffff89243d874080  R15: 0000000000000000
        ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
    OnePlusOSS#10 [ffff89240e1a39c8] mlx5_alloc_cmd_msg at ffffffffc04680f3 [mlx5_core]
    OnePlusOSS#11 [ffff89240e1a3a18] cmd_exec at ffffffffc046ad62 [mlx5_core]
    OnePlusOSS#12 [ffff89240e1a3ab8] mlx5_cmd_exec at ffffffffc046b4fb [mlx5_core]
    OnePlusOSS#13 [ffff89240e1a3ae8] mlx5_core_access_reg at ffffffffc0475434 [mlx5_core]
    OnePlusOSS#14 [ffff89240e1a3b40] mlx5e_get_fec_caps at ffffffffc04a7348 [mlx5_core]
    OnePlusOSS#15 [ffff89240e1a3bb0] get_fec_supported_advertised at ffffffffc04992bf [mlx5_core]
    OnePlusOSS#16 [ffff89240e1a3c08] mlx5e_get_link_ksettings at ffffffffc049ab36 [mlx5_core]
    OnePlusOSS#17 [ffff89240e1a3ce8] __ethtool_get_link_ksettings at ffffffff8f25db46
    OnePlusOSS#18 [ffff89240e1a3d48] speed_show at ffffffff8f277208
    OnePlusOSS#19 [ffff89240e1a3dd8] dev_attr_show at ffffffff8f0b70e3
    OnePlusOSS#20 [ffff89240e1a3df8] sysfs_kf_seq_show at ffffffff8eedbedf
    OnePlusOSS#21 [ffff89240e1a3e18] kernfs_seq_show at ffffffff8eeda596
    OnePlusOSS#22 [ffff89240e1a3e28] seq_read at ffffffff8ee76d10
    OnePlusOSS#23 [ffff89240e1a3e98] kernfs_fop_read at ffffffff8eedaef5
    OnePlusOSS#24 [ffff89240e1a3ed8] vfs_read at ffffffff8ee4e3ff
    OnePlusOSS#25 [ffff89240e1a3f08] sys_read at ffffffff8ee4f27f
    OnePlusOSS#26 [ffff89240e1a3f50] system_call_fastpath at ffffffff8f395f92

    crash> net_device.state ffff89443b0c0000
      state = 0x5  (__LINK_STATE_START| __LINK_STATE_NOCARRIER)

To prevent this scenario, we also make sure that the netdevice is present.

Signed-off-by: suresh kumar <suresh2514@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
andreock referenced this issue in andreock/android_kernel_realme_sm8250 Sep 1, 2022
This patch is to fix a crash:

 #3 [ffffb6580689f898] oops_end at ffffffffa2835bc2
 #4 [ffffb6580689f8b8] no_context at ffffffffa28766e7
 #5 [ffffb6580689f920] async_page_fault at ffffffffa320135e
    [exception RIP: f2fs_is_compressed_page+34]
    RIP: ffffffffa2ba83a2  RSP: ffffb6580689f9d8  RFLAGS: 00010213
    RAX: 0000000000000001  RBX: fffffc0f50b34bc0  RCX: 0000000000002122
    RDX: 0000000000002123  RSI: 0000000000000c00  RDI: fffffc0f50b34bc0
    RBP: ffff97e815a40178   R8: 0000000000000000   R9: ffff97e83ffc9000
    R10: 0000000000032300  R11: 0000000000032380  R12: ffffb6580689fa38
    R13: fffffc0f50b34bc0  R14: ffff97e825cbd000  R15: 0000000000000c00
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #6 [ffffb6580689f9d8] __is_cp_guaranteed at ffffffffa2b7ea98
 #7 [ffffb6580689f9f0] f2fs_submit_page_write at ffffffffa2b81a69
 #8 [ffffb6580689fa30] f2fs_do_write_meta_page at ffffffffa2b99777
 #9 [ffffb6580689fae0] __f2fs_write_meta_page at ffffffffa2b75f1a
 #10 [ffffb6580689fb18] f2fs_sync_meta_pages at ffffffffa2b77466
 #11 [ffffb6580689fc98] do_checkpoint at ffffffffa2b78e46
 #12 [ffffb6580689fd88] f2fs_write_checkpoint at ffffffffa2b79c29
 #13 [ffffb6580689fdd0] f2fs_sync_fs at ffffffffa2b69d95
 #14 [ffffb6580689fe20] sync_filesystem at ffffffffa2ad2574
 #15 [ffffb6580689fe30] generic_shutdown_super at ffffffffa2a9b582
 #16 [ffffb6580689fe48] kill_block_super at ffffffffa2a9b6d1
 OnePlusOSS#17 [ffffb6580689fe60] kill_f2fs_super at ffffffffa2b6abe1
 OnePlusOSS#18 [ffffb6580689fea0] deactivate_locked_super at ffffffffa2a9afb6
 OnePlusOSS#19 [ffffb6580689feb8] cleanup_mnt at ffffffffa2abcad4
 OnePlusOSS#20 [ffffb6580689fee0] task_work_run at ffffffffa28bca28
 OnePlusOSS#21 [ffffb6580689ff00] exit_to_usermode_loop at ffffffffa28050b7
 OnePlusOSS#22 [ffffb6580689ff38] do_syscall_64 at ffffffffa280560e
 OnePlusOSS#23 [ffffb6580689ff50] entry_SYSCALL_64_after_hwframe at ffffffffa320008c

This occurred when umount f2fs if enable F2FS_FS_COMPRESSION
with F2FS_IO_TRACE. Fixes it by adding IS_IO_TRACED_PAGE to check
validity of pid for page_private.

Signed-off-by: Yu Changchun <yuchangchun1@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
andreock referenced this issue in andreock/android_kernel_realme_sm8250 Sep 1, 2022
https://bugzilla.kernel.org/show_bug.cgi?id=208565

PID: 257    TASK: ecdd0000  CPU: 0   COMMAND: "init"
  #0 [<c0b420ec>] (__schedule) from [<c0b423c8>]
  #1 [<c0b423c8>] (schedule) from [<c0b459d4>]
  #2 [<c0b459d4>] (rwsem_down_read_failed) from [<c0b44fa0>]
  #3 [<c0b44fa0>] (down_read) from [<c044233c>]
  #4 [<c044233c>] (f2fs_truncate_blocks) from [<c0442890>]
  #5 [<c0442890>] (f2fs_truncate) from [<c044d408>]
  #6 [<c044d408>] (f2fs_evict_inode) from [<c030be18>]
  #7 [<c030be18>] (evict) from [<c030a558>]
  #8 [<c030a558>] (iput) from [<c047c600>]
  #9 [<c047c600>] (f2fs_sync_node_pages) from [<c0465414>]
 #10 [<c0465414>] (f2fs_write_checkpoint) from [<c04575f4>]
 #11 [<c04575f4>] (f2fs_sync_fs) from [<c0441918>]
 #12 [<c0441918>] (f2fs_do_sync_file) from [<c0441098>]
 #13 [<c0441098>] (f2fs_sync_file) from [<c0323fa0>]
 #14 [<c0323fa0>] (vfs_fsync_range) from [<c0324294>]
 #15 [<c0324294>] (do_fsync) from [<c0324014>]
 #16 [<c0324014>] (sys_fsync) from [<c0108bc0>]

This can be caused by flush_dirty_inode() in f2fs_sync_node_pages() where
iput() requires f2fs_lock_op() again resulting in livelock.

Reported-by: Zhiguo Niu <Zhiguo.Niu@unisoc.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
schqiushui pushed a commit to PixelExperiencePlus/android_kernel_oneplus_sm8250 that referenced this issue Jan 19, 2023
…g the sock

[ Upstream commit 3cf7203ca620682165706f70a1b12b5194607dce ]

There is a race condition in vxlan that when deleting a vxlan device
during receiving packets, there is a possibility that the sock is
released after getting vxlan_sock vs from sk_user_data. Then in
later vxlan_ecn_decapsulate(), vxlan_get_sk_family() we will got
NULL pointer dereference. e.g.

   #0 [ffffa25ec6978a38] machine_kexec at ffffffff8c669757
   OnePlusOSS#1 [ffffa25ec6978a90] __crash_kexec at ffffffff8c7c0a4d
   OnePlusOSS#2 [ffffa25ec6978b58] crash_kexec at ffffffff8c7c1c48
   OnePlusOSS#3 [ffffa25ec6978b60] oops_end at ffffffff8c627f2b
   OnePlusOSS#4 [ffffa25ec6978b80] page_fault_oops at ffffffff8c678fcb
   OnePlusOSS#5 [ffffa25ec6978bd8] exc_page_fault at ffffffff8d109542
   OnePlusOSS#6 [ffffa25ec6978c00] asm_exc_page_fault at ffffffff8d200b62
      [exception RIP: vxlan_ecn_decapsulate+0x3b]
      RIP: ffffffffc1014e7b  RSP: ffffa25ec6978cb0  RFLAGS: 00010246
      RAX: 0000000000000008  RBX: ffff8aa000888000  RCX: 0000000000000000
      RDX: 000000000000000e  RSI: ffff8a9fc7ab803e  RDI: ffff8a9fd1168700
      RBP: ffff8a9fc7ab803e   R8: 0000000000700000   R9: 00000000000010ae
      R10: ffff8a9fcb748980  R11: 0000000000000000  R12: ffff8a9fd1168700
      R13: ffff8aa000888000  R14: 00000000002a0000  R15: 00000000000010ae
      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
   OnePlusOSS#7 [ffffa25ec6978ce8] vxlan_rcv at ffffffffc10189cd [vxlan]
   OnePlusOSS#8 [ffffa25ec6978d90] udp_queue_rcv_one_skb at ffffffff8cfb6507
   OnePlusOSS#9 [ffffa25ec6978dc0] udp_unicast_rcv_skb at ffffffff8cfb6e45
  OnePlusOSS#10 [ffffa25ec6978dc8] __udp4_lib_rcv at ffffffff8cfb8807
  OnePlusOSS#11 [ffffa25ec6978e20] ip_protocol_deliver_rcu at ffffffff8cf76951
  OnePlusOSS#12 [ffffa25ec6978e48] ip_local_deliver at ffffffff8cf76bde
  OnePlusOSS#13 [ffffa25ec6978ea0] __netif_receive_skb_one_core at ffffffff8cecde9b
  OnePlusOSS#14 [ffffa25ec6978ec8] process_backlog at ffffffff8cece139
  OnePlusOSS#15 [ffffa25ec6978f00] __napi_poll at ffffffff8ceced1a
  OnePlusOSS#16 [ffffa25ec6978f28] net_rx_action at ffffffff8cecf1f3
  OnePlusOSS#17 [ffffa25ec6978fa0] __softirqentry_text_start at ffffffff8d4000ca
  OnePlusOSS#18 [ffffa25ec6978ff0] do_softirq at ffffffff8c6fbdc3

Reproducer: https://github.com/Mellanox/ovs-tests/blob/master/test-ovs-vxlan-remove-tunnel-during-traffic.sh

Fix this by waiting for all sk_user_data reader to finish before
releasing the sock.

Reported-by: Jianlin Shi <jishi@redhat.com>
Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
Fixes: 6a93cc9 ("udp-tunnel: Add a few more UDP tunnel APIs")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
sreelekshman pushed a commit to sreelekshman/android_kernel_realme_sm8250 that referenced this issue Jun 29, 2023
[ Upstream commit 05bb0167c80b8f93c6a4e0451b7da9b96db990c2 ]

ACPICA commit 770653e3ba67c30a629ca7d12e352d83c2541b1e

Before this change we see the following UBSAN stack trace in Fuchsia:

  #0    0x000021e4213b3302 in acpi_ds_init_aml_walk(struct acpi_walk_state*, union acpi_parse_object*, struct acpi_namespace_node*, u8*, u32, struct acpi_evaluate_info*, u8) ../../third_party/acpica/source/components/dispatcher/dswstate.c:682 <platform-bus-x86.so>+0x233302
  realme-sm8250-devs#1.2  0x000020d0f660777f in ubsan_get_stack_trace() compiler-rt/lib/ubsan/ubsan_diag.cpp:41 <libclang_rt.asan.so>+0x3d77f
  realme-sm8250-devs#1.1  0x000020d0f660777f in maybe_print_stack_trace() compiler-rt/lib/ubsan/ubsan_diag.cpp:51 <libclang_rt.asan.so>+0x3d77f
  realme-sm8250-devs#1    0x000020d0f660777f in ~scoped_report() compiler-rt/lib/ubsan/ubsan_diag.cpp:387 <libclang_rt.asan.so>+0x3d77f
  realme-sm8250-devs#2    0x000020d0f660b96d in handlepointer_overflow_impl() compiler-rt/lib/ubsan/ubsan_handlers.cpp:809 <libclang_rt.asan.so>+0x4196d
  realme-sm8250-devs#3    0x000020d0f660b50d in compiler-rt/lib/ubsan/ubsan_handlers.cpp:815 <libclang_rt.asan.so>+0x4150d
  OnePlusOSS#4    0x000021e4213b3302 in acpi_ds_init_aml_walk(struct acpi_walk_state*, union acpi_parse_object*, struct acpi_namespace_node*, u8*, u32, struct acpi_evaluate_info*, u8) ../../third_party/acpica/source/components/dispatcher/dswstate.c:682 <platform-bus-x86.so>+0x233302
  OnePlusOSS#5    0x000021e4213e2369 in acpi_ds_call_control_method(struct acpi_thread_state*, struct acpi_walk_state*, union acpi_parse_object*) ../../third_party/acpica/source/components/dispatcher/dsmethod.c:605 <platform-bus-x86.so>+0x262369
  OnePlusOSS#6    0x000021e421437fac in acpi_ps_parse_aml(struct acpi_walk_state*) ../../third_party/acpica/source/components/parser/psparse.c:550 <platform-bus-x86.so>+0x2b7fac
  OnePlusOSS#7    0x000021e4214464d2 in acpi_ps_execute_method(struct acpi_evaluate_info*) ../../third_party/acpica/source/components/parser/psxface.c:244 <platform-bus-x86.so>+0x2c64d2
  OnePlusOSS#8    0x000021e4213aa052 in acpi_ns_evaluate(struct acpi_evaluate_info*) ../../third_party/acpica/source/components/namespace/nseval.c:250 <platform-bus-x86.so>+0x22a052
  OnePlusOSS#9    0x000021e421413dd8 in acpi_ns_init_one_device(acpi_handle, u32, void*, void**) ../../third_party/acpica/source/components/namespace/nsinit.c:735 <platform-bus-x86.so>+0x293dd8
  OnePlusOSS#10   0x000021e421429e98 in acpi_ns_walk_namespace(acpi_object_type, acpi_handle, u32, u32, acpi_walk_callback, acpi_walk_callback, void*, void**) ../../third_party/acpica/source/components/namespace/nswalk.c:298 <platform-bus-x86.so>+0x2a9e98
  OnePlusOSS#11   0x000021e4214131ac in acpi_ns_initialize_devices(u32) ../../third_party/acpica/source/components/namespace/nsinit.c:268 <platform-bus-x86.so>+0x2931ac
  OnePlusOSS#12   0x000021e42147c40d in acpi_initialize_objects(u32) ../../third_party/acpica/source/components/utilities/utxfinit.c:304 <platform-bus-x86.so>+0x2fc40d
  OnePlusOSS#13   0x000021e42126d603 in acpi::acpi_impl::initialize_acpi(acpi::acpi_impl*) ../../src/devices/board/lib/acpi/acpi-impl.cc:224 <platform-bus-x86.so>+0xed603

Add a simple check that avoids incrementing a pointer by zero, but
otherwise behaves as before. Note that our findings are against ACPICA
20221020, but the same code exists on master.

Link: acpica/acpica@770653e3
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
schqiushui pushed a commit to PixelExperiencePlus/android_kernel_oneplus_sm8250 that referenced this issue Nov 7, 2023
[ Upstream commit 0b0747d507bffb827e40fc0f9fb5883fffc23477 ]

The following processes run into a deadlock. CPU 41 was waiting for CPU 29
to handle a CSD request while holding spinlock "crashdump_lock", but CPU 29
was hung by that spinlock with IRQs disabled.

  PID: 17360    TASK: ffff95c1090c5c40  CPU: 41  COMMAND: "mrdiagd"
  !# 0 [ffffb80edbf37b58] __read_once_size at ffffffff9b871a40 include/linux/compiler.h:185:0
  !# 1 [ffffb80edbf37b58] atomic_read at ffffffff9b871a40 arch/x86/include/asm/atomic.h:27:0
  !# 2 [ffffb80edbf37b58] dump_stack at ffffffff9b871a40 lib/dump_stack.c:54:0
   # 3 [ffffb80edbf37b78] csd_lock_wait_toolong at ffffffff9b131ad5 kernel/smp.c:364:0
   # 4 [ffffb80edbf37b78] __csd_lock_wait at ffffffff9b131ad5 kernel/smp.c:384:0
   # 5 [ffffb80edbf37bf8] csd_lock_wait at ffffffff9b13267a kernel/smp.c:394:0
   # 6 [ffffb80edbf37bf8] smp_call_function_many at ffffffff9b13267a kernel/smp.c:843:0
   # 7 [ffffb80edbf37c50] smp_call_function at ffffffff9b13279d kernel/smp.c:867:0
   # 8 [ffffb80edbf37c50] on_each_cpu at ffffffff9b13279d kernel/smp.c:976:0
   # 9 [ffffb80edbf37c78] flush_tlb_kernel_range at ffffffff9b085c4b arch/x86/mm/tlb.c:742:0
   OnePlusOSS#10 [ffffb80edbf37cb8] __purge_vmap_area_lazy at ffffffff9b23a1e0 mm/vmalloc.c:701:0
   OnePlusOSS#11 [ffffb80edbf37ce0] try_purge_vmap_area_lazy at ffffffff9b23a2cc mm/vmalloc.c:722:0
   OnePlusOSS#12 [ffffb80edbf37ce0] free_vmap_area_noflush at ffffffff9b23a2cc mm/vmalloc.c:754:0
   OnePlusOSS#13 [ffffb80edbf37cf8] free_unmap_vmap_area at ffffffff9b23bb3b mm/vmalloc.c:764:0
   OnePlusOSS#14 [ffffb80edbf37cf8] remove_vm_area at ffffffff9b23bb3b mm/vmalloc.c:1509:0
   OnePlusOSS#15 [ffffb80edbf37d18] __vunmap at ffffffff9b23bb8a mm/vmalloc.c:1537:0
   OnePlusOSS#16 [ffffb80edbf37d40] vfree at ffffffff9b23bc85 mm/vmalloc.c:1612:0
   OnePlusOSS#17 [ffffb80edbf37d58] megasas_free_host_crash_buffer [megaraid_sas] at ffffffffc020b7f2 drivers/scsi/megaraid/megaraid_sas_fusion.c:3932:0
   OnePlusOSS#18 [ffffb80edbf37d80] fw_crash_state_store [megaraid_sas] at ffffffffc01f804d drivers/scsi/megaraid/megaraid_sas_base.c:3291:0
   OnePlusOSS#19 [ffffb80edbf37dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0
   OnePlusOSS#20 [ffffb80edbf37dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0
   OnePlusOSS#21 [ffffb80edbf37de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0
   OnePlusOSS#22 [ffffb80edbf37e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0
   OnePlusOSS#23 [ffffb80edbf37ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0
   OnePlusOSS#24 [ffffb80edbf37ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0
   OnePlusOSS#25 [ffffb80edbf37ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0
   OnePlusOSS#26 [ffffb80edbf37f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0
   OnePlusOSS#27 [ffffb80edbf37f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0

  PID: 17355    TASK: ffff95c1090c3d80  CPU: 29  COMMAND: "mrdiagd"
  !# 0 [ffffb80f2d3c7d30] __read_once_size at ffffffff9b0f2ab0 include/linux/compiler.h:185:0
  !# 1 [ffffb80f2d3c7d30] native_queued_spin_lock_slowpath at ffffffff9b0f2ab0 kernel/locking/qspinlock.c:368:0
   # 2 [ffffb80f2d3c7d58] pv_queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/paravirt.h:674:0
   # 3 [ffffb80f2d3c7d58] queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/qspinlock.h:53:0
   # 4 [ffffb80f2d3c7d68] queued_spin_lock at ffffffff9b8961a6 include/asm-generic/qspinlock.h:90:0
   # 5 [ffffb80f2d3c7d68] do_raw_spin_lock_flags at ffffffff9b8961a6 include/linux/spinlock.h:173:0
   # 6 [ffffb80f2d3c7d68] __raw_spin_lock_irqsave at ffffffff9b8961a6 include/linux/spinlock_api_smp.h:122:0
   # 7 [ffffb80f2d3c7d68] _raw_spin_lock_irqsave at ffffffff9b8961a6 kernel/locking/spinlock.c:160:0
   # 8 [ffffb80f2d3c7d88] fw_crash_buffer_store [megaraid_sas] at ffffffffc01f8129 drivers/scsi/megaraid/megaraid_sas_base.c:3205:0
   # 9 [ffffb80f2d3c7dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0
   OnePlusOSS#10 [ffffb80f2d3c7dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0
   OnePlusOSS#11 [ffffb80f2d3c7de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0
   OnePlusOSS#12 [ffffb80f2d3c7e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0
   OnePlusOSS#13 [ffffb80f2d3c7ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0
   OnePlusOSS#14 [ffffb80f2d3c7ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0
   OnePlusOSS#15 [ffffb80f2d3c7ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0
   OnePlusOSS#16 [ffffb80f2d3c7f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0
   OnePlusOSS#17 [ffffb80f2d3c7f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0

The lock is used to synchronize different sysfs operations, it doesn't
protect any resource that will be touched by an interrupt. Consequently
it's not required to disable IRQs. Replace the spinlock with a mutex to fix
the deadlock.

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Link: https://lore.kernel.org/r/20230828221018.19471-1-junxiao.bi@oracle.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Cc: stable@vger.kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
schqiushui pushed a commit to PixelExperiencePlus/android_kernel_oneplus_sm8250 that referenced this issue Nov 7, 2023
[ Upstream commit a154f5f643c6ecddd44847217a7a3845b4350003 ]

The following call trace shows a deadlock issue due to recursive locking of
mutex "device_mutex". First lock acquire is in target_for_each_device() and
second in target_free_device().

 PID: 148266   TASK: ffff8be21ffb5d00  CPU: 10   COMMAND: "iscsi_ttx"
  #0 [ffffa2bfc9ec3b18] __schedule at ffffffffa8060e7f
  OnePlusOSS#1 [ffffa2bfc9ec3ba0] schedule at ffffffffa8061224
  OnePlusOSS#2 [ffffa2bfc9ec3bb8] schedule_preempt_disabled at ffffffffa80615ee
  OnePlusOSS#3 [ffffa2bfc9ec3bc8] __mutex_lock at ffffffffa8062fd7
  OnePlusOSS#4 [ffffa2bfc9ec3c40] __mutex_lock_slowpath at ffffffffa80631d3
  OnePlusOSS#5 [ffffa2bfc9ec3c50] mutex_lock at ffffffffa806320c
  OnePlusOSS#6 [ffffa2bfc9ec3c68] target_free_device at ffffffffc0935998 [target_core_mod]
  OnePlusOSS#7 [ffffa2bfc9ec3c90] target_core_dev_release at ffffffffc092f975 [target_core_mod]
  OnePlusOSS#8 [ffffa2bfc9ec3ca0] config_item_put at ffffffffa79d250f
  OnePlusOSS#9 [ffffa2bfc9ec3cd0] config_item_put at ffffffffa79d2583
 OnePlusOSS#10 [ffffa2bfc9ec3ce0] target_devices_idr_iter at ffffffffc0933f3a [target_core_mod]
 OnePlusOSS#11 [ffffa2bfc9ec3d00] idr_for_each at ffffffffa803f6fc
 OnePlusOSS#12 [ffffa2bfc9ec3d60] target_for_each_device at ffffffffc0935670 [target_core_mod]
 OnePlusOSS#13 [ffffa2bfc9ec3d98] transport_deregister_session at ffffffffc0946408 [target_core_mod]
 OnePlusOSS#14 [ffffa2bfc9ec3dc8] iscsit_close_session at ffffffffc09a44a6 [iscsi_target_mod]
 OnePlusOSS#15 [ffffa2bfc9ec3df0] iscsit_close_connection at ffffffffc09a4a88 [iscsi_target_mod]
 OnePlusOSS#16 [ffffa2bfc9ec3df8] finish_task_switch at ffffffffa76e5d07
 OnePlusOSS#17 [ffffa2bfc9ec3e78] iscsit_take_action_for_connection_exit at ffffffffc0991c23 [iscsi_target_mod]
 OnePlusOSS#18 [ffffa2bfc9ec3ea0] iscsi_target_tx_thread at ffffffffc09a403b [iscsi_target_mod]
 OnePlusOSS#19 [ffffa2bfc9ec3f08] kthread at ffffffffa76d8080
 OnePlusOSS#20 [ffffa2bfc9ec3f50] ret_from_fork at ffffffffa8200364

Fixes: 36d4cb4 ("scsi: target: Avoid that EXTENDED COPY commands trigger lock inversion")
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Link: https://lore.kernel.org/r/20230918225848.66463-1-junxiao.bi@oracle.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants