forked from torvalds/linux
-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bluetooth userspace api broken in 3.0.8 #10
Comments
Updated - accidentally submitted the issue a bit early. |
chrisdearman
pushed a commit
that referenced
this issue
Mar 22, 2017
Only adb, mtp and ptp configurations have been functional when selected. This patch introduces proper behaviour when mtp,adb or ptp,adb configurations are selected. During the boot-up the following warning is no longer shown: [ 2.879328] ------------[ cut here ]------------ [ 2.883983] WARNING: CPU: 0 PID: 1 at drivers/usb/dwc2/gadget.c:212 s3c_hsotg_init_fifo+0x168/0x1d0() [ 2.893204] insufficient fifo memory [ 2.896602] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 3.18.3+ #10 [ 2.904004] Stack : 00000000 800919a0 00000000 00000004 00000006 800913f4 00000000 00000000 00000000 00000000 80f75a12 00000042 80f75a12 00000042 00000006 00000000 80e42767 80d7c2e 00000001 00000000 80f73574 8bc90418 80ea0000 01000d00 80f06704 80b24c00 00000000 80035388 00000006 00000000 80d834a4 8bc99b04 8bc99b04 80e40000 00000000 00000000 00000000 00000000 00000000 00000000 ... [ 2.939709] Call Trace: [ 2.942174] [<8001bab0>] show_stack+0xd4/0xf0 [ 2.946528] [<80b26c40>] dump_stack+0x70/0xbc [ 2.950880] [<800356bc>] warn_slowpath_common+0x90/0xe8 [ 2.956116] [<80035808>] warn_slowpath_fmt+0x3c/0x48 [ 2.961075] [<8069b824>] s3c_hsotg_init_fifo+0x168/0x1d0 [ 2.966398] [<8069d8fc>] s3c_hsotg_init+0x50/0x9c [ 2.971095] [<806a0388>] dwc2_gadget_init+0x430/0x8c0 [ 2.976158] [<806a0df0>] dwc2_driver_probe+0x218/0x2a8 [ 2.981291] [<805b935c>] platform_drv_probe+0x64/0x120 [ 2.986440] [<805b783c>] really_probe+0xa0/0x278 [ 2.991050] [<805b7c78>] driver_probe_device+0x48/0x78 [ 2.996197] [<805b7d74>] __driver_attach+0xcc/0xd4 [ 3.000980] [<805b5b7c>] bus_for_each_dev+0x7c/0xc4 [ 3.005874] [<805b64f8>] bus_add_driver+0x180/0x240 [ 3.010743] [<805b8428>] driver_register+0xac/0x154 [ 3.015633] [<80ea9e04>] do_one_initcall+0x150/0x1f4 [ 3.020589] [<80eaa080>] kernel_init_freeable+0x1d8/0x298 [ 3.025998] [<80b23c5c>] kernel_init+0x28/0x158 [ 3.030522] [<800153ec>] ret_from_kernel_thread+0x14/0x1c [ 3.035926] [ 3.037412] ---[ end trace cb88537fdc8fa201 ]--- And during configuration transitions (e.g. adb -> mtp,adb) the following warning is no longer shown: [ 311.726159] -----------[ cut here ]----------- [ 311.730817] WARNING: CPU: 0 PID: 0 at drivers/usb/dwc2/gadget.c:1475 s3c_hsotg_rx_data+0x130/0x13c() [ 311.739931] Modules linked in: [ 311.742993] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 3.18.3+ #45 [ 311.750199] Stack : 00000000 80080370 00000000 00000004 00000006 00000000 00000000 00000000 00000000 00000000 80f05b02 00000042 80d61010 80e18e20 80d60000 8b408010 80e18927 80d0df6c 00000000 00000000 80f03614 80e18e20 80d60000 8b408010 00250182 80a54f54 80e20cc4 80e20cc8 00000000 00000000 80d14ab8 80dfbacc 80dfbacc 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ... [ 311.785841] Call Trace: [ 311.788292] [<8001ac28>] show_stack+0xc4/0xe0 [ 311.792650] [<80a56e58>] dump_stack+0x70/0xbc [ 311.797008] [<80033c14>] warn_slowpath_common+0x88/0xb8 [ 311.802224] [<80033cc8>] warn_slowpath_null+0x18/0x24 [ 311.807266] [<80606a3c>] s3c_hsotg_rx_data+0x130/0x13c [ 311.812397] [<8060afa4>] s3c_hsotg_irq+0x3b4/0x5e8 [ 311.817183] [<80082ab8>] handle_irq_event_percpu+0x90/0x2d0 [ 311.822745] [<80082d4c>] handle_irq_event+0x54/0x98 [ 311.827617] [<80086390>] handle_level_irq+0xe0/0x1c0 [ 311.832572] [<800820bc>] generic_handle_irq+0x3c/0x54 [ 311.837622] [<804bb680>] jz4740_cascade+0x78/0xac [ 311.842317] [<80082ab8>] handle_irq_event_percpu+0x90/0x2d0 [ 311.847881] [<80086d18>] handle_percpu_irq+0x8c/0xbc [ 311.852835] [<800820bc>] generic_handle_irq+0x3c/0x54 [ 311.857878] [<80016c8c>] do_IRQ+0x18/0x2c [ 311.861879] [<80014c40>] ret_from_irq+0x0/0x4 [ 311.866227] [<80016b20>] mips_cpuidle_wait_enter+0x14/0x34 [ 311.871713] [<806d37b0>] cpuidle_enter_state+0x88/0x2c0 [ 311.876934] [<80074308>] cpu_startup_entry+0x36c/0x484 [ 311.882074] [<80e7dc04>] start_kernel+0x4b8/0x4e0 [ 311.886767] [ 311.888253] --[ end trace dd7a60dcc5530db3 ]-- Change-Id: Ic8ac37a28913d4314371de0cd446f8a7cc45864d Signed-off-by: Dragan Cecavac <dragan.cecavac@imgtec.com>
pcercuei
referenced
this issue
in OpenDingux/linux
May 6, 2017
If a given cpu is not in cpu_present and cpu hotplug is disabled, arch can skip setting up the cpu_dev. Arch cpuidle driver should pass correct cpu mask for registration, but failing to do so by the driver causes error to propagate and crash like this: [ 30.076045] Unable to handle kernel paging request for data at address 0x00000048 [ 30.076100] Faulting instruction address: 0xc0000000007b2f30 cpu 0x4d: Vector: 300 (Data Access) at [c000003feb18b670] pc: c0000000007b2f30: kobject_get+0x20/0x70 lr: c0000000007b3c94: kobject_add_internal+0x54/0x3f0 sp: c000003feb18b8f0 msr: 9000000000009033 dar: 48 dsisr: 40000000 current = 0xc000003fd2ed8300 paca = 0xc00000000fbab500 softe: 0 irq_happened: 0x01 pid = 1, comm = swapper/0 Linux version 4.11.0-rc2-svaidy+ (sv@sagarika) (gcc version 6.2.0 20161005 (Ubuntu 6.2.0-5ubuntu12) ) #10 SMP Sun Mar 19 00:08:09 IST 2017 enter ? for help [c000003feb18b960] c0000000007b3c94 kobject_add_internal+0x54/0x3f0 [c000003feb18b9f0] c0000000007b43a4 kobject_init_and_add+0x64/0xa0 [c000003feb18ba70] c000000000e284f4 cpuidle_add_sysfs+0xb4/0x130 [c000003feb18baf0] c000000000e26038 cpuidle_register_device+0x118/0x1c0 [c000003feb18bb30] c000000000e26c48 cpuidle_register+0x78/0x120 [c000003feb18bbc0] c00000000168fd9c powernv_processor_idle_init+0x110/0x1c4 [c000003feb18bc40] c00000000000cff8 do_one_initcall+0x68/0x1d0 [c000003feb18bd00] c0000000016242f4 kernel_init_freeable+0x280/0x360 [c000003feb18bdc0] c00000000000d864 kernel_init+0x24/0x160 [c000003feb18be30] c00000000000b4e8 ret_from_kernel_thread+0x5c/0x74 Validating cpu_dev fixes the crash and reports correct error message like: [ 30.163506] Failed to register cpuidle device for cpu136 [ 30.173329] Registration of powernv driver failed. Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> [ rjw: Comment massage ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
pcercuei
referenced
this issue
in OpenDingux/linux
May 6, 2017
mipsxx_pmu_handle_shared_irq() calls irq_work_run() while holding the pmuint_rwlock for read. irq_work_run() can, via perf_pending_event(), call try_to_wake_up() which can try to take rq->lock. However, perf can also call perf_pmu_enable() (and thus take the pmuint_rwlock for write) while holding the rq->lock, from finish_task_switch() via perf_event_context_sched_in(). This leads to an ABBA deadlock: PID: 3855 TASK: 8f7ce288 CPU: 2 COMMAND: "process" #0 [89c39ac8] __delay at 803b5be4 #1 [89c39ac8] do_raw_spin_lock at 8008fdcc #2 [89c39af8] try_to_wake_up at 8006e47c #3 [89c39b38] pollwake at 8018eab0 #4 [89c39b68] __wake_up_common at 800879f4 #5 [89c39b98] __wake_up at 800880e4 #6 [89c39bc8] perf_event_wakeup at 8012109c #7 [89c39be8] perf_pending_event at 80121184 #8 [89c39c08] irq_work_run_list at 801151f0 #9 [89c39c38] irq_work_run at 80115274 #10 [89c39c50] mipsxx_pmu_handle_shared_irq at 8002cc7c PID: 1481 TASK: 8eaac6a8 CPU: 3 COMMAND: "process" #0 [8de7f900] do_raw_write_lock at 800900e0 #1 [8de7f918] perf_event_context_sched_in at 80122310 #2 [8de7f938] __perf_event_task_sched_in at 80122608 #3 [8de7f958] finish_task_switch at 8006b8a4 #4 [8de7f998] __schedule at 805e4dc4 #5 [8de7f9f8] schedule at 805e5558 #6 [8de7fa10] schedule_hrtimeout_range_clock at 805e9984 #7 [8de7fa70] poll_schedule_timeout at 8018e8f8 #8 [8de7fa88] do_select at 8018f338 #9 [8de7fd88] core_sys_select at 8018f5cc #10 [8de7fee0] sys_select at 8018f854 #11 [8de7ff28] syscall_common at 80028fc8 The lock seems to be there to protect the hardware counters so there is no need to hold it across irq_work_run(). Signed-off-by: Rabin Vincent <rabinv@axis.com> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
miodragdinic
pushed a commit
to miodragdinic/CI20_linux
that referenced
this issue
May 24, 2017
Before this patch, using multiple active endpoints would not be possible and would actually be canceling each other out. The issue was discovered on Android when combining adb, mtp and ptp configurations together. This patch introduces proper behaviour for these cases. Also, during the boot-up the following warning is no longer shown: [ 2.879328] ------------[ cut here ]------------ [ 2.883983] WARNING: CPU: 0 PID: 1 at drivers/usb/dwc2/gadget.c:212 s3c_hsotg_init_fifo+0x168/0x1d0() [ 2.893204] insufficient fifo memory [ 2.896602] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 3.18.3+ MIPS#10 [ 2.904004] Stack : 00000000 800919a0 00000000 00000004 00000006 800913f4 00000000 00000000 00000000 00000000 80f75a12 00000042 80f75a12 00000042 00000006 00000000 80e42767 80d7c2e 00000001 00000000 80f73574 8bc90418 80ea0000 01000d00 80f06704 80b24c00 00000000 80035388 00000006 00000000 80d834a4 8bc99b04 8bc99b04 80e40000 00000000 00000000 00000000 00000000 00000000 00000000 ... [ 2.939709] Call Trace: [ 2.942174] [<8001bab0>] show_stack+0xd4/0xf0 [ 2.946528] [<80b26c40>] dump_stack+0x70/0xbc [ 2.950880] [<800356bc>] warn_slowpath_common+0x90/0xe8 [ 2.956116] [<80035808>] warn_slowpath_fmt+0x3c/0x48 [ 2.961075] [<8069b824>] s3c_hsotg_init_fifo+0x168/0x1d0 [ 2.966398] [<8069d8fc>] s3c_hsotg_init+0x50/0x9c [ 2.971095] [<806a0388>] dwc2_gadget_init+0x430/0x8c0 [ 2.976158] [<806a0df0>] dwc2_driver_probe+0x218/0x2a8 [ 2.981291] [<805b935c>] platform_drv_probe+0x64/0x120 [ 2.986440] [<805b783c>] really_probe+0xa0/0x278 [ 2.991050] [<805b7c78>] driver_probe_device+0x48/0x78 [ 2.996197] [<805b7d74>] __driver_attach+0xcc/0xd4 [ 3.000980] [<805b5b7c>] bus_for_each_dev+0x7c/0xc4 [ 3.005874] [<805b64f8>] bus_add_driver+0x180/0x240 [ 3.010743] [<805b8428>] driver_register+0xac/0x154 [ 3.015633] [<80ea9e04>] do_one_initcall+0x150/0x1f4 [ 3.020589] [<80eaa080>] kernel_init_freeable+0x1d8/0x298 [ 3.025998] [<80b23c5c>] kernel_init+0x28/0x158 [ 3.030522] [<800153ec>] ret_from_kernel_thread+0x14/0x1c [ 3.035926] [ 3.037412] ---[ end trace cb88537fdc8fa201 ]--- And during configuration transitions (e.g. adb -> mtp,adb) the following warning is no longer shown: [ 311.726159] -----------[ cut here ]----------- [ 311.730817] WARNING: CPU: 0 PID: 0 at drivers/usb/dwc2/gadget.c:1475 s3c_hsotg_rx_data+0x130/0x13c() [ 311.739931] Modules linked in: [ 311.742993] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 3.18.3+ MIPS#45 [ 311.750199] Stack : 00000000 80080370 00000000 00000004 00000006 00000000 00000000 00000000 00000000 00000000 80f05b02 00000042 80d61010 80e18e20 80d60000 8b408010 80e18927 80d0df6c 00000000 00000000 80f03614 80e18e20 80d60000 8b408010 00250182 80a54f54 80e20cc4 80e20cc8 00000000 00000000 80d14ab8 80dfbacc 80dfbacc 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ... [ 311.785841] Call Trace: [ 311.788292] [<8001ac28>] show_stack+0xc4/0xe0 [ 311.792650] [<80a56e58>] dump_stack+0x70/0xbc [ 311.797008] [<80033c14>] warn_slowpath_common+0x88/0xb8 [ 311.802224] [<80033cc8>] warn_slowpath_null+0x18/0x24 [ 311.807266] [<80606a3c>] s3c_hsotg_rx_data+0x130/0x13c [ 311.812397] [<8060afa4>] s3c_hsotg_irq+0x3b4/0x5e8 [ 311.817183] [<80082ab8>] handle_irq_event_percpu+0x90/0x2d0 [ 311.822745] [<80082d4c>] handle_irq_event+0x54/0x98 [ 311.827617] [<80086390>] handle_level_irq+0xe0/0x1c0 [ 311.832572] [<800820bc>] generic_handle_irq+0x3c/0x54 [ 311.837622] [<804bb680>] jz4740_cascade+0x78/0xac [ 311.842317] [<80082ab8>] handle_irq_event_percpu+0x90/0x2d0 [ 311.847881] [<80086d18>] handle_percpu_irq+0x8c/0xbc [ 311.852835] [<800820bc>] generic_handle_irq+0x3c/0x54 [ 311.857878] [<80016c8c>] do_IRQ+0x18/0x2c [ 311.861879] [<80014c40>] ret_from_irq+0x0/0x4 [ 311.866227] [<80016b20>] mips_cpuidle_wait_enter+0x14/0x34 [ 311.871713] [<806d37b0>] cpuidle_enter_state+0x88/0x2c0 [ 311.876934] [<80074308>] cpu_startup_entry+0x36c/0x484 [ 311.882074] [<80e7dc04>] start_kernel+0x4b8/0x4e0 [ 311.886767] [ 311.888253] --[ end trace dd7a60dcc5530db3 ]-- Change-Id: Ic8ac37a28913d4314371de0cd446f8a7cc45864d Signed-off-by: Dragan Cecavac <dragan.cecavac@imgtec.com>
pcercuei
referenced
this issue
in OpenDingux/linux
May 16, 2018
syzbot caught an infinite recursion in nsh_gso_segment(). Problem here is that we need to make sure the NSH header is of reasonable length. BUG: MAX_LOCK_DEPTH too low! turning off the locking correctness validator. depth: 48 max: 48! 48 locks held by syz-executor0/10189: #0: (ptrval) (rcu_read_lock_bh){....}, at: __dev_queue_xmit+0x30f/0x34c0 net/core/dev.c:3517 #1: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #1: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #2: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #2: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #3: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #3: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #4: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #4: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #5: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #5: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #6: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #6: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #7: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #7: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #8: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #8: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #9: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #9: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #10: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #10: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #11: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #11: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #12: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #12: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #13: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #13: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #14: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #14: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 #15: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] #15: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 MIPS#16: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] MIPS#16: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 MIPS#17: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] MIPS#17: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 MIPS#18: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] MIPS#18: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 MIPS#19: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] MIPS#19: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 MIPS#20: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] MIPS#20: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 MIPS#21: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] MIPS#21: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 MIPS#22: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] MIPS#22: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 MIPS#23: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] MIPS#23: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 MIPS#24: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] MIPS#24: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 MIPS#25: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] MIPS#25: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 MIPS#26: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] MIPS#26: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 MIPS#27: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] MIPS#27: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 MIPS#28: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] MIPS#28: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 MIPS#29: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] MIPS#29: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 MIPS#30: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] MIPS#30: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 MIPS#31: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] MIPS#31: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 dccp_close: ABORT with 65423 bytes unread MIPS#32: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] MIPS#32: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 MIPS#33: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] MIPS#33: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 MIPS#34: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] MIPS#34: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 MIPS#35: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] MIPS#35: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 MIPS#36: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] MIPS#36: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 MIPS#37: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] MIPS#37: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 MIPS#38: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] MIPS#38: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 MIPS#39: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] MIPS#39: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 MIPS#40: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] MIPS#40: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 MIPS#41: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] MIPS#41: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 MIPS#42: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] MIPS#42: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 MIPS#43: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] MIPS#43: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 MIPS#44: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] MIPS#44: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 MIPS#45: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] MIPS#45: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 MIPS#46: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] MIPS#46: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 MIPS#47: (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline] MIPS#47: (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787 INFO: lockdep is turned off. CPU: 1 PID: 10189 Comm: syz-executor0 Not tainted 4.17.0-rc2+ MIPS#26 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1b9/0x294 lib/dump_stack.c:113 __lock_acquire+0x1788/0x5140 kernel/locking/lockdep.c:3449 lock_acquire+0x1dc/0x520 kernel/locking/lockdep.c:3920 rcu_lock_acquire include/linux/rcupdate.h:246 [inline] rcu_read_lock include/linux/rcupdate.h:632 [inline] skb_mac_gso_segment+0x25b/0x720 net/core/dev.c:2789 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107 skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792 __skb_gso_segment+0x3bb/0x870 net/core/dev.c:2865 skb_gso_segment include/linux/netdevice.h:4025 [inline] validate_xmit_skb+0x54d/0xd90 net/core/dev.c:3118 validate_xmit_skb_list+0xbf/0x120 net/core/dev.c:3168 sch_direct_xmit+0x354/0x11e0 net/sched/sch_generic.c:312 qdisc_restart net/sched/sch_generic.c:399 [inline] __qdisc_run+0x741/0x1af0 net/sched/sch_generic.c:410 __dev_xmit_skb net/core/dev.c:3243 [inline] __dev_queue_xmit+0x28ea/0x34c0 net/core/dev.c:3551 dev_queue_xmit+0x17/0x20 net/core/dev.c:3616 packet_snd net/packet/af_packet.c:2951 [inline] packet_sendmsg+0x40f8/0x6070 net/packet/af_packet.c:2976 sock_sendmsg_nosec net/socket.c:629 [inline] sock_sendmsg+0xd5/0x120 net/socket.c:639 __sys_sendto+0x3d7/0x670 net/socket.c:1789 __do_sys_sendto net/socket.c:1801 [inline] __se_sys_sendto net/socket.c:1797 [inline] __x64_sys_sendto+0xe1/0x1a0 net/socket.c:1797 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x49/0xbe Fixes: c411ed8 ("nsh: add GSO support") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jiri Benc <jbenc@redhat.com> Reported-by: syzbot <syzkaller@googlegroups.com> Acked-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
nemunaire
pushed a commit
to nemunaire/CI20_linux
that referenced
this issue
Jun 6, 2018
[ Upstream commit ad0a45f ] If a given cpu is not in cpu_present and cpu hotplug is disabled, arch can skip setting up the cpu_dev. Arch cpuidle driver should pass correct cpu mask for registration, but failing to do so by the driver causes error to propagate and crash like this: [ 30.076045] Unable to handle kernel paging request for data at address 0x00000048 [ 30.076100] Faulting instruction address: 0xc0000000007b2f30 cpu 0x4d: Vector: 300 (Data Access) at [c000003feb18b670] pc: c0000000007b2f30: kobject_get+0x20/0x70 lr: c0000000007b3c94: kobject_add_internal+0x54/0x3f0 sp: c000003feb18b8f0 msr: 9000000000009033 dar: 48 dsisr: 40000000 current = 0xc000003fd2ed8300 paca = 0xc00000000fbab500 softe: 0 irq_happened: 0x01 pid = 1, comm = swapper/0 Linux version 4.11.0-rc2-svaidy+ (sv@sagarika) (gcc version 6.2.0 20161005 (Ubuntu 6.2.0-5ubuntu12) ) MIPS#10 SMP Sun Mar 19 00:08:09 IST 2017 enter ? for help [c000003feb18b960] c0000000007b3c94 kobject_add_internal+0x54/0x3f0 [c000003feb18b9f0] c0000000007b43a4 kobject_init_and_add+0x64/0xa0 [c000003feb18ba70] c000000000e284f4 cpuidle_add_sysfs+0xb4/0x130 [c000003feb18baf0] c000000000e26038 cpuidle_register_device+0x118/0x1c0 [c000003feb18bb30] c000000000e26c48 cpuidle_register+0x78/0x120 [c000003feb18bbc0] c00000000168fd9c powernv_processor_idle_init+0x110/0x1c4 [c000003feb18bc40] c00000000000cff8 do_one_initcall+0x68/0x1d0 [c000003feb18bd00] c0000000016242f4 kernel_init_freeable+0x280/0x360 [c000003feb18bdc0] c00000000000d864 kernel_init+0x24/0x160 [c000003feb18be30] c00000000000b4e8 ret_from_kernel_thread+0x5c/0x74 Validating cpu_dev fixes the crash and reports correct error message like: [ 30.163506] Failed to register cpuidle device for cpu136 [ 30.173329] Registration of powernv driver failed. Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> [ rjw: Comment massage ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
nemunaire
pushed a commit
to nemunaire/CI20_linux
that referenced
this issue
Jun 6, 2018
[ Upstream commit 2bbea6e ] when mounting an ISO filesystem sometimes (very rarely) the system hangs because of a race condition between two tasks. PID: 6766 TASK: ffff88007b2a6dd0 CPU: 0 COMMAND: "mount" #0 [ffff880078447ae0] __schedule at ffffffff8168d605 MIPS#1 [ffff880078447b48] schedule_preempt_disabled at ffffffff8168ed49 MIPS#2 [ffff880078447b58] __mutex_lock_slowpath at ffffffff8168c995 MIPS#3 [ffff880078447bb8] mutex_lock at ffffffff8168bdef MIPS#4 [ffff880078447bd0] sr_block_ioctl at ffffffffa00b6818 [sr_mod] MIPS#5 [ffff880078447c10] blkdev_ioctl at ffffffff812fea50 MIPS#6 [ffff880078447c70] ioctl_by_bdev at ffffffff8123a8b3 MIPS#7 [ffff880078447c90] isofs_fill_super at ffffffffa04fb1e1 [isofs] MIPS#8 [ffff880078447da8] mount_bdev at ffffffff81202570 MIPS#9 [ffff880078447e18] isofs_mount at ffffffffa04f9828 [isofs] MIPS#10 [ffff880078447e28] mount_fs at ffffffff81202d09 MIPS#11 [ffff880078447e70] vfs_kern_mount at ffffffff8121ea8f MIPS#12 [ffff880078447ea8] do_mount at ffffffff81220fee MIPS#13 [ffff880078447f28] sys_mount at ffffffff812218d6 MIPS#14 [ffff880078447f80] system_call_fastpath at ffffffff81698c49 RIP: 00007fd9ea914e9a RSP: 00007ffd5d9bf648 RFLAGS: 00010246 RAX: 00000000000000a5 RBX: ffffffff81698c49 RCX: 0000000000000010 RDX: 00007fd9ec2bc210 RSI: 00007fd9ec2bc290 RDI: 00007fd9ec2bcf30 RBP: 0000000000000000 R8: 0000000000000000 R9: 0000000000000010 R10: 00000000c0ed0001 R11: 0000000000000206 R12: 00007fd9ec2bc040 R13: 00007fd9eb6b2380 R14: 00007fd9ec2bc210 R15: 00007fd9ec2bcf30 ORIG_RAX: 00000000000000a5 CS: 0033 SS: 002b This task was trying to mount the cdrom. It allocated and configured a super_block struct and owned the write-lock for the super_block->s_umount rwsem. While exclusively owning the s_umount lock, it called sr_block_ioctl and waited to acquire the global sr_mutex lock. PID: 6785 TASK: ffff880078720fb0 CPU: 0 COMMAND: "systemd-udevd" #0 [ffff880078417898] __schedule at ffffffff8168d605 MIPS#1 [ffff880078417900] schedule at ffffffff8168dc59 MIPS#2 [ffff880078417910] rwsem_down_read_failed at ffffffff8168f605 MIPS#3 [ffff880078417980] call_rwsem_down_read_failed at ffffffff81328838 MIPS#4 [ffff8800784179d0] down_read at ffffffff8168cde0 MIPS#5 [ffff8800784179e8] get_super at ffffffff81201cc7 MIPS#6 [ffff880078417a10] __invalidate_device at ffffffff8123a8de MIPS#7 [ffff880078417a40] flush_disk at ffffffff8123a94b MIPS#8 [ffff880078417a88] check_disk_change at ffffffff8123ab50 MIPS#9 [ffff880078417ab0] cdrom_open at ffffffffa00a29e1 [cdrom] MIPS#10 [ffff880078417b68] sr_block_open at ffffffffa00b6f9b [sr_mod] MIPS#11 [ffff880078417b98] __blkdev_get at ffffffff8123ba86 MIPS#12 [ffff880078417bf0] blkdev_get at ffffffff8123bd65 MIPS#13 [ffff880078417c78] blkdev_open at ffffffff8123bf9b MIPS#14 [ffff880078417c90] do_dentry_open at ffffffff811fc7f7 MIPS#15 [ffff880078417cd8] vfs_open at ffffffff811fc9cf MIPS#16 [ffff880078417d00] do_last at ffffffff8120d53d MIPS#17 [ffff880078417db0] path_openat at ffffffff8120e6b2 MIPS#18 [ffff880078417e48] do_filp_open at ffffffff8121082b MIPS#19 [ffff880078417f18] do_sys_open at ffffffff811fdd33 MIPS#20 [ffff880078417f70] sys_open at ffffffff811fde4e MIPS#21 [ffff880078417f80] system_call_fastpath at ffffffff81698c49 RIP: 00007f29438b0c20 RSP: 00007ffc76624b78 RFLAGS: 00010246 RAX: 0000000000000002 RBX: ffffffff81698c49 RCX: 0000000000000000 RDX: 00007f2944a5fa70 RSI: 00000000000a0800 RDI: 00007f2944a5fa70 RBP: 00007f2944a5f540 R8: 0000000000000000 R9: 0000000000000020 R10: 00007f2943614c40 R11: 0000000000000246 R12: ffffffff811fde4e R13: ffff880078417f78 R14: 000000000000000c R15: 00007f2944a4b010 ORIG_RAX: 0000000000000002 CS: 0033 SS: 002b This task tried to open the cdrom device, the sr_block_open function acquired the global sr_mutex lock. The call to check_disk_change() then saw an event flag indicating a possible media change and tried to flush any cached data for the device. As part of the flush, it tried to acquire the super_block->s_umount lock associated with the cdrom device. This was the same super_block as created and locked by the previous task. The first task acquires the s_umount lock and then the sr_mutex_lock; the second task acquires the sr_mutex_lock and then the s_umount lock. This patch fixes the issue by moving check_disk_change() out of cdrom_open() and let the caller take care of it. Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
pcercuei
referenced
this issue
in OpenDingux/linux
Jul 24, 2018
Crash dump shows following instructions crash> bt PID: 0 TASK: ffffffffbe412480 CPU: 0 COMMAND: "swapper/0" #0 [ffff891ee0003868] machine_kexec at ffffffffbd063ef1 #1 [ffff891ee00038c8] __crash_kexec at ffffffffbd12b6f2 #2 [ffff891ee0003998] crash_kexec at ffffffffbd12c84c #3 [ffff891ee00039b8] oops_end at ffffffffbd030f0a #4 [ffff891ee00039e0] no_context at ffffffffbd074643 #5 [ffff891ee0003a40] __bad_area_nosemaphore at ffffffffbd07496e #6 [ffff891ee0003a90] bad_area_nosemaphore at ffffffffbd074a64 #7 [ffff891ee0003aa0] __do_page_fault at ffffffffbd074b0a #8 [ffff891ee0003b18] do_page_fault at ffffffffbd074fc8 #9 [ffff891ee0003b50] page_fault at ffffffffbda01925 [exception RIP: qlt_schedule_sess_for_deletion+15] RIP: ffffffffc02e526f RSP: ffff891ee0003c08 RFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffc0307847 RDX: 00000000000020e6 RSI: ffff891edbc377c8 RDI: 0000000000000000 RBP: ffff891ee0003c18 R8: ffffffffc02f0b20 R9: 0000000000000250 R10: 0000000000000258 R11: 000000000000b780 R12: ffff891ed9b43000 R13: 00000000000000f0 R14: 0000000000000006 R15: ffff891edbc377c8 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #10 [ffff891ee0003c20] qla2x00_fcport_event_handler at ffffffffc02853d3 [qla2xxx] #11 [ffff891ee0003cf0] __dta_qla24xx_async_gnl_sp_done_333 at ffffffffc0285a1d [qla2xxx] #12 [ffff891ee0003de8] qla24xx_process_response_queue at ffffffffc02a2eb5 [qla2xxx] #13 [ffff891ee0003e88] qla24xx_msix_rsp_q at ffffffffc02a5403 [qla2xxx] #14 [ffff891ee0003ec0] __handle_irq_event_percpu at ffffffffbd0f4c59 #15 [ffff891ee0003f10] handle_irq_event_percpu at ffffffffbd0f4e02 MIPS#16 [ffff891ee0003f40] handle_irq_event at ffffffffbd0f4e90 MIPS#17 [ffff891ee0003f68] handle_edge_irq at ffffffffbd0f8984 MIPS#18 [ffff891ee0003f88] handle_irq at ffffffffbd0305d5 MIPS#19 [ffff891ee0003fb8] do_IRQ at ffffffffbda02a18 --- <IRQ stack> --- MIPS#20 [ffffffffbe403d30] ret_from_intr at ffffffffbda0094e [exception RIP: unknown or invalid address] RIP: 000000000000001f RSP: 0000000000000000 RFLAGS: fff3b8c2091ebb3f RAX: ffffbba5a0000200 RBX: 0000be8cdfa8f9fa RCX: 0000000000000018 RDX: 0000000000000101 RSI: 000000000000015d RDI: 0000000000000193 RBP: 0000000000000083 R8: ffffffffbe403e38 R9: 0000000000000002 R10: 0000000000000000 R11: ffffffffbe56b820 R12: ffff891ee001cf00 R13: ffffffffbd11c0a4 R14: ffffffffbe403d60 R15: 0000000000000001 ORIG_RAX: ffff891ee0022ac0 CS: 0000 SS: ffffffffffffffb9 bt: WARNING: possibly bogus exception frame MIPS#21 [ffffffffbe403dd8] cpuidle_enter_state at ffffffffbd67c6fd MIPS#22 [ffffffffbe403e40] cpuidle_enter at ffffffffbd67c907 MIPS#23 [ffffffffbe403e50] call_cpuidle at ffffffffbd0d98f3 MIPS#24 [ffffffffbe403e60] do_idle at ffffffffbd0d9b42 MIPS#25 [ffffffffbe403e98] cpu_startup_entry at ffffffffbd0d9da3 MIPS#26 [ffffffffbe403ec0] rest_init at ffffffffbd81d4aa MIPS#27 [ffffffffbe403ed0] start_kernel at ffffffffbe67d2ca MIPS#28 [ffffffffbe403f28] x86_64_start_reservations at ffffffffbe67c675 MIPS#29 [ffffffffbe403f38] x86_64_start_kernel at ffffffffbe67c6eb MIPS#30 [ffffffffbe403f50] secondary_startup_64 at ffffffffbd0000d5 Fixes: 040036b ("scsi: qla2xxx: Delay loop id allocation at login") Cc: <stable@vger.kernel.org> # v4.17+ Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
pcercuei
referenced
this issue
in OpenDingux/linux
Aug 4, 2018
In the code path where only rcu read lock is held, e.g. in the route lookup code path, it is not safe to directly call fib6_info_hold() because the fib6_info may already have been deleted but still exists in the rcu grace period. Holding reference to it could cause double free and crash the kernel. This patch adds a new function fib6_info_hold_safe() and replace fib6_info_hold() in all necessary places. Syzbot reported 3 crash traces because of this. One of them is: 8021q: adding VLAN 0 to HW filter on device team0 IPv6: ADDRCONF(NETDEV_CHANGE): team0: link becomes ready dst_release: dst:(____ptrval____) refcnt:-1 dst_release: dst:(____ptrval____) refcnt:-2 WARNING: CPU: 1 PID: 4845 at include/net/dst.h:239 dst_hold include/net/dst.h:239 [inline] WARNING: CPU: 1 PID: 4845 at include/net/dst.h:239 ip6_setup_cork+0xd66/0x1830 net/ipv6/ip6_output.c:1204 dst_release: dst:(____ptrval____) refcnt:-1 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 4845 Comm: syz-executor493 Not tainted 4.18.0-rc3+ #10 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113 panic+0x238/0x4e7 kernel/panic.c:184 dst_release: dst:(____ptrval____) refcnt:-2 dst_release: dst:(____ptrval____) refcnt:-3 __warn.cold.8+0x163/0x1ba kernel/panic.c:536 dst_release: dst:(____ptrval____) refcnt:-4 report_bug+0x252/0x2d0 lib/bug.c:186 fixup_bug arch/x86/kernel/traps.c:178 [inline] do_error_trap+0x1fc/0x4d0 arch/x86/kernel/traps.c:296 dst_release: dst:(____ptrval____) refcnt:-5 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:316 invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:992 RIP: 0010:dst_hold include/net/dst.h:239 [inline] RIP: 0010:ip6_setup_cork+0xd66/0x1830 net/ipv6/ip6_output.c:1204 Code: c1 ed 03 89 9d 18 ff ff ff 48 b8 00 00 00 00 00 fc ff df 41 c6 44 05 00 f8 e9 2d 01 00 00 4c 8b a5 c8 fe ff ff e8 1a f6 e6 fa <0f> 0b e9 6a fc ff ff e8 0e f6 e6 fa 48 8b 85 d0 fe ff ff 48 8d 78 RSP: 0018:ffff8801a8fcf178 EFLAGS: 00010293 RAX: ffff8801a8eba5c0 RBX: 0000000000000000 RCX: ffffffff869511e6 RDX: 0000000000000000 RSI: ffffffff869515b6 RDI: 0000000000000005 RBP: ffff8801a8fcf2c8 R08: ffff8801a8eba5c0 R09: ffffed0035ac8338 R10: ffffed0035ac8338 R11: ffff8801ad6419c3 R12: ffff8801a8fcf720 R13: ffff8801a8fcf6a0 R14: ffff8801ad6419c0 R15: ffff8801ad641980 ip6_make_skb+0x2c8/0x600 net/ipv6/ip6_output.c:1768 udpv6_sendmsg+0x2c90/0x35f0 net/ipv6/udp.c:1376 inet_sendmsg+0x1a1/0x690 net/ipv4/af_inet.c:798 sock_sendmsg_nosec net/socket.c:641 [inline] sock_sendmsg+0xd5/0x120 net/socket.c:651 ___sys_sendmsg+0x51d/0x930 net/socket.c:2125 __sys_sendmmsg+0x240/0x6f0 net/socket.c:2220 __do_sys_sendmmsg net/socket.c:2249 [inline] __se_sys_sendmmsg net/socket.c:2246 [inline] __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2246 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x446ba9 Code: e8 cc bb 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fb39a469da8 EFLAGS: 00000246 ORIG_RAX: 0000000000000133 RAX: ffffffffffffffda RBX: 00000000006dcc54 RCX: 0000000000446ba9 RDX: 00000000000000b8 RSI: 0000000020001b00 RDI: 0000000000000003 RBP: 00000000006dcc50 R08: 00007fb39a46a700 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 45c828efc7a64843 R13: e6eeb815b9d8a477 R14: 5068caf6f713c6fc R15: 0000000000000001 Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: disabled Rebooting in 86400 seconds.. Fixes: 93531c6 ("net/ipv6: separate handling of FIB entries from dst based routes") Reported-by: syzbot+902e2a1bcd4f7808cef5@syzkaller.appspotmail.com Reported-by: syzbot+8ae62d67f647abeeceb9@syzkaller.appspotmail.com Reported-by: syzbot+3f08feb14086930677d0@syzkaller.appspotmail.com Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
nemunaire
pushed a commit
to nemunaire/CI20_linux
that referenced
this issue
Aug 17, 2018
Before this patch, using multiple active endpoints would not be possible and would actually be canceling each other out. The issue was discovered on Android when combining adb, mtp and ptp configurations together. This patch introduces proper behaviour for these cases. Also, during the boot-up the following warning is no longer shown: [ 2.879328] ------------[ cut here ]------------ [ 2.883983] WARNING: CPU: 0 PID: 1 at drivers/usb/dwc2/gadget.c:212 s3c_hsotg_init_fifo+0x168/0x1d0() [ 2.893204] insufficient fifo memory [ 2.896602] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 3.18.3+ MIPS#10 [ 2.904004] Stack : 00000000 800919a0 00000000 00000004 00000006 800913f4 00000000 00000000 00000000 00000000 80f75a12 00000042 80f75a12 00000042 00000006 00000000 80e42767 80d7c2e 00000001 00000000 80f73574 8bc90418 80ea0000 01000d00 80f06704 80b24c00 00000000 80035388 00000006 00000000 80d834a4 8bc99b04 8bc99b04 80e40000 00000000 00000000 00000000 00000000 00000000 00000000 ... [ 2.939709] Call Trace: [ 2.942174] [<8001bab0>] show_stack+0xd4/0xf0 [ 2.946528] [<80b26c40>] dump_stack+0x70/0xbc [ 2.950880] [<800356bc>] warn_slowpath_common+0x90/0xe8 [ 2.956116] [<80035808>] warn_slowpath_fmt+0x3c/0x48 [ 2.961075] [<8069b824>] s3c_hsotg_init_fifo+0x168/0x1d0 [ 2.966398] [<8069d8fc>] s3c_hsotg_init+0x50/0x9c [ 2.971095] [<806a0388>] dwc2_gadget_init+0x430/0x8c0 [ 2.976158] [<806a0df0>] dwc2_driver_probe+0x218/0x2a8 [ 2.981291] [<805b935c>] platform_drv_probe+0x64/0x120 [ 2.986440] [<805b783c>] really_probe+0xa0/0x278 [ 2.991050] [<805b7c78>] driver_probe_device+0x48/0x78 [ 2.996197] [<805b7d74>] __driver_attach+0xcc/0xd4 [ 3.000980] [<805b5b7c>] bus_for_each_dev+0x7c/0xc4 [ 3.005874] [<805b64f8>] bus_add_driver+0x180/0x240 [ 3.010743] [<805b8428>] driver_register+0xac/0x154 [ 3.015633] [<80ea9e04>] do_one_initcall+0x150/0x1f4 [ 3.020589] [<80eaa080>] kernel_init_freeable+0x1d8/0x298 [ 3.025998] [<80b23c5c>] kernel_init+0x28/0x158 [ 3.030522] [<800153ec>] ret_from_kernel_thread+0x14/0x1c [ 3.035926] [ 3.037412] ---[ end trace cb88537fdc8fa201 ]--- And during configuration transitions (e.g. adb -> mtp,adb) the following warning is no longer shown: [ 311.726159] -----------[ cut here ]----------- [ 311.730817] WARNING: CPU: 0 PID: 0 at drivers/usb/dwc2/gadget.c:1475 s3c_hsotg_rx_data+0x130/0x13c() [ 311.739931] Modules linked in: [ 311.742993] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 3.18.3+ MIPS#45 [ 311.750199] Stack : 00000000 80080370 00000000 00000004 00000006 00000000 00000000 00000000 00000000 00000000 80f05b02 00000042 80d61010 80e18e20 80d60000 8b408010 80e18927 80d0df6c 00000000 00000000 80f03614 80e18e20 80d60000 8b408010 00250182 80a54f54 80e20cc4 80e20cc8 00000000 00000000 80d14ab8 80dfbacc 80dfbacc 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ... [ 311.785841] Call Trace: [ 311.788292] [<8001ac28>] show_stack+0xc4/0xe0 [ 311.792650] [<80a56e58>] dump_stack+0x70/0xbc [ 311.797008] [<80033c14>] warn_slowpath_common+0x88/0xb8 [ 311.802224] [<80033cc8>] warn_slowpath_null+0x18/0x24 [ 311.807266] [<80606a3c>] s3c_hsotg_rx_data+0x130/0x13c [ 311.812397] [<8060afa4>] s3c_hsotg_irq+0x3b4/0x5e8 [ 311.817183] [<80082ab8>] handle_irq_event_percpu+0x90/0x2d0 [ 311.822745] [<80082d4c>] handle_irq_event+0x54/0x98 [ 311.827617] [<80086390>] handle_level_irq+0xe0/0x1c0 [ 311.832572] [<800820bc>] generic_handle_irq+0x3c/0x54 [ 311.837622] [<804bb680>] jz4740_cascade+0x78/0xac [ 311.842317] [<80082ab8>] handle_irq_event_percpu+0x90/0x2d0 [ 311.847881] [<80086d18>] handle_percpu_irq+0x8c/0xbc [ 311.852835] [<800820bc>] generic_handle_irq+0x3c/0x54 [ 311.857878] [<80016c8c>] do_IRQ+0x18/0x2c [ 311.861879] [<80014c40>] ret_from_irq+0x0/0x4 [ 311.866227] [<80016b20>] mips_cpuidle_wait_enter+0x14/0x34 [ 311.871713] [<806d37b0>] cpuidle_enter_state+0x88/0x2c0 [ 311.876934] [<80074308>] cpu_startup_entry+0x36c/0x484 [ 311.882074] [<80e7dc04>] start_kernel+0x4b8/0x4e0 [ 311.886767] [ 311.888253] --[ end trace dd7a60dcc5530db3 ]-- Change-Id: Ic8ac37a28913d4314371de0cd446f8a7cc45864d Signed-off-by: Dragan Cecavac <dragan.cecavac@imgtec.com>
gabrielesvelto
pushed a commit
to gabrielesvelto/CI20_linux
that referenced
this issue
Sep 11, 2018
Before this patch, using multiple active endpoints would not be possible and would actually be canceling each other out. The issue was discovered on Android when combining adb, mtp and ptp configurations together. This patch introduces proper behaviour for these cases. Also, during the boot-up the following warning is no longer shown: [ 2.879328] ------------[ cut here ]------------ [ 2.883983] WARNING: CPU: 0 PID: 1 at drivers/usb/dwc2/gadget.c:212 s3c_hsotg_init_fifo+0x168/0x1d0() [ 2.893204] insufficient fifo memory [ 2.896602] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 3.18.3+ MIPS#10 [ 2.904004] Stack : 00000000 800919a0 00000000 00000004 00000006 800913f4 00000000 00000000 00000000 00000000 80f75a12 00000042 80f75a12 00000042 00000006 00000000 80e42767 80d7c2e 00000001 00000000 80f73574 8bc90418 80ea0000 01000d00 80f06704 80b24c00 00000000 80035388 00000006 00000000 80d834a4 8bc99b04 8bc99b04 80e40000 00000000 00000000 00000000 00000000 00000000 00000000 ... [ 2.939709] Call Trace: [ 2.942174] [<8001bab0>] show_stack+0xd4/0xf0 [ 2.946528] [<80b26c40>] dump_stack+0x70/0xbc [ 2.950880] [<800356bc>] warn_slowpath_common+0x90/0xe8 [ 2.956116] [<80035808>] warn_slowpath_fmt+0x3c/0x48 [ 2.961075] [<8069b824>] s3c_hsotg_init_fifo+0x168/0x1d0 [ 2.966398] [<8069d8fc>] s3c_hsotg_init+0x50/0x9c [ 2.971095] [<806a0388>] dwc2_gadget_init+0x430/0x8c0 [ 2.976158] [<806a0df0>] dwc2_driver_probe+0x218/0x2a8 [ 2.981291] [<805b935c>] platform_drv_probe+0x64/0x120 [ 2.986440] [<805b783c>] really_probe+0xa0/0x278 [ 2.991050] [<805b7c78>] driver_probe_device+0x48/0x78 [ 2.996197] [<805b7d74>] __driver_attach+0xcc/0xd4 [ 3.000980] [<805b5b7c>] bus_for_each_dev+0x7c/0xc4 [ 3.005874] [<805b64f8>] bus_add_driver+0x180/0x240 [ 3.010743] [<805b8428>] driver_register+0xac/0x154 [ 3.015633] [<80ea9e04>] do_one_initcall+0x150/0x1f4 [ 3.020589] [<80eaa080>] kernel_init_freeable+0x1d8/0x298 [ 3.025998] [<80b23c5c>] kernel_init+0x28/0x158 [ 3.030522] [<800153ec>] ret_from_kernel_thread+0x14/0x1c [ 3.035926] [ 3.037412] ---[ end trace cb88537fdc8fa201 ]--- And during configuration transitions (e.g. adb -> mtp,adb) the following warning is no longer shown: [ 311.726159] -----------[ cut here ]----------- [ 311.730817] WARNING: CPU: 0 PID: 0 at drivers/usb/dwc2/gadget.c:1475 s3c_hsotg_rx_data+0x130/0x13c() [ 311.739931] Modules linked in: [ 311.742993] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 3.18.3+ MIPS#45 [ 311.750199] Stack : 00000000 80080370 00000000 00000004 00000006 00000000 00000000 00000000 00000000 00000000 80f05b02 00000042 80d61010 80e18e20 80d60000 8b408010 80e18927 80d0df6c 00000000 00000000 80f03614 80e18e20 80d60000 8b408010 00250182 80a54f54 80e20cc4 80e20cc8 00000000 00000000 80d14ab8 80dfbacc 80dfbacc 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ... [ 311.785841] Call Trace: [ 311.788292] [<8001ac28>] show_stack+0xc4/0xe0 [ 311.792650] [<80a56e58>] dump_stack+0x70/0xbc [ 311.797008] [<80033c14>] warn_slowpath_common+0x88/0xb8 [ 311.802224] [<80033cc8>] warn_slowpath_null+0x18/0x24 [ 311.807266] [<80606a3c>] s3c_hsotg_rx_data+0x130/0x13c [ 311.812397] [<8060afa4>] s3c_hsotg_irq+0x3b4/0x5e8 [ 311.817183] [<80082ab8>] handle_irq_event_percpu+0x90/0x2d0 [ 311.822745] [<80082d4c>] handle_irq_event+0x54/0x98 [ 311.827617] [<80086390>] handle_level_irq+0xe0/0x1c0 [ 311.832572] [<800820bc>] generic_handle_irq+0x3c/0x54 [ 311.837622] [<804bb680>] jz4740_cascade+0x78/0xac [ 311.842317] [<80082ab8>] handle_irq_event_percpu+0x90/0x2d0 [ 311.847881] [<80086d18>] handle_percpu_irq+0x8c/0xbc [ 311.852835] [<800820bc>] generic_handle_irq+0x3c/0x54 [ 311.857878] [<80016c8c>] do_IRQ+0x18/0x2c [ 311.861879] [<80014c40>] ret_from_irq+0x0/0x4 [ 311.866227] [<80016b20>] mips_cpuidle_wait_enter+0x14/0x34 [ 311.871713] [<806d37b0>] cpuidle_enter_state+0x88/0x2c0 [ 311.876934] [<80074308>] cpu_startup_entry+0x36c/0x484 [ 311.882074] [<80e7dc04>] start_kernel+0x4b8/0x4e0 [ 311.886767] [ 311.888253] --[ end trace dd7a60dcc5530db3 ]-- Change-Id: Ic8ac37a28913d4314371de0cd446f8a7cc45864d Signed-off-by: Dragan Cecavac <dragan.cecavac@imgtec.com>
gabrielesvelto
pushed a commit
to gabrielesvelto/CI20_linux
that referenced
this issue
Sep 26, 2018
Before this patch, using multiple active endpoints would not be possible and would actually be canceling each other out. The issue was discovered on Android when combining adb, mtp and ptp configurations together. This patch introduces proper behaviour for these cases. Also, during the boot-up the following warning is no longer shown: [ 2.879328] ------------[ cut here ]------------ [ 2.883983] WARNING: CPU: 0 PID: 1 at drivers/usb/dwc2/gadget.c:212 s3c_hsotg_init_fifo+0x168/0x1d0() [ 2.893204] insufficient fifo memory [ 2.896602] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 3.18.3+ MIPS#10 [ 2.904004] Stack : 00000000 800919a0 00000000 00000004 00000006 800913f4 00000000 00000000 00000000 00000000 80f75a12 00000042 80f75a12 00000042 00000006 00000000 80e42767 80d7c2e 00000001 00000000 80f73574 8bc90418 80ea0000 01000d00 80f06704 80b24c00 00000000 80035388 00000006 00000000 80d834a4 8bc99b04 8bc99b04 80e40000 00000000 00000000 00000000 00000000 00000000 00000000 ... [ 2.939709] Call Trace: [ 2.942174] [<8001bab0>] show_stack+0xd4/0xf0 [ 2.946528] [<80b26c40>] dump_stack+0x70/0xbc [ 2.950880] [<800356bc>] warn_slowpath_common+0x90/0xe8 [ 2.956116] [<80035808>] warn_slowpath_fmt+0x3c/0x48 [ 2.961075] [<8069b824>] s3c_hsotg_init_fifo+0x168/0x1d0 [ 2.966398] [<8069d8fc>] s3c_hsotg_init+0x50/0x9c [ 2.971095] [<806a0388>] dwc2_gadget_init+0x430/0x8c0 [ 2.976158] [<806a0df0>] dwc2_driver_probe+0x218/0x2a8 [ 2.981291] [<805b935c>] platform_drv_probe+0x64/0x120 [ 2.986440] [<805b783c>] really_probe+0xa0/0x278 [ 2.991050] [<805b7c78>] driver_probe_device+0x48/0x78 [ 2.996197] [<805b7d74>] __driver_attach+0xcc/0xd4 [ 3.000980] [<805b5b7c>] bus_for_each_dev+0x7c/0xc4 [ 3.005874] [<805b64f8>] bus_add_driver+0x180/0x240 [ 3.010743] [<805b8428>] driver_register+0xac/0x154 [ 3.015633] [<80ea9e04>] do_one_initcall+0x150/0x1f4 [ 3.020589] [<80eaa080>] kernel_init_freeable+0x1d8/0x298 [ 3.025998] [<80b23c5c>] kernel_init+0x28/0x158 [ 3.030522] [<800153ec>] ret_from_kernel_thread+0x14/0x1c [ 3.035926] [ 3.037412] ---[ end trace cb88537fdc8fa201 ]--- And during configuration transitions (e.g. adb -> mtp,adb) the following warning is no longer shown: [ 311.726159] -----------[ cut here ]----------- [ 311.730817] WARNING: CPU: 0 PID: 0 at drivers/usb/dwc2/gadget.c:1475 s3c_hsotg_rx_data+0x130/0x13c() [ 311.739931] Modules linked in: [ 311.742993] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 3.18.3+ MIPS#45 [ 311.750199] Stack : 00000000 80080370 00000000 00000004 00000006 00000000 00000000 00000000 00000000 00000000 80f05b02 00000042 80d61010 80e18e20 80d60000 8b408010 80e18927 80d0df6c 00000000 00000000 80f03614 80e18e20 80d60000 8b408010 00250182 80a54f54 80e20cc4 80e20cc8 00000000 00000000 80d14ab8 80dfbacc 80dfbacc 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ... [ 311.785841] Call Trace: [ 311.788292] [<8001ac28>] show_stack+0xc4/0xe0 [ 311.792650] [<80a56e58>] dump_stack+0x70/0xbc [ 311.797008] [<80033c14>] warn_slowpath_common+0x88/0xb8 [ 311.802224] [<80033cc8>] warn_slowpath_null+0x18/0x24 [ 311.807266] [<80606a3c>] s3c_hsotg_rx_data+0x130/0x13c [ 311.812397] [<8060afa4>] s3c_hsotg_irq+0x3b4/0x5e8 [ 311.817183] [<80082ab8>] handle_irq_event_percpu+0x90/0x2d0 [ 311.822745] [<80082d4c>] handle_irq_event+0x54/0x98 [ 311.827617] [<80086390>] handle_level_irq+0xe0/0x1c0 [ 311.832572] [<800820bc>] generic_handle_irq+0x3c/0x54 [ 311.837622] [<804bb680>] jz4740_cascade+0x78/0xac [ 311.842317] [<80082ab8>] handle_irq_event_percpu+0x90/0x2d0 [ 311.847881] [<80086d18>] handle_percpu_irq+0x8c/0xbc [ 311.852835] [<800820bc>] generic_handle_irq+0x3c/0x54 [ 311.857878] [<80016c8c>] do_IRQ+0x18/0x2c [ 311.861879] [<80014c40>] ret_from_irq+0x0/0x4 [ 311.866227] [<80016b20>] mips_cpuidle_wait_enter+0x14/0x34 [ 311.871713] [<806d37b0>] cpuidle_enter_state+0x88/0x2c0 [ 311.876934] [<80074308>] cpu_startup_entry+0x36c/0x484 [ 311.882074] [<80e7dc04>] start_kernel+0x4b8/0x4e0 [ 311.886767] [ 311.888253] --[ end trace dd7a60dcc5530db3 ]-- Change-Id: Ic8ac37a28913d4314371de0cd446f8a7cc45864d Signed-off-by: Dragan Cecavac <dragan.cecavac@imgtec.com>
gabrielesvelto
pushed a commit
to gabrielesvelto/CI20_linux
that referenced
this issue
Oct 14, 2018
Before this patch, using multiple active endpoints would not be possible and would actually be canceling each other out. The issue was discovered on Android when combining adb, mtp and ptp configurations together. This patch introduces proper behaviour for these cases. Also, during the boot-up the following warning is no longer shown: [ 2.879328] ------------[ cut here ]------------ [ 2.883983] WARNING: CPU: 0 PID: 1 at drivers/usb/dwc2/gadget.c:212 s3c_hsotg_init_fifo+0x168/0x1d0() [ 2.893204] insufficient fifo memory [ 2.896602] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 3.18.3+ MIPS#10 [ 2.904004] Stack : 00000000 800919a0 00000000 00000004 00000006 800913f4 00000000 00000000 00000000 00000000 80f75a12 00000042 80f75a12 00000042 00000006 00000000 80e42767 80d7c2e 00000001 00000000 80f73574 8bc90418 80ea0000 01000d00 80f06704 80b24c00 00000000 80035388 00000006 00000000 80d834a4 8bc99b04 8bc99b04 80e40000 00000000 00000000 00000000 00000000 00000000 00000000 ... [ 2.939709] Call Trace: [ 2.942174] [<8001bab0>] show_stack+0xd4/0xf0 [ 2.946528] [<80b26c40>] dump_stack+0x70/0xbc [ 2.950880] [<800356bc>] warn_slowpath_common+0x90/0xe8 [ 2.956116] [<80035808>] warn_slowpath_fmt+0x3c/0x48 [ 2.961075] [<8069b824>] s3c_hsotg_init_fifo+0x168/0x1d0 [ 2.966398] [<8069d8fc>] s3c_hsotg_init+0x50/0x9c [ 2.971095] [<806a0388>] dwc2_gadget_init+0x430/0x8c0 [ 2.976158] [<806a0df0>] dwc2_driver_probe+0x218/0x2a8 [ 2.981291] [<805b935c>] platform_drv_probe+0x64/0x120 [ 2.986440] [<805b783c>] really_probe+0xa0/0x278 [ 2.991050] [<805b7c78>] driver_probe_device+0x48/0x78 [ 2.996197] [<805b7d74>] __driver_attach+0xcc/0xd4 [ 3.000980] [<805b5b7c>] bus_for_each_dev+0x7c/0xc4 [ 3.005874] [<805b64f8>] bus_add_driver+0x180/0x240 [ 3.010743] [<805b8428>] driver_register+0xac/0x154 [ 3.015633] [<80ea9e04>] do_one_initcall+0x150/0x1f4 [ 3.020589] [<80eaa080>] kernel_init_freeable+0x1d8/0x298 [ 3.025998] [<80b23c5c>] kernel_init+0x28/0x158 [ 3.030522] [<800153ec>] ret_from_kernel_thread+0x14/0x1c [ 3.035926] [ 3.037412] ---[ end trace cb88537fdc8fa201 ]--- And during configuration transitions (e.g. adb -> mtp,adb) the following warning is no longer shown: [ 311.726159] -----------[ cut here ]----------- [ 311.730817] WARNING: CPU: 0 PID: 0 at drivers/usb/dwc2/gadget.c:1475 s3c_hsotg_rx_data+0x130/0x13c() [ 311.739931] Modules linked in: [ 311.742993] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 3.18.3+ MIPS#45 [ 311.750199] Stack : 00000000 80080370 00000000 00000004 00000006 00000000 00000000 00000000 00000000 00000000 80f05b02 00000042 80d61010 80e18e20 80d60000 8b408010 80e18927 80d0df6c 00000000 00000000 80f03614 80e18e20 80d60000 8b408010 00250182 80a54f54 80e20cc4 80e20cc8 00000000 00000000 80d14ab8 80dfbacc 80dfbacc 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ... [ 311.785841] Call Trace: [ 311.788292] [<8001ac28>] show_stack+0xc4/0xe0 [ 311.792650] [<80a56e58>] dump_stack+0x70/0xbc [ 311.797008] [<80033c14>] warn_slowpath_common+0x88/0xb8 [ 311.802224] [<80033cc8>] warn_slowpath_null+0x18/0x24 [ 311.807266] [<80606a3c>] s3c_hsotg_rx_data+0x130/0x13c [ 311.812397] [<8060afa4>] s3c_hsotg_irq+0x3b4/0x5e8 [ 311.817183] [<80082ab8>] handle_irq_event_percpu+0x90/0x2d0 [ 311.822745] [<80082d4c>] handle_irq_event+0x54/0x98 [ 311.827617] [<80086390>] handle_level_irq+0xe0/0x1c0 [ 311.832572] [<800820bc>] generic_handle_irq+0x3c/0x54 [ 311.837622] [<804bb680>] jz4740_cascade+0x78/0xac [ 311.842317] [<80082ab8>] handle_irq_event_percpu+0x90/0x2d0 [ 311.847881] [<80086d18>] handle_percpu_irq+0x8c/0xbc [ 311.852835] [<800820bc>] generic_handle_irq+0x3c/0x54 [ 311.857878] [<80016c8c>] do_IRQ+0x18/0x2c [ 311.861879] [<80014c40>] ret_from_irq+0x0/0x4 [ 311.866227] [<80016b20>] mips_cpuidle_wait_enter+0x14/0x34 [ 311.871713] [<806d37b0>] cpuidle_enter_state+0x88/0x2c0 [ 311.876934] [<80074308>] cpu_startup_entry+0x36c/0x484 [ 311.882074] [<80e7dc04>] start_kernel+0x4b8/0x4e0 [ 311.886767] [ 311.888253] --[ end trace dd7a60dcc5530db3 ]-- Change-Id: Ic8ac37a28913d4314371de0cd446f8a7cc45864d Signed-off-by: Dragan Cecavac <dragan.cecavac@imgtec.com>
gabrielesvelto
pushed a commit
to gabrielesvelto/CI20_linux
that referenced
this issue
Nov 23, 2018
[ Upstream commit 4fde620 ] At f_audio_free_inst, it tries to access struct gaudio *card which is freed at f_audio_free, it causes below oops if the audio device is not there (do unload module may trigger the same problem). The gaudio_cleanup is related to function, so it is better move to f_audio_free. root@freescale ~$ modprobe g_audio [ 751.968931] g_audio gadget: unable to open sound control device file: /dev/snd/controlC0 [ 751.977134] g_audio gadget: we need at least one control device [ 751.988633] Unable to handle kernel paging request at virtual address 455f448e [ 751.995963] pgd = bd42c00 [ 751.998681] [455f448e] *pgd=00000000 [ 752.002383] Internal error: Oops: 5 [MIPS#1] SMP ARM [ 752.007008] Modules linked in: usb_f_uac1 g_audio(+) usb_f_mass_storage libcomposite configfs [last unloaded: g_mass_storage] [ 752.018427] CPU: 0 PID: 692 Comm: modprobe Not tainted 3.18.0-rc4-00345-g842f57b MIPS#10 [ 752.026176] task: bdb3ba80 ti: bd41a000 task.ti: bd41a000 [ 752.031590] PC is at filp_close+0xc/0x84 [ 752.035530] LR is at gaudio_cleanup+0x28/0x54 [usb_f_uac1] [ 752.041023] pc : [<800ec94c>] lr : [<7f03c63c>] psr: 20000013 [ 752.041023] sp : bd41bcc8 ip : bd41bce8 fp : bd41bce4 [ 752.052504] r10: 7f036234 r9 : 7f036220 r8 : 7f036500 [ 752.057732] r7 : bd456480 r6 : 7f036500 r5 : 7f03626c r4 : bd441000 [ 752.064264] r3 : 7f03b3dc r2 : 7f03cab0 r1 : 00000000 r0 : 455f4456 [ 752.070798] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user [ 752.077938] Control: 10c5387d Table: bd42c04a DAC: 00000015 [ 752.083688] Process modprobe (pid: 692, stack limit = 0xbd41a240) [ 752.089786] Stack: (0xbd41bcc8 to 0xbd41c000) [ 752.094152] bcc0: 7f03b3dc bd441000 7f03626c 7f036500 bd41bcfc bd41bce8 [ 752.102337] bce0: 7f03c63c 800ec94c 7f03b3dc bdaa6b00 bd41bd14 bd41bd00 7f03b3f4 7f03c620 [ 752.110521] bd00: 7f03b3dc 7f03cbd4 bd41bd2c bd41bd18 7f00f88c 7f03b3e8 00000000 fffffffe [ 752.118705] bd20: bd41bd5c bd41bd30 7f0380d8 7f00f874 7f038000 bd456480 7f036364 be392240 [ 752.126889] bd40: 00000000 7f00f620 7f00f638 bd41a008 bd41bd94 bd41bd60 7f00f6d4 7f03800c [ 752.135073] bd60: 00000001 00000000 8047438c be3a4000 7f036364 7f036364 7f00db28 7f00f620 [ 752.143257] bd80: 7f00f638 bd41a008 bd41bdb4 bd41bd98 804742ac 7f00f644 00000000 809adde0 [ 752.151442] bda0: 7f036364 7f036364 bd41bdcc bd41bdb8 804743c8 80474284 7f03633c 7f036200 [ 752.159626] bdc0: bd41bdf4 bd41bdd0 7f00d5b4 8047435c bd41a000 80974060 7f038158 00000000 [ 752.167811] bde0: 80974060 bdaa9940 bd41be04 bd41bdf8 7f03816c 7f00d518 bd41be8c bd41be08 [ 752.175995] be00: 80008a5c 7f038164 be001f00 7f0363c4 bd41bf48 00000000 bd41be54 bd41be28 [ 752.184179] be20: 800e9498 800e8e74 00000002 00000003 bd4129c0 c0a07000 00000001 7f0363c4 [ 752.192363] be40: bd41bf48 00000000 bd41be74 bd41be58 800de780 800e9320 bd41a000 7f0363d0 [ 752.200547] be60: 00000000 bd41a000 7f0363d0 00000000 bd41beec 7f0363c4 bd41bf48 00000000 [ 752.208731] be80: bd41bf44 bd41be90 80093e5 800089e0 ffff8000 00007fff 80091390 0000065f [ 752.216915] bea0: 00000000 c0a0834c bd41bf7c 00000086 bd41bf50 00000000 7f03651c 00000086 [ 752.225099] bec0: bd41a010 00c28758 800ddcc4 800ddae0 000000d2 bd412a00 bd41bf24 00000000 [ 752.233283] bee0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 752.241467] bf00: 00000000 00000000 00000000 00000000 00000000 00000000 bd41bf44 000025b0 [ 752.249651] bf20: 00c28a08 00c28758 00000080 8000edc4 bd41a000 00000000 bd41bfa4 bd41bf48 [ 752.257835] bf40: 800943e4 800932ec c0a07000 000025b0 c0a07f8c c0a07ea4 c0a08e5c 0000051c [ 752.266019] bf60: 0000088c 00000000 00000000 00000000 00000018 00000019 00000010 0000000b [ 752.274203] bf80: 00000009 00000000 00000000 000025b0 00000000 00c28758 00000000 bd41bfa8 [ 752.282387] bfa0: 8000ec00 8009430c 000025b0 00000000 00c28a08 000025b0 00c28758 00c28980 [ 752.290571] bfc0: 000025b0 00000000 00c28758 00000080 000a6a78 00000007 00c28718 00c28980 [ 752.298756] bfe0: 7ebc1af0 7ebc1ae0 0001a32c 76e9c490 60000010 00c28a08 22013510 ecebffff [ 752.306933] Backtrace: [ 752.309414] [<800ec940>] (filp_close) from [<7f03c63c>] (gaudio_cleanup+0x28/0x54 [usb_f_uac1]) [ 752.318115] r6:7f036500 r5:7f03626c r4:bd441000 r3:7f03b3dc [ 752.323851] [<7f03c614>] (gaudio_cleanup [usb_f_uac1]) from [<7f03b3f4>] (f_audio_free_inst+0x18/0x68 [usb_f_uac1]) [ 752.334288] r4:bdaa6b00 r3:7f03b3dc [ 752.337931] [<7f03b3dc>] (f_audio_free_inst [usb_f_uac1]) from [<7f00f88c>] (usb_put_function_instance+0x24/0x30 [libcomposite]) [ 752.349498] r4:7f03cbd4 r3:7f03b3dc [ 752.353127] [<7f00f868>] (usb_put_function_instance [libcomposite]) from [<7f0380d8>] (audio_bind+0xd8/0xfc [g_audio]) [ 752.363824] r4:fffffffe r3:00000000 [ 752.367456] [<7f038000>] (audio_bind [g_audio]) from [<7f00f6d4>] (composite_bind+0x9c/0x1e8 [libcomposite]) [ 752.377284] r10:bd41a008 r9:7f00f638 r8:7f00f620 r7:00000000 r6:be392240 r5:7f036364 [ 752.385193] r4:bd456480 r3:7f038000 [ 752.388825] [<7f00f638>] (composite_bind [libcomposite]) from [<804742ac>] (udc_bind_to_driver+0x34/0xd8) [ 752.398394] r10:bd41a008 r9:7f00f638 r8:7f00f620 r7:7f00db28 r6:7f036364 r5:7f036364 [ 752.406302] r4:be3a4000 [ 752.408860] [<80474278>] (udc_bind_to_driver) from [<804743c8>] (usb_gadget_probe_driver+0x78/0xa8) [ 752.417908] r6:7f036364 r5:7f036364 r4:809adde0 r3:00000000 [ 752.423649] [<80474350>] (usb_gadget_probe_driver) from [<7f00d5b4>] (usb_composite_probe+0xa8/0xd4 [libcomposite]) [ 752.434086] r5:7f036200 r4:7f03633c [ 752.437713] [<7f00d50c>] (usb_composite_probe [libcomposite]) from [<7f03816c>] (audio_driver_init+0x14/0x1c [g_audio]) [ 752.448498] r9:bdaa9940 r8:80974060 r7:00000000 r6:7f038158 r5:80974060 r4:bd41a000 [ 752.456330] [<7f038158>] (audio_driver_init [g_audio]) from [<80008a5c>] (do_one_initcall+0x88/0x1d4) [ 752.465564] [<800089d4>] (do_one_initcall) from [<80093e54>] (load_module+0xb74/0x1020) [ 752.473571] r10:00000000 r9:bd41bf48 r8:7f0363c4 r7:bd41beec r6:00000000 r5:7f0363d0 [ 752.481478] r4:bd41a000 [ 752.484037] [<800932e0>] (load_module) from [<800943e4>] (SyS_init_module+0xe4/0xf8) [ 752.491781] r10:00000000 r9:bd41a000 r8:8000edc4 r7:00000080 r6:00c28758 r5:00c28a08 [ 752.499689] r4:000025b0 [ 752.502252] [<80094300>] (SyS_init_module) from [<8000ec00>] (ret_fast_syscall+0x0/0x48) [ 752.510345] r6:00c28758 r5:00000000 r4:000025b0 [ 752.515013] Code: 808475b4 e1a0c00d e92dd878 e24cb004 (e5904038) [ 752.521223] ---[ end trace 70babe34de4ab99b ]--- Segmentation fault Signed-off-by: Peter Chen <peter.chen@freescale.com> Signed-off-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
gabrielesvelto
pushed a commit
to gabrielesvelto/CI20_linux
that referenced
this issue
Nov 23, 2018
Before this patch, using multiple active endpoints would not be possible and would actually be canceling each other out. The issue was discovered on Android when combining adb, mtp and ptp configurations together. This patch introduces proper behaviour for these cases. Also, during the boot-up the following warning is no longer shown: [ 2.879328] ------------[ cut here ]------------ [ 2.883983] WARNING: CPU: 0 PID: 1 at drivers/usb/dwc2/gadget.c:212 s3c_hsotg_init_fifo+0x168/0x1d0() [ 2.893204] insufficient fifo memory [ 2.896602] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 3.18.3+ MIPS#10 [ 2.904004] Stack : 00000000 800919a0 00000000 00000004 00000006 800913f4 00000000 00000000 00000000 00000000 80f75a12 00000042 80f75a12 00000042 00000006 00000000 80e42767 80d7c2e 00000001 00000000 80f73574 8bc90418 80ea0000 01000d00 80f06704 80b24c00 00000000 80035388 00000006 00000000 80d834a4 8bc99b04 8bc99b04 80e40000 00000000 00000000 00000000 00000000 00000000 00000000 ... [ 2.939709] Call Trace: [ 2.942174] [<8001bab0>] show_stack+0xd4/0xf0 [ 2.946528] [<80b26c40>] dump_stack+0x70/0xbc [ 2.950880] [<800356bc>] warn_slowpath_common+0x90/0xe8 [ 2.956116] [<80035808>] warn_slowpath_fmt+0x3c/0x48 [ 2.961075] [<8069b824>] s3c_hsotg_init_fifo+0x168/0x1d0 [ 2.966398] [<8069d8fc>] s3c_hsotg_init+0x50/0x9c [ 2.971095] [<806a0388>] dwc2_gadget_init+0x430/0x8c0 [ 2.976158] [<806a0df0>] dwc2_driver_probe+0x218/0x2a8 [ 2.981291] [<805b935c>] platform_drv_probe+0x64/0x120 [ 2.986440] [<805b783c>] really_probe+0xa0/0x278 [ 2.991050] [<805b7c78>] driver_probe_device+0x48/0x78 [ 2.996197] [<805b7d74>] __driver_attach+0xcc/0xd4 [ 3.000980] [<805b5b7c>] bus_for_each_dev+0x7c/0xc4 [ 3.005874] [<805b64f8>] bus_add_driver+0x180/0x240 [ 3.010743] [<805b8428>] driver_register+0xac/0x154 [ 3.015633] [<80ea9e04>] do_one_initcall+0x150/0x1f4 [ 3.020589] [<80eaa080>] kernel_init_freeable+0x1d8/0x298 [ 3.025998] [<80b23c5c>] kernel_init+0x28/0x158 [ 3.030522] [<800153ec>] ret_from_kernel_thread+0x14/0x1c [ 3.035926] [ 3.037412] ---[ end trace cb88537fdc8fa201 ]--- And during configuration transitions (e.g. adb -> mtp,adb) the following warning is no longer shown: [ 311.726159] -----------[ cut here ]----------- [ 311.730817] WARNING: CPU: 0 PID: 0 at drivers/usb/dwc2/gadget.c:1475 s3c_hsotg_rx_data+0x130/0x13c() [ 311.739931] Modules linked in: [ 311.742993] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 3.18.3+ MIPS#45 [ 311.750199] Stack : 00000000 80080370 00000000 00000004 00000006 00000000 00000000 00000000 00000000 00000000 80f05b02 00000042 80d61010 80e18e20 80d60000 8b408010 80e18927 80d0df6c 00000000 00000000 80f03614 80e18e20 80d60000 8b408010 00250182 80a54f54 80e20cc4 80e20cc8 00000000 00000000 80d14ab8 80dfbacc 80dfbacc 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ... [ 311.785841] Call Trace: [ 311.788292] [<8001ac28>] show_stack+0xc4/0xe0 [ 311.792650] [<80a56e58>] dump_stack+0x70/0xbc [ 311.797008] [<80033c14>] warn_slowpath_common+0x88/0xb8 [ 311.802224] [<80033cc8>] warn_slowpath_null+0x18/0x24 [ 311.807266] [<80606a3c>] s3c_hsotg_rx_data+0x130/0x13c [ 311.812397] [<8060afa4>] s3c_hsotg_irq+0x3b4/0x5e8 [ 311.817183] [<80082ab8>] handle_irq_event_percpu+0x90/0x2d0 [ 311.822745] [<80082d4c>] handle_irq_event+0x54/0x98 [ 311.827617] [<80086390>] handle_level_irq+0xe0/0x1c0 [ 311.832572] [<800820bc>] generic_handle_irq+0x3c/0x54 [ 311.837622] [<804bb680>] jz4740_cascade+0x78/0xac [ 311.842317] [<80082ab8>] handle_irq_event_percpu+0x90/0x2d0 [ 311.847881] [<80086d18>] handle_percpu_irq+0x8c/0xbc [ 311.852835] [<800820bc>] generic_handle_irq+0x3c/0x54 [ 311.857878] [<80016c8c>] do_IRQ+0x18/0x2c [ 311.861879] [<80014c40>] ret_from_irq+0x0/0x4 [ 311.866227] [<80016b20>] mips_cpuidle_wait_enter+0x14/0x34 [ 311.871713] [<806d37b0>] cpuidle_enter_state+0x88/0x2c0 [ 311.876934] [<80074308>] cpu_startup_entry+0x36c/0x484 [ 311.882074] [<80e7dc04>] start_kernel+0x4b8/0x4e0 [ 311.886767] [ 311.888253] --[ end trace dd7a60dcc5530db3 ]-- Change-Id: Ic8ac37a28913d4314371de0cd446f8a7cc45864d Signed-off-by: Dragan Cecavac <dragan.cecavac@imgtec.com>
gabrielesvelto
pushed a commit
to gabrielesvelto/CI20_linux
that referenced
this issue
Nov 28, 2018
Before this patch, using multiple active endpoints would not be possible and would actually be canceling each other out. The issue was discovered on Android when combining adb, mtp and ptp configurations together. This patch introduces proper behaviour for these cases. Also, during the boot-up the following warning is no longer shown: [ 2.879328] ------------[ cut here ]------------ [ 2.883983] WARNING: CPU: 0 PID: 1 at drivers/usb/dwc2/gadget.c:212 s3c_hsotg_init_fifo+0x168/0x1d0() [ 2.893204] insufficient fifo memory [ 2.896602] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 3.18.3+ MIPS#10 [ 2.904004] Stack : 00000000 800919a0 00000000 00000004 00000006 800913f4 00000000 00000000 00000000 00000000 80f75a12 00000042 80f75a12 00000042 00000006 00000000 80e42767 80d7c2e 00000001 00000000 80f73574 8bc90418 80ea0000 01000d00 80f06704 80b24c00 00000000 80035388 00000006 00000000 80d834a4 8bc99b04 8bc99b04 80e40000 00000000 00000000 00000000 00000000 00000000 00000000 ... [ 2.939709] Call Trace: [ 2.942174] [<8001bab0>] show_stack+0xd4/0xf0 [ 2.946528] [<80b26c40>] dump_stack+0x70/0xbc [ 2.950880] [<800356bc>] warn_slowpath_common+0x90/0xe8 [ 2.956116] [<80035808>] warn_slowpath_fmt+0x3c/0x48 [ 2.961075] [<8069b824>] s3c_hsotg_init_fifo+0x168/0x1d0 [ 2.966398] [<8069d8fc>] s3c_hsotg_init+0x50/0x9c [ 2.971095] [<806a0388>] dwc2_gadget_init+0x430/0x8c0 [ 2.976158] [<806a0df0>] dwc2_driver_probe+0x218/0x2a8 [ 2.981291] [<805b935c>] platform_drv_probe+0x64/0x120 [ 2.986440] [<805b783c>] really_probe+0xa0/0x278 [ 2.991050] [<805b7c78>] driver_probe_device+0x48/0x78 [ 2.996197] [<805b7d74>] __driver_attach+0xcc/0xd4 [ 3.000980] [<805b5b7c>] bus_for_each_dev+0x7c/0xc4 [ 3.005874] [<805b64f8>] bus_add_driver+0x180/0x240 [ 3.010743] [<805b8428>] driver_register+0xac/0x154 [ 3.015633] [<80ea9e04>] do_one_initcall+0x150/0x1f4 [ 3.020589] [<80eaa080>] kernel_init_freeable+0x1d8/0x298 [ 3.025998] [<80b23c5c>] kernel_init+0x28/0x158 [ 3.030522] [<800153ec>] ret_from_kernel_thread+0x14/0x1c [ 3.035926] [ 3.037412] ---[ end trace cb88537fdc8fa201 ]--- And during configuration transitions (e.g. adb -> mtp,adb) the following warning is no longer shown: [ 311.726159] -----------[ cut here ]----------- [ 311.730817] WARNING: CPU: 0 PID: 0 at drivers/usb/dwc2/gadget.c:1475 s3c_hsotg_rx_data+0x130/0x13c() [ 311.739931] Modules linked in: [ 311.742993] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 3.18.3+ MIPS#45 [ 311.750199] Stack : 00000000 80080370 00000000 00000004 00000006 00000000 00000000 00000000 00000000 00000000 80f05b02 00000042 80d61010 80e18e20 80d60000 8b408010 80e18927 80d0df6c 00000000 00000000 80f03614 80e18e20 80d60000 8b408010 00250182 80a54f54 80e20cc4 80e20cc8 00000000 00000000 80d14ab8 80dfbacc 80dfbacc 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ... [ 311.785841] Call Trace: [ 311.788292] [<8001ac28>] show_stack+0xc4/0xe0 [ 311.792650] [<80a56e58>] dump_stack+0x70/0xbc [ 311.797008] [<80033c14>] warn_slowpath_common+0x88/0xb8 [ 311.802224] [<80033cc8>] warn_slowpath_null+0x18/0x24 [ 311.807266] [<80606a3c>] s3c_hsotg_rx_data+0x130/0x13c [ 311.812397] [<8060afa4>] s3c_hsotg_irq+0x3b4/0x5e8 [ 311.817183] [<80082ab8>] handle_irq_event_percpu+0x90/0x2d0 [ 311.822745] [<80082d4c>] handle_irq_event+0x54/0x98 [ 311.827617] [<80086390>] handle_level_irq+0xe0/0x1c0 [ 311.832572] [<800820bc>] generic_handle_irq+0x3c/0x54 [ 311.837622] [<804bb680>] jz4740_cascade+0x78/0xac [ 311.842317] [<80082ab8>] handle_irq_event_percpu+0x90/0x2d0 [ 311.847881] [<80086d18>] handle_percpu_irq+0x8c/0xbc [ 311.852835] [<800820bc>] generic_handle_irq+0x3c/0x54 [ 311.857878] [<80016c8c>] do_IRQ+0x18/0x2c [ 311.861879] [<80014c40>] ret_from_irq+0x0/0x4 [ 311.866227] [<80016b20>] mips_cpuidle_wait_enter+0x14/0x34 [ 311.871713] [<806d37b0>] cpuidle_enter_state+0x88/0x2c0 [ 311.876934] [<80074308>] cpu_startup_entry+0x36c/0x484 [ 311.882074] [<80e7dc04>] start_kernel+0x4b8/0x4e0 [ 311.886767] [ 311.888253] --[ end trace dd7a60dcc5530db3 ]-- Change-Id: Ic8ac37a28913d4314371de0cd446f8a7cc45864d Signed-off-by: Dragan Cecavac <dragan.cecavac@imgtec.com>
gabrielesvelto
pushed a commit
to gabrielesvelto/CI20_linux
that referenced
this issue
Dec 11, 2018
Before this patch, using multiple active endpoints would not be possible and would actually be canceling each other out. The issue was discovered on Android when combining adb, mtp and ptp configurations together. This patch introduces proper behaviour for these cases. Also, during the boot-up the following warning is no longer shown: [ 2.879328] ------------[ cut here ]------------ [ 2.883983] WARNING: CPU: 0 PID: 1 at drivers/usb/dwc2/gadget.c:212 s3c_hsotg_init_fifo+0x168/0x1d0() [ 2.893204] insufficient fifo memory [ 2.896602] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 3.18.3+ MIPS#10 [ 2.904004] Stack : 00000000 800919a0 00000000 00000004 00000006 800913f4 00000000 00000000 00000000 00000000 80f75a12 00000042 80f75a12 00000042 00000006 00000000 80e42767 80d7c2e 00000001 00000000 80f73574 8bc90418 80ea0000 01000d00 80f06704 80b24c00 00000000 80035388 00000006 00000000 80d834a4 8bc99b04 8bc99b04 80e40000 00000000 00000000 00000000 00000000 00000000 00000000 ... [ 2.939709] Call Trace: [ 2.942174] [<8001bab0>] show_stack+0xd4/0xf0 [ 2.946528] [<80b26c40>] dump_stack+0x70/0xbc [ 2.950880] [<800356bc>] warn_slowpath_common+0x90/0xe8 [ 2.956116] [<80035808>] warn_slowpath_fmt+0x3c/0x48 [ 2.961075] [<8069b824>] s3c_hsotg_init_fifo+0x168/0x1d0 [ 2.966398] [<8069d8fc>] s3c_hsotg_init+0x50/0x9c [ 2.971095] [<806a0388>] dwc2_gadget_init+0x430/0x8c0 [ 2.976158] [<806a0df0>] dwc2_driver_probe+0x218/0x2a8 [ 2.981291] [<805b935c>] platform_drv_probe+0x64/0x120 [ 2.986440] [<805b783c>] really_probe+0xa0/0x278 [ 2.991050] [<805b7c78>] driver_probe_device+0x48/0x78 [ 2.996197] [<805b7d74>] __driver_attach+0xcc/0xd4 [ 3.000980] [<805b5b7c>] bus_for_each_dev+0x7c/0xc4 [ 3.005874] [<805b64f8>] bus_add_driver+0x180/0x240 [ 3.010743] [<805b8428>] driver_register+0xac/0x154 [ 3.015633] [<80ea9e04>] do_one_initcall+0x150/0x1f4 [ 3.020589] [<80eaa080>] kernel_init_freeable+0x1d8/0x298 [ 3.025998] [<80b23c5c>] kernel_init+0x28/0x158 [ 3.030522] [<800153ec>] ret_from_kernel_thread+0x14/0x1c [ 3.035926] [ 3.037412] ---[ end trace cb88537fdc8fa201 ]--- And during configuration transitions (e.g. adb -> mtp,adb) the following warning is no longer shown: [ 311.726159] -----------[ cut here ]----------- [ 311.730817] WARNING: CPU: 0 PID: 0 at drivers/usb/dwc2/gadget.c:1475 s3c_hsotg_rx_data+0x130/0x13c() [ 311.739931] Modules linked in: [ 311.742993] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 3.18.3+ MIPS#45 [ 311.750199] Stack : 00000000 80080370 00000000 00000004 00000006 00000000 00000000 00000000 00000000 00000000 80f05b02 00000042 80d61010 80e18e20 80d60000 8b408010 80e18927 80d0df6c 00000000 00000000 80f03614 80e18e20 80d60000 8b408010 00250182 80a54f54 80e20cc4 80e20cc8 00000000 00000000 80d14ab8 80dfbacc 80dfbacc 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ... [ 311.785841] Call Trace: [ 311.788292] [<8001ac28>] show_stack+0xc4/0xe0 [ 311.792650] [<80a56e58>] dump_stack+0x70/0xbc [ 311.797008] [<80033c14>] warn_slowpath_common+0x88/0xb8 [ 311.802224] [<80033cc8>] warn_slowpath_null+0x18/0x24 [ 311.807266] [<80606a3c>] s3c_hsotg_rx_data+0x130/0x13c [ 311.812397] [<8060afa4>] s3c_hsotg_irq+0x3b4/0x5e8 [ 311.817183] [<80082ab8>] handle_irq_event_percpu+0x90/0x2d0 [ 311.822745] [<80082d4c>] handle_irq_event+0x54/0x98 [ 311.827617] [<80086390>] handle_level_irq+0xe0/0x1c0 [ 311.832572] [<800820bc>] generic_handle_irq+0x3c/0x54 [ 311.837622] [<804bb680>] jz4740_cascade+0x78/0xac [ 311.842317] [<80082ab8>] handle_irq_event_percpu+0x90/0x2d0 [ 311.847881] [<80086d18>] handle_percpu_irq+0x8c/0xbc [ 311.852835] [<800820bc>] generic_handle_irq+0x3c/0x54 [ 311.857878] [<80016c8c>] do_IRQ+0x18/0x2c [ 311.861879] [<80014c40>] ret_from_irq+0x0/0x4 [ 311.866227] [<80016b20>] mips_cpuidle_wait_enter+0x14/0x34 [ 311.871713] [<806d37b0>] cpuidle_enter_state+0x88/0x2c0 [ 311.876934] [<80074308>] cpu_startup_entry+0x36c/0x484 [ 311.882074] [<80e7dc04>] start_kernel+0x4b8/0x4e0 [ 311.886767] [ 311.888253] --[ end trace dd7a60dcc5530db3 ]-- Change-Id: Ic8ac37a28913d4314371de0cd446f8a7cc45864d Signed-off-by: Dragan Cecavac <dragan.cecavac@imgtec.com>
gabrielesvelto
pushed a commit
to gabrielesvelto/CI20_linux
that referenced
this issue
Jan 1, 2019
[ Upstream commit c5a94f4 ] It was observed that a process blocked indefintely in __fscache_read_or_alloc_page(), waiting for FSCACHE_COOKIE_LOOKING_UP to be cleared via fscache_wait_for_deferred_lookup(). At this time, ->backing_objects was empty, which would normaly prevent __fscache_read_or_alloc_page() from getting to the point of waiting. This implies that ->backing_objects was cleared *after* __fscache_read_or_alloc_page was was entered. When an object is "killed" and then "dropped", FSCACHE_COOKIE_LOOKING_UP is cleared in fscache_lookup_failure(), then KILL_OBJECT and DROP_OBJECT are "called" and only in DROP_OBJECT is ->backing_objects cleared. This leaves a window where something else can set FSCACHE_COOKIE_LOOKING_UP and __fscache_read_or_alloc_page() can start waiting, before ->backing_objects is cleared There is some uncertainty in this analysis, but it seems to be fit the observations. Adding the wake in this patch will be handled correctly by __fscache_read_or_alloc_page(), as it checks if ->backing_objects is empty again, after waiting. Customer which reported the hang, also report that the hang cannot be reproduced with this fix. The backtrace for the blocked process looked like: PID: 29360 TASK: ffff881ff2ac0f80 CPU: 3 COMMAND: "zsh" #0 [ffff881ff43efbf8] schedule at ffffffff815e56f1 MIPS#1 [ffff881ff43efc58] bit_wait at ffffffff815e64ed MIPS#2 [ffff881ff43efc68] __wait_on_bit at ffffffff815e61b8 MIPS#3 [ffff881ff43efca0] out_of_line_wait_on_bit at ffffffff815e625e MIPS#4 [ffff881ff43efd08] fscache_wait_for_deferred_lookup at ffffffffa04f2e8f [fscache] MIPS#5 [ffff881ff43efd18] __fscache_read_or_alloc_page at ffffffffa04f2ffe [fscache] MIPS#6 [ffff881ff43efd58] __nfs_readpage_from_fscache at ffffffffa0679668 [nfs] MIPS#7 [ffff881ff43efd78] nfs_readpage at ffffffffa067092b [nfs] MIPS#8 [ffff881ff43efda0] generic_file_read_iter at ffffffff81187a73 MIPS#9 [ffff881ff43efe50] nfs_file_read at ffffffffa066544b [nfs] MIPS#10 [ffff881ff43efe70] __vfs_read at ffffffff811fc756 MIPS#11 [ffff881ff43efee8] vfs_read at ffffffff811fccfa MIPS#12 [ffff881ff43eff18] sys_read at ffffffff811fda62 MIPS#13 [ffff881ff43eff50] entry_SYSCALL_64_fastpath at ffffffff815e986e Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
gabrielesvelto
pushed a commit
to gabrielesvelto/CI20_linux
that referenced
this issue
Jan 1, 2019
Before this patch, using multiple active endpoints would not be possible and would actually be canceling each other out. The issue was discovered on Android when combining adb, mtp and ptp configurations together. This patch introduces proper behaviour for these cases. Also, during the boot-up the following warning is no longer shown: [ 2.879328] ------------[ cut here ]------------ [ 2.883983] WARNING: CPU: 0 PID: 1 at drivers/usb/dwc2/gadget.c:212 s3c_hsotg_init_fifo+0x168/0x1d0() [ 2.893204] insufficient fifo memory [ 2.896602] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 3.18.3+ MIPS#10 [ 2.904004] Stack : 00000000 800919a0 00000000 00000004 00000006 800913f4 00000000 00000000 00000000 00000000 80f75a12 00000042 80f75a12 00000042 00000006 00000000 80e42767 80d7c2e 00000001 00000000 80f73574 8bc90418 80ea0000 01000d00 80f06704 80b24c00 00000000 80035388 00000006 00000000 80d834a4 8bc99b04 8bc99b04 80e40000 00000000 00000000 00000000 00000000 00000000 00000000 ... [ 2.939709] Call Trace: [ 2.942174] [<8001bab0>] show_stack+0xd4/0xf0 [ 2.946528] [<80b26c40>] dump_stack+0x70/0xbc [ 2.950880] [<800356bc>] warn_slowpath_common+0x90/0xe8 [ 2.956116] [<80035808>] warn_slowpath_fmt+0x3c/0x48 [ 2.961075] [<8069b824>] s3c_hsotg_init_fifo+0x168/0x1d0 [ 2.966398] [<8069d8fc>] s3c_hsotg_init+0x50/0x9c [ 2.971095] [<806a0388>] dwc2_gadget_init+0x430/0x8c0 [ 2.976158] [<806a0df0>] dwc2_driver_probe+0x218/0x2a8 [ 2.981291] [<805b935c>] platform_drv_probe+0x64/0x120 [ 2.986440] [<805b783c>] really_probe+0xa0/0x278 [ 2.991050] [<805b7c78>] driver_probe_device+0x48/0x78 [ 2.996197] [<805b7d74>] __driver_attach+0xcc/0xd4 [ 3.000980] [<805b5b7c>] bus_for_each_dev+0x7c/0xc4 [ 3.005874] [<805b64f8>] bus_add_driver+0x180/0x240 [ 3.010743] [<805b8428>] driver_register+0xac/0x154 [ 3.015633] [<80ea9e04>] do_one_initcall+0x150/0x1f4 [ 3.020589] [<80eaa080>] kernel_init_freeable+0x1d8/0x298 [ 3.025998] [<80b23c5c>] kernel_init+0x28/0x158 [ 3.030522] [<800153ec>] ret_from_kernel_thread+0x14/0x1c [ 3.035926] [ 3.037412] ---[ end trace cb88537fdc8fa201 ]--- And during configuration transitions (e.g. adb -> mtp,adb) the following warning is no longer shown: [ 311.726159] -----------[ cut here ]----------- [ 311.730817] WARNING: CPU: 0 PID: 0 at drivers/usb/dwc2/gadget.c:1475 s3c_hsotg_rx_data+0x130/0x13c() [ 311.739931] Modules linked in: [ 311.742993] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 3.18.3+ MIPS#45 [ 311.750199] Stack : 00000000 80080370 00000000 00000004 00000006 00000000 00000000 00000000 00000000 00000000 80f05b02 00000042 80d61010 80e18e20 80d60000 8b408010 80e18927 80d0df6c 00000000 00000000 80f03614 80e18e20 80d60000 8b408010 00250182 80a54f54 80e20cc4 80e20cc8 00000000 00000000 80d14ab8 80dfbacc 80dfbacc 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ... [ 311.785841] Call Trace: [ 311.788292] [<8001ac28>] show_stack+0xc4/0xe0 [ 311.792650] [<80a56e58>] dump_stack+0x70/0xbc [ 311.797008] [<80033c14>] warn_slowpath_common+0x88/0xb8 [ 311.802224] [<80033cc8>] warn_slowpath_null+0x18/0x24 [ 311.807266] [<80606a3c>] s3c_hsotg_rx_data+0x130/0x13c [ 311.812397] [<8060afa4>] s3c_hsotg_irq+0x3b4/0x5e8 [ 311.817183] [<80082ab8>] handle_irq_event_percpu+0x90/0x2d0 [ 311.822745] [<80082d4c>] handle_irq_event+0x54/0x98 [ 311.827617] [<80086390>] handle_level_irq+0xe0/0x1c0 [ 311.832572] [<800820bc>] generic_handle_irq+0x3c/0x54 [ 311.837622] [<804bb680>] jz4740_cascade+0x78/0xac [ 311.842317] [<80082ab8>] handle_irq_event_percpu+0x90/0x2d0 [ 311.847881] [<80086d18>] handle_percpu_irq+0x8c/0xbc [ 311.852835] [<800820bc>] generic_handle_irq+0x3c/0x54 [ 311.857878] [<80016c8c>] do_IRQ+0x18/0x2c [ 311.861879] [<80014c40>] ret_from_irq+0x0/0x4 [ 311.866227] [<80016b20>] mips_cpuidle_wait_enter+0x14/0x34 [ 311.871713] [<806d37b0>] cpuidle_enter_state+0x88/0x2c0 [ 311.876934] [<80074308>] cpu_startup_entry+0x36c/0x484 [ 311.882074] [<80e7dc04>] start_kernel+0x4b8/0x4e0 [ 311.886767] [ 311.888253] --[ end trace dd7a60dcc5530db3 ]-- Change-Id: Ic8ac37a28913d4314371de0cd446f8a7cc45864d Signed-off-by: Dragan Cecavac <dragan.cecavac@imgtec.com>
gabrielesvelto
pushed a commit
to gabrielesvelto/CI20_linux
that referenced
this issue
Jan 13, 2019
Before this patch, using multiple active endpoints would not be possible and would actually be canceling each other out. The issue was discovered on Android when combining adb, mtp and ptp configurations together. This patch introduces proper behaviour for these cases. Also, during the boot-up the following warning is no longer shown: [ 2.879328] ------------[ cut here ]------------ [ 2.883983] WARNING: CPU: 0 PID: 1 at drivers/usb/dwc2/gadget.c:212 s3c_hsotg_init_fifo+0x168/0x1d0() [ 2.893204] insufficient fifo memory [ 2.896602] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 3.18.3+ MIPS#10 [ 2.904004] Stack : 00000000 800919a0 00000000 00000004 00000006 800913f4 00000000 00000000 00000000 00000000 80f75a12 00000042 80f75a12 00000042 00000006 00000000 80e42767 80d7c2e 00000001 00000000 80f73574 8bc90418 80ea0000 01000d00 80f06704 80b24c00 00000000 80035388 00000006 00000000 80d834a4 8bc99b04 8bc99b04 80e40000 00000000 00000000 00000000 00000000 00000000 00000000 ... [ 2.939709] Call Trace: [ 2.942174] [<8001bab0>] show_stack+0xd4/0xf0 [ 2.946528] [<80b26c40>] dump_stack+0x70/0xbc [ 2.950880] [<800356bc>] warn_slowpath_common+0x90/0xe8 [ 2.956116] [<80035808>] warn_slowpath_fmt+0x3c/0x48 [ 2.961075] [<8069b824>] s3c_hsotg_init_fifo+0x168/0x1d0 [ 2.966398] [<8069d8fc>] s3c_hsotg_init+0x50/0x9c [ 2.971095] [<806a0388>] dwc2_gadget_init+0x430/0x8c0 [ 2.976158] [<806a0df0>] dwc2_driver_probe+0x218/0x2a8 [ 2.981291] [<805b935c>] platform_drv_probe+0x64/0x120 [ 2.986440] [<805b783c>] really_probe+0xa0/0x278 [ 2.991050] [<805b7c78>] driver_probe_device+0x48/0x78 [ 2.996197] [<805b7d74>] __driver_attach+0xcc/0xd4 [ 3.000980] [<805b5b7c>] bus_for_each_dev+0x7c/0xc4 [ 3.005874] [<805b64f8>] bus_add_driver+0x180/0x240 [ 3.010743] [<805b8428>] driver_register+0xac/0x154 [ 3.015633] [<80ea9e04>] do_one_initcall+0x150/0x1f4 [ 3.020589] [<80eaa080>] kernel_init_freeable+0x1d8/0x298 [ 3.025998] [<80b23c5c>] kernel_init+0x28/0x158 [ 3.030522] [<800153ec>] ret_from_kernel_thread+0x14/0x1c [ 3.035926] [ 3.037412] ---[ end trace cb88537fdc8fa201 ]--- And during configuration transitions (e.g. adb -> mtp,adb) the following warning is no longer shown: [ 311.726159] -----------[ cut here ]----------- [ 311.730817] WARNING: CPU: 0 PID: 0 at drivers/usb/dwc2/gadget.c:1475 s3c_hsotg_rx_data+0x130/0x13c() [ 311.739931] Modules linked in: [ 311.742993] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 3.18.3+ MIPS#45 [ 311.750199] Stack : 00000000 80080370 00000000 00000004 00000006 00000000 00000000 00000000 00000000 00000000 80f05b02 00000042 80d61010 80e18e20 80d60000 8b408010 80e18927 80d0df6c 00000000 00000000 80f03614 80e18e20 80d60000 8b408010 00250182 80a54f54 80e20cc4 80e20cc8 00000000 00000000 80d14ab8 80dfbacc 80dfbacc 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ... [ 311.785841] Call Trace: [ 311.788292] [<8001ac28>] show_stack+0xc4/0xe0 [ 311.792650] [<80a56e58>] dump_stack+0x70/0xbc [ 311.797008] [<80033c14>] warn_slowpath_common+0x88/0xb8 [ 311.802224] [<80033cc8>] warn_slowpath_null+0x18/0x24 [ 311.807266] [<80606a3c>] s3c_hsotg_rx_data+0x130/0x13c [ 311.812397] [<8060afa4>] s3c_hsotg_irq+0x3b4/0x5e8 [ 311.817183] [<80082ab8>] handle_irq_event_percpu+0x90/0x2d0 [ 311.822745] [<80082d4c>] handle_irq_event+0x54/0x98 [ 311.827617] [<80086390>] handle_level_irq+0xe0/0x1c0 [ 311.832572] [<800820bc>] generic_handle_irq+0x3c/0x54 [ 311.837622] [<804bb680>] jz4740_cascade+0x78/0xac [ 311.842317] [<80082ab8>] handle_irq_event_percpu+0x90/0x2d0 [ 311.847881] [<80086d18>] handle_percpu_irq+0x8c/0xbc [ 311.852835] [<800820bc>] generic_handle_irq+0x3c/0x54 [ 311.857878] [<80016c8c>] do_IRQ+0x18/0x2c [ 311.861879] [<80014c40>] ret_from_irq+0x0/0x4 [ 311.866227] [<80016b20>] mips_cpuidle_wait_enter+0x14/0x34 [ 311.871713] [<806d37b0>] cpuidle_enter_state+0x88/0x2c0 [ 311.876934] [<80074308>] cpu_startup_entry+0x36c/0x484 [ 311.882074] [<80e7dc04>] start_kernel+0x4b8/0x4e0 [ 311.886767] [ 311.888253] --[ end trace dd7a60dcc5530db3 ]-- Change-Id: Ic8ac37a28913d4314371de0cd446f8a7cc45864d Signed-off-by: Dragan Cecavac <dragan.cecavac@imgtec.com>
gabrielesvelto
pushed a commit
to gabrielesvelto/CI20_linux
that referenced
this issue
Feb 1, 2019
Before this patch, using multiple active endpoints would not be possible and would actually be canceling each other out. The issue was discovered on Android when combining adb, mtp and ptp configurations together. This patch introduces proper behaviour for these cases. Also, during the boot-up the following warning is no longer shown: [ 2.879328] ------------[ cut here ]------------ [ 2.883983] WARNING: CPU: 0 PID: 1 at drivers/usb/dwc2/gadget.c:212 s3c_hsotg_init_fifo+0x168/0x1d0() [ 2.893204] insufficient fifo memory [ 2.896602] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 3.18.3+ MIPS#10 [ 2.904004] Stack : 00000000 800919a0 00000000 00000004 00000006 800913f4 00000000 00000000 00000000 00000000 80f75a12 00000042 80f75a12 00000042 00000006 00000000 80e42767 80d7c2e 00000001 00000000 80f73574 8bc90418 80ea0000 01000d00 80f06704 80b24c00 00000000 80035388 00000006 00000000 80d834a4 8bc99b04 8bc99b04 80e40000 00000000 00000000 00000000 00000000 00000000 00000000 ... [ 2.939709] Call Trace: [ 2.942174] [<8001bab0>] show_stack+0xd4/0xf0 [ 2.946528] [<80b26c40>] dump_stack+0x70/0xbc [ 2.950880] [<800356bc>] warn_slowpath_common+0x90/0xe8 [ 2.956116] [<80035808>] warn_slowpath_fmt+0x3c/0x48 [ 2.961075] [<8069b824>] s3c_hsotg_init_fifo+0x168/0x1d0 [ 2.966398] [<8069d8fc>] s3c_hsotg_init+0x50/0x9c [ 2.971095] [<806a0388>] dwc2_gadget_init+0x430/0x8c0 [ 2.976158] [<806a0df0>] dwc2_driver_probe+0x218/0x2a8 [ 2.981291] [<805b935c>] platform_drv_probe+0x64/0x120 [ 2.986440] [<805b783c>] really_probe+0xa0/0x278 [ 2.991050] [<805b7c78>] driver_probe_device+0x48/0x78 [ 2.996197] [<805b7d74>] __driver_attach+0xcc/0xd4 [ 3.000980] [<805b5b7c>] bus_for_each_dev+0x7c/0xc4 [ 3.005874] [<805b64f8>] bus_add_driver+0x180/0x240 [ 3.010743] [<805b8428>] driver_register+0xac/0x154 [ 3.015633] [<80ea9e04>] do_one_initcall+0x150/0x1f4 [ 3.020589] [<80eaa080>] kernel_init_freeable+0x1d8/0x298 [ 3.025998] [<80b23c5c>] kernel_init+0x28/0x158 [ 3.030522] [<800153ec>] ret_from_kernel_thread+0x14/0x1c [ 3.035926] [ 3.037412] ---[ end trace cb88537fdc8fa201 ]--- And during configuration transitions (e.g. adb -> mtp,adb) the following warning is no longer shown: [ 311.726159] -----------[ cut here ]----------- [ 311.730817] WARNING: CPU: 0 PID: 0 at drivers/usb/dwc2/gadget.c:1475 s3c_hsotg_rx_data+0x130/0x13c() [ 311.739931] Modules linked in: [ 311.742993] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 3.18.3+ MIPS#45 [ 311.750199] Stack : 00000000 80080370 00000000 00000004 00000006 00000000 00000000 00000000 00000000 00000000 80f05b02 00000042 80d61010 80e18e20 80d60000 8b408010 80e18927 80d0df6c 00000000 00000000 80f03614 80e18e20 80d60000 8b408010 00250182 80a54f54 80e20cc4 80e20cc8 00000000 00000000 80d14ab8 80dfbacc 80dfbacc 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ... [ 311.785841] Call Trace: [ 311.788292] [<8001ac28>] show_stack+0xc4/0xe0 [ 311.792650] [<80a56e58>] dump_stack+0x70/0xbc [ 311.797008] [<80033c14>] warn_slowpath_common+0x88/0xb8 [ 311.802224] [<80033cc8>] warn_slowpath_null+0x18/0x24 [ 311.807266] [<80606a3c>] s3c_hsotg_rx_data+0x130/0x13c [ 311.812397] [<8060afa4>] s3c_hsotg_irq+0x3b4/0x5e8 [ 311.817183] [<80082ab8>] handle_irq_event_percpu+0x90/0x2d0 [ 311.822745] [<80082d4c>] handle_irq_event+0x54/0x98 [ 311.827617] [<80086390>] handle_level_irq+0xe0/0x1c0 [ 311.832572] [<800820bc>] generic_handle_irq+0x3c/0x54 [ 311.837622] [<804bb680>] jz4740_cascade+0x78/0xac [ 311.842317] [<80082ab8>] handle_irq_event_percpu+0x90/0x2d0 [ 311.847881] [<80086d18>] handle_percpu_irq+0x8c/0xbc [ 311.852835] [<800820bc>] generic_handle_irq+0x3c/0x54 [ 311.857878] [<80016c8c>] do_IRQ+0x18/0x2c [ 311.861879] [<80014c40>] ret_from_irq+0x0/0x4 [ 311.866227] [<80016b20>] mips_cpuidle_wait_enter+0x14/0x34 [ 311.871713] [<806d37b0>] cpuidle_enter_state+0x88/0x2c0 [ 311.876934] [<80074308>] cpu_startup_entry+0x36c/0x484 [ 311.882074] [<80e7dc04>] start_kernel+0x4b8/0x4e0 [ 311.886767] [ 311.888253] --[ end trace dd7a60dcc5530db3 ]-- Change-Id: Ic8ac37a28913d4314371de0cd446f8a7cc45864d Signed-off-by: Dragan Cecavac <dragan.cecavac@imgtec.com>
pcercuei
referenced
this issue
in OpenDingux/linux
Feb 3, 2019
syzbot found that ax25 routes where not properly protected against concurrent use [1]. In this particular report the bug happened while copying ax25->digipeat. Fix this problem by making sure we call ax25_get_route() while ax25_route_lock is held, so that no modification could happen while using the route. The current two ax25_get_route() callers do not sleep, so this change should be fine. Once we do that, ax25_get_route() no longer needs to grab a reference on the found route. [1] ax25_connect(): syz-executor0 uses autobind, please contact jreuter@yaina.de BUG: KASAN: use-after-free in memcpy include/linux/string.h:352 [inline] BUG: KASAN: use-after-free in kmemdup+0x42/0x60 mm/util.c:113 Read of size 66 at addr ffff888066641a80 by task syz-executor2/531 ax25_connect(): syz-executor0 uses autobind, please contact jreuter@yaina.de CPU: 1 PID: 531 Comm: syz-executor2 Not tainted 5.0.0-rc2+ #10 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1db/0x2d0 lib/dump_stack.c:113 print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187 kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317 check_memory_region_inline mm/kasan/generic.c:185 [inline] check_memory_region+0x123/0x190 mm/kasan/generic.c:191 memcpy+0x24/0x50 mm/kasan/common.c:130 memcpy include/linux/string.h:352 [inline] kmemdup+0x42/0x60 mm/util.c:113 kmemdup include/linux/string.h:425 [inline] ax25_rt_autobind+0x25d/0x750 net/ax25/ax25_route.c:424 ax25_connect.cold+0x30/0xa4 net/ax25/af_ax25.c:1224 __sys_connect+0x357/0x490 net/socket.c:1664 __do_sys_connect net/socket.c:1675 [inline] __se_sys_connect net/socket.c:1672 [inline] __x64_sys_connect+0x73/0xb0 net/socket.c:1672 do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x458099 Code: 6d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f870ee22c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002a RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458099 RDX: 0000000000000048 RSI: 0000000020000080 RDI: 0000000000000005 RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000 ax25_connect(): syz-executor4 uses autobind, please contact jreuter@yaina.de R10: 0000000000000000 R11: 0000000000000246 R12: 00007f870ee236d4 R13: 00000000004be48e R14: 00000000004ce9a8 R15: 00000000ffffffff Allocated by task 526: save_stack+0x45/0xd0 mm/kasan/common.c:73 set_track mm/kasan/common.c:85 [inline] __kasan_kmalloc mm/kasan/common.c:496 [inline] __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:469 kasan_kmalloc+0x9/0x10 mm/kasan/common.c:504 ax25_connect(): syz-executor5 uses autobind, please contact jreuter@yaina.de kmem_cache_alloc_trace+0x151/0x760 mm/slab.c:3609 kmalloc include/linux/slab.h:545 [inline] ax25_rt_add net/ax25/ax25_route.c:95 [inline] ax25_rt_ioctl+0x3b9/0x1270 net/ax25/ax25_route.c:233 ax25_ioctl+0x322/0x10b0 net/ax25/af_ax25.c:1763 sock_do_ioctl+0xe2/0x400 net/socket.c:950 sock_ioctl+0x32f/0x6c0 net/socket.c:1074 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:509 [inline] do_vfs_ioctl+0x107b/0x17d0 fs/ioctl.c:696 ksys_ioctl+0xab/0xd0 fs/ioctl.c:713 __do_sys_ioctl fs/ioctl.c:720 [inline] __se_sys_ioctl fs/ioctl.c:718 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718 do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe ax25_connect(): syz-executor5 uses autobind, please contact jreuter@yaina.de Freed by task 550: save_stack+0x45/0xd0 mm/kasan/common.c:73 set_track mm/kasan/common.c:85 [inline] __kasan_slab_free+0x102/0x150 mm/kasan/common.c:458 kasan_slab_free+0xe/0x10 mm/kasan/common.c:466 __cache_free mm/slab.c:3487 [inline] kfree+0xcf/0x230 mm/slab.c:3806 ax25_rt_add net/ax25/ax25_route.c:92 [inline] ax25_rt_ioctl+0x304/0x1270 net/ax25/ax25_route.c:233 ax25_ioctl+0x322/0x10b0 net/ax25/af_ax25.c:1763 sock_do_ioctl+0xe2/0x400 net/socket.c:950 sock_ioctl+0x32f/0x6c0 net/socket.c:1074 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:509 [inline] do_vfs_ioctl+0x107b/0x17d0 fs/ioctl.c:696 ksys_ioctl+0xab/0xd0 fs/ioctl.c:713 __do_sys_ioctl fs/ioctl.c:720 [inline] __se_sys_ioctl fs/ioctl.c:718 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718 do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe The buggy address belongs to the object at ffff888066641a80 which belongs to the cache kmalloc-96 of size 96 The buggy address is located 0 bytes inside of 96-byte region [ffff888066641a80, ffff888066641ae0) The buggy address belongs to the page: page:ffffea0001999040 count:1 mapcount:0 mapping:ffff88812c3f04c0 index:0x0 flags: 0x1fffc0000000200(slab) ax25_connect(): syz-executor4 uses autobind, please contact jreuter@yaina.de raw: 01fffc0000000200 ffffea0001817948 ffffea0002341dc8 ffff88812c3f04c0 raw: 0000000000000000 ffff888066641000 0000000100000020 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888066641980: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc ffff888066641a00: 00 00 00 00 00 00 00 00 02 fc fc fc fc fc fc fc >ffff888066641a80: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc ^ ffff888066641b00: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc ffff888066641b80: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ralf Baechle <ralf@linux-mips.org> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
gabrielesvelto
pushed a commit
to gabrielesvelto/CI20_linux
that referenced
this issue
Feb 12, 2019
Before this patch, using multiple active endpoints would not be possible and would actually be canceling each other out. The issue was discovered on Android when combining adb, mtp and ptp configurations together. This patch introduces proper behaviour for these cases. Also, during the boot-up the following warning is no longer shown: [ 2.879328] ------------[ cut here ]------------ [ 2.883983] WARNING: CPU: 0 PID: 1 at drivers/usb/dwc2/gadget.c:212 s3c_hsotg_init_fifo+0x168/0x1d0() [ 2.893204] insufficient fifo memory [ 2.896602] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 3.18.3+ MIPS#10 [ 2.904004] Stack : 00000000 800919a0 00000000 00000004 00000006 800913f4 00000000 00000000 00000000 00000000 80f75a12 00000042 80f75a12 00000042 00000006 00000000 80e42767 80d7c2e 00000001 00000000 80f73574 8bc90418 80ea0000 01000d00 80f06704 80b24c00 00000000 80035388 00000006 00000000 80d834a4 8bc99b04 8bc99b04 80e40000 00000000 00000000 00000000 00000000 00000000 00000000 ... [ 2.939709] Call Trace: [ 2.942174] [<8001bab0>] show_stack+0xd4/0xf0 [ 2.946528] [<80b26c40>] dump_stack+0x70/0xbc [ 2.950880] [<800356bc>] warn_slowpath_common+0x90/0xe8 [ 2.956116] [<80035808>] warn_slowpath_fmt+0x3c/0x48 [ 2.961075] [<8069b824>] s3c_hsotg_init_fifo+0x168/0x1d0 [ 2.966398] [<8069d8fc>] s3c_hsotg_init+0x50/0x9c [ 2.971095] [<806a0388>] dwc2_gadget_init+0x430/0x8c0 [ 2.976158] [<806a0df0>] dwc2_driver_probe+0x218/0x2a8 [ 2.981291] [<805b935c>] platform_drv_probe+0x64/0x120 [ 2.986440] [<805b783c>] really_probe+0xa0/0x278 [ 2.991050] [<805b7c78>] driver_probe_device+0x48/0x78 [ 2.996197] [<805b7d74>] __driver_attach+0xcc/0xd4 [ 3.000980] [<805b5b7c>] bus_for_each_dev+0x7c/0xc4 [ 3.005874] [<805b64f8>] bus_add_driver+0x180/0x240 [ 3.010743] [<805b8428>] driver_register+0xac/0x154 [ 3.015633] [<80ea9e04>] do_one_initcall+0x150/0x1f4 [ 3.020589] [<80eaa080>] kernel_init_freeable+0x1d8/0x298 [ 3.025998] [<80b23c5c>] kernel_init+0x28/0x158 [ 3.030522] [<800153ec>] ret_from_kernel_thread+0x14/0x1c [ 3.035926] [ 3.037412] ---[ end trace cb88537fdc8fa201 ]--- And during configuration transitions (e.g. adb -> mtp,adb) the following warning is no longer shown: [ 311.726159] -----------[ cut here ]----------- [ 311.730817] WARNING: CPU: 0 PID: 0 at drivers/usb/dwc2/gadget.c:1475 s3c_hsotg_rx_data+0x130/0x13c() [ 311.739931] Modules linked in: [ 311.742993] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 3.18.3+ MIPS#45 [ 311.750199] Stack : 00000000 80080370 00000000 00000004 00000006 00000000 00000000 00000000 00000000 00000000 80f05b02 00000042 80d61010 80e18e20 80d60000 8b408010 80e18927 80d0df6c 00000000 00000000 80f03614 80e18e20 80d60000 8b408010 00250182 80a54f54 80e20cc4 80e20cc8 00000000 00000000 80d14ab8 80dfbacc 80dfbacc 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ... [ 311.785841] Call Trace: [ 311.788292] [<8001ac28>] show_stack+0xc4/0xe0 [ 311.792650] [<80a56e58>] dump_stack+0x70/0xbc [ 311.797008] [<80033c14>] warn_slowpath_common+0x88/0xb8 [ 311.802224] [<80033cc8>] warn_slowpath_null+0x18/0x24 [ 311.807266] [<80606a3c>] s3c_hsotg_rx_data+0x130/0x13c [ 311.812397] [<8060afa4>] s3c_hsotg_irq+0x3b4/0x5e8 [ 311.817183] [<80082ab8>] handle_irq_event_percpu+0x90/0x2d0 [ 311.822745] [<80082d4c>] handle_irq_event+0x54/0x98 [ 311.827617] [<80086390>] handle_level_irq+0xe0/0x1c0 [ 311.832572] [<800820bc>] generic_handle_irq+0x3c/0x54 [ 311.837622] [<804bb680>] jz4740_cascade+0x78/0xac [ 311.842317] [<80082ab8>] handle_irq_event_percpu+0x90/0x2d0 [ 311.847881] [<80086d18>] handle_percpu_irq+0x8c/0xbc [ 311.852835] [<800820bc>] generic_handle_irq+0x3c/0x54 [ 311.857878] [<80016c8c>] do_IRQ+0x18/0x2c [ 311.861879] [<80014c40>] ret_from_irq+0x0/0x4 [ 311.866227] [<80016b20>] mips_cpuidle_wait_enter+0x14/0x34 [ 311.871713] [<806d37b0>] cpuidle_enter_state+0x88/0x2c0 [ 311.876934] [<80074308>] cpu_startup_entry+0x36c/0x484 [ 311.882074] [<80e7dc04>] start_kernel+0x4b8/0x4e0 [ 311.886767] [ 311.888253] --[ end trace dd7a60dcc5530db3 ]-- Change-Id: Ic8ac37a28913d4314371de0cd446f8a7cc45864d Signed-off-by: Dragan Cecavac <dragan.cecavac@imgtec.com>
pcercuei
referenced
this issue
in OpenDingux/linux
Feb 25, 2019
commit 63530ab upstream. syzbot found that ax25 routes where not properly protected against concurrent use [1]. In this particular report the bug happened while copying ax25->digipeat. Fix this problem by making sure we call ax25_get_route() while ax25_route_lock is held, so that no modification could happen while using the route. The current two ax25_get_route() callers do not sleep, so this change should be fine. Once we do that, ax25_get_route() no longer needs to grab a reference on the found route. [1] ax25_connect(): syz-executor0 uses autobind, please contact jreuter@yaina.de BUG: KASAN: use-after-free in memcpy include/linux/string.h:352 [inline] BUG: KASAN: use-after-free in kmemdup+0x42/0x60 mm/util.c:113 Read of size 66 at addr ffff888066641a80 by task syz-executor2/531 ax25_connect(): syz-executor0 uses autobind, please contact jreuter@yaina.de CPU: 1 PID: 531 Comm: syz-executor2 Not tainted 5.0.0-rc2+ #10 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1db/0x2d0 lib/dump_stack.c:113 print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187 kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317 check_memory_region_inline mm/kasan/generic.c:185 [inline] check_memory_region+0x123/0x190 mm/kasan/generic.c:191 memcpy+0x24/0x50 mm/kasan/common.c:130 memcpy include/linux/string.h:352 [inline] kmemdup+0x42/0x60 mm/util.c:113 kmemdup include/linux/string.h:425 [inline] ax25_rt_autobind+0x25d/0x750 net/ax25/ax25_route.c:424 ax25_connect.cold+0x30/0xa4 net/ax25/af_ax25.c:1224 __sys_connect+0x357/0x490 net/socket.c:1664 __do_sys_connect net/socket.c:1675 [inline] __se_sys_connect net/socket.c:1672 [inline] __x64_sys_connect+0x73/0xb0 net/socket.c:1672 do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x458099 Code: 6d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f870ee22c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002a RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458099 RDX: 0000000000000048 RSI: 0000000020000080 RDI: 0000000000000005 RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000 ax25_connect(): syz-executor4 uses autobind, please contact jreuter@yaina.de R10: 0000000000000000 R11: 0000000000000246 R12: 00007f870ee236d4 R13: 00000000004be48e R14: 00000000004ce9a8 R15: 00000000ffffffff Allocated by task 526: save_stack+0x45/0xd0 mm/kasan/common.c:73 set_track mm/kasan/common.c:85 [inline] __kasan_kmalloc mm/kasan/common.c:496 [inline] __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:469 kasan_kmalloc+0x9/0x10 mm/kasan/common.c:504 ax25_connect(): syz-executor5 uses autobind, please contact jreuter@yaina.de kmem_cache_alloc_trace+0x151/0x760 mm/slab.c:3609 kmalloc include/linux/slab.h:545 [inline] ax25_rt_add net/ax25/ax25_route.c:95 [inline] ax25_rt_ioctl+0x3b9/0x1270 net/ax25/ax25_route.c:233 ax25_ioctl+0x322/0x10b0 net/ax25/af_ax25.c:1763 sock_do_ioctl+0xe2/0x400 net/socket.c:950 sock_ioctl+0x32f/0x6c0 net/socket.c:1074 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:509 [inline] do_vfs_ioctl+0x107b/0x17d0 fs/ioctl.c:696 ksys_ioctl+0xab/0xd0 fs/ioctl.c:713 __do_sys_ioctl fs/ioctl.c:720 [inline] __se_sys_ioctl fs/ioctl.c:718 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718 do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe ax25_connect(): syz-executor5 uses autobind, please contact jreuter@yaina.de Freed by task 550: save_stack+0x45/0xd0 mm/kasan/common.c:73 set_track mm/kasan/common.c:85 [inline] __kasan_slab_free+0x102/0x150 mm/kasan/common.c:458 kasan_slab_free+0xe/0x10 mm/kasan/common.c:466 __cache_free mm/slab.c:3487 [inline] kfree+0xcf/0x230 mm/slab.c:3806 ax25_rt_add net/ax25/ax25_route.c:92 [inline] ax25_rt_ioctl+0x304/0x1270 net/ax25/ax25_route.c:233 ax25_ioctl+0x322/0x10b0 net/ax25/af_ax25.c:1763 sock_do_ioctl+0xe2/0x400 net/socket.c:950 sock_ioctl+0x32f/0x6c0 net/socket.c:1074 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:509 [inline] do_vfs_ioctl+0x107b/0x17d0 fs/ioctl.c:696 ksys_ioctl+0xab/0xd0 fs/ioctl.c:713 __do_sys_ioctl fs/ioctl.c:720 [inline] __se_sys_ioctl fs/ioctl.c:718 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718 do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe The buggy address belongs to the object at ffff888066641a80 which belongs to the cache kmalloc-96 of size 96 The buggy address is located 0 bytes inside of 96-byte region [ffff888066641a80, ffff888066641ae0) The buggy address belongs to the page: page:ffffea0001999040 count:1 mapcount:0 mapping:ffff88812c3f04c0 index:0x0 flags: 0x1fffc0000000200(slab) ax25_connect(): syz-executor4 uses autobind, please contact jreuter@yaina.de raw: 01fffc0000000200 ffffea0001817948 ffffea0002341dc8 ffff88812c3f04c0 raw: 0000000000000000 ffff888066641000 0000000100000020 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888066641980: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc ffff888066641a00: 00 00 00 00 00 00 00 00 02 fc fc fc fc fc fc fc >ffff888066641a80: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc ^ ffff888066641b00: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc ffff888066641b80: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ralf Baechle <ralf@linux-mips.org> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
pcercuei
referenced
this issue
in OpenDingux/linux
May 6, 2019
There is a KASAN slab-out-of-bounds: BUG: KASAN: slab-out-of-bounds in _copy_from_iter_full+0x783/0xaa0 Read of size 80 at addr ffff88810c35e180 by task mount.cifs/539 CPU: 1 PID: 539 Comm: mount.cifs Not tainted 4.19 #10 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0xdd/0x12a print_address_description+0xa7/0x540 kasan_report+0x1ff/0x550 check_memory_region+0x2f1/0x310 memcpy+0x2f/0x80 _copy_from_iter_full+0x783/0xaa0 tcp_sendmsg_locked+0x1840/0x4140 tcp_sendmsg+0x37/0x60 inet_sendmsg+0x18c/0x490 sock_sendmsg+0xae/0x130 smb_send_kvec+0x29c/0x520 __smb_send_rqst+0x3ef/0xc60 smb_send_rqst+0x25a/0x2e0 compound_send_recv+0x9e8/0x2af0 cifs_send_recv+0x24/0x30 SMB2_open+0x35e/0x1620 open_shroot+0x27b/0x490 smb2_open_op_close+0x4e1/0x590 smb2_query_path_info+0x2ac/0x650 cifs_get_inode_info+0x1058/0x28f0 cifs_root_iget+0x3bb/0xf80 cifs_smb3_do_mount+0xe00/0x14c0 cifs_do_mount+0x15/0x20 mount_fs+0x5e/0x290 vfs_kern_mount+0x88/0x460 do_mount+0x398/0x31e0 ksys_mount+0xc6/0x150 __x64_sys_mount+0xea/0x190 do_syscall_64+0x122/0x590 entry_SYSCALL_64_after_hwframe+0x44/0xa9 It can be reproduced by the following step: 1. samba configured with: server max protocol = SMB2_10 2. mount -o vers=default When parse the mount version parameter, the 'ops' and 'vals' was setted to smb30, if negotiate result is smb21, just update the 'ops' to smb21, but the 'vals' is still smb30. When add lease context, the iov_base is allocated with smb21 ops, but the iov_len is initiallited with the smb30. Because the iov_len is longer than iov_base, when send the message, copy array out of bounds. we need to keep the 'ops' and 'vals' consistent. Fixes: 9764c02 ("SMB3: Add support for multidialect negotiate (SMB2.1 and later)") Fixes: d5c7076 ("smb3: add smb3.1.1 to default dialect list") Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: Steve French <stfrench@microsoft.com> CC: Stable <stable@vger.kernel.org> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
pcercuei
referenced
this issue
in OpenDingux/linux
May 6, 2019
By calling maps__insert() we assume to get 2 references on the map, which we relese within maps__remove call. However if there's already same map name, we currently don't bump the reference and can crash, like: Program received signal SIGABRT, Aborted. 0x00007ffff75e60f5 in raise () from /lib64/libc.so.6 (gdb) bt #0 0x00007ffff75e60f5 in raise () from /lib64/libc.so.6 #1 0x00007ffff75d0895 in abort () from /lib64/libc.so.6 #2 0x00007ffff75d0769 in __assert_fail_base.cold () from /lib64/libc.so.6 #3 0x00007ffff75de596 in __assert_fail () from /lib64/libc.so.6 #4 0x00000000004fc006 in refcount_sub_and_test (i=1, r=0x1224e88) at tools/include/linux/refcount.h:131 #5 refcount_dec_and_test (r=0x1224e88) at tools/include/linux/refcount.h:148 #6 map__put (map=0x1224df0) at util/map.c:299 #7 0x00000000004fdb95 in __maps__remove (map=0x1224df0, maps=0xb17d80) at util/map.c:953 #8 maps__remove (maps=0xb17d80, map=0x1224df0) at util/map.c:959 #9 0x00000000004f7d8a in map_groups__remove (map=<optimized out>, mg=<optimized out>) at util/map_groups.h:65 #10 machine__process_ksymbol_unregister (sample=<optimized out>, event=0x7ffff7279670, machine=<optimized out>) at util/machine.c:728 #11 machine__process_ksymbol (machine=<optimized out>, event=0x7ffff7279670, sample=<optimized out>) at util/machine.c:741 #12 0x00000000004fffbb in perf_session__deliver_event (session=0xb11390, event=0x7ffff7279670, tool=0x7fffffffc7b0, file_offset=13936) at util/session.c:1362 #13 0x00000000005039bb in do_flush (show_progress=false, oe=0xb17e80) at util/ordered-events.c:243 #14 __ordered_events__flush (oe=0xb17e80, how=OE_FLUSH__ROUND, timestamp=<optimized out>) at util/ordered-events.c:322 #15 0x00000000005005e4 in perf_session__process_user_event (session=session@entry=0xb11390, event=event@entry=0x7ffff72a4af8, ... Add the map to the list and getting the reference event if we find the map with same name. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Fixes: 1e62856 ("perf symbols: Fix slowness due to -ffunction-section") Link: http://lkml.kernel.org/r/20190416160127.30203-10-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
nemunaire
pushed a commit
to nemunaire/CI20_linux
that referenced
this issue
Jun 16, 2019
commit 63530ab upstream. syzbot found that ax25 routes where not properly protected against concurrent use [1]. In this particular report the bug happened while copying ax25->digipeat. Fix this problem by making sure we call ax25_get_route() while ax25_route_lock is held, so that no modification could happen while using the route. The current two ax25_get_route() callers do not sleep, so this change should be fine. Once we do that, ax25_get_route() no longer needs to grab a reference on the found route. [1] ax25_connect(): syz-executor0 uses autobind, please contact jreuter@yaina.de BUG: KASAN: use-after-free in memcpy include/linux/string.h:352 [inline] BUG: KASAN: use-after-free in kmemdup+0x42/0x60 mm/util.c:113 Read of size 66 at addr ffff888066641a80 by task syz-executor2/531 ax25_connect(): syz-executor0 uses autobind, please contact jreuter@yaina.de CPU: 1 PID: 531 Comm: syz-executor2 Not tainted 5.0.0-rc2+ MIPS#10 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1db/0x2d0 lib/dump_stack.c:113 print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187 kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317 check_memory_region_inline mm/kasan/generic.c:185 [inline] check_memory_region+0x123/0x190 mm/kasan/generic.c:191 memcpy+0x24/0x50 mm/kasan/common.c:130 memcpy include/linux/string.h:352 [inline] kmemdup+0x42/0x60 mm/util.c:113 kmemdup include/linux/string.h:425 [inline] ax25_rt_autobind+0x25d/0x750 net/ax25/ax25_route.c:424 ax25_connect.cold+0x30/0xa4 net/ax25/af_ax25.c:1224 __sys_connect+0x357/0x490 net/socket.c:1664 __do_sys_connect net/socket.c:1675 [inline] __se_sys_connect net/socket.c:1672 [inline] __x64_sys_connect+0x73/0xb0 net/socket.c:1672 do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x458099 Code: 6d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f870ee22c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002a RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458099 RDX: 0000000000000048 RSI: 0000000020000080 RDI: 0000000000000005 RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000 ax25_connect(): syz-executor4 uses autobind, please contact jreuter@yaina.de R10: 0000000000000000 R11: 0000000000000246 R12: 00007f870ee236d4 R13: 00000000004be48e R14: 00000000004ce9a8 R15: 00000000ffffffff Allocated by task 526: save_stack+0x45/0xd0 mm/kasan/common.c:73 set_track mm/kasan/common.c:85 [inline] __kasan_kmalloc mm/kasan/common.c:496 [inline] __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:469 kasan_kmalloc+0x9/0x10 mm/kasan/common.c:504 ax25_connect(): syz-executor5 uses autobind, please contact jreuter@yaina.de kmem_cache_alloc_trace+0x151/0x760 mm/slab.c:3609 kmalloc include/linux/slab.h:545 [inline] ax25_rt_add net/ax25/ax25_route.c:95 [inline] ax25_rt_ioctl+0x3b9/0x1270 net/ax25/ax25_route.c:233 ax25_ioctl+0x322/0x10b0 net/ax25/af_ax25.c:1763 sock_do_ioctl+0xe2/0x400 net/socket.c:950 sock_ioctl+0x32f/0x6c0 net/socket.c:1074 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:509 [inline] do_vfs_ioctl+0x107b/0x17d0 fs/ioctl.c:696 ksys_ioctl+0xab/0xd0 fs/ioctl.c:713 __do_sys_ioctl fs/ioctl.c:720 [inline] __se_sys_ioctl fs/ioctl.c:718 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718 do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe ax25_connect(): syz-executor5 uses autobind, please contact jreuter@yaina.de Freed by task 550: save_stack+0x45/0xd0 mm/kasan/common.c:73 set_track mm/kasan/common.c:85 [inline] __kasan_slab_free+0x102/0x150 mm/kasan/common.c:458 kasan_slab_free+0xe/0x10 mm/kasan/common.c:466 __cache_free mm/slab.c:3487 [inline] kfree+0xcf/0x230 mm/slab.c:3806 ax25_rt_add net/ax25/ax25_route.c:92 [inline] ax25_rt_ioctl+0x304/0x1270 net/ax25/ax25_route.c:233 ax25_ioctl+0x322/0x10b0 net/ax25/af_ax25.c:1763 sock_do_ioctl+0xe2/0x400 net/socket.c:950 sock_ioctl+0x32f/0x6c0 net/socket.c:1074 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:509 [inline] do_vfs_ioctl+0x107b/0x17d0 fs/ioctl.c:696 ksys_ioctl+0xab/0xd0 fs/ioctl.c:713 __do_sys_ioctl fs/ioctl.c:720 [inline] __se_sys_ioctl fs/ioctl.c:718 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718 do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe The buggy address belongs to the object at ffff888066641a80 which belongs to the cache kmalloc-96 of size 96 The buggy address is located 0 bytes inside of 96-byte region [ffff888066641a80, ffff888066641ae0) The buggy address belongs to the page: page:ffffea0001999040 count:1 mapcount:0 mapping:ffff88812c3f04c0 index:0x0 flags: 0x1fffc0000000200(slab) ax25_connect(): syz-executor4 uses autobind, please contact jreuter@yaina.de raw: 01fffc0000000200 ffffea0001817948 ffffea0002341dc8 ffff88812c3f04c0 raw: 0000000000000000 ffff888066641000 0000000100000020 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888066641980: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc ffff888066641a00: 00 00 00 00 00 00 00 00 02 fc fc fc fc fc fc fc >ffff888066641a80: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc ^ ffff888066641b00: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc ffff888066641b80: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ralf Baechle <ralf@linux-mips.org> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
nemunaire
pushed a commit
to nemunaire/CI20_linux
that referenced
this issue
Jun 16, 2019
commit fa30dde upstream. We see the following NULL pointer dereference while running xfstests generic/475: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 PGD 8000000c84bad067 P4D 8000000c84bad067 PUD c84e62067 PMD 0 Oops: 0000 [MIPS#1] SMP PTI CPU: 7 PID: 9886 Comm: fsstress Kdump: loaded Not tainted 5.0.0-rc8 MIPS#10 RIP: 0010:ext4_do_update_inode+0x4ec/0x760 ... Call Trace: ? jbd2_journal_get_write_access+0x42/0x50 ? __ext4_journal_get_write_access+0x2c/0x70 ? ext4_truncate+0x186/0x3f0 ext4_mark_iloc_dirty+0x61/0x80 ext4_mark_inode_dirty+0x62/0x1b0 ext4_truncate+0x186/0x3f0 ? unmap_mapping_pages+0x56/0x100 ext4_setattr+0x817/0x8b0 notify_change+0x1df/0x430 do_truncate+0x5e/0x90 ? generic_permission+0x12b/0x1a0 This is triggered because the NULL pointer handle->h_transaction was dereferenced in function ext4_update_inode_fsync_trans(). I found that the h_transaction was set to NULL in jbd2__journal_restart but failed to attached to a new transaction while the journal is aborted. Fix this by checking the handle before updating the inode. Fixes: b436b9b ("ext4: Wait for proper transaction commit on fsync") Signed-off-by: Jiufei Xue <jiufei.xue@linux.alibaba.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
nemunaire
pushed a commit
to nemunaire/CI20_linux
that referenced
this issue
Jun 16, 2019
[ Upstream commit d982b33 ] ================================================================= ==20875==ERROR: LeakSanitizer: detected memory leaks Direct leak of 1160 byte(s) in 1 object(s) allocated from: #0 0x7f1b6fc84138 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee138) MIPS#1 0x55bd50005599 in zalloc util/util.h:23 MIPS#2 0x55bd500068f5 in perf_evsel__newtp_idx util/evsel.c:327 MIPS#3 0x55bd4ff810fc in perf_evsel__newtp /home/work/linux/tools/perf/util/evsel.h:216 MIPS#4 0x55bd4ff81608 in test__perf_evsel__tp_sched_test tests/evsel-tp-sched.c:69 MIPS#5 0x55bd4ff528e6 in run_test tests/builtin-test.c:358 MIPS#6 0x55bd4ff52baf in test_and_print tests/builtin-test.c:388 MIPS#7 0x55bd4ff543fe in __cmd_test tests/builtin-test.c:583 MIPS#8 0x55bd4ff5572f in cmd_test tests/builtin-test.c:722 MIPS#9 0x55bd4ffc4087 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302 MIPS#10 0x55bd4ffc45c6 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354 MIPS#11 0x55bd4ffc49ca in run_argv /home/changbin/work/linux/tools/perf/perf.c:398 MIPS#12 0x55bd4ffc5138 in main /home/changbin/work/linux/tools/perf/perf.c:520 MIPS#13 0x7f1b6e34809a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) Indirect leak of 19 byte(s) in 1 object(s) allocated from: #0 0x7f1b6fc83f30 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xedf30) MIPS#1 0x7f1b6e3ac30f in vasprintf (/lib/x86_64-linux-gnu/libc.so.6+0x8830f) Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Fixes: 6a6cd11 ("perf test: Add test for the sched tracepoint format fields") Link: http://lkml.kernel.org/r/20190316080556.3075-17-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
nemunaire
pushed a commit
to nemunaire/CI20_linux
that referenced
this issue
Jun 16, 2019
Before this patch, using multiple active endpoints would not be possible and would actually be canceling each other out. The issue was discovered on Android when combining adb, mtp and ptp configurations together. This patch introduces proper behaviour for these cases. Also, during the boot-up the following warning is no longer shown: [ 2.879328] ------------[ cut here ]------------ [ 2.883983] WARNING: CPU: 0 PID: 1 at drivers/usb/dwc2/gadget.c:212 s3c_hsotg_init_fifo+0x168/0x1d0() [ 2.893204] insufficient fifo memory [ 2.896602] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 3.18.3+ MIPS#10 [ 2.904004] Stack : 00000000 800919a0 00000000 00000004 00000006 800913f4 00000000 00000000 00000000 00000000 80f75a12 00000042 80f75a12 00000042 00000006 00000000 80e42767 80d7c2e 00000001 00000000 80f73574 8bc90418 80ea0000 01000d00 80f06704 80b24c00 00000000 80035388 00000006 00000000 80d834a4 8bc99b04 8bc99b04 80e40000 00000000 00000000 00000000 00000000 00000000 00000000 ... [ 2.939709] Call Trace: [ 2.942174] [<8001bab0>] show_stack+0xd4/0xf0 [ 2.946528] [<80b26c40>] dump_stack+0x70/0xbc [ 2.950880] [<800356bc>] warn_slowpath_common+0x90/0xe8 [ 2.956116] [<80035808>] warn_slowpath_fmt+0x3c/0x48 [ 2.961075] [<8069b824>] s3c_hsotg_init_fifo+0x168/0x1d0 [ 2.966398] [<8069d8fc>] s3c_hsotg_init+0x50/0x9c [ 2.971095] [<806a0388>] dwc2_gadget_init+0x430/0x8c0 [ 2.976158] [<806a0df0>] dwc2_driver_probe+0x218/0x2a8 [ 2.981291] [<805b935c>] platform_drv_probe+0x64/0x120 [ 2.986440] [<805b783c>] really_probe+0xa0/0x278 [ 2.991050] [<805b7c78>] driver_probe_device+0x48/0x78 [ 2.996197] [<805b7d74>] __driver_attach+0xcc/0xd4 [ 3.000980] [<805b5b7c>] bus_for_each_dev+0x7c/0xc4 [ 3.005874] [<805b64f8>] bus_add_driver+0x180/0x240 [ 3.010743] [<805b8428>] driver_register+0xac/0x154 [ 3.015633] [<80ea9e04>] do_one_initcall+0x150/0x1f4 [ 3.020589] [<80eaa080>] kernel_init_freeable+0x1d8/0x298 [ 3.025998] [<80b23c5c>] kernel_init+0x28/0x158 [ 3.030522] [<800153ec>] ret_from_kernel_thread+0x14/0x1c [ 3.035926] [ 3.037412] ---[ end trace cb88537fdc8fa201 ]--- And during configuration transitions (e.g. adb -> mtp,adb) the following warning is no longer shown: [ 311.726159] -----------[ cut here ]----------- [ 311.730817] WARNING: CPU: 0 PID: 0 at drivers/usb/dwc2/gadget.c:1475 s3c_hsotg_rx_data+0x130/0x13c() [ 311.739931] Modules linked in: [ 311.742993] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 3.18.3+ MIPS#45 [ 311.750199] Stack : 00000000 80080370 00000000 00000004 00000006 00000000 00000000 00000000 00000000 00000000 80f05b02 00000042 80d61010 80e18e20 80d60000 8b408010 80e18927 80d0df6c 00000000 00000000 80f03614 80e18e20 80d60000 8b408010 00250182 80a54f54 80e20cc4 80e20cc8 00000000 00000000 80d14ab8 80dfbacc 80dfbacc 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ... [ 311.785841] Call Trace: [ 311.788292] [<8001ac28>] show_stack+0xc4/0xe0 [ 311.792650] [<80a56e58>] dump_stack+0x70/0xbc [ 311.797008] [<80033c14>] warn_slowpath_common+0x88/0xb8 [ 311.802224] [<80033cc8>] warn_slowpath_null+0x18/0x24 [ 311.807266] [<80606a3c>] s3c_hsotg_rx_data+0x130/0x13c [ 311.812397] [<8060afa4>] s3c_hsotg_irq+0x3b4/0x5e8 [ 311.817183] [<80082ab8>] handle_irq_event_percpu+0x90/0x2d0 [ 311.822745] [<80082d4c>] handle_irq_event+0x54/0x98 [ 311.827617] [<80086390>] handle_level_irq+0xe0/0x1c0 [ 311.832572] [<800820bc>] generic_handle_irq+0x3c/0x54 [ 311.837622] [<804bb680>] jz4740_cascade+0x78/0xac [ 311.842317] [<80082ab8>] handle_irq_event_percpu+0x90/0x2d0 [ 311.847881] [<80086d18>] handle_percpu_irq+0x8c/0xbc [ 311.852835] [<800820bc>] generic_handle_irq+0x3c/0x54 [ 311.857878] [<80016c8c>] do_IRQ+0x18/0x2c [ 311.861879] [<80014c40>] ret_from_irq+0x0/0x4 [ 311.866227] [<80016b20>] mips_cpuidle_wait_enter+0x14/0x34 [ 311.871713] [<806d37b0>] cpuidle_enter_state+0x88/0x2c0 [ 311.876934] [<80074308>] cpu_startup_entry+0x36c/0x484 [ 311.882074] [<80e7dc04>] start_kernel+0x4b8/0x4e0 [ 311.886767] [ 311.888253] --[ end trace dd7a60dcc5530db3 ]-- Change-Id: Ic8ac37a28913d4314371de0cd446f8a7cc45864d Signed-off-by: Dragan Cecavac <dragan.cecavac@imgtec.com>
gabrielesvelto
pushed a commit
to gabrielesvelto/CI20_linux
that referenced
this issue
Jan 17, 2020
commit 63530ab upstream. syzbot found that ax25 routes where not properly protected against concurrent use [1]. In this particular report the bug happened while copying ax25->digipeat. Fix this problem by making sure we call ax25_get_route() while ax25_route_lock is held, so that no modification could happen while using the route. The current two ax25_get_route() callers do not sleep, so this change should be fine. Once we do that, ax25_get_route() no longer needs to grab a reference on the found route. [1] ax25_connect(): syz-executor0 uses autobind, please contact jreuter@yaina.de BUG: KASAN: use-after-free in memcpy include/linux/string.h:352 [inline] BUG: KASAN: use-after-free in kmemdup+0x42/0x60 mm/util.c:113 Read of size 66 at addr ffff888066641a80 by task syz-executor2/531 ax25_connect(): syz-executor0 uses autobind, please contact jreuter@yaina.de CPU: 1 PID: 531 Comm: syz-executor2 Not tainted 5.0.0-rc2+ MIPS#10 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1db/0x2d0 lib/dump_stack.c:113 print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187 kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317 check_memory_region_inline mm/kasan/generic.c:185 [inline] check_memory_region+0x123/0x190 mm/kasan/generic.c:191 memcpy+0x24/0x50 mm/kasan/common.c:130 memcpy include/linux/string.h:352 [inline] kmemdup+0x42/0x60 mm/util.c:113 kmemdup include/linux/string.h:425 [inline] ax25_rt_autobind+0x25d/0x750 net/ax25/ax25_route.c:424 ax25_connect.cold+0x30/0xa4 net/ax25/af_ax25.c:1224 __sys_connect+0x357/0x490 net/socket.c:1664 __do_sys_connect net/socket.c:1675 [inline] __se_sys_connect net/socket.c:1672 [inline] __x64_sys_connect+0x73/0xb0 net/socket.c:1672 do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x458099 Code: 6d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f870ee22c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002a RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458099 RDX: 0000000000000048 RSI: 0000000020000080 RDI: 0000000000000005 RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000 ax25_connect(): syz-executor4 uses autobind, please contact jreuter@yaina.de R10: 0000000000000000 R11: 0000000000000246 R12: 00007f870ee236d4 R13: 00000000004be48e R14: 00000000004ce9a8 R15: 00000000ffffffff Allocated by task 526: save_stack+0x45/0xd0 mm/kasan/common.c:73 set_track mm/kasan/common.c:85 [inline] __kasan_kmalloc mm/kasan/common.c:496 [inline] __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:469 kasan_kmalloc+0x9/0x10 mm/kasan/common.c:504 ax25_connect(): syz-executor5 uses autobind, please contact jreuter@yaina.de kmem_cache_alloc_trace+0x151/0x760 mm/slab.c:3609 kmalloc include/linux/slab.h:545 [inline] ax25_rt_add net/ax25/ax25_route.c:95 [inline] ax25_rt_ioctl+0x3b9/0x1270 net/ax25/ax25_route.c:233 ax25_ioctl+0x322/0x10b0 net/ax25/af_ax25.c:1763 sock_do_ioctl+0xe2/0x400 net/socket.c:950 sock_ioctl+0x32f/0x6c0 net/socket.c:1074 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:509 [inline] do_vfs_ioctl+0x107b/0x17d0 fs/ioctl.c:696 ksys_ioctl+0xab/0xd0 fs/ioctl.c:713 __do_sys_ioctl fs/ioctl.c:720 [inline] __se_sys_ioctl fs/ioctl.c:718 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718 do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe ax25_connect(): syz-executor5 uses autobind, please contact jreuter@yaina.de Freed by task 550: save_stack+0x45/0xd0 mm/kasan/common.c:73 set_track mm/kasan/common.c:85 [inline] __kasan_slab_free+0x102/0x150 mm/kasan/common.c:458 kasan_slab_free+0xe/0x10 mm/kasan/common.c:466 __cache_free mm/slab.c:3487 [inline] kfree+0xcf/0x230 mm/slab.c:3806 ax25_rt_add net/ax25/ax25_route.c:92 [inline] ax25_rt_ioctl+0x304/0x1270 net/ax25/ax25_route.c:233 ax25_ioctl+0x322/0x10b0 net/ax25/af_ax25.c:1763 sock_do_ioctl+0xe2/0x400 net/socket.c:950 sock_ioctl+0x32f/0x6c0 net/socket.c:1074 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:509 [inline] do_vfs_ioctl+0x107b/0x17d0 fs/ioctl.c:696 ksys_ioctl+0xab/0xd0 fs/ioctl.c:713 __do_sys_ioctl fs/ioctl.c:720 [inline] __se_sys_ioctl fs/ioctl.c:718 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718 do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe The buggy address belongs to the object at ffff888066641a80 which belongs to the cache kmalloc-96 of size 96 The buggy address is located 0 bytes inside of 96-byte region [ffff888066641a80, ffff888066641ae0) The buggy address belongs to the page: page:ffffea0001999040 count:1 mapcount:0 mapping:ffff88812c3f04c0 index:0x0 flags: 0x1fffc0000000200(slab) ax25_connect(): syz-executor4 uses autobind, please contact jreuter@yaina.de raw: 01fffc0000000200 ffffea0001817948 ffffea0002341dc8 ffff88812c3f04c0 raw: 0000000000000000 ffff888066641000 0000000100000020 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888066641980: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc ffff888066641a00: 00 00 00 00 00 00 00 00 02 fc fc fc fc fc fc fc >ffff888066641a80: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc ^ ffff888066641b00: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc ffff888066641b80: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ralf Baechle <ralf@linux-mips.org> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
gabrielesvelto
pushed a commit
to gabrielesvelto/CI20_linux
that referenced
this issue
Jan 17, 2020
commit fa30dde upstream. We see the following NULL pointer dereference while running xfstests generic/475: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 PGD 8000000c84bad067 P4D 8000000c84bad067 PUD c84e62067 PMD 0 Oops: 0000 [MIPS#1] SMP PTI CPU: 7 PID: 9886 Comm: fsstress Kdump: loaded Not tainted 5.0.0-rc8 MIPS#10 RIP: 0010:ext4_do_update_inode+0x4ec/0x760 ... Call Trace: ? jbd2_journal_get_write_access+0x42/0x50 ? __ext4_journal_get_write_access+0x2c/0x70 ? ext4_truncate+0x186/0x3f0 ext4_mark_iloc_dirty+0x61/0x80 ext4_mark_inode_dirty+0x62/0x1b0 ext4_truncate+0x186/0x3f0 ? unmap_mapping_pages+0x56/0x100 ext4_setattr+0x817/0x8b0 notify_change+0x1df/0x430 do_truncate+0x5e/0x90 ? generic_permission+0x12b/0x1a0 This is triggered because the NULL pointer handle->h_transaction was dereferenced in function ext4_update_inode_fsync_trans(). I found that the h_transaction was set to NULL in jbd2__journal_restart but failed to attached to a new transaction while the journal is aborted. Fix this by checking the handle before updating the inode. Fixes: b436b9b ("ext4: Wait for proper transaction commit on fsync") Signed-off-by: Jiufei Xue <jiufei.xue@linux.alibaba.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
gabrielesvelto
pushed a commit
to gabrielesvelto/CI20_linux
that referenced
this issue
Jan 17, 2020
[ Upstream commit d982b33 ] ================================================================= ==20875==ERROR: LeakSanitizer: detected memory leaks Direct leak of 1160 byte(s) in 1 object(s) allocated from: #0 0x7f1b6fc84138 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee138) MIPS#1 0x55bd50005599 in zalloc util/util.h:23 MIPS#2 0x55bd500068f5 in perf_evsel__newtp_idx util/evsel.c:327 MIPS#3 0x55bd4ff810fc in perf_evsel__newtp /home/work/linux/tools/perf/util/evsel.h:216 MIPS#4 0x55bd4ff81608 in test__perf_evsel__tp_sched_test tests/evsel-tp-sched.c:69 MIPS#5 0x55bd4ff528e6 in run_test tests/builtin-test.c:358 MIPS#6 0x55bd4ff52baf in test_and_print tests/builtin-test.c:388 MIPS#7 0x55bd4ff543fe in __cmd_test tests/builtin-test.c:583 MIPS#8 0x55bd4ff5572f in cmd_test tests/builtin-test.c:722 MIPS#9 0x55bd4ffc4087 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302 MIPS#10 0x55bd4ffc45c6 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354 MIPS#11 0x55bd4ffc49ca in run_argv /home/changbin/work/linux/tools/perf/perf.c:398 MIPS#12 0x55bd4ffc5138 in main /home/changbin/work/linux/tools/perf/perf.c:520 MIPS#13 0x7f1b6e34809a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) Indirect leak of 19 byte(s) in 1 object(s) allocated from: #0 0x7f1b6fc83f30 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xedf30) MIPS#1 0x7f1b6e3ac30f in vasprintf (/lib/x86_64-linux-gnu/libc.so.6+0x8830f) Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Fixes: 6a6cd11 ("perf test: Add test for the sched tracepoint format fields") Link: http://lkml.kernel.org/r/20190316080556.3075-17-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
pcercuei
referenced
this issue
in OpenDingux/linux
Mar 31, 2020
When experimenting with bpf_send_signal() helper in our production environment (5.2 based), we experienced a deadlock in NMI mode: #5 [ffffc9002219f770] queued_spin_lock_slowpath at ffffffff8110be24 #6 [ffffc9002219f770] _raw_spin_lock_irqsave at ffffffff81a43012 #7 [ffffc9002219f780] try_to_wake_up at ffffffff810e7ecd #8 [ffffc9002219f7e0] signal_wake_up_state at ffffffff810c7b55 #9 [ffffc9002219f7f0] __send_signal at ffffffff810c8602 #10 [ffffc9002219f830] do_send_sig_info at ffffffff810ca31a #11 [ffffc9002219f868] bpf_send_signal at ffffffff8119d227 #12 [ffffc9002219f988] bpf_overflow_handler at ffffffff811d4140 #13 [ffffc9002219f9e0] __perf_event_overflow at ffffffff811d68cf #14 [ffffc9002219fa10] perf_swevent_overflow at ffffffff811d6a09 #15 [ffffc9002219fa38] ___perf_sw_event at ffffffff811e0f47 MIPS#16 [ffffc9002219fc30] __schedule at ffffffff81a3e04d MIPS#17 [ffffc9002219fc90] schedule at ffffffff81a3e219 MIPS#18 [ffffc9002219fca0] futex_wait_queue_me at ffffffff8113d1b9 MIPS#19 [ffffc9002219fcd8] futex_wait at ffffffff8113e529 MIPS#20 [ffffc9002219fdf0] do_futex at ffffffff8113ffbc MIPS#21 [ffffc9002219fec0] __x64_sys_futex at ffffffff81140d1c MIPS#22 [ffffc9002219ff38] do_syscall_64 at ffffffff81002602 MIPS#23 [ffffc9002219ff50] entry_SYSCALL_64_after_hwframe at ffffffff81c00068 The above call stack is actually very similar to an issue reported by Commit eac9153 ("bpf/stackmap: Fix deadlock with rq_lock in bpf_get_stack()") by Song Liu. The only difference is bpf_send_signal() helper instead of bpf_get_stack() helper. The above deadlock is triggered with a perf_sw_event. Similar to Commit eac9153, the below almost identical reproducer used tracepoint point sched/sched_switch so the issue can be easily caught. /* stress_test.c */ #include <stdio.h> #include <stdlib.h> #include <sys/mman.h> #include <pthread.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #define THREAD_COUNT 1000 char *filename; void *worker(void *p) { void *ptr; int fd; char *pptr; fd = open(filename, O_RDONLY); if (fd < 0) return NULL; while (1) { struct timespec ts = {0, 1000 + rand() % 2000}; ptr = mmap(NULL, 4096 * 64, PROT_READ, MAP_PRIVATE, fd, 0); usleep(1); if (ptr == MAP_FAILED) { printf("failed to mmap\n"); break; } munmap(ptr, 4096 * 64); usleep(1); pptr = malloc(1); usleep(1); pptr[0] = 1; usleep(1); free(pptr); usleep(1); nanosleep(&ts, NULL); } close(fd); return NULL; } int main(int argc, char *argv[]) { void *ptr; int i; pthread_t threads[THREAD_COUNT]; if (argc < 2) return 0; filename = argv[1]; for (i = 0; i < THREAD_COUNT; i++) { if (pthread_create(threads + i, NULL, worker, NULL)) { fprintf(stderr, "Error creating thread\n"); return 0; } } for (i = 0; i < THREAD_COUNT; i++) pthread_join(threads[i], NULL); return 0; } and the following command: 1. run `stress_test /bin/ls` in one windown 2. hack bcc trace.py with the following change: --- a/tools/trace.py +++ b/tools/trace.py @@ -513,6 +513,7 @@ BPF_PERF_OUTPUT(%s); __data.tgid = __tgid; __data.pid = __pid; bpf_get_current_comm(&__data.comm, sizeof(__data.comm)); + bpf_send_signal(10); %s %s %s.perf_submit(%s, &__data, sizeof(__data)); 3. in a different window run ./trace.py -p $(pidof stress_test) t:sched:sched_switch The deadlock can be reproduced in our production system. Similar to Song's fix, the fix is to delay sending signal if irqs is disabled to avoid deadlocks involving with rq_lock. With this change, my above stress-test in our production system won't cause deadlock any more. I also implemented a scale-down version of reproducer in the selftest (a subsequent commit). With latest bpf-next, it complains for the following potential deadlock. [ 32.832450] -> #1 (&p->pi_lock){-.-.}: [ 32.833100] _raw_spin_lock_irqsave+0x44/0x80 [ 32.833696] task_rq_lock+0x2c/0xa0 [ 32.834182] task_sched_runtime+0x59/0xd0 [ 32.834721] thread_group_cputime+0x250/0x270 [ 32.835304] thread_group_cputime_adjusted+0x2e/0x70 [ 32.835959] do_task_stat+0x8a7/0xb80 [ 32.836461] proc_single_show+0x51/0xb0 ... [ 32.839512] -> #0 (&(&sighand->siglock)->rlock){....}: [ 32.840275] __lock_acquire+0x1358/0x1a20 [ 32.840826] lock_acquire+0xc7/0x1d0 [ 32.841309] _raw_spin_lock_irqsave+0x44/0x80 [ 32.841916] __lock_task_sighand+0x79/0x160 [ 32.842465] do_send_sig_info+0x35/0x90 [ 32.842977] bpf_send_signal+0xa/0x10 [ 32.843464] bpf_prog_bc13ed9e4d3163e3_send_signal_tp_sched+0x465/0x1000 [ 32.844301] trace_call_bpf+0x115/0x270 [ 32.844809] perf_trace_run_bpf_submit+0x4a/0xc0 [ 32.845411] perf_trace_sched_switch+0x10f/0x180 [ 32.846014] __schedule+0x45d/0x880 [ 32.846483] schedule+0x5f/0xd0 ... [ 32.853148] Chain exists of: [ 32.853148] &(&sighand->siglock)->rlock --> &p->pi_lock --> &rq->lock [ 32.853148] [ 32.854451] Possible unsafe locking scenario: [ 32.854451] [ 32.855173] CPU0 CPU1 [ 32.855745] ---- ---- [ 32.856278] lock(&rq->lock); [ 32.856671] lock(&p->pi_lock); [ 32.857332] lock(&rq->lock); [ 32.857999] lock(&(&sighand->siglock)->rlock); Deadlock happens on CPU0 when it tries to acquire &sighand->siglock but it has been held by CPU1 and CPU1 tries to grab &rq->lock and cannot get it. This is not exactly the callstack in our production environment, but sympotom is similar and both locks are using spin_lock_irqsave() to acquire the lock, and both involves rq_lock. The fix to delay sending signal when irq is disabled also fixed this issue. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Cc: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200304191104.2796501-1-yhs@fb.com
pcercuei
referenced
this issue
in OpenDingux/linux
Sep 17, 2020
…s metrics" test Linux 5.9 introduced perf test case "Parse and process metrics" and on s390 this test case always dumps core: [root@t35lp67 perf]# ./perf test -vvvv -F 67 67: Parse and process metrics : --- start --- metric expr inst_retired.any / cpu_clk_unhalted.thread for IPC parsing metric: inst_retired.any / cpu_clk_unhalted.thread Segmentation fault (core dumped) [root@t35lp67 perf]# I debugged this core dump and gdb shows this call chain: (gdb) where #0 0x000003ffabc3192a in __strnlen_c_1 () from /lib64/libc.so.6 #1 0x000003ffabc293de in strcasestr () from /lib64/libc.so.6 #2 0x0000000001102ba2 in match_metric(list=0x1e6ea20 "inst_retired.any", n=<optimized out>) at util/metricgroup.c:368 #3 find_metric (map=<optimized out>, map=<optimized out>, metric=0x1e6ea20 "inst_retired.any") at util/metricgroup.c:765 #4 __resolve_metric (ids=0x0, map=<optimized out>, metric_list=0x0, metric_no_group=<optimized out>, m=<optimized out>) at util/metricgroup.c:844 #5 resolve_metric (ids=0x0, map=0x0, metric_list=0x0, metric_no_group=<optimized out>) at util/metricgroup.c:881 #6 metricgroup__add_metric (metric=<optimized out>, metric_no_group=metric_no_group@entry=false, events=<optimized out>, events@entry=0x3ffd84fb878, metric_list=0x0, metric_list@entry=0x3ffd84fb868, map=0x0) at util/metricgroup.c:943 #7 0x00000000011034ae in metricgroup__add_metric_list (map=0x13f9828 <map>, metric_list=0x3ffd84fb868, events=0x3ffd84fb878, metric_no_group=<optimized out>, list=<optimized out>) at util/metricgroup.c:988 #8 parse_groups (perf_evlist=perf_evlist@entry=0x1e70260, str=str@entry=0x12f34b2 "IPC", metric_no_group=<optimized out>, metric_no_merge=<optimized out>, fake_pmu=fake_pmu@entry=0x1462f18 <perf_pmu.fake>, metric_events=0x3ffd84fba58, map=0x1) at util/metricgroup.c:1040 #9 0x0000000001103eb2 in metricgroup__parse_groups_test( evlist=evlist@entry=0x1e70260, map=map@entry=0x13f9828 <map>, str=str@entry=0x12f34b2 "IPC", metric_no_group=metric_no_group@entry=false, metric_no_merge=metric_no_merge@entry=false, metric_events=0x3ffd84fba58) at util/metricgroup.c:1082 #10 0x00000000010c84d8 in __compute_metric (ratio2=0x0, name2=0x0, ratio1=<synthetic pointer>, name1=0x12f34b2 "IPC", vals=0x3ffd84fbad8, name=0x12f34b2 "IPC") at tests/parse-metric.c:159 #11 compute_metric (ratio=<synthetic pointer>, vals=0x3ffd84fbad8, name=0x12f34b2 "IPC") at tests/parse-metric.c:189 #12 test_ipc () at tests/parse-metric.c:208 ..... ..... omitted many more lines This test case was added with commit 218ca91 ("perf tests: Add parse metric test for frontend metric"). When I compile with make DEBUG=y it works fine and I do not get a core dump. It turned out that the above listed function call chain worked on a struct pmu_event array which requires a trailing element with zeroes which was missing. The marco map_for_each_event() loops over that array tests for members metric_expr/metric_name/metric_group being non-NULL. Adding this element fixes the issue. Output after: [root@t35lp46 perf]# ./perf test 67 67: Parse and process metrics : Ok [root@t35lp46 perf]# Committer notes: As Ian remarks, this is not s390 specific: <quote Ian> This also shows up with address sanitizer on all architectures (perhaps change the patch title) and perhaps add a "Fixes: <commit>" tag. ================================================================= ==4718==ERROR: AddressSanitizer: global-buffer-overflow on address 0x55c93b4d59e8 at pc 0x55c93a1541e2 bp 0x7ffd24327c60 sp 0x7ffd24327c58 READ of size 8 at 0x55c93b4d59e8 thread T0 #0 0x55c93a1541e1 in find_metric tools/perf/util/metricgroup.c:764:2 #1 0x55c93a153e6c in __resolve_metric tools/perf/util/metricgroup.c:844:9 #2 0x55c93a152f18 in resolve_metric tools/perf/util/metricgroup.c:881:9 #3 0x55c93a1528db in metricgroup__add_metric tools/perf/util/metricgroup.c:943:9 #4 0x55c93a151996 in metricgroup__add_metric_list tools/perf/util/metricgroup.c:988:9 #5 0x55c93a1511b9 in parse_groups tools/perf/util/metricgroup.c:1040:8 #6 0x55c93a1513e1 in metricgroup__parse_groups_test tools/perf/util/metricgroup.c:1082:9 #7 0x55c93a0108ae in __compute_metric tools/perf/tests/parse-metric.c:159:8 #8 0x55c93a010744 in compute_metric tools/perf/tests/parse-metric.c:189:9 #9 0x55c93a00f5ee in test_ipc tools/perf/tests/parse-metric.c:208:2 #10 0x55c93a00f1e8 in test__parse_metric tools/perf/tests/parse-metric.c:345:2 #11 0x55c939fd7202 in run_test tools/perf/tests/builtin-test.c:410:9 #12 0x55c939fd6736 in test_and_print tools/perf/tests/builtin-test.c:440:9 #13 0x55c939fd58c3 in __cmd_test tools/perf/tests/builtin-test.c:661:4 #14 0x55c939fd4e02 in cmd_test tools/perf/tests/builtin-test.c:807:9 #15 0x55c939e4763d in run_builtin tools/perf/perf.c:313:11 MIPS#16 0x55c939e46475 in handle_internal_command tools/perf/perf.c:365:8 MIPS#17 0x55c939e4737e in run_argv tools/perf/perf.c:409:2 MIPS#18 0x55c939e45f7e in main tools/perf/perf.c:539:3 0x55c93b4d59e8 is located 0 bytes to the right of global variable 'pme_test' defined in 'tools/perf/tests/parse-metric.c:17:25' (0x55c93b4d54a0) of size 1352 SUMMARY: AddressSanitizer: global-buffer-overflow tools/perf/util/metricgroup.c:764:2 in find_metric Shadow bytes around the buggy address: 0x0ab9a7692ae0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ab9a7692af0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ab9a7692b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ab9a7692b10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ab9a7692b20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 =>0x0ab9a7692b30: 00 00 00 00 00 00 00 00 00 00 00 00 00[f9]f9 f9 0x0ab9a7692b40: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 0x0ab9a7692b50: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 0x0ab9a7692b60: f9 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00 00 0x0ab9a7692b70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ab9a7692b80: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb Shadow gap: cc </quote> I'm also adding the missing "Fixes" tag and setting just .name to NULL, as doing it that way is more compact (the compiler will zero out everything else) and the table iterators look for .name being NULL as the sentinel marking the end of the table. Fixes: 0a507af ("perf tests: Add parse metric test for ipc metric") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: http://lore.kernel.org/lkml/20200825071211.16959-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
pcercuei
referenced
this issue
in OpenDingux/linux
Sep 24, 2020
The aliases were never released causing the following leaks: Indirect leak of 1224 byte(s) in 9 object(s) allocated from: #0 0x7feefb830628 in malloc (/lib/x86_64-linux-gnu/libasan.so.5+0x107628) #1 0x56332c8f1b62 in __perf_pmu__new_alias util/pmu.c:322 #2 0x56332c8f401f in pmu_add_cpu_aliases_map util/pmu.c:778 #3 0x56332c792ce9 in __test__pmu_event_aliases tests/pmu-events.c:295 #4 0x56332c792ce9 in test_aliases tests/pmu-events.c:367 #5 0x56332c76a09b in run_test tests/builtin-test.c:410 #6 0x56332c76a09b in test_and_print tests/builtin-test.c:440 #7 0x56332c76ce69 in __cmd_test tests/builtin-test.c:695 #8 0x56332c76ce69 in cmd_test tests/builtin-test.c:807 #9 0x56332c7d2214 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312 #10 0x56332c6701a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364 #11 0x56332c6701a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408 #12 0x56332c6701a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538 #13 0x7feefb359cc9 in __libc_start_main ../csu/libc-start.c:308 Fixes: 956a783 ("perf test: Test pmu-events aliases") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: John Garry <john.garry@huawei.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200915031819.386559-11-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
pcercuei
referenced
this issue
in OpenDingux/linux
Sep 24, 2020
The evsel->unit borrows a pointer of pmu event or alias instead of owns a string. But tool event (duration_time) passes a result of strdup() caused a leak. It was found by ASAN during metric test: Direct leak of 210 byte(s) in 70 object(s) allocated from: #0 0x7fe366fca0b5 in strdup (/lib/x86_64-linux-gnu/libasan.so.5+0x920b5) #1 0x559fbbcc6ea3 in add_event_tool util/parse-events.c:414 #2 0x559fbbcc6ea3 in parse_events_add_tool util/parse-events.c:1414 #3 0x559fbbd8474d in parse_events_parse util/parse-events.y:439 #4 0x559fbbcc95da in parse_events__scanner util/parse-events.c:2096 #5 0x559fbbcc95da in __parse_events util/parse-events.c:2141 #6 0x559fbbc28555 in check_parse_id tests/pmu-events.c:406 #7 0x559fbbc28555 in check_parse_id tests/pmu-events.c:393 #8 0x559fbbc28555 in check_parse_cpu tests/pmu-events.c:415 #9 0x559fbbc28555 in test_parsing tests/pmu-events.c:498 #10 0x559fbbc0109b in run_test tests/builtin-test.c:410 #11 0x559fbbc0109b in test_and_print tests/builtin-test.c:440 #12 0x559fbbc03e69 in __cmd_test tests/builtin-test.c:695 #13 0x559fbbc03e69 in cmd_test tests/builtin-test.c:807 #14 0x559fbbc691f4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312 #15 0x559fbbb071a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364 MIPS#16 0x559fbbb071a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408 MIPS#17 0x559fbbb071a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538 MIPS#18 0x7fe366b68cc9 in __libc_start_main ../csu/libc-start.c:308 Fixes: f0fbb11 ("perf stat: Implement duration_time as a proper event") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200915031819.386559-6-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
pcercuei
referenced
this issue
in OpenDingux/linux
Sep 24, 2020
The test_generic_metric() missed to release entries in the pctx. Asan reported following leak (and more): Direct leak of 128 byte(s) in 1 object(s) allocated from: #0 0x7f4c9396980e in calloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10780e) #1 0x55f7e748cc14 in hashmap_grow (/home/namhyung/project/linux/tools/perf/perf+0x90cc14) #2 0x55f7e748d497 in hashmap__insert (/home/namhyung/project/linux/tools/perf/perf+0x90d497) #3 0x55f7e7341667 in hashmap__set /home/namhyung/project/linux/tools/perf/util/hashmap.h:111 #4 0x55f7e7341667 in expr__add_ref util/expr.c:120 #5 0x55f7e7292436 in prepare_metric util/stat-shadow.c:783 #6 0x55f7e729556d in test_generic_metric util/stat-shadow.c:858 #7 0x55f7e712390b in compute_single tests/parse-metric.c:128 #8 0x55f7e712390b in __compute_metric tests/parse-metric.c:180 #9 0x55f7e712446d in compute_metric tests/parse-metric.c:196 #10 0x55f7e712446d in test_dcache_l2 tests/parse-metric.c:295 #11 0x55f7e712446d in test__parse_metric tests/parse-metric.c:355 #12 0x55f7e70be09b in run_test tests/builtin-test.c:410 #13 0x55f7e70be09b in test_and_print tests/builtin-test.c:440 #14 0x55f7e70c101a in __cmd_test tests/builtin-test.c:661 #15 0x55f7e70c101a in cmd_test tests/builtin-test.c:807 MIPS#16 0x55f7e7126214 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312 MIPS#17 0x55f7e6fc41a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364 MIPS#18 0x55f7e6fc41a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408 MIPS#19 0x55f7e6fc41a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538 MIPS#20 0x7f4c93492cc9 in __libc_start_main ../csu/libc-start.c:308 Fixes: 6d432c4 ("perf tools: Add test_generic_metric function") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200915031819.386559-8-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
pcercuei
referenced
this issue
in OpenDingux/linux
Sep 24, 2020
The metricgroup__add_metric() can find multiple match for a metric group and it's possible to fail. Also it can fail in the middle like in resolve_metric() even for single metric. In those cases, the intermediate list and ids will be leaked like: Direct leak of 3 byte(s) in 1 object(s) allocated from: #0 0x7f4c938f40b5 in strdup (/lib/x86_64-linux-gnu/libasan.so.5+0x920b5) #1 0x55f7e71c1bef in __add_metric util/metricgroup.c:683 #2 0x55f7e71c31d0 in add_metric util/metricgroup.c:906 #3 0x55f7e71c3844 in metricgroup__add_metric util/metricgroup.c:940 #4 0x55f7e71c488d in metricgroup__add_metric_list util/metricgroup.c:993 #5 0x55f7e71c488d in parse_groups util/metricgroup.c:1045 #6 0x55f7e71c60a4 in metricgroup__parse_groups_test util/metricgroup.c:1087 #7 0x55f7e71235ae in __compute_metric tests/parse-metric.c:164 #8 0x55f7e7124650 in compute_metric tests/parse-metric.c:196 #9 0x55f7e7124650 in test_recursion_fail tests/parse-metric.c:318 #10 0x55f7e7124650 in test__parse_metric tests/parse-metric.c:356 #11 0x55f7e70be09b in run_test tests/builtin-test.c:410 #12 0x55f7e70be09b in test_and_print tests/builtin-test.c:440 #13 0x55f7e70c101a in __cmd_test tests/builtin-test.c:661 #14 0x55f7e70c101a in cmd_test tests/builtin-test.c:807 #15 0x55f7e7126214 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312 MIPS#16 0x55f7e6fc41a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364 MIPS#17 0x55f7e6fc41a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408 MIPS#18 0x55f7e6fc41a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538 MIPS#19 0x7f4c93492cc9 in __libc_start_main ../csu/libc-start.c:308 Fixes: 83de0b7 ("perf metric: Collect referenced metrics in struct metric_ref_node") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200915031819.386559-9-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
pcercuei
referenced
this issue
in OpenDingux/linux
Sep 24, 2020
The following leaks were detected by ASAN: Indirect leak of 360 byte(s) in 9 object(s) allocated from: #0 0x7fecc305180e in calloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10780e) #1 0x560578f6dce5 in perf_pmu__new_format util/pmu.c:1333 #2 0x560578f752fc in perf_pmu_parse util/pmu.y:59 #3 0x560578f6a8b7 in perf_pmu__format_parse util/pmu.c:73 #4 0x560578e07045 in test__pmu tests/pmu.c:155 #5 0x560578de109b in run_test tests/builtin-test.c:410 #6 0x560578de109b in test_and_print tests/builtin-test.c:440 #7 0x560578de401a in __cmd_test tests/builtin-test.c:661 #8 0x560578de401a in cmd_test tests/builtin-test.c:807 #9 0x560578e49354 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312 #10 0x560578ce71a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364 #11 0x560578ce71a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408 #12 0x560578ce71a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538 #13 0x7fecc2b7acc9 in __libc_start_main ../csu/libc-start.c:308 Fixes: cff7f95 ("perf tests: Move pmu tests into separate object") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200915031819.386559-12-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
pcercuei
referenced
this issue
in OpenDingux/linux
Nov 3, 2020
Very sporadically I had test case btrfs/069 from fstests hanging (for years, it is not a recent regression), with the following traces in dmesg/syslog: [162301.160628] BTRFS info (device sdc): dev_replace from /dev/sdd (devid 2) to /dev/sdg started [162301.181196] BTRFS info (device sdc): scrub: finished on devid 4 with status: 0 [162301.287162] BTRFS info (device sdc): dev_replace from /dev/sdd (devid 2) to /dev/sdg finished [162513.513792] INFO: task btrfs-transacti:1356167 blocked for more than 120 seconds. [162513.514318] Not tainted 5.9.0-rc6-btrfs-next-69 #1 [162513.514522] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [162513.514747] task:btrfs-transacti state:D stack: 0 pid:1356167 ppid: 2 flags:0x00004000 [162513.514751] Call Trace: [162513.514761] __schedule+0x5ce/0xd00 [162513.514765] ? _raw_spin_unlock_irqrestore+0x3c/0x60 [162513.514771] schedule+0x46/0xf0 [162513.514844] wait_current_trans+0xde/0x140 [btrfs] [162513.514850] ? finish_wait+0x90/0x90 [162513.514864] start_transaction+0x37c/0x5f0 [btrfs] [162513.514879] transaction_kthread+0xa4/0x170 [btrfs] [162513.514891] ? btrfs_cleanup_transaction+0x660/0x660 [btrfs] [162513.514894] kthread+0x153/0x170 [162513.514897] ? kthread_stop+0x2c0/0x2c0 [162513.514902] ret_from_fork+0x22/0x30 [162513.514916] INFO: task fsstress:1356184 blocked for more than 120 seconds. [162513.515192] Not tainted 5.9.0-rc6-btrfs-next-69 #1 [162513.515431] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [162513.515680] task:fsstress state:D stack: 0 pid:1356184 ppid:1356177 flags:0x00004000 [162513.515682] Call Trace: [162513.515688] __schedule+0x5ce/0xd00 [162513.515691] ? _raw_spin_unlock_irqrestore+0x3c/0x60 [162513.515697] schedule+0x46/0xf0 [162513.515712] wait_current_trans+0xde/0x140 [btrfs] [162513.515716] ? finish_wait+0x90/0x90 [162513.515729] start_transaction+0x37c/0x5f0 [btrfs] [162513.515743] btrfs_attach_transaction_barrier+0x1f/0x50 [btrfs] [162513.515753] btrfs_sync_fs+0x61/0x1c0 [btrfs] [162513.515758] ? __ia32_sys_fdatasync+0x20/0x20 [162513.515761] iterate_supers+0x87/0xf0 [162513.515765] ksys_sync+0x60/0xb0 [162513.515768] __do_sys_sync+0xa/0x10 [162513.515771] do_syscall_64+0x33/0x80 [162513.515774] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [162513.515781] RIP: 0033:0x7f5238f50bd7 [162513.515782] Code: Bad RIP value. [162513.515784] RSP: 002b:00007fff67b978e8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a2 [162513.515786] RAX: ffffffffffffffda RBX: 000055b1fad2c560 RCX: 00007f5238f50bd7 [162513.515788] RDX: 00000000ffffffff RSI: 000000000daf0e74 RDI: 000000000000003a [162513.515789] RBP: 0000000000000032 R08: 000000000000000a R09: 00007f5239019be0 [162513.515791] R10: fffffffffffff24f R11: 0000000000000206 R12: 000000000000003a [162513.515792] R13: 00007fff67b97950 R14: 00007fff67b97906 R15: 000055b1fad1a340 [162513.515804] INFO: task fsstress:1356185 blocked for more than 120 seconds. [162513.516064] Not tainted 5.9.0-rc6-btrfs-next-69 #1 [162513.516329] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [162513.516617] task:fsstress state:D stack: 0 pid:1356185 ppid:1356177 flags:0x00000000 [162513.516620] Call Trace: [162513.516625] __schedule+0x5ce/0xd00 [162513.516628] ? _raw_spin_unlock_irqrestore+0x3c/0x60 [162513.516634] schedule+0x46/0xf0 [162513.516647] wait_current_trans+0xde/0x140 [btrfs] [162513.516650] ? finish_wait+0x90/0x90 [162513.516662] start_transaction+0x4d7/0x5f0 [btrfs] [162513.516679] btrfs_setxattr_trans+0x3c/0x100 [btrfs] [162513.516686] __vfs_setxattr+0x66/0x80 [162513.516691] __vfs_setxattr_noperm+0x70/0x200 [162513.516697] vfs_setxattr+0x6b/0x120 [162513.516703] setxattr+0x125/0x240 [162513.516709] ? lock_acquire+0xb1/0x480 [162513.516712] ? mnt_want_write+0x20/0x50 [162513.516721] ? rcu_read_lock_any_held+0x8e/0xb0 [162513.516723] ? preempt_count_add+0x49/0xa0 [162513.516725] ? __sb_start_write+0x19b/0x290 [162513.516727] ? preempt_count_add+0x49/0xa0 [162513.516732] path_setxattr+0xba/0xd0 [162513.516739] __x64_sys_setxattr+0x27/0x30 [162513.516741] do_syscall_64+0x33/0x80 [162513.516743] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [162513.516745] RIP: 0033:0x7f5238f56d5a [162513.516746] Code: Bad RIP value. [162513.516748] RSP: 002b:00007fff67b97868 EFLAGS: 00000202 ORIG_RAX: 00000000000000bc [162513.516750] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f5238f56d5a [162513.516751] RDX: 000055b1fbb0d5a0 RSI: 00007fff67b978a0 RDI: 000055b1fbb0d470 [162513.516753] RBP: 000055b1fbb0d5a0 R08: 0000000000000001 R09: 00007fff67b97700 [162513.516754] R10: 0000000000000004 R11: 0000000000000202 R12: 0000000000000004 [162513.516756] R13: 0000000000000024 R14: 0000000000000001 R15: 00007fff67b978a0 [162513.516767] INFO: task fsstress:1356196 blocked for more than 120 seconds. [162513.517064] Not tainted 5.9.0-rc6-btrfs-next-69 #1 [162513.517365] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [162513.517763] task:fsstress state:D stack: 0 pid:1356196 ppid:1356177 flags:0x00004000 [162513.517780] Call Trace: [162513.517786] __schedule+0x5ce/0xd00 [162513.517789] ? _raw_spin_unlock_irqrestore+0x3c/0x60 [162513.517796] schedule+0x46/0xf0 [162513.517810] wait_current_trans+0xde/0x140 [btrfs] [162513.517814] ? finish_wait+0x90/0x90 [162513.517829] start_transaction+0x37c/0x5f0 [btrfs] [162513.517845] btrfs_attach_transaction_barrier+0x1f/0x50 [btrfs] [162513.517857] btrfs_sync_fs+0x61/0x1c0 [btrfs] [162513.517862] ? __ia32_sys_fdatasync+0x20/0x20 [162513.517865] iterate_supers+0x87/0xf0 [162513.517869] ksys_sync+0x60/0xb0 [162513.517872] __do_sys_sync+0xa/0x10 [162513.517875] do_syscall_64+0x33/0x80 [162513.517878] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [162513.517881] RIP: 0033:0x7f5238f50bd7 [162513.517883] Code: Bad RIP value. [162513.517885] RSP: 002b:00007fff67b978e8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a2 [162513.517887] RAX: ffffffffffffffda RBX: 000055b1fad2c560 RCX: 00007f5238f50bd7 [162513.517889] RDX: 0000000000000000 RSI: 000000007660add2 RDI: 0000000000000053 [162513.517891] RBP: 0000000000000032 R08: 0000000000000067 R09: 00007f5239019be0 [162513.517893] R10: fffffffffffff24f R11: 0000000000000206 R12: 0000000000000053 [162513.517895] R13: 00007fff67b97950 R14: 00007fff67b97906 R15: 000055b1fad1a340 [162513.517908] INFO: task fsstress:1356197 blocked for more than 120 seconds. [162513.518298] Not tainted 5.9.0-rc6-btrfs-next-69 #1 [162513.518672] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [162513.519157] task:fsstress state:D stack: 0 pid:1356197 ppid:1356177 flags:0x00000000 [162513.519160] Call Trace: [162513.519165] __schedule+0x5ce/0xd00 [162513.519168] ? _raw_spin_unlock_irqrestore+0x3c/0x60 [162513.519174] schedule+0x46/0xf0 [162513.519190] wait_current_trans+0xde/0x140 [btrfs] [162513.519193] ? finish_wait+0x90/0x90 [162513.519206] start_transaction+0x4d7/0x5f0 [btrfs] [162513.519222] btrfs_create+0x57/0x200 [btrfs] [162513.519230] lookup_open+0x522/0x650 [162513.519246] path_openat+0x2b8/0xa50 [162513.519270] do_filp_open+0x91/0x100 [162513.519275] ? find_held_lock+0x32/0x90 [162513.519280] ? lock_acquired+0x33b/0x470 [162513.519285] ? do_raw_spin_unlock+0x4b/0xc0 [162513.519287] ? _raw_spin_unlock+0x29/0x40 [162513.519295] do_sys_openat2+0x20d/0x2d0 [162513.519300] do_sys_open+0x44/0x80 [162513.519304] do_syscall_64+0x33/0x80 [162513.519307] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [162513.519309] RIP: 0033:0x7f5238f4a903 [162513.519310] Code: Bad RIP value. [162513.519312] RSP: 002b:00007fff67b97758 EFLAGS: 00000246 ORIG_RAX: 0000000000000055 [162513.519314] RAX: ffffffffffffffda RBX: 00000000ffffffff RCX: 00007f5238f4a903 [162513.519316] RDX: 0000000000000000 RSI: 00000000000001b6 RDI: 000055b1fbb0d470 [162513.519317] RBP: 00007fff67b978c0 R08: 0000000000000001 R09: 0000000000000002 [162513.519319] R10: 00007fff67b974f7 R11: 0000000000000246 R12: 0000000000000013 [162513.519320] R13: 00000000000001b6 R14: 00007fff67b97906 R15: 000055b1fad1c620 [162513.519332] INFO: task btrfs:1356211 blocked for more than 120 seconds. [162513.519727] Not tainted 5.9.0-rc6-btrfs-next-69 #1 [162513.520115] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [162513.520508] task:btrfs state:D stack: 0 pid:1356211 ppid:1356178 flags:0x00004002 [162513.520511] Call Trace: [162513.520516] __schedule+0x5ce/0xd00 [162513.520519] ? _raw_spin_unlock_irqrestore+0x3c/0x60 [162513.520525] schedule+0x46/0xf0 [162513.520544] btrfs_scrub_pause+0x11f/0x180 [btrfs] [162513.520548] ? finish_wait+0x90/0x90 [162513.520562] btrfs_commit_transaction+0x45a/0xc30 [btrfs] [162513.520574] ? start_transaction+0xe0/0x5f0 [btrfs] [162513.520596] btrfs_dev_replace_finishing+0x6d8/0x711 [btrfs] [162513.520619] btrfs_dev_replace_by_ioctl.cold+0x1cc/0x1fd [btrfs] [162513.520639] btrfs_ioctl+0x2a25/0x36f0 [btrfs] [162513.520643] ? do_sigaction+0xf3/0x240 [162513.520645] ? find_held_lock+0x32/0x90 [162513.520648] ? do_sigaction+0xf3/0x240 [162513.520651] ? lock_acquired+0x33b/0x470 [162513.520655] ? _raw_spin_unlock_irq+0x24/0x50 [162513.520657] ? lockdep_hardirqs_on+0x7d/0x100 [162513.520660] ? _raw_spin_unlock_irq+0x35/0x50 [162513.520662] ? do_sigaction+0xf3/0x240 [162513.520671] ? __x64_sys_ioctl+0x83/0xb0 [162513.520672] __x64_sys_ioctl+0x83/0xb0 [162513.520677] do_syscall_64+0x33/0x80 [162513.520679] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [162513.520681] RIP: 0033:0x7fc3cd307d87 [162513.520682] Code: Bad RIP value. [162513.520684] RSP: 002b:00007ffe30a56bb8 EFLAGS: 00000202 ORIG_RAX: 0000000000000010 [162513.520686] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007fc3cd307d87 [162513.520687] RDX: 00007ffe30a57a30 RSI: 00000000ca289435 RDI: 0000000000000003 [162513.520689] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [162513.520690] R10: 0000000000000008 R11: 0000000000000202 R12: 0000000000000003 [162513.520692] R13: 0000557323a212e0 R14: 00007ffe30a5a520 R15: 0000000000000001 [162513.520703] Showing all locks held in the system: [162513.520712] 1 lock held by khungtaskd/54: [162513.520713] #0: ffffffffb40a91a0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x15/0x197 [162513.520728] 1 lock held by in:imklog/596: [162513.520729] #0: ffff8f3f0d781400 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0x4d/0x60 [162513.520782] 1 lock held by btrfs-transacti/1356167: [162513.520784] #0: ffff8f3d810cc848 (&fs_info->transaction_kthread_mutex){+.+.}-{3:3}, at: transaction_kthread+0x4a/0x170 [btrfs] [162513.520798] 1 lock held by btrfs/1356190: [162513.520800] #0: ffff8f3d57644470 (sb_writers#15){.+.+}-{0:0}, at: mnt_want_write_file+0x22/0x60 [162513.520805] 1 lock held by fsstress/1356184: [162513.520806] #0: ffff8f3d576440e8 (&type->s_umount_key#62){++++}-{3:3}, at: iterate_supers+0x6f/0xf0 [162513.520811] 3 locks held by fsstress/1356185: [162513.520812] #0: ffff8f3d57644470 (sb_writers#15){.+.+}-{0:0}, at: mnt_want_write+0x20/0x50 [162513.520815] #1: ffff8f3d80a650b8 (&type->i_mutex_dir_key#10){++++}-{3:3}, at: vfs_setxattr+0x50/0x120 [162513.520820] #2: ffff8f3d57644690 (sb_internal#2){.+.+}-{0:0}, at: start_transaction+0x40e/0x5f0 [btrfs] [162513.520833] 1 lock held by fsstress/1356196: [162513.520834] #0: ffff8f3d576440e8 (&type->s_umount_key#62){++++}-{3:3}, at: iterate_supers+0x6f/0xf0 [162513.520838] 3 locks held by fsstress/1356197: [162513.520839] #0: ffff8f3d57644470 (sb_writers#15){.+.+}-{0:0}, at: mnt_want_write+0x20/0x50 [162513.520843] #1: ffff8f3d506465e8 (&type->i_mutex_dir_key#10){++++}-{3:3}, at: path_openat+0x2a7/0xa50 [162513.520846] #2: ffff8f3d57644690 (sb_internal#2){.+.+}-{0:0}, at: start_transaction+0x40e/0x5f0 [btrfs] [162513.520858] 2 locks held by btrfs/1356211: [162513.520859] #0: ffff8f3d810cde30 (&fs_info->dev_replace.lock_finishing_cancel_unmount){+.+.}-{3:3}, at: btrfs_dev_replace_finishing+0x52/0x711 [btrfs] [162513.520877] #1: ffff8f3d57644690 (sb_internal#2){.+.+}-{0:0}, at: start_transaction+0x40e/0x5f0 [btrfs] This was weird because the stack traces show that a transaction commit, triggered by a device replace operation, is blocking trying to pause any running scrubs but there are no stack traces of blocked tasks doing a scrub. After poking around with drgn, I noticed there was a scrub task that was constantly running and blocking for shorts periods of time: >>> t = find_task(prog, 1356190) >>> prog.stack_trace(t) #0 __schedule+0x5ce/0xcfc #1 schedule+0x46/0xe4 #2 schedule_timeout+0x1df/0x475 #3 btrfs_reada_wait+0xda/0x132 #4 scrub_stripe+0x2a8/0x112f #5 scrub_chunk+0xcd/0x134 #6 scrub_enumerate_chunks+0x29e/0x5ee #7 btrfs_scrub_dev+0x2d5/0x91b #8 btrfs_ioctl+0x7f5/0x36e7 #9 __x64_sys_ioctl+0x83/0xb0 #10 do_syscall_64+0x33/0x77 #11 entry_SYSCALL_64+0x7c/0x156 Which corresponds to: int btrfs_reada_wait(void *handle) { struct reada_control *rc = handle; struct btrfs_fs_info *fs_info = rc->fs_info; while (atomic_read(&rc->elems)) { if (!atomic_read(&fs_info->reada_works_cnt)) reada_start_machine(fs_info); wait_event_timeout(rc->wait, atomic_read(&rc->elems) == 0, (HZ + 9) / 10); } (...) So the counter "rc->elems" was set to 1 and never decreased to 0, causing the scrub task to loop forever in that function. Then I used the following script for drgn to check the readahead requests: $ cat dump_reada.py import sys import drgn from drgn import NULL, Object, cast, container_of, execscript, \ reinterpret, sizeof from drgn.helpers.linux import * mnt_path = b"/home/fdmanana/btrfs-tests/scratch_1" mnt = None for mnt in for_each_mount(prog, dst = mnt_path): pass if mnt is None: sys.stderr.write(f'Error: mount point {mnt_path} not found\n') sys.exit(1) fs_info = cast('struct btrfs_fs_info *', mnt.mnt.mnt_sb.s_fs_info) def dump_re(re): nzones = re.nzones.value_() print(f're at {hex(re.value_())}') print(f'\t logical {re.logical.value_()}') print(f'\t refcnt {re.refcnt.value_()}') print(f'\t nzones {nzones}') for i in range(nzones): dev = re.zones[i].device name = dev.name.str.string_() print(f'\t\t dev id {dev.devid.value_()} name {name}') print() for _, e in radix_tree_for_each(fs_info.reada_tree): re = cast('struct reada_extent *', e) dump_re(re) $ drgn dump_reada.py re at 0xffff8f3da9d25ad8 logical 38928384 refcnt 1 nzones 1 dev id 0 name b'/dev/sdd' $ So there was one readahead extent with a single zone corresponding to the source device of that last device replace operation logged in dmesg/syslog. Also the ID of that zone's device was 0 which is a special value set in the source device of a device replace operation when the operation finishes (constant BTRFS_DEV_REPLACE_DEVID set at btrfs_dev_replace_finishing()), confirming again that device /dev/sdd was the source of a device replace operation. Normally there should be as many zones in the readahead extent as there are devices, and I wasn't expecting the extent to be in a block group with a 'single' profile, so I went and confirmed with the following drgn script that there weren't any single profile block groups: $ cat dump_block_groups.py import sys import drgn from drgn import NULL, Object, cast, container_of, execscript, \ reinterpret, sizeof from drgn.helpers.linux import * mnt_path = b"/home/fdmanana/btrfs-tests/scratch_1" mnt = None for mnt in for_each_mount(prog, dst = mnt_path): pass if mnt is None: sys.stderr.write(f'Error: mount point {mnt_path} not found\n') sys.exit(1) fs_info = cast('struct btrfs_fs_info *', mnt.mnt.mnt_sb.s_fs_info) BTRFS_BLOCK_GROUP_DATA = (1 << 0) BTRFS_BLOCK_GROUP_SYSTEM = (1 << 1) BTRFS_BLOCK_GROUP_METADATA = (1 << 2) BTRFS_BLOCK_GROUP_RAID0 = (1 << 3) BTRFS_BLOCK_GROUP_RAID1 = (1 << 4) BTRFS_BLOCK_GROUP_DUP = (1 << 5) BTRFS_BLOCK_GROUP_RAID10 = (1 << 6) BTRFS_BLOCK_GROUP_RAID5 = (1 << 7) BTRFS_BLOCK_GROUP_RAID6 = (1 << 8) BTRFS_BLOCK_GROUP_RAID1C3 = (1 << 9) BTRFS_BLOCK_GROUP_RAID1C4 = (1 << 10) def bg_flags_string(bg): flags = bg.flags.value_() ret = '' if flags & BTRFS_BLOCK_GROUP_DATA: ret = 'data' if flags & BTRFS_BLOCK_GROUP_METADATA: if len(ret) > 0: ret += '|' ret += 'meta' if flags & BTRFS_BLOCK_GROUP_SYSTEM: if len(ret) > 0: ret += '|' ret += 'system' if flags & BTRFS_BLOCK_GROUP_RAID0: ret += ' raid0' elif flags & BTRFS_BLOCK_GROUP_RAID1: ret += ' raid1' elif flags & BTRFS_BLOCK_GROUP_DUP: ret += ' dup' elif flags & BTRFS_BLOCK_GROUP_RAID10: ret += ' raid10' elif flags & BTRFS_BLOCK_GROUP_RAID5: ret += ' raid5' elif flags & BTRFS_BLOCK_GROUP_RAID6: ret += ' raid6' elif flags & BTRFS_BLOCK_GROUP_RAID1C3: ret += ' raid1c3' elif flags & BTRFS_BLOCK_GROUP_RAID1C4: ret += ' raid1c4' else: ret += ' single' return ret def dump_bg(bg): print() print(f'block group at {hex(bg.value_())}') print(f'\t start {bg.start.value_()} length {bg.length.value_()}') print(f'\t flags {bg.flags.value_()} - {bg_flags_string(bg)}') bg_root = fs_info.block_group_cache_tree.address_of_() for bg in rbtree_inorder_for_each_entry('struct btrfs_block_group', bg_root, 'cache_node'): dump_bg(bg) $ drgn dump_block_groups.py block group at 0xffff8f3d673b0400 start 22020096 length 16777216 flags 258 - system raid6 block group at 0xffff8f3d53ddb400 start 38797312 length 536870912 flags 260 - meta raid6 block group at 0xffff8f3d5f4d9c00 start 575668224 length 2147483648 flags 257 - data raid6 block group at 0xffff8f3d08189000 start 2723151872 length 67108864 flags 258 - system raid6 block group at 0xffff8f3db70ff000 start 2790260736 length 1073741824 flags 260 - meta raid6 block group at 0xffff8f3d5f4dd800 start 3864002560 length 67108864 flags 258 - system raid6 block group at 0xffff8f3d67037000 start 3931111424 length 2147483648 flags 257 - data raid6 $ So there were only 2 reasons left for having a readahead extent with a single zone: reada_find_zone(), called when creating a readahead extent, returned NULL either because we failed to find the corresponding block group or because a memory allocation failed. With some additional and custom tracing I figured out that on every further ocurrence of the problem the block group had just been deleted when we were looping to create the zones for the readahead extent (at reada_find_extent()), so we ended up with only one zone in the readahead extent, corresponding to a device that ends up getting replaced. So after figuring that out it became obvious why the hang happens: 1) Task A starts a scrub on any device of the filesystem, except for device /dev/sdd; 2) Task B starts a device replace with /dev/sdd as the source device; 3) Task A calls btrfs_reada_add() from scrub_stripe() and it is currently starting to scrub a stripe from block group X. This call to btrfs_reada_add() is the one for the extent tree. When btrfs_reada_add() calls reada_add_block(), it passes the logical address of the extent tree's root node as its 'logical' argument - a value of 38928384; 4) Task A then enters reada_find_extent(), called from reada_add_block(). It finds there isn't any existing readahead extent for the logical address 38928384, so it proceeds to the path of creating a new one. It calls btrfs_map_block() to find out which stripes exist for the block group X. On the first iteration of the for loop that iterates over the stripes, it finds the stripe for device /dev/sdd, so it creates one zone for that device and adds it to the readahead extent. Before getting into the second iteration of the loop, the cleanup kthread deletes block group X because it was empty. So in the iterations for the remaining stripes it does not add more zones to the readahead extent, because the calls to reada_find_zone() returned NULL because they couldn't find block group X anymore. As a result the new readahead extent has a single zone, corresponding to the device /dev/sdd; 4) Before task A returns to btrfs_reada_add() and queues the readahead job for the readahead work queue, task B finishes the device replace and at btrfs_dev_replace_finishing() swaps the device /dev/sdd with the new device /dev/sdg; 5) Task A returns to reada_add_block(), which increments the counter "->elems" of the reada_control structure allocated at btrfs_reada_add(). Then it returns back to btrfs_reada_add() and calls reada_start_machine(). This queues a job in the readahead work queue to run the function reada_start_machine_worker(), which calls __reada_start_machine(). At __reada_start_machine() we take the device list mutex and for each device found in the current device list, we call reada_start_machine_dev() to start the readahead work. However at this point the device /dev/sdd was already freed and is not in the device list anymore. This means the corresponding readahead for the extent at 38928384 is never started, and therefore the "->elems" counter of the reada_control structure allocated at btrfs_reada_add() never goes down to 0, causing the call to btrfs_reada_wait(), done by the scrub task, to wait forever. Note that the readahead request can be made either after the device replace started or before it started, however in pratice it is very unlikely that a device replace is able to start after a readahead request is made and is able to complete before the readahead request completes - maybe only on a very small and nearly empty filesystem. This hang however is not the only problem we can have with readahead and device removals. When the readahead extent has other zones other than the one corresponding to the device that is being removed (either by a device replace or a device remove operation), we risk having a use-after-free on the device when dropping the last reference of the readahead extent. For example if we create a readahead extent with two zones, one for the device /dev/sdd and one for the device /dev/sde: 1) Before the readahead worker starts, the device /dev/sdd is removed, and the corresponding btrfs_device structure is freed. However the readahead extent still has the zone pointing to the device structure; 2) When the readahead worker starts, it only finds device /dev/sde in the current device list of the filesystem; 3) It starts the readahead work, at reada_start_machine_dev(), using the device /dev/sde; 4) Then when it finishes reading the extent from device /dev/sde, it calls __readahead_hook() which ends up dropping the last reference on the readahead extent through the last call to reada_extent_put(); 5) At reada_extent_put() it iterates over each zone of the readahead extent and attempts to delete an element from the device's 'reada_extents' radix tree, resulting in a use-after-free, as the device pointer of the zone for /dev/sdd is now stale. We can also access the device after dropping the last reference of a zone, through reada_zone_release(), also called by reada_extent_put(). And a device remove suffers the same problem, however since it shrinks the device size down to zero before removing the device, it is very unlikely to still have readahead requests not completed by the time we free the device, the only possibility is if the device has a very little space allocated. While the hang problem is exclusive to scrub, since it is currently the only user of btrfs_reada_add() and btrfs_reada_wait(), the use-after-free problem affects any path that triggers readhead, which includes btree_readahead_hook() and __readahead_hook() (a readahead worker can trigger readahed for the children of a node) for example - any path that ends up calling reada_add_block() can trigger the use-after-free after a device is removed. So fix this by waiting for any readahead requests for a device to complete before removing a device, ensuring that while waiting for existing ones no new ones can be made. This problem has been around for a very long time - the readahead code was added in 2011, device remove exists since 2008 and device replace was introduced in 2013, hard to pick a specific commit for a git Fixes tag. CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
pcercuei
referenced
this issue
in OpenDingux/linux
Nov 28, 2020
This fix is for a failure that occurred in the DWARF unwind perf test. Stack unwinders may probe memory when looking for frames. Memory sanitizer will poison and track uninitialized memory on the stack, and on the heap if the value is copied to the heap. This can lead to false memory sanitizer failures for the use of an uninitialized value. Avoid this problem by removing the poison on the copied stack. The full msan failure with track origins looks like: ==2168==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x559ceb10755b in handle_cfi elfutils/libdwfl/frame_unwind.c:648:8 #1 0x559ceb105448 in __libdwfl_frame_unwind elfutils/libdwfl/frame_unwind.c:741:4 #2 0x559ceb0ece90 in dwfl_thread_getframes elfutils/libdwfl/dwfl_frame.c:435:7 #3 0x559ceb0ec6b7 in get_one_thread_frames_cb elfutils/libdwfl/dwfl_frame.c:379:10 #4 0x559ceb0ec6b7 in get_one_thread_cb elfutils/libdwfl/dwfl_frame.c:308:17 #5 0x559ceb0ec6b7 in dwfl_getthreads elfutils/libdwfl/dwfl_frame.c:283:17 #6 0x559ceb0ec6b7 in getthread elfutils/libdwfl/dwfl_frame.c:354:14 #7 0x559ceb0ec6b7 in dwfl_getthread_frames elfutils/libdwfl/dwfl_frame.c:388:10 #8 0x559ceaff6ae6 in unwind__get_entries tools/perf/util/unwind-libdw.c:236:8 #9 0x559ceabc9dbc in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:111:8 #10 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26 #11 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0) #12 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2 #13 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9 #14 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9 #15 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8 MIPS#16 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9 MIPS#17 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9 MIPS#18 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4 MIPS#19 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9 MIPS#20 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11 MIPS#21 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8 MIPS#22 0x559cea95fbce in run_argv tools/perf/perf.c:409:2 MIPS#23 0x559cea95fbce in main tools/perf/perf.c:539:3 Uninitialized value was stored to memory at #0 0x559ceb106acf in __libdwfl_frame_reg_set elfutils/libdwfl/frame_unwind.c:77:22 #1 0x559ceb106acf in handle_cfi elfutils/libdwfl/frame_unwind.c:627:13 #2 0x559ceb105448 in __libdwfl_frame_unwind elfutils/libdwfl/frame_unwind.c:741:4 #3 0x559ceb0ece90 in dwfl_thread_getframes elfutils/libdwfl/dwfl_frame.c:435:7 #4 0x559ceb0ec6b7 in get_one_thread_frames_cb elfutils/libdwfl/dwfl_frame.c:379:10 #5 0x559ceb0ec6b7 in get_one_thread_cb elfutils/libdwfl/dwfl_frame.c:308:17 #6 0x559ceb0ec6b7 in dwfl_getthreads elfutils/libdwfl/dwfl_frame.c:283:17 #7 0x559ceb0ec6b7 in getthread elfutils/libdwfl/dwfl_frame.c:354:14 #8 0x559ceb0ec6b7 in dwfl_getthread_frames elfutils/libdwfl/dwfl_frame.c:388:10 #9 0x559ceaff6ae6 in unwind__get_entries tools/perf/util/unwind-libdw.c:236:8 #10 0x559ceabc9dbc in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:111:8 #11 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26 #12 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0) #13 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2 #14 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9 #15 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9 MIPS#16 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8 MIPS#17 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9 MIPS#18 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9 MIPS#19 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4 MIPS#20 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9 MIPS#21 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11 MIPS#22 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8 MIPS#23 0x559cea95fbce in run_argv tools/perf/perf.c:409:2 MIPS#24 0x559cea95fbce in main tools/perf/perf.c:539:3 Uninitialized value was stored to memory at #0 0x559ceb106a54 in handle_cfi elfutils/libdwfl/frame_unwind.c:613:9 #1 0x559ceb105448 in __libdwfl_frame_unwind elfutils/libdwfl/frame_unwind.c:741:4 #2 0x559ceb0ece90 in dwfl_thread_getframes elfutils/libdwfl/dwfl_frame.c:435:7 #3 0x559ceb0ec6b7 in get_one_thread_frames_cb elfutils/libdwfl/dwfl_frame.c:379:10 #4 0x559ceb0ec6b7 in get_one_thread_cb elfutils/libdwfl/dwfl_frame.c:308:17 #5 0x559ceb0ec6b7 in dwfl_getthreads elfutils/libdwfl/dwfl_frame.c:283:17 #6 0x559ceb0ec6b7 in getthread elfutils/libdwfl/dwfl_frame.c:354:14 #7 0x559ceb0ec6b7 in dwfl_getthread_frames elfutils/libdwfl/dwfl_frame.c:388:10 #8 0x559ceaff6ae6 in unwind__get_entries tools/perf/util/unwind-libdw.c:236:8 #9 0x559ceabc9dbc in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:111:8 #10 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26 #11 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0) #12 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2 #13 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9 #14 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9 #15 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8 MIPS#16 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9 MIPS#17 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9 MIPS#18 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4 MIPS#19 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9 MIPS#20 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11 MIPS#21 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8 MIPS#22 0x559cea95fbce in run_argv tools/perf/perf.c:409:2 MIPS#23 0x559cea95fbce in main tools/perf/perf.c:539:3 Uninitialized value was stored to memory at #0 0x559ceaff8800 in memory_read tools/perf/util/unwind-libdw.c:156:10 #1 0x559ceb10f053 in expr_eval elfutils/libdwfl/frame_unwind.c:501:13 #2 0x559ceb1060cc in handle_cfi elfutils/libdwfl/frame_unwind.c:603:18 #3 0x559ceb105448 in __libdwfl_frame_unwind elfutils/libdwfl/frame_unwind.c:741:4 #4 0x559ceb0ece90 in dwfl_thread_getframes elfutils/libdwfl/dwfl_frame.c:435:7 #5 0x559ceb0ec6b7 in get_one_thread_frames_cb elfutils/libdwfl/dwfl_frame.c:379:10 #6 0x559ceb0ec6b7 in get_one_thread_cb elfutils/libdwfl/dwfl_frame.c:308:17 #7 0x559ceb0ec6b7 in dwfl_getthreads elfutils/libdwfl/dwfl_frame.c:283:17 #8 0x559ceb0ec6b7 in getthread elfutils/libdwfl/dwfl_frame.c:354:14 #9 0x559ceb0ec6b7 in dwfl_getthread_frames elfutils/libdwfl/dwfl_frame.c:388:10 #10 0x559ceaff6ae6 in unwind__get_entries tools/perf/util/unwind-libdw.c:236:8 #11 0x559ceabc9dbc in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:111:8 #12 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26 #13 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0) #14 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2 #15 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9 MIPS#16 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9 MIPS#17 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8 MIPS#18 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9 MIPS#19 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9 MIPS#20 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4 MIPS#21 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9 MIPS#22 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11 MIPS#23 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8 MIPS#24 0x559cea95fbce in run_argv tools/perf/perf.c:409:2 MIPS#25 0x559cea95fbce in main tools/perf/perf.c:539:3 Uninitialized value was stored to memory at #0 0x559cea9027d9 in __msan_memcpy llvm/llvm-project/compiler-rt/lib/msan/msan_interceptors.cpp:1558:3 #1 0x559cea9d2185 in sample_ustack tools/perf/arch/x86/tests/dwarf-unwind.c:41:2 #2 0x559cea9d202c in test__arch_unwind_sample tools/perf/arch/x86/tests/dwarf-unwind.c:72:9 #3 0x559ceabc9cbd in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:106:6 #4 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26 #5 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0) #6 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2 #7 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9 #8 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9 #9 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8 #10 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9 #11 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9 #12 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4 #13 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9 #14 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11 #15 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8 MIPS#16 0x559cea95fbce in run_argv tools/perf/perf.c:409:2 MIPS#17 0x559cea95fbce in main tools/perf/perf.c:539:3 Uninitialized value was created by an allocation of 'bf' in the stack frame of function 'perf_event__synthesize_mmap_events' #0 0x559ceafc5f60 in perf_event__synthesize_mmap_events tools/perf/util/synthetic-events.c:445 SUMMARY: MemorySanitizer: use-of-uninitialized-value elfutils/libdwfl/frame_unwind.c:648:8 in handle_cfi Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: clang-built-linux@googlegroups.com Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandeep Dasgupta <sdasgup@google.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20201113182053.754625-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
pcercuei
referenced
this issue
in OpenDingux/linux
Dec 17, 2020
We hit this issue in our internal test. When enabling generic kasan, a kfree()'d object is put into per-cpu quarantine first. If the cpu goes offline, object still remains in the per-cpu quarantine. If we call kmem_cache_destroy() now, slub will report "Objects remaining" error. ============================================================================= BUG test_module_slab (Not tainted): Objects remaining in test_module_slab on __kmem_cache_shutdown() ----------------------------------------------------------------------------- Disabling lock debugging due to kernel taint INFO: Slab 0x(____ptrval____) objects=34 used=1 fp=0x(____ptrval____) flags=0x2ffff00000010200 CPU: 3 PID: 176 Comm: cat Tainted: G B 5.10.0-rc1-00007-g4525c8781ec0-dirty #10 Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace+0x0/0x2b0 show_stack+0x18/0x68 dump_stack+0xfc/0x168 slab_err+0xac/0xd4 __kmem_cache_shutdown+0x1e4/0x3c8 kmem_cache_destroy+0x68/0x130 test_version_show+0x84/0xf0 module_attr_show+0x40/0x60 sysfs_kf_seq_show+0x128/0x1c0 kernfs_seq_show+0xa0/0xb8 seq_read+0x1f0/0x7e8 kernfs_fop_read+0x70/0x338 vfs_read+0xe4/0x250 ksys_read+0xc8/0x180 __arm64_sys_read+0x44/0x58 el0_svc_common.constprop.0+0xac/0x228 do_el0_svc+0x38/0xa0 el0_sync_handler+0x170/0x178 el0_sync+0x174/0x180 INFO: Object 0x(____ptrval____) @offset=15848 INFO: Allocated in test_version_show+0x98/0xf0 age=8188 cpu=6 pid=172 stack_trace_save+0x9c/0xd0 set_track+0x64/0xf0 alloc_debug_processing+0x104/0x1a0 ___slab_alloc+0x628/0x648 __slab_alloc.isra.0+0x2c/0x58 kmem_cache_alloc+0x560/0x588 test_version_show+0x98/0xf0 module_attr_show+0x40/0x60 sysfs_kf_seq_show+0x128/0x1c0 kernfs_seq_show+0xa0/0xb8 seq_read+0x1f0/0x7e8 kernfs_fop_read+0x70/0x338 vfs_read+0xe4/0x250 ksys_read+0xc8/0x180 __arm64_sys_read+0x44/0x58 el0_svc_common.constprop.0+0xac/0x228 kmem_cache_destroy test_module_slab: Slab cache still has objects Register a cpu hotplug function to remove all objects in the offline per-cpu quarantine when cpu is going offline. Set a per-cpu variable to indicate this cpu is offline. [qiang.zhang@windriver.com: fix slab double free when cpu-hotplug] Link: https://lkml.kernel.org/r/20201204102206.20237-1-qiang.zhang@windriver.com Link: https://lkml.kernel.org/r/1606895585-17382-2-git-send-email-Kuan-Ying.Lee@mediatek.com Signed-off-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com> Signed-off-by: Zqiang <qiang.zhang@windriver.com> Suggested-by: Dmitry Vyukov <dvyukov@google.com> Reported-by: Guangye Yang <guangye.yang@mediatek.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Nicholas Tang <nicholas.tang@mediatek.com> Cc: Miles Chen <miles.chen@mediatek.com> Cc: Qian Cai <qcai@redhat.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Commit c172849 broke the userspace bluetooth API by increasing the size of the
hci_conn_info
struct. This causes buffer overflows in userspace programs (notably blueman crashes when doing almost anything).The patch was submitted upstream but was NAKed for this reason here:
https://lkml.org/lkml/2014/6/6/147
The text was updated successfully, but these errors were encountered: