Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HWMON: Add OCC hwmon driver v5 #37

Closed
wants to merge 1 commit into from
Closed

Conversation

adamliyi
Copy link
Member

This patch add a hwmon driver for BMC to monitor POWER CPU sensors
via OCC (On-Chip-Controller).

Changes for this version:

  1. Fixed checkpatch.pl warnings and errors.
  2. Fixed issue from Joel Stanley's review
  3. Add a new sysfs interface: '/sys/class/i2c-adaptor/i2c-x/x-yyyy/online',
    to notify the driver OCC is active.
  4. Use device table to instantiate the occ-i2c device.
  5. Add a new hwmon sysfs interface: '/sys/class/hwmon/hwmonX/user_powercap',
    to send "Set User Power Cap" command to OCC.

Signed-off-by: Yi Li adamliyi@msn.com

This patch add a hwmon driver for BMC to monitor POWER CPU sensors
via OCC (On-Chip-Controller).

Changes for this version:
1. Fixed checkpatch.pl warnings and errors.
2. Fixed issue from Joel Stanley's review
3. Add a new sysfs interface: '/sys/class/i2c-adaptor/i2c-x/x-yyyy/online',
to notify the driver OCC is active.
4. Use device table to instantiate the occ-i2c device.
5. Add a new hwmon sysfs interface: '/sys/class/hwmon/hwmonX/user_powercap',
to send "Set User Power Cap" command to OCC.

Signed-off-by: Yi Li <adamliyi@msn.com>
nkskjames pushed a commit to nkskjames/linux that referenced this pull request Jan 13, 2016
Fixes the following lockdep splat:
[    1.244527] =============================================
[    1.245193] [ INFO: possible recursive locking detected ]
[    1.245193] 4.2.0-rc1+ openbmc#37 Not tainted
[    1.245193] ---------------------------------------------
[    1.245193] cp/742 is trying to acquire lock:
[    1.245193]  (&sb->s_type->i_mutex_key#9){+.+.+.}, at: [<ffffffff812b3f69>] ubifs_init_security+0x29/0xb0
[    1.245193]
[    1.245193] but task is already holding lock:
[    1.245193]  (&sb->s_type->i_mutex_key#9){+.+.+.}, at: [<ffffffff81198e7f>] path_openat+0x3af/0x1280
[    1.245193]
[    1.245193] other info that might help us debug this:
[    1.245193]  Possible unsafe locking scenario:
[    1.245193]
[    1.245193]        CPU0
[    1.245193]        ----
[    1.245193]   lock(&sb->s_type->i_mutex_key#9);
[    1.245193]   lock(&sb->s_type->i_mutex_key#9);
[    1.245193]
[    1.245193]  *** DEADLOCK ***
[    1.245193]
[    1.245193]  May be due to missing lock nesting notation
[    1.245193]
[    1.245193] 2 locks held by cp/742:
[    1.245193]  #0:  (sb_writers#5){.+.+.+}, at: [<ffffffff811ad37f>] mnt_want_write+0x1f/0x50
[    1.245193]  openbmc#1:  (&sb->s_type->i_mutex_key#9){+.+.+.}, at: [<ffffffff81198e7f>] path_openat+0x3af/0x1280
[    1.245193]
[    1.245193] stack backtrace:
[    1.245193] CPU: 2 PID: 742 Comm: cp Not tainted 4.2.0-rc1+ openbmc#37
[    1.245193] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140816_022509-build35 04/01/2014
[    1.245193]  ffffffff8252d530 ffff88007b023a38 ffffffff814f6f49 ffffffff810b56c5
[    1.245193]  ffff88007c30cc80 ffff88007b023af8 ffffffff810a150d ffff88007b023a68
[    1.245193]  000000008101302a ffff880000000000 00000008f447e23f ffffffff8252d500
[    1.245193] Call Trace:
[    1.245193]  [<ffffffff814f6f49>] dump_stack+0x4c/0x65
[    1.245193]  [<ffffffff810b56c5>] ? console_unlock+0x1c5/0x510
[    1.245193]  [<ffffffff810a150d>] __lock_acquire+0x1a6d/0x1ea0
[    1.245193]  [<ffffffff8109fa78>] ? __lock_is_held+0x58/0x80
[    1.245193]  [<ffffffff810a1a93>] lock_acquire+0xd3/0x270
[    1.245193]  [<ffffffff812b3f69>] ? ubifs_init_security+0x29/0xb0
[    1.245193]  [<ffffffff814fc83b>] mutex_lock_nested+0x6b/0x3a0
[    1.245193]  [<ffffffff812b3f69>] ? ubifs_init_security+0x29/0xb0
[    1.245193]  [<ffffffff812b3f69>] ? ubifs_init_security+0x29/0xb0
[    1.245193]  [<ffffffff812b3f69>] ubifs_init_security+0x29/0xb0
[    1.245193]  [<ffffffff8128e286>] ubifs_create+0xa6/0x1f0
[    1.245193]  [<ffffffff81198e7f>] ? path_openat+0x3af/0x1280
[    1.245193]  [<ffffffff81195d15>] vfs_create+0x95/0xc0
[    1.245193]  [<ffffffff8119929c>] path_openat+0x7cc/0x1280
[    1.245193]  [<ffffffff8109ffe3>] ? __lock_acquire+0x543/0x1ea0
[    1.245193]  [<ffffffff81088f20>] ? sched_clock_cpu+0x90/0xc0
[    1.245193]  [<ffffffff81088c00>] ? calc_global_load_tick+0x60/0x90
[    1.245193]  [<ffffffff81088f20>] ? sched_clock_cpu+0x90/0xc0
[    1.245193]  [<ffffffff811a9cef>] ? __alloc_fd+0xaf/0x180
[    1.245193]  [<ffffffff8119ac55>] do_filp_open+0x75/0xd0
[    1.245193]  [<ffffffff814ffd86>] ? _raw_spin_unlock+0x26/0x40
[    1.245193]  [<ffffffff811a9cef>] ? __alloc_fd+0xaf/0x180
[    1.245193]  [<ffffffff81189bd9>] do_sys_open+0x129/0x200
[    1.245193]  [<ffffffff81189cc9>] SyS_open+0x19/0x20
[    1.245193]  [<ffffffff81500717>] entry_SYSCALL_64_fastpath+0x12/0x6f

While the lockdep splat is a false positive, becuase path_openat holds i_mutex
of the parent directory and ubifs_init_security() tries to acquire i_mutex
of a new inode, it reveals that taking i_mutex in ubifs_init_security() is
in vain because it is only being called in the inode allocation path
and therefore nobody else can see the inode yet.

Cc: stable@vger.kernel.org # 3.20-
Reported-and-tested-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reviewed-and-tested-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: dedekind1@gmail.com
@shenki
Copy link
Member

shenki commented Jan 27, 2016

This has been merged as of fe68a78

Note that there is still a lot of rework to be done before this driver can be sent upstream. This will be tracked in #44

@shenki shenki closed this Jan 27, 2016
shenki pushed a commit that referenced this pull request May 5, 2016
commit 3b65428 upstream.

Even though hid_hw_* checks that passed in data_len is less than
HID_MAX_BUFFER_SIZE it is not enough, as i2c-hid does not necessarily
allocate buffers of HID_MAX_BUFFER_SIZE but rather checks all device
reports and select largest size. In-kernel users normally just send as much
data as report needs, so there is no problem, but hidraw users can do
whatever they please:

BUG: KASAN: slab-out-of-bounds in memcpy+0x34/0x54 at addr ffffffc07135ea80
Write of size 4101 by task syz-executor/8747
CPU: 2 PID: 8747 Comm: syz-executor Tainted: G    BU         3.18.0 #37
Hardware name: Google Tegra210 Smaug Rev 1,3+ (DT)
Call trace:
[<ffffffc00020ebcc>] dump_backtrace+0x0/0x258 arch/arm64/kernel/traps.c:83
[<ffffffc00020ee40>] show_stack+0x1c/0x2c arch/arm64/kernel/traps.c:172
[<     inline     >] __dump_stack lib/dump_stack.c:15
[<ffffffc001958114>] dump_stack+0x90/0x140 lib/dump_stack.c:50
[<     inline     >] print_error_description mm/kasan/report.c:97
[<     inline     >] kasan_report_error mm/kasan/report.c:278
[<ffffffc0004597dc>] kasan_report+0x268/0x530 mm/kasan/report.c:305
[<ffffffc0004592e8>] __asan_storeN+0x20/0x150 mm/kasan/kasan.c:718
[<ffffffc0004594e0>] memcpy+0x30/0x54 mm/kasan/kasan.c:299
[<ffffffc001306354>] __i2c_hid_command+0x2b0/0x7b4 drivers/hid/i2c-hid/i2c-hid.c:178
[<     inline     >] i2c_hid_set_or_send_report drivers/hid/i2c-hid/i2c-hid.c:321
[<ffffffc0013079a0>] i2c_hid_output_raw_report.isra.2+0x3d4/0x4b8 drivers/hid/i2c-hid/i2c-hid.c:589
[<ffffffc001307ad8>] i2c_hid_output_report+0x54/0x68 drivers/hid/i2c-hid/i2c-hid.c:602
[<     inline     >] hid_hw_output_report include/linux/hid.h:1039
[<ffffffc0012cc7a0>] hidraw_send_report+0x400/0x414 drivers/hid/hidraw.c:154
[<ffffffc0012cc7f4>] hidraw_write+0x40/0x64 drivers/hid/hidraw.c:177
[<ffffffc0004681dc>] vfs_write+0x1d4/0x3cc fs/read_write.c:534
[<     inline     >] SYSC_pwrite64 fs/read_write.c:627
[<ffffffc000468984>] SyS_pwrite64+0xec/0x144 fs/read_write.c:614
Object at ffffffc07135ea80, in cache kmalloc-512
Object allocated with size 268 bytes.

Let's check data length against the buffer size before attempting to copy
data over.

Reported-by: Alexander Potapenko <glider@google.com>
Signed-off-by: Dmitry Torokhov <dtor@chromium.org>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
shenki pushed a commit that referenced this pull request Jul 12, 2016
commit d246dcb upstream.

[   40.467381] =============================================
[   40.473013] [ INFO: possible recursive locking detected ]
[   40.478651] 4.6.0-08691-g7f3db9a #37 Not tainted
[   40.483466] ---------------------------------------------
[   40.489098] usb/733 is trying to acquire lock:
[   40.493734]  (&(&dev->lock)->rlock){-.....}, at: [<bf129288>] ep0_complete+0x18/0xdc [gadgetfs]
[   40.502882]
[   40.502882] but task is already holding lock:
[   40.508967]  (&(&dev->lock)->rlock){-.....}, at: [<bf12a420>] ep0_read+0x20/0x5e0 [gadgetfs]
[   40.517811]
[   40.517811] other info that might help us debug this:
[   40.524623]  Possible unsafe locking scenario:
[   40.524623]
[   40.530798]        CPU0
[   40.533346]        ----
[   40.535894]   lock(&(&dev->lock)->rlock);
[   40.540088]   lock(&(&dev->lock)->rlock);
[   40.544284]
[   40.544284]  *** DEADLOCK ***
[   40.544284]
[   40.550461]  May be due to missing lock nesting notation
[   40.550461]
[   40.557544] 2 locks held by usb/733:
[   40.561271]  #0:  (&f->f_pos_lock){+.+.+.}, at: [<c02a6114>] __fdget_pos+0x40/0x48
[   40.569219]  #1:  (&(&dev->lock)->rlock){-.....}, at: [<bf12a420>] ep0_read+0x20/0x5e0 [gadgetfs]
[   40.578523]
[   40.578523] stack backtrace:
[   40.583075] CPU: 0 PID: 733 Comm: usb Not tainted 4.6.0-08691-g7f3db9a #37
[   40.590246] Hardware name: Generic AM33XX (Flattened Device Tree)
[   40.596625] [<c010ffbc>] (unwind_backtrace) from [<c010c1bc>] (show_stack+0x10/0x14)
[   40.604718] [<c010c1bc>] (show_stack) from [<c04207fc>] (dump_stack+0xb0/0xe4)
[   40.612267] [<c04207fc>] (dump_stack) from [<c01886ec>] (__lock_acquire+0xf68/0x1994)
[   40.620440] [<c01886ec>] (__lock_acquire) from [<c0189528>] (lock_acquire+0xd8/0x238)
[   40.628621] [<c0189528>] (lock_acquire) from [<c06ad6b4>] (_raw_spin_lock_irqsave+0x38/0x4c)
[   40.637440] [<c06ad6b4>] (_raw_spin_lock_irqsave) from [<bf129288>] (ep0_complete+0x18/0xdc [gadgetfs])
[   40.647339] [<bf129288>] (ep0_complete [gadgetfs]) from [<bf10a728>] (musb_g_giveback+0x118/0x1b0 [musb_hdrc])
[   40.657842] [<bf10a728>] (musb_g_giveback [musb_hdrc]) from [<bf108768>] (musb_g_ep0_queue+0x16c/0x188 [musb_hdrc])
[   40.668772] [<bf108768>] (musb_g_ep0_queue [musb_hdrc]) from [<bf12a944>] (ep0_read+0x544/0x5e0 [gadgetfs])
[   40.678963] [<bf12a944>] (ep0_read [gadgetfs]) from [<c0284470>] (__vfs_read+0x20/0x110)
[   40.687414] [<c0284470>] (__vfs_read) from [<c0285324>] (vfs_read+0x88/0x114)
[   40.694864] [<c0285324>] (vfs_read) from [<c0286150>] (SyS_read+0x44/0x9c)
[   40.702051] [<c0286150>] (SyS_read) from [<c0107820>] (ret_fast_syscall+0x0/0x1c)

This is caused by the spinlock bug in ep0_read().
Fix the two other deadlock sources in gadgetfs_setup() too.

Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
amboar pushed a commit to amboar/linux that referenced this pull request Jul 12, 2016
[   40.467381] =============================================
[   40.473013] [ INFO: possible recursive locking detected ]
[   40.478651] 4.6.0-08691-g7f3db9a openbmc#37 Not tainted
[   40.483466] ---------------------------------------------
[   40.489098] usb/733 is trying to acquire lock:
[   40.493734]  (&(&dev->lock)->rlock){-.....}, at: [<bf129288>] ep0_complete+0x18/0xdc [gadgetfs]
[   40.502882]
[   40.502882] but task is already holding lock:
[   40.508967]  (&(&dev->lock)->rlock){-.....}, at: [<bf12a420>] ep0_read+0x20/0x5e0 [gadgetfs]
[   40.517811]
[   40.517811] other info that might help us debug this:
[   40.524623]  Possible unsafe locking scenario:
[   40.524623]
[   40.530798]        CPU0
[   40.533346]        ----
[   40.535894]   lock(&(&dev->lock)->rlock);
[   40.540088]   lock(&(&dev->lock)->rlock);
[   40.544284]
[   40.544284]  *** DEADLOCK ***
[   40.544284]
[   40.550461]  May be due to missing lock nesting notation
[   40.550461]
[   40.557544] 2 locks held by usb/733:
[   40.561271]  #0:  (&f->f_pos_lock){+.+.+.}, at: [<c02a6114>] __fdget_pos+0x40/0x48
[   40.569219]  openbmc#1:  (&(&dev->lock)->rlock){-.....}, at: [<bf12a420>] ep0_read+0x20/0x5e0 [gadgetfs]
[   40.578523]
[   40.578523] stack backtrace:
[   40.583075] CPU: 0 PID: 733 Comm: usb Not tainted 4.6.0-08691-g7f3db9a openbmc#37
[   40.590246] Hardware name: Generic AM33XX (Flattened Device Tree)
[   40.596625] [<c010ffbc>] (unwind_backtrace) from [<c010c1bc>] (show_stack+0x10/0x14)
[   40.604718] [<c010c1bc>] (show_stack) from [<c04207fc>] (dump_stack+0xb0/0xe4)
[   40.612267] [<c04207fc>] (dump_stack) from [<c01886ec>] (__lock_acquire+0xf68/0x1994)
[   40.620440] [<c01886ec>] (__lock_acquire) from [<c0189528>] (lock_acquire+0xd8/0x238)
[   40.628621] [<c0189528>] (lock_acquire) from [<c06ad6b4>] (_raw_spin_lock_irqsave+0x38/0x4c)
[   40.637440] [<c06ad6b4>] (_raw_spin_lock_irqsave) from [<bf129288>] (ep0_complete+0x18/0xdc [gadgetfs])
[   40.647339] [<bf129288>] (ep0_complete [gadgetfs]) from [<bf10a728>] (musb_g_giveback+0x118/0x1b0 [musb_hdrc])
[   40.657842] [<bf10a728>] (musb_g_giveback [musb_hdrc]) from [<bf108768>] (musb_g_ep0_queue+0x16c/0x188 [musb_hdrc])
[   40.668772] [<bf108768>] (musb_g_ep0_queue [musb_hdrc]) from [<bf12a944>] (ep0_read+0x544/0x5e0 [gadgetfs])
[   40.678963] [<bf12a944>] (ep0_read [gadgetfs]) from [<c0284470>] (__vfs_read+0x20/0x110)
[   40.687414] [<c0284470>] (__vfs_read) from [<c0285324>] (vfs_read+0x88/0x114)
[   40.694864] [<c0285324>] (vfs_read) from [<c0286150>] (SyS_read+0x44/0x9c)
[   40.702051] [<c0286150>] (SyS_read) from [<c0107820>] (ret_fast_syscall+0x0/0x1c)

This is caused by the spinlock bug in ep0_read().
Fix the two other deadlock sources in gadgetfs_setup() too.

Cc: <stable@vger.kernel.org> # v3.16+
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
shenki pushed a commit that referenced this pull request Jul 22, 2016
commit d246dcb upstream.

[   40.467381] =============================================
[   40.473013] [ INFO: possible recursive locking detected ]
[   40.478651] 4.6.0-08691-g7f3db9a #37 Not tainted
[   40.483466] ---------------------------------------------
[   40.489098] usb/733 is trying to acquire lock:
[   40.493734]  (&(&dev->lock)->rlock){-.....}, at: [<bf129288>] ep0_complete+0x18/0xdc [gadgetfs]
[   40.502882]
[   40.502882] but task is already holding lock:
[   40.508967]  (&(&dev->lock)->rlock){-.....}, at: [<bf12a420>] ep0_read+0x20/0x5e0 [gadgetfs]
[   40.517811]
[   40.517811] other info that might help us debug this:
[   40.524623]  Possible unsafe locking scenario:
[   40.524623]
[   40.530798]        CPU0
[   40.533346]        ----
[   40.535894]   lock(&(&dev->lock)->rlock);
[   40.540088]   lock(&(&dev->lock)->rlock);
[   40.544284]
[   40.544284]  *** DEADLOCK ***
[   40.544284]
[   40.550461]  May be due to missing lock nesting notation
[   40.550461]
[   40.557544] 2 locks held by usb/733:
[   40.561271]  #0:  (&f->f_pos_lock){+.+.+.}, at: [<c02a6114>] __fdget_pos+0x40/0x48
[   40.569219]  #1:  (&(&dev->lock)->rlock){-.....}, at: [<bf12a420>] ep0_read+0x20/0x5e0 [gadgetfs]
[   40.578523]
[   40.578523] stack backtrace:
[   40.583075] CPU: 0 PID: 733 Comm: usb Not tainted 4.6.0-08691-g7f3db9a #37
[   40.590246] Hardware name: Generic AM33XX (Flattened Device Tree)
[   40.596625] [<c010ffbc>] (unwind_backtrace) from [<c010c1bc>] (show_stack+0x10/0x14)
[   40.604718] [<c010c1bc>] (show_stack) from [<c04207fc>] (dump_stack+0xb0/0xe4)
[   40.612267] [<c04207fc>] (dump_stack) from [<c01886ec>] (__lock_acquire+0xf68/0x1994)
[   40.620440] [<c01886ec>] (__lock_acquire) from [<c0189528>] (lock_acquire+0xd8/0x238)
[   40.628621] [<c0189528>] (lock_acquire) from [<c06ad6b4>] (_raw_spin_lock_irqsave+0x38/0x4c)
[   40.637440] [<c06ad6b4>] (_raw_spin_lock_irqsave) from [<bf129288>] (ep0_complete+0x18/0xdc [gadgetfs])
[   40.647339] [<bf129288>] (ep0_complete [gadgetfs]) from [<bf10a728>] (musb_g_giveback+0x118/0x1b0 [musb_hdrc])
[   40.657842] [<bf10a728>] (musb_g_giveback [musb_hdrc]) from [<bf108768>] (musb_g_ep0_queue+0x16c/0x188 [musb_hdrc])
[   40.668772] [<bf108768>] (musb_g_ep0_queue [musb_hdrc]) from [<bf12a944>] (ep0_read+0x544/0x5e0 [gadgetfs])
[   40.678963] [<bf12a944>] (ep0_read [gadgetfs]) from [<c0284470>] (__vfs_read+0x20/0x110)
[   40.687414] [<c0284470>] (__vfs_read) from [<c0285324>] (vfs_read+0x88/0x114)
[   40.694864] [<c0285324>] (vfs_read) from [<c0286150>] (SyS_read+0x44/0x9c)
[   40.702051] [<c0286150>] (SyS_read) from [<c0107820>] (ret_fast_syscall+0x0/0x1c)

This is caused by the spinlock bug in ep0_read().
Fix the two other deadlock sources in gadgetfs_setup() too.

Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
amboar pushed a commit to amboar/linux that referenced this pull request Nov 16, 2016
Commit 2518ac5 ("staging: wilc1000: Replace kthread with workqueue
for host interface") adds an unconditional destroy_workqueue() on the
wilc's "hif_workqueue" soon after its creation thereby rendering
it unusable. It then further attempts to queue work onto this
non-existing hif_worqueue and results in:

Unable to handle kernel NULL pointer dereference at virtual address 00000010
pgd = de478000
[00000010] *pgd=3eec0831, *pte=00000000, *ppte=00000000
Internal error: Oops: 17 [openbmc#1] ARM
Modules linked in: wilc1000_sdio(C) wilc1000(C)
CPU: 0 PID: 825 Comm: ifconfig Tainted: G         C      4.8.0-rc8+ openbmc#37
Hardware name: Atmel SAMA5
task: df56f800 task.stack: deeb0000
PC is at __queue_work+0x90/0x284
LR is at __queue_work+0x58/0x284
pc : [<c0126bb0>]    lr : [<c0126b78>]    psr: 600f0093
sp : deeb1aa0  ip : def22d78  fp : deea6000
r10: 00000000  r9 : c0a08150  r8 : c0a2f05
r7 : 00000001  r6 : dee9b600  r5 : def22d74  r4 : 00000000
r3 : 00000000  r2 : def22d74  r1 : 07ffffff  r0 : 00000000
Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment none
...
[<c0127060>] (__queue_work) from [<c0127298>] (queue_work_on+0x34/0x40)
[<c0127298>] (queue_work_on) from [<bf0076b4>] (wilc_enqueue_cmd+0x54/0x64 [wilc1000])
[<bf0076b4>] (wilc_enqueue_cmd [wilc1000]) from [<bf0082b4>] (wilc_set_wfi_drv_handler+0x48/0x70 [wilc1000])
[<bf0082b4>] (wilc_set_wfi_drv_handler [wilc1000]) from [<bf00509c>] (wilc_mac_open+0x214/0x250 [wilc1000])
[<bf00509c>] (wilc_mac_open [wilc1000]) from [<c04fde98>] (__dev_open+0xb8/0x11c)
[<c04fde98>] (__dev_open) from [<c04fe128>] (__dev_change_flags+0x94/0x158)
[<c04fe128>] (__dev_change_flags) from [<c04fe204>] (dev_change_flags+0x18/0x48)
[<c04fe204>] (dev_change_flags) from [<c0557d5c>] (devinet_ioctl+0x6b4/0x788)
[<c0557d5c>] (devinet_ioctl) from [<c04e40a0>] (sock_ioctl+0x154/0x2cc)
[<c04e40a0>] (sock_ioctl) from [<c01b16e0>] (do_vfs_ioctl+0x9c/0x878)
[<c01b16e0>] (do_vfs_ioctl) from [<c01b1ef0>] (SyS_ioctl+0x34/0x5c)
[<c01b1ef0>] (SyS_ioctl) from [<c0107520>] (ret_fast_syscall+0x0/0x3c)
Code: e5932004 e1520006 01a04003 0affffff (e5943010)
---[ end trace b612328adaa6bf20 ]---

This fix removes the unnecessary call to destroy_workqueue() while opening
the device to avoid the above kernel panic. The deinit routine already
does a good job of terminating the workqueue when no longer needed.

Reported-by: Nicolas Ferre <Nicolas.Ferre@microchip.com>
Fixes: 2518ac5 ("staging: wilc1000: Replace kthread with workqueue for host interface")
Cc: stable@vger.kernel.org # 4.8+
Signed-off-by: Aditya Shankar <Aditya.Shankar@microchip.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
amboar pushed a commit to amboar/linux that referenced this pull request Nov 16, 2016
Vince Waver reported the following bug:

  WARNING: CPU: 0 PID: 21338 at arch/x86/mm/fault.c:435 vmalloc_fault+0x58/0x1f0
  CPU: 0 PID: 21338 Comm: perf_fuzzer Not tainted 4.8.0+ openbmc#37
  Hardware name: Hewlett-Packard HP Compaq Pro 6305 SFF/1850, BIOS K06 v02.57 08/16/2013
  Call Trace:
   <NMI>  ? dump_stack+0x46/0x59
   ? __warn+0xd5/0xee
   ? vmalloc_fault+0x58/0x1f0
   ? __do_page_fault+0x6d/0x48e
   ? perf_log_throttle+0xa4/0xf4
   ? trace_page_fault+0x22/0x30
   ? __unwind_start+0x28/0x42
   ? perf_callchain_kernel+0x75/0xac
   ? get_perf_callchain+0x13a/0x1f0
   ? perf_callchain+0x6a/0x6c
   ? perf_prepare_sample+0x71/0x2eb
   ? perf_event_output_forward+0x1a/0x54
   ? __default_send_IPI_shortcut+0x10/0x2d
   ? __perf_event_overflow+0xfb/0x167
   ? x86_pmu_handle_irq+0x113/0x150
   ? native_read_msr+0x6/0x34
   ? perf_event_nmi_handler+0x22/0x39
   ? perf_ibs_nmi_handler+0x4a/0x51
   ? perf_event_nmi_handler+0x22/0x39
   ? nmi_handle+0x4d/0xf0
   ? perf_ibs_handle_irq+0x3d1/0x3d1
   ? default_do_nmi+0x3c/0xd5
   ? do_nmi+0x92/0x102
   ? end_repeat_nmi+0x1a/0x1e
   ? entry_SYSCALL_64_after_swapgs+0x12/0x4a
   ? entry_SYSCALL_64_after_swapgs+0x12/0x4a
   ? entry_SYSCALL_64_after_swapgs+0x12/0x4a
   <EOE> ^A4---[ end trace 632723104d47d31a ]---
  BUG: stack guard page was hit at ffffc90008500000 (stack is ffffc900084fc000..ffffc900084fffff)
  kernel stack overflow (page fault): 0000 [openbmc#1] SMP
  ...

The NMI hit in the entry code right after setting up the stack pointer
from 'cpu_current_top_of_stack', so the kernel stack was empty.  The
'guess' version of __unwind_start() attempted to dereference the "top of
stack" pointer, which is not actually *on* the stack.

Add a check in the guess unwinder to deal with an empty stack.  (The
frame pointer unwinder already has such a check.)

Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 7c7900f ("x86/unwind: Add new unwind interface and implementations")
Link: http://lkml.kernel.org/r/20161024133127.e5evgeebdbohnmpb@treble
Signed-off-by: Ingo Molnar <mingo@kernel.org>
shenki pushed a commit that referenced this pull request Dec 10, 2018
[ Upstream commit 45611c6 ]

The bug is not easily reproducable, as it may occur very infrequently
(we had machines with 20minutes heavy downloading before it occurred)
However, on a virual machine (VMWare on Windows 10 host) it occurred
pretty frequently (1-2 seconds after a speedtest was started)

dev->tx_skb mab be freed via dev_kfree_skb_irq on a callback
before it is set.

This causes the following problems:
- double free of the skb or potential memory leak
- in dmesg: 'recvmsg bug' and 'recvmsg bug 2' and eventually
  general protection fault

Example dmesg output:
[  134.841986] ------------[ cut here ]------------
[  134.841987] recvmsg bug: copied 9C24A555 seq 9C24B557 rcvnxt 9C25A6B3 fl 0
[  134.841993] WARNING: CPU: 7 PID: 2629 at /build/linux-hwe-On9fm7/linux-hwe-4.15.0/net/ipv4/tcp.c:1865 tcp_recvmsg+0x44d/0xab0
[  134.841994] Modules linked in: ipheth(OE) kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd vmw_balloon intel_rapl_perf joydev input_leds serio_raw vmw_vsock_vmci_transport vsock shpchp i2c_piix4 mac_hid binfmt_misc vmw_vmci parport_pc ppdev lp parport autofs4 vmw_pvscsi vmxnet3 hid_generic usbhid hid vmwgfx ttm drm_kms_helper syscopyarea sysfillrect mptspi mptscsih sysimgblt ahci psmouse fb_sys_fops pata_acpi mptbase libahci e1000 drm scsi_transport_spi
[  134.842046] CPU: 7 PID: 2629 Comm: python Tainted: G        W  OE    4.15.0-34-generic #37~16.04.1-Ubuntu
[  134.842046] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
[  134.842048] RIP: 0010:tcp_recvmsg+0x44d/0xab0
[  134.842048] RSP: 0018:ffffa6630422bcc8 EFLAGS: 00010286
[  134.842049] RAX: 0000000000000000 RBX: ffff997616f4f200 RCX: 0000000000000006
[  134.842049] RDX: 0000000000000007 RSI: 0000000000000082 RDI: ffff9976257d6490
[  134.842050] RBP: ffffa6630422bd98 R08: 0000000000000001 R09: 000000000004bba4
[  134.842050] R10: 0000000001e00c6f R11: 000000000004bba4 R12: ffff99760dee3000
[  134.842051] R13: 0000000000000000 R14: ffff99760dee3514 R15: 0000000000000000
[  134.842051] FS:  00007fe332347700(0000) GS:ffff9976257c0000(0000) knlGS:0000000000000000
[  134.842052] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  134.842053] CR2: 0000000001e41000 CR3: 000000020e9b4006 CR4: 00000000003606e0
[  134.842055] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  134.842055] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  134.842057] Call Trace:
[  134.842060]  ? aa_sk_perm+0x53/0x1a0
[  134.842064]  inet_recvmsg+0x51/0xc0
[  134.842066]  sock_recvmsg+0x43/0x50
[  134.842070]  SYSC_recvfrom+0xe4/0x160
[  134.842072]  ? __schedule+0x3de/0x8b0
[  134.842075]  ? ktime_get_ts64+0x4c/0xf0
[  134.842079]  SyS_recvfrom+0xe/0x10
[  134.842082]  do_syscall_64+0x73/0x130
[  134.842086]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[  134.842086] RIP: 0033:0x7fe331f5a81d
[  134.842088] RSP: 002b:00007ffe8da98398 EFLAGS: 00000246 ORIG_RAX: 000000000000002d
[  134.842090] RAX: ffffffffffffffda RBX: ffffffffffffffff RCX: 00007fe331f5a81d
[  134.842094] RDX: 00000000000003fb RSI: 0000000001e00874 RDI: 0000000000000003
[  134.842095] RBP: 00007fe32f642c70 R08: 0000000000000000 R09: 0000000000000000
[  134.842097] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fe332347698
[  134.842099] R13: 0000000001b7e0a0 R14: 0000000001e00874 R15: 0000000000000000
[  134.842103] Code: 24 fd ff ff e9 cc fe ff ff 48 89 d8 41 8b 8c 24 10 05 00 00 44 8b 45 80 48 c7 c7 08 bd 59 8b 48 89 85 68 ff ff ff e8 b3 c4 7d ff <0f> 0b 48 8b 85 68 ff ff ff e9 e9 fe ff ff 41 8b 8c 24 10 05 00
[  134.842126] ---[ end trace b7138fc08c83147f ]---
[  134.842144] general protection fault: 0000 [#1] SMP PTI
[  134.842145] Modules linked in: ipheth(OE) kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd vmw_balloon intel_rapl_perf joydev input_leds serio_raw vmw_vsock_vmci_transport vsock shpchp i2c_piix4 mac_hid binfmt_misc vmw_vmci parport_pc ppdev lp parport autofs4 vmw_pvscsi vmxnet3 hid_generic usbhid hid vmwgfx ttm drm_kms_helper syscopyarea sysfillrect mptspi mptscsih sysimgblt ahci psmouse fb_sys_fops pata_acpi mptbase libahci e1000 drm scsi_transport_spi
[  134.842161] CPU: 7 PID: 2629 Comm: python Tainted: G        W  OE    4.15.0-34-generic #37~16.04.1-Ubuntu
[  134.842162] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
[  134.842164] RIP: 0010:tcp_close+0x2c6/0x440
[  134.842165] RSP: 0018:ffffa6630422bde8 EFLAGS: 00010202
[  134.842167] RAX: 0000000000000000 RBX: ffff99760dee3000 RCX: 0000000180400034
[  134.842168] RDX: 5c4afd407207a6c4 RSI: ffffe868495bd300 RDI: ffff997616f4f200
[  134.842169] RBP: ffffa6630422be08 R08: 0000000016f4d401 R09: 0000000180400034
[  134.842169] R10: ffffa6630422bd98 R11: 0000000000000000 R12: 000000000000600c
[  134.842170] R13: 0000000000000000 R14: ffff99760dee30c8 R15: ffff9975bd44fe00
[  134.842171] FS:  00007fe332347700(0000) GS:ffff9976257c0000(0000) knlGS:0000000000000000
[  134.842173] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  134.842174] CR2: 0000000001e41000 CR3: 000000020e9b4006 CR4: 00000000003606e0
[  134.842177] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  134.842178] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  134.842179] Call Trace:
[  134.842181]  inet_release+0x42/0x70
[  134.842183]  __sock_release+0x42/0xb0
[  134.842184]  sock_close+0x15/0x20
[  134.842187]  __fput+0xea/0x220
[  134.842189]  ____fput+0xe/0x10
[  134.842191]  task_work_run+0x8a/0xb0
[  134.842193]  exit_to_usermode_loop+0xc4/0xd0
[  134.842195]  do_syscall_64+0xf4/0x130
[  134.842197]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[  134.842197] RIP: 0033:0x7fe331f5a560
[  134.842198] RSP: 002b:00007ffe8da982e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
[  134.842200] RAX: 0000000000000000 RBX: 00007fe32f642c70 RCX: 00007fe331f5a560
[  134.842201] RDX: 00000000008f5320 RSI: 0000000001cd4b50 RDI: 0000000000000003
[  134.842202] RBP: 00007fe32f6500f8 R08: 000000000000003c R09: 00000000009343c0
[  134.842203] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fe32f6500d0
[  134.842204] R13: 00000000008f5320 R14: 00000000008f5320 R15: 0000000001cd4770
[  134.842205] Code: c8 00 00 00 45 31 e4 49 39 fe 75 4d eb 50 83 ab d8 00 00 00 01 48 8b 17 48 8b 47 08 48 c7 07 00 00 00 00 48 c7 47 08 00 00 00 00 <48> 89 42 08 48 89 10 0f b6 57 34 8b 47 2c 2b 47 28 83 e2 01 80
[  134.842226] RIP: tcp_close+0x2c6/0x440 RSP: ffffa6630422bde8
[  134.842227] ---[ end trace b7138fc08c831480 ]---

The proposed patch eliminates a potential racing condition.
Before, usb_submit_urb was called and _after_ that, the skb was attached
(dev->tx_skb). So, on a callback it was possible, however unlikely that the
skb was freed before it was set. That way (because dev->tx_skb was not set
to NULL after it was freed), it could happen that a skb from a earlier
transmission was freed a second time (and the skb we should have freed did
not get freed at all)

Now we free the skb directly in ipheth_tx(). It is not passed to the
callback anymore, eliminating the posibility of a double free of the same
skb. Depending on the retval of usb_submit_urb() we use dev_kfree_skb_any()
respectively dev_consume_skb_any() to free the skb.

Signed-off-by: Oliver Zweigle <Oliver.Zweigle@faro.com>
Signed-off-by: Bernd Eckstein <3ernd.Eckstein@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
shenki pushed a commit that referenced this pull request Feb 13, 2019
[ Upstream commit 9b1f19d ]

Similarly to commit 276bdb8 ("dccp: check ccid before dereferencing")
it is wise to test for a NULL ccid.

kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 16 Comm: ksoftirqd/1 Not tainted 5.0.0-rc3+ #37
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:ccid_hc_tx_parse_options net/dccp/ccid.h:205 [inline]
RIP: 0010:dccp_parse_options+0x8d9/0x12b0 net/dccp/options.c:233
Code: c5 0f b6 75 b3 80 38 00 0f 85 d6 08 00 00 48 b9 00 00 00 00 00 fc ff df 48 8b 45 b8 4c 8b b8 f8 07 00 00 4c 89 f8 48 c1 e8 03 <80> 3c 08 00 0f 85 95 08 00 00 48 b8 00 00 00 00 00 fc ff df 4d 8b
kobject: 'loop5' (0000000080f78fc1): kobject_uevent_env
RSP: 0018:ffff8880a94df0b8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8880858ac723 RCX: dffffc0000000000
RDX: 0000000000000100 RSI: 0000000000000007 RDI: 0000000000000001
RBP: ffff8880a94df140 R08: 0000000000000001 R09: ffff888061b83a80
R10: ffffed100c370752 R11: ffff888061b83a97 R12: 0000000000000026
R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f0defa33518 CR3: 000000008db5e000 CR4: 00000000001406e0
kobject: 'loop5' (0000000080f78fc1): fill_kobj_path: path = '/devices/virtual/block/loop5'
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 dccp_rcv_state_process+0x2b6/0x1af6 net/dccp/input.c:654
 dccp_v4_do_rcv+0x100/0x190 net/dccp/ipv4.c:688
 sk_backlog_rcv include/net/sock.h:936 [inline]
 __sk_receive_skb+0x3a9/0xea0 net/core/sock.c:473
 dccp_v4_rcv+0x10cb/0x1f80 net/dccp/ipv4.c:880
 ip_protocol_deliver_rcu+0xb6/0xa20 net/ipv4/ip_input.c:208
 ip_local_deliver_finish+0x23b/0x390 net/ipv4/ip_input.c:234
 NF_HOOK include/linux/netfilter.h:289 [inline]
 NF_HOOK include/linux/netfilter.h:283 [inline]
 ip_local_deliver+0x1f0/0x740 net/ipv4/ip_input.c:255
 dst_input include/net/dst.h:450 [inline]
 ip_rcv_finish+0x1f4/0x2f0 net/ipv4/ip_input.c:414
 NF_HOOK include/linux/netfilter.h:289 [inline]
 NF_HOOK include/linux/netfilter.h:283 [inline]
 ip_rcv+0xed/0x620 net/ipv4/ip_input.c:524
 __netif_receive_skb_one_core+0x160/0x210 net/core/dev.c:4973
 __netif_receive_skb+0x2c/0x1c0 net/core/dev.c:5083
 process_backlog+0x206/0x750 net/core/dev.c:5923
 napi_poll net/core/dev.c:6346 [inline]
 net_rx_action+0x76d/0x1930 net/core/dev.c:6412
 __do_softirq+0x30b/0xb11 kernel/softirq.c:292
 run_ksoftirqd kernel/softirq.c:654 [inline]
 run_ksoftirqd+0x8e/0x110 kernel/softirq.c:646
 smpboot_thread_fn+0x6ab/0xa10 kernel/smpboot.c:164
 kthread+0x357/0x430 kernel/kthread.c:246
 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352
Modules linked in:
---[ end trace 58a0ba03bea2c376 ]---
RIP: 0010:ccid_hc_tx_parse_options net/dccp/ccid.h:205 [inline]
RIP: 0010:dccp_parse_options+0x8d9/0x12b0 net/dccp/options.c:233
Code: c5 0f b6 75 b3 80 38 00 0f 85 d6 08 00 00 48 b9 00 00 00 00 00 fc ff df 48 8b 45 b8 4c 8b b8 f8 07 00 00 4c 89 f8 48 c1 e8 03 <80> 3c 08 00 0f 85 95 08 00 00 48 b8 00 00 00 00 00 fc ff df 4d 8b
RSP: 0018:ffff8880a94df0b8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8880858ac723 RCX: dffffc0000000000
RDX: 0000000000000100 RSI: 0000000000000007 RDI: 0000000000000001
RBP: ffff8880a94df140 R08: 0000000000000001 R09: ffff888061b83a80
R10: ffffed100c370752 R11: ffff888061b83a97 R12: 0000000000000026
R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f0defa33518 CR3: 0000000009871000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
amboar pushed a commit to amboar/linux that referenced this pull request Sep 19, 2019
Donald reported this sequence:
  ip next add id 1 blackhole
  ip next add id 2 blackhole
  ip ro add 1.1.1.1/32 nhid 1
  ip ro add 1.1.1.2/32 nhid 2

would cause a crash. Backtrace is:

[  151.302790] general protection fault: 0000 [openbmc#1] SMP DEBUG_PAGEALLOC KASAN PTI
[  151.304043] CPU: 1 PID: 277 Comm: ip Not tainted 5.3.0-rc5+ openbmc#37
[  151.305078] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1 04/01/2014
[  151.306526] RIP: 0010:fib_add_nexthop+0x8b/0x2aa
[  151.307343] Code: 35 f7 81 48 8d 14 01 c7 02 f1 f1 f1 f1 c7 42 04 01 f4 f4 f4 48 89 f2 48 c1 ea 03 65 48 8b 0c 25 28 00 00 00 48 89 4d d0 31 c9 <80> 3c 02 00 74 08 48 89 f7 e8 1a e8 53 ff be 08 00 00 00 4c 89 e7
[  151.310549] RSP: 0018:ffff888116c27340 EFLAGS: 00010246
[  151.311469] RAX: dffffc0000000000 RBX: ffff8881154ece00 RCX: 0000000000000000
[  151.312713] RDX: 0000000000000004 RSI: 0000000000000020 RDI: ffff888115649b40
[  151.313968] RBP: ffff888116c273d8 R08: ffffed10221e3757 R09: ffff888110f1bab8
[  151.315212] R10: 0000000000000001 R11: ffff888110f1bab3 R12: ffff888115649b40
[  151.316456] R13: 0000000000000020 R14: ffff888116c273b0 R15: ffff888115649b40
[  151.317707] FS:  00007f60b4d8d800(0000) GS:ffff88811ac00000(0000) knlGS:0000000000000000
[  151.319113] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  151.320119] CR2: 0000555671ffdc00 CR3: 00000001136ba005 CR4: 0000000000020ee0
[  151.321367] Call Trace:
[  151.321820]  ? fib_nexthop_info+0x635/0x635
[  151.322572]  fib_dump_info+0xaa4/0xde0
[  151.323247]  ? fib_create_info+0x2431/0x2431
[  151.324008]  ? napi_alloc_frag+0x2a/0x2a
[  151.324711]  rtmsg_fib+0x2c4/0x3be
[  151.325339]  fib_table_insert+0xe2f/0xeee
...

fib_dump_info incorrectly has nhs = 0 for blackhole nexthops, so it
believes the nexthop object is a multipath group (nhs != 1) and ends
up down the nexthop_mpath_fill_node() path which is wrong for a
blackhole.

The blackhole check in nexthop_num_path is leftover from early days
of the blackhole implementation which did not initialize the device.
In the end the design was simpler (fewer special case checks) to set
the device to loopback in nh_info, so the check in nexthop_num_path
should have been removed.

Fixes: 430a049 ("nexthop: Add support for nexthop groups")
Reported-by: Donald Sharp <sharpd@cumulusnetworks.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
amboar pushed a commit to amboar/linux that referenced this pull request Nov 26, 2020
[ Upstream commit c54e14d ]

In xfs_growfs_rt(), we enlarge bitmap and summary files by allocating
new blocks for both files. For each of the new blocks allocated, we
allocate an xfs_buf, zero the payload, log the contents and commit the
transaction. Hence these buffers will eventually find themselves
appended to list at xfs_ail->ail_buf_list.

Later, xfs_growfs_rt() loops across all of the new blocks belonging to
the bitmap inode to set the bitmap values to 1. In doing so, it
allocates a new transaction and invokes the following sequence of
functions,
  - xfs_rtfree_range()
    - xfs_rtmodify_range()
      - xfs_rtbuf_get()
        We pass '&xfs_rtbuf_ops' as the ops pointer to xfs_trans_read_buf().
        - xfs_trans_read_buf()
	  We find the xfs_buf of interest in per-ag hash table, invoke
	  xfs_buf_reverify() which ends up assigning '&xfs_rtbuf_ops' to
	  xfs_buf->b_ops.

On the other hand, if xfs_growfs_rt_alloc() had allocated a few blocks
for the bitmap inode and returned with an error, all the xfs_bufs
corresponding to the new bitmap blocks that have been allocated would
continue to be on xfs_ail->ail_buf_list list without ever having a
non-NULL value assigned to their b_ops members. An AIL flush operation
would then trigger the following warning message to be printed on the
console,

  XFS (loop0): _xfs_buf_ioapply: no buf ops on daddr 0x58 len 8
  00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  CPU: 3 PID: 449 Comm: xfsaild/loop0 Not tainted 5.8.0-rc4-chandan-00038-g4d8c2b9de9ab-dirty openbmc#37
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
  Call Trace:
   dump_stack+0x57/0x70
   _xfs_buf_ioapply+0x37c/0x3b0
   ? xfs_rw_bdev+0x1e0/0x1e0
   ? xfs_buf_delwri_submit_buffers+0xd4/0x210
   __xfs_buf_submit+0x6d/0x1f0
   xfs_buf_delwri_submit_buffers+0xd4/0x210
   xfsaild+0x2c8/0x9e0
   ? __switch_to_asm+0x42/0x70
   ? xfs_trans_ail_cursor_first+0x80/0x80
   kthread+0xfe/0x140
   ? kthread_park+0x90/0x90
   ret_from_fork+0x22/0x30

This message indicates that the xfs_buf had its b_ops member set to
NULL.

This commit fixes the issue by assigning "&xfs_rtbuf_ops" to b_ops
member of each of the xfs_bufs logged by xfs_growfs_rt_alloc().

Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
shenki pushed a commit that referenced this pull request May 19, 2022
commit acb16b3 upstream.

We received a report[1] of kernel crashes when Cilium is used in XDP
mode with virtio_net after updating to newer kernels. After
investigating the reason it turned out that when using mergeable bufs
with an XDP program which adjusts xdp.data or xdp.data_meta page_to_buf()
calculates the build_skb address wrong because the offset can become less
than the headroom so it gets the address of the previous page (-X bytes
depending on how lower offset is):
 page_to_skb: page addr ffff9eb2923e2000 buf ffff9eb2923e1ffc offset 252 headroom 256

This is a pr_err() I added in the beginning of page_to_skb which clearly
shows offset that is less than headroom by adding 4 bytes of metadata
via an xdp prog. The calculations done are:
 receive_mergeable():
 headroom = VIRTIO_XDP_HEADROOM; // VIRTIO_XDP_HEADROOM == 256 bytes
 offset = xdp.data - page_address(xdp_page) -
          vi->hdr_len - metasize;

 page_to_skb():
 p = page_address(page) + offset;
 ...
 buf = p - headroom;

Now buf goes -4 bytes from the page's starting address as can be seen
above which is set as skb->head and skb->data by build_skb later. Depending
on what's done with the skb (when it's freed most often) we get all kinds
of corruptions and BUG_ON() triggers in mm[2]. We have to recalculate
the new headroom after the xdp program has run, similar to how offset
and len are recalculated. Headroom is directly related to
data_hard_start, data and data_meta, so we use them to get the new size.
The result is correct (similar pr_err() in page_to_skb, one case of
xdp_page and one case of virtnet buf):
 a) Case with 4 bytes of metadata
 [  115.949641] page_to_skb: page addr ffff8b4dcfad2000 offset 252 headroom 252
 [  121.084105] page_to_skb: page addr ffff8b4dcf018000 offset 20732 headroom 252
 b) Case of pushing data +32 bytes
 [  153.181401] page_to_skb: page addr ffff8b4dd0c4d000 offset 288 headroom 288
 [  158.480421] page_to_skb: page addr ffff8b4dd00b0000 offset 24864 headroom 288
 c) Case of pushing data -33 bytes
 [  835.906830] page_to_skb: page addr ffff8b4dd3270000 offset 223 headroom 223
 [  840.839910] page_to_skb: page addr ffff8b4dcdd68000 offset 12511 headroom 223

Offset and headroom are equal because offset points to the start of
reserved bytes for the virtio_net header which are at buf start +
headroom, while data points at buf start + vnet hdr size + headroom so
when data or data_meta are adjusted by the xdp prog both the headroom size
and the offset change equally. We can use data_hard_start to compute the
new headroom after the xdp prog (linearized / page start case, the
virtnet buf case is similar just with bigger base offset):
 xdp.data_hard_start = page_address + vnet_hdr
 xdp.data = page_address + vnet_hdr + headroom
 new headroom after xdp prog = xdp.data - xdp.data_hard_start - metasize

An example reproducer xdp prog[3] is below.

[1] cilium/cilium#19453

[2] Two of the many traces:
 [   40.437400] BUG: Bad page state in process swapper/0  pfn:14940
 [   40.916726] BUG: Bad page state in process systemd-resolve  pfn:053b7
 [   41.300891] kernel BUG at include/linux/mm.h:720!
 [   41.301801] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
 [   41.302784] CPU: 1 PID: 1181 Comm: kubelet Kdump: loaded Tainted: G    B   W         5.18.0-rc1+ #37
 [   41.304458] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1.fc35 04/01/2014
 [   41.306018] RIP: 0010:page_frag_free+0x79/0xe0
 [   41.306836] Code: 00 00 75 ea 48 8b 07 a9 00 00 01 00 74 e0 48 8b 47 48 48 8d 50 ff a8 01 48 0f 45 fa eb d0 48 c7 c6 18 b8 30 a6 e8 d7 f8 fc ff <0f> 0b 48 8d 78 ff eb bc 48 8b 07 a9 00 00 01 00 74 3a 66 90 0f b6
 [   41.310235] RSP: 0018:ffffac05c2a6bc78 EFLAGS: 00010292
 [   41.311201] RAX: 000000000000003e RBX: 0000000000000000 RCX: 0000000000000000
 [   41.312502] RDX: 0000000000000001 RSI: ffffffffa6423004 RDI: 00000000ffffffff
 [   41.313794] RBP: ffff993c98823600 R08: 0000000000000000 R09: 00000000ffffdfff
 [   41.315089] R10: ffffac05c2a6ba68 R11: ffffffffa698ca28 R12: ffff993c98823600
 [   41.316398] R13: ffff993c86311ebc R14: 0000000000000000 R15: 000000000000005c
 [   41.317700] FS:  00007fe13fc56740(0000) GS:ffff993cdd900000(0000) knlGS:0000000000000000
 [   41.319150] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 [   41.320152] CR2: 000000c00008a000 CR3: 0000000014908000 CR4: 0000000000350ee0
 [   41.321387] Call Trace:
 [   41.321819]  <TASK>
 [   41.322193]  skb_release_data+0x13f/0x1c0
 [   41.322902]  __kfree_skb+0x20/0x30
 [   41.343870]  tcp_recvmsg_locked+0x671/0x880
 [   41.363764]  tcp_recvmsg+0x5e/0x1c0
 [   41.384102]  inet_recvmsg+0x42/0x100
 [   41.406783]  ? sock_recvmsg+0x1d/0x70
 [   41.428201]  sock_read_iter+0x84/0xd0
 [   41.445592]  ? 0xffffffffa3000000
 [   41.462442]  new_sync_read+0x148/0x160
 [   41.479314]  ? 0xffffffffa3000000
 [   41.496937]  vfs_read+0x138/0x190
 [   41.517198]  ksys_read+0x87/0xc0
 [   41.535336]  do_syscall_64+0x3b/0x90
 [   41.551637]  entry_SYSCALL_64_after_hwframe+0x44/0xae
 [   41.568050] RIP: 0033:0x48765b
 [   41.583955] Code: e8 4a 35 fe ff eb 88 cc cc cc cc cc cc cc cc e8 fb 7a fe ff 48 8b 7c 24 10 48 8b 74 24 18 48 8b 54 24 20 48 8b 44 24 08 0f 05 <48> 3d 01 f0 ff ff 76 20 48 c7 44 24 28 ff ff ff ff 48 c7 44 24 30
 [   41.632818] RSP: 002b:000000c000a2f5b8 EFLAGS: 00000212 ORIG_RAX: 0000000000000000
 [   41.664588] RAX: ffffffffffffffda RBX: 000000c000062000 RCX: 000000000048765b
 [   41.681205] RDX: 0000000000005e54 RSI: 000000c000e66000 RDI: 0000000000000016
 [   41.697164] RBP: 000000c000a2f608 R08: 0000000000000001 R09: 00000000000001b4
 [   41.713034] R10: 00000000000000b6 R11: 0000000000000212 R12: 00000000000000e9
 [   41.728755] R13: 0000000000000001 R14: 000000c000a92000 R15: ffffffffffffffff
 [   41.744254]  </TASK>
 [   41.758585] Modules linked in: br_netfilter bridge veth netconsole virtio_net

 and

 [   33.524802] BUG: Bad page state in process systemd-network  pfn:11e60
 [   33.528617] page ffffe05dc0147b00 ffffe05dc04e7a00 ffff8ae9851ec000 (1) len 82 offset 252 metasize 4 hroom 0 hdr_len 12 data ffff8ae9851ec10c data_meta ffff8ae9851ec108 data_end ffff8ae9851ec14e
 [   33.529764] page:000000003792b5ba refcount:0 mapcount:-512 mapping:0000000000000000 index:0x0 pfn:0x11e60
 [   33.532463] flags: 0xfffffc0000000(node=0|zone=1|lastcpupid=0x1fffff)
 [   33.532468] raw: 000fffffc0000000 0000000000000000 dead000000000122 0000000000000000
 [   33.532470] raw: 0000000000000000 0000000000000000 00000000fffffdff 0000000000000000
 [   33.532471] page dumped because: nonzero mapcount
 [   33.532472] Modules linked in: br_netfilter bridge veth netconsole virtio_net
 [   33.532479] CPU: 0 PID: 791 Comm: systemd-network Kdump: loaded Not tainted 5.18.0-rc1+ #37
 [   33.532482] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1.fc35 04/01/2014
 [   33.532484] Call Trace:
 [   33.532496]  <TASK>
 [   33.532500]  dump_stack_lvl+0x45/0x5a
 [   33.532506]  bad_page.cold+0x63/0x94
 [   33.532510]  free_pcp_prepare+0x290/0x420
 [   33.532515]  free_unref_page+0x1b/0x100
 [   33.532518]  skb_release_data+0x13f/0x1c0
 [   33.532524]  kfree_skb_reason+0x3e/0xc0
 [   33.532527]  ip6_mc_input+0x23c/0x2b0
 [   33.532531]  ip6_sublist_rcv_finish+0x83/0x90
 [   33.532534]  ip6_sublist_rcv+0x22b/0x2b0

[3] XDP program to reproduce(xdp_pass.c):
 #include <linux/bpf.h>
 #include <bpf/bpf_helpers.h>

 SEC("xdp_pass")
 int xdp_pkt_pass(struct xdp_md *ctx)
 {
          bpf_xdp_adjust_head(ctx, -(int)32);
          return XDP_PASS;
 }

 char _license[] SEC("license") = "GPL";

 compile: clang -O2 -g -Wall -target bpf -c xdp_pass.c -o xdp_pass.o
 load on virtio_net: ip link set enp1s0 xdpdrv obj xdp_pass.o sec xdp_pass

CC: stable@vger.kernel.org
CC: Jason Wang <jasowang@redhat.com>
CC: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
CC: Daniel Borkmann <daniel@iogearbox.net>
CC: "Michael S. Tsirkin" <mst@redhat.com>
CC: virtualization@lists.linux-foundation.org
Fixes: 8fb7da9 ("virtio_net: get build_skb() buf by data ptr")
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20220425103703.3067292-1-razor@blackwall.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
shenki pushed a commit that referenced this pull request Sep 21, 2022
[ Upstream commit 12f3519 ]

This change fixes the following kernel NULL pointer dereference
which is reproduced by blktests srp/007 occasionally.

BUG: kernel NULL pointer dereference, address: 0000000000000170
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 9 Comm: kworker/0:1H Kdump: loaded Not tainted 6.0.0-rc1+ #37
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.15.0-29-g6a62e0cb0dfe-prebuilt.qemu.org 04/01/2014
Workqueue:  0x0 (kblockd)
RIP: 0010:srp_recv_done+0x176/0x500 [ib_srp]
Code: 00 4d 85 ff 0f 84 52 02 00 00 48 c7 82 80 02 00 00 00 00 00 00 4c 89 df 4c 89 14 24 e8 53 d3 4a f6 4c 8b 14 24 41 0f b6 42 13 <41> 89 87 70 01 00 00 41 0f b6 52 12 f6 c2 02 74 44 41 8b 42 1c b9
RSP: 0018:ffffaef7c0003e28 EFLAGS: 00000282
RAX: 0000000000000000 RBX: ffff9bc9486dea60 RCX: 0000000000000000
RDX: 0000000000000102 RSI: ffffffffb76bbd0e RDI: 00000000ffffffff
RBP: ffff9bc980099a00 R08: 0000000000000001 R09: 0000000000000001
R10: ffff9bca53ef0000 R11: ffff9bc980099a10 R12: ffff9bc956e14000
R13: ffff9bc9836b9cb0 R14: ffff9bc9557b4480 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff9bc97ec00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000170 CR3: 0000000007e04000 CR4: 00000000000006f0
Call Trace:
 <IRQ>
 __ib_process_cq+0xb7/0x280 [ib_core]
 ib_poll_handler+0x2b/0x130 [ib_core]
 irq_poll_softirq+0x93/0x150
 __do_softirq+0xee/0x4b8
 irq_exit_rcu+0xf7/0x130
 sysvec_apic_timer_interrupt+0x8e/0xc0
 </IRQ>

Fixes: ad215aa ("RDMA/srp: Make struct scsi_cmnd and struct srp_request adjacent")
Link: https://lore.kernel.org/r/20220831081626.18712-1-yangx.jy@fujitsu.com
Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com>
Acked-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
thangqn-ampere pushed a commit to ampere-openbmc/linux that referenced this pull request Oct 4, 2022
[ Upstream commit 12f3519 ]

This change fixes the following kernel NULL pointer dereference
which is reproduced by blktests srp/007 occasionally.

BUG: kernel NULL pointer dereference, address: 0000000000000170
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 9 Comm: kworker/0:1H Kdump: loaded Not tainted 6.0.0-rc1+ openbmc#37
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.15.0-29-g6a62e0cb0dfe-prebuilt.qemu.org 04/01/2014
Workqueue:  0x0 (kblockd)
RIP: 0010:srp_recv_done+0x176/0x500 [ib_srp]
Code: 00 4d 85 ff 0f 84 52 02 00 00 48 c7 82 80 02 00 00 00 00 00 00 4c 89 df 4c 89 14 24 e8 53 d3 4a f6 4c 8b 14 24 41 0f b6 42 13 <41> 89 87 70 01 00 00 41 0f b6 52 12 f6 c2 02 74 44 41 8b 42 1c b9
RSP: 0018:ffffaef7c0003e28 EFLAGS: 00000282
RAX: 0000000000000000 RBX: ffff9bc9486dea60 RCX: 0000000000000000
RDX: 0000000000000102 RSI: ffffffffb76bbd0e RDI: 00000000ffffffff
RBP: ffff9bc980099a00 R08: 0000000000000001 R09: 0000000000000001
R10: ffff9bca53ef0000 R11: ffff9bc980099a10 R12: ffff9bc956e14000
R13: ffff9bc9836b9cb0 R14: ffff9bc9557b4480 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff9bc97ec00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000170 CR3: 0000000007e04000 CR4: 00000000000006f0
Call Trace:
 <IRQ>
 __ib_process_cq+0xb7/0x280 [ib_core]
 ib_poll_handler+0x2b/0x130 [ib_core]
 irq_poll_softirq+0x93/0x150
 __do_softirq+0xee/0x4b8
 irq_exit_rcu+0xf7/0x130
 sysvec_apic_timer_interrupt+0x8e/0xc0
 </IRQ>

Fixes: ad215aa ("RDMA/srp: Make struct scsi_cmnd and struct srp_request adjacent")
Link: https://lore.kernel.org/r/20220831081626.18712-1-yangx.jy@fujitsu.com
Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com>
Acked-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
shenki pushed a commit that referenced this pull request Oct 24, 2022
…utdown

[ Upstream commit 316ae95 ]

lpuart_dma_shutdown tears down lpuart dma, but lpuart_flush_buffer can
still occur which in turn tries to access dma apis if lpuart_dma_tx_use
flag is true. At this point since dma is torn down, these dma apis can
abort. Set lpuart_dma_tx_use and the corresponding rx flag
lpuart_dma_rx_use to false in lpuart_dma_shutdown so that dmas are not
accessed after they are relinquished.

Otherwise, when try to kill btattach, kernel may panic. This patch may
fix this issue.
root@imx8ulpevk:~# btattach -B /dev/ttyLP2 -S 115200
^C[   90.182296] Internal error: synchronous external abort: 96000210 [#1] PREEMPT SMP
[   90.189806] Modules linked in: moal(O) mlan(O)
[   90.194258] CPU: 0 PID: 503 Comm: btattach Tainted: G           O      5.15.32-06136-g34eecdf2f9e4 #37
[   90.203554] Hardware name: NXP i.MX8ULP 9X9 EVK (DT)
[   90.208513] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   90.215470] pc : fsl_edma3_disable_request+0x8/0x60
[   90.220358] lr : fsl_edma3_terminate_all+0x34/0x20c
[   90.225237] sp : ffff800013f0bac0
[   90.228548] x29: ffff800013f0bac0 x28: 0000000000000001 x27: ffff000008404800
[   90.235681] x26: ffff000008404960 x25: ffff000008404a08 x24: ffff000008404a00
[   90.242813] x23: ffff000008404a60 x22: 0000000000000002 x21: 0000000000000000
[   90.249946] x20: ffff800013f0baf8 x19: ffff00000559c800 x18: 0000000000000000
[   90.257078] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[   90.264211] x14: 0000000000000003 x13: 0000000000000000 x12: 0000000000000040
[   90.271344] x11: ffff00000600c248 x10: ffff800013f0bb10 x9 : ffff000057bcb090
[   90.278477] x8 : fffffc0000241a08 x7 : ffff00000534ee00 x6 : ffff000008404804
[   90.285609] x5 : 0000000000000000 x4 : 0000000000000000 x3 : ffff0000055b3480
[   90.292742] x2 : ffff8000135c0000 x1 : ffff00000534ee00 x0 : ffff00000559c800
[   90.299876] Call trace:
[   90.302321]  fsl_edma3_disable_request+0x8/0x60
[   90.306851]  lpuart_flush_buffer+0x40/0x160
[   90.311037]  uart_flush_buffer+0x88/0x120
[   90.315050]  tty_driver_flush_buffer+0x20/0x30
[   90.319496]  hci_uart_flush+0x44/0x90
[   90.323162]  +0x34/0x12c
[   90.327253]  tty_ldisc_close+0x38/0x70
[   90.331005]  tty_ldisc_release+0xa8/0x190
[   90.335018]  tty_release_struct+0x24/0x8c
[   90.339022]  tty_release+0x3ec/0x4c0
[   90.342593]  __fput+0x70/0x234
[   90.345652]  ____fput+0x14/0x20
[   90.348790]  task_work_run+0x84/0x17c
[   90.352455]  do_exit+0x310/0x96c
[   90.355688]  do_group_exit+0x3c/0xa0
[   90.359259]  __arm64_sys_exit_group+0x1c/0x20
[   90.363609]  invoke_syscall+0x48/0x114
[   90.367362]  el0_svc_common.constprop.0+0xd4/0xfc
[   90.372068]  do_el0_svc+0x2c/0x94
[   90.375379]  el0_svc+0x28/0x80
[   90.378438]  el0t_64_sync_handler+0xa8/0x130
[   90.382711]  el0t_64_sync+0x1a0/0x1a4
[   90.386376] Code: 17ffffda d503201f d503233f f9409802 (b9400041)
[   90.392467] ---[ end trace 2f60524b4a43f1f6 ]---
[   90.397073] note: btattach[503] exited with preempt_count 1
[   90.402636] Fixing recursive fault but reboot is needed!

Fixes: 6250cc3 ("tty: serial: fsl_lpuart: Use scatter/gather DMA for Tx")
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Thara Gopinath <tgopinath@microsoft.com>
Signed-off-by: Sherry Sun <sherry.sun@nxp.com>
Link: https://lore.kernel.org/r/20220920111703.1532-1-sherry.sun@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
shenki pushed a commit that referenced this pull request Nov 2, 2022
…utdown

[ Upstream commit 316ae95 ]

lpuart_dma_shutdown tears down lpuart dma, but lpuart_flush_buffer can
still occur which in turn tries to access dma apis if lpuart_dma_tx_use
flag is true. At this point since dma is torn down, these dma apis can
abort. Set lpuart_dma_tx_use and the corresponding rx flag
lpuart_dma_rx_use to false in lpuart_dma_shutdown so that dmas are not
accessed after they are relinquished.

Otherwise, when try to kill btattach, kernel may panic. This patch may
fix this issue.
root@imx8ulpevk:~# btattach -B /dev/ttyLP2 -S 115200
^C[   90.182296] Internal error: synchronous external abort: 96000210 [#1] PREEMPT SMP
[   90.189806] Modules linked in: moal(O) mlan(O)
[   90.194258] CPU: 0 PID: 503 Comm: btattach Tainted: G           O      5.15.32-06136-g34eecdf2f9e4 #37
[   90.203554] Hardware name: NXP i.MX8ULP 9X9 EVK (DT)
[   90.208513] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   90.215470] pc : fsl_edma3_disable_request+0x8/0x60
[   90.220358] lr : fsl_edma3_terminate_all+0x34/0x20c
[   90.225237] sp : ffff800013f0bac0
[   90.228548] x29: ffff800013f0bac0 x28: 0000000000000001 x27: ffff000008404800
[   90.235681] x26: ffff000008404960 x25: ffff000008404a08 x24: ffff000008404a00
[   90.242813] x23: ffff000008404a60 x22: 0000000000000002 x21: 0000000000000000
[   90.249946] x20: ffff800013f0baf8 x19: ffff00000559c800 x18: 0000000000000000
[   90.257078] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[   90.264211] x14: 0000000000000003 x13: 0000000000000000 x12: 0000000000000040
[   90.271344] x11: ffff00000600c248 x10: ffff800013f0bb10 x9 : ffff000057bcb090
[   90.278477] x8 : fffffc0000241a08 x7 : ffff00000534ee00 x6 : ffff000008404804
[   90.285609] x5 : 0000000000000000 x4 : 0000000000000000 x3 : ffff0000055b3480
[   90.292742] x2 : ffff8000135c0000 x1 : ffff00000534ee00 x0 : ffff00000559c800
[   90.299876] Call trace:
[   90.302321]  fsl_edma3_disable_request+0x8/0x60
[   90.306851]  lpuart_flush_buffer+0x40/0x160
[   90.311037]  uart_flush_buffer+0x88/0x120
[   90.315050]  tty_driver_flush_buffer+0x20/0x30
[   90.319496]  hci_uart_flush+0x44/0x90
[   90.323162]  +0x34/0x12c
[   90.327253]  tty_ldisc_close+0x38/0x70
[   90.331005]  tty_ldisc_release+0xa8/0x190
[   90.335018]  tty_release_struct+0x24/0x8c
[   90.339022]  tty_release+0x3ec/0x4c0
[   90.342593]  __fput+0x70/0x234
[   90.345652]  ____fput+0x14/0x20
[   90.348790]  task_work_run+0x84/0x17c
[   90.352455]  do_exit+0x310/0x96c
[   90.355688]  do_group_exit+0x3c/0xa0
[   90.359259]  __arm64_sys_exit_group+0x1c/0x20
[   90.363609]  invoke_syscall+0x48/0x114
[   90.367362]  el0_svc_common.constprop.0+0xd4/0xfc
[   90.372068]  do_el0_svc+0x2c/0x94
[   90.375379]  el0_svc+0x28/0x80
[   90.378438]  el0t_64_sync_handler+0xa8/0x130
[   90.382711]  el0t_64_sync+0x1a0/0x1a4
[   90.386376] Code: 17ffffda d503201f d503233f f9409802 (b9400041)
[   90.392467] ---[ end trace 2f60524b4a43f1f6 ]---
[   90.397073] note: btattach[503] exited with preempt_count 1
[   90.402636] Fixing recursive fault but reboot is needed!

Fixes: 6250cc3 ("tty: serial: fsl_lpuart: Use scatter/gather DMA for Tx")
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Thara Gopinath <tgopinath@microsoft.com>
Signed-off-by: Sherry Sun <sherry.sun@nxp.com>
Link: https://lore.kernel.org/r/20220920111703.1532-1-sherry.sun@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
shenki pushed a commit that referenced this pull request Aug 18, 2023
commit 511b90e upstream.

Despite commit 0ad529d ("mptcp: fix possible divide by zero in
recvmsg()"), the mptcp protocol is still prone to a race between
disconnect() (or shutdown) and accept.

The root cause is that the mentioned commit checks the msk-level
flag, but mptcp_stream_accept() does acquire the msk-level lock,
as it can rely directly on the first subflow lock.

As reported by Christoph than can lead to a race where an msk
socket is accepted after that mptcp_subflow_queue_clean() releases
the listener socket lock and just before it takes destructive
actions leading to the following splat:

BUG: kernel NULL pointer dereference, address: 0000000000000012
PGD 5a4ca067 P4D 5a4ca067 PUD 37d4c067 PMD 0
Oops: 0000 [#1] PREEMPT SMP
CPU: 2 PID: 10955 Comm: syz-executor.5 Not tainted 6.5.0-rc1-gdc7b257ee5dd #37
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
RIP: 0010:mptcp_stream_accept+0x1ee/0x2f0 include/net/inet_sock.h:330
Code: 0a 09 00 48 8b 1b 4c 39 e3 74 07 e8 bc 7c 7f fe eb a1 e8 b5 7c 7f fe 4c 8b 6c 24 08 eb 05 e8 a9 7c 7f fe 49 8b 85 d8 09 00 00 <0f> b6 40 12 88 44 24 07 0f b6 6c 24 07 bf 07 00 00 00 89 ee e8 89
RSP: 0018:ffffc90000d07dc0 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff888037e8d020 RCX: ffff88803b093300
RDX: 0000000000000000 RSI: ffffffff833822c5 RDI: ffffffff8333896a
RBP: 0000607f82031520 R08: ffff88803b093300 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000003e83 R12: ffff888037e8d020
R13: ffff888037e8c680 R14: ffff888009af7900 R15: ffff888009af6880
FS:  00007fc26d708640(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000012 CR3: 0000000066bc5001 CR4: 0000000000370ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 do_accept+0x1ae/0x260 net/socket.c:1872
 __sys_accept4+0x9b/0x110 net/socket.c:1913
 __do_sys_accept4 net/socket.c:1954 [inline]
 __se_sys_accept4 net/socket.c:1951 [inline]
 __x64_sys_accept4+0x20/0x30 net/socket.c:1951
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x47/0xa0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x6e/0xd8

Address the issue by temporary removing the pending request socket
from the accept queue, so that racing accept() can't touch them.

After depleting the msk - the ssk still exists, as plain TCP sockets,
re-insert them into the accept queue, so that later inet_csk_listen_stop()
will complete the tcp socket disposal.

Fixes: 2a6a870 ("mptcp: stops worker on unaccepted sockets at listener close")
Cc: stable@vger.kernel.org
Reported-by: Christoph Paasch <cpaasch@apple.com>
Closes: multipath-tcp/mptcp_net-next#423
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Link: https://lore.kernel.org/r/20230803-upstream-net-20230803-misc-fixes-6-5-v1-4-6671b1ab11cc@tessares.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
amboar pushed a commit that referenced this pull request Apr 11, 2024
[ Upstream commit f7442a6 ]

The mlxbf_gige driver encounters a NULL pointer exception in
mlxbf_gige_open() when kdump is enabled.  The sequence to reproduce
the exception is as follows:
a) enable kdump
b) trigger kdump via "echo c > /proc/sysrq-trigger"
c) kdump kernel executes
d) kdump kernel loads mlxbf_gige module
e) the mlxbf_gige module runs its open() as the
   the "oob_net0" interface is brought up
f) mlxbf_gige module will experience an exception
   during its open(), something like:

     Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
     Mem abort info:
       ESR = 0x0000000086000004
       EC = 0x21: IABT (current EL), IL = 32 bits
       SET = 0, FnV = 0
       EA = 0, S1PTW = 0
       FSC = 0x04: level 0 translation fault
     user pgtable: 4k pages, 48-bit VAs, pgdp=00000000e29a4000
     [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
     Internal error: Oops: 0000000086000004 [#1] SMP
     CPU: 0 PID: 812 Comm: NetworkManager Tainted: G           OE     5.15.0-1035-bluefield #37-Ubuntu
     Hardware name: https://www.mellanox.com BlueField-3 SmartNIC Main Card/BlueField-3 SmartNIC Main Card, BIOS 4.6.0.13024 Jan 19 2024
     pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
     pc : 0x0
     lr : __napi_poll+0x40/0x230
     sp : ffff800008003e00
     x29: ffff800008003e00 x28: 0000000000000000 x27: 00000000ffffffff
     x26: ffff000066027238 x25: ffff00007cedec00 x24: ffff800008003ec8
     x23: 000000000000012c x22: ffff800008003eb7 x21: 0000000000000000
     x20: 0000000000000001 x19: ffff000066027238 x18: 0000000000000000
     x17: ffff578fcb450000 x16: ffffa870b083c7c0 x15: 0000aaab010441d0
     x14: 0000000000000001 x13: 00726f7272655f65 x12: 6769675f6662786c
     x11: 0000000000000000 x10: 0000000000000000 x9 : ffffa870b0842398
     x8 : 0000000000000004 x7 : fe5a48b9069706ea x6 : 17fdb11fc84ae0d2
     x5 : d94a82549d594f35 x4 : 0000000000000000 x3 : 0000000000400100
     x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000066027238
     Call trace:
      0x0
      net_rx_action+0x178/0x360
      __do_softirq+0x15c/0x428
      __irq_exit_rcu+0xac/0xec
      irq_exit+0x18/0x2c
      handle_domain_irq+0x6c/0xa0
      gic_handle_irq+0xec/0x1b0
      call_on_irq_stack+0x20/0x2c
      do_interrupt_handler+0x5c/0x70
      el1_interrupt+0x30/0x50
      el1h_64_irq_handler+0x18/0x2c
      el1h_64_irq+0x7c/0x80
      __setup_irq+0x4c0/0x950
      request_threaded_irq+0xf4/0x1bc
      mlxbf_gige_request_irqs+0x68/0x110 [mlxbf_gige]
      mlxbf_gige_open+0x5c/0x170 [mlxbf_gige]
      __dev_open+0x100/0x220
      __dev_change_flags+0x16c/0x1f0
      dev_change_flags+0x2c/0x70
      do_setlink+0x220/0xa40
      __rtnl_newlink+0x56c/0x8a0
      rtnl_newlink+0x58/0x84
      rtnetlink_rcv_msg+0x138/0x3c4
      netlink_rcv_skb+0x64/0x130
      rtnetlink_rcv+0x20/0x30
      netlink_unicast+0x2ec/0x360
      netlink_sendmsg+0x278/0x490
      __sock_sendmsg+0x5c/0x6c
      ____sys_sendmsg+0x290/0x2d4
      ___sys_sendmsg+0x84/0xd0
      __sys_sendmsg+0x70/0xd0
      __arm64_sys_sendmsg+0x2c/0x40
      invoke_syscall+0x78/0x100
      el0_svc_common.constprop.0+0x54/0x184
      do_el0_svc+0x30/0xac
      el0_svc+0x48/0x160
      el0t_64_sync_handler+0xa4/0x12c
      el0t_64_sync+0x1a4/0x1a8
     Code: bad PC value
     ---[ end trace 7d1c3f3bf9d81885 ]---
     Kernel panic - not syncing: Oops: Fatal exception in interrupt
     Kernel Offset: 0x2870a7a00000 from 0xffff800008000000
     PHYS_OFFSET: 0x80000000
     CPU features: 0x0,000005c1,a3332a5a
     Memory Limit: none
     ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]---

The exception happens because there is a pending RX interrupt before the
call to request_irq(RX IRQ) executes.  Then, the RX IRQ handler fires
immediately after this request_irq() completes. The RX IRQ handler runs
"napi_schedule()" before NAPI is fully initialized via "netif_napi_add()"
and "napi_enable()", both which happen later in the open() logic.

The logic in mlxbf_gige_open() must fully initialize NAPI before any calls
to request_irq() execute.

Fixes: f92e186 ("Add Mellanox BlueField Gigabit Ethernet driver")
Signed-off-by: David Thompson <davthompson@nvidia.com>
Reviewed-by: Asmaa Mnebhi <asmaa@nvidia.com>
Link: https://lore.kernel.org/r/20240325183627.7641-1-davthompson@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
amboar pushed a commit to amboar/linux that referenced this pull request Apr 15, 2024
The mlxbf_gige driver encounters a NULL pointer exception in
mlxbf_gige_open() when kdump is enabled.  The sequence to reproduce
the exception is as follows:
a) enable kdump
b) trigger kdump via "echo c > /proc/sysrq-trigger"
c) kdump kernel executes
d) kdump kernel loads mlxbf_gige module
e) the mlxbf_gige module runs its open() as the
   the "oob_net0" interface is brought up
f) mlxbf_gige module will experience an exception
   during its open(), something like:

     Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
     Mem abort info:
       ESR = 0x0000000086000004
       EC = 0x21: IABT (current EL), IL = 32 bits
       SET = 0, FnV = 0
       EA = 0, S1PTW = 0
       FSC = 0x04: level 0 translation fault
     user pgtable: 4k pages, 48-bit VAs, pgdp=00000000e29a4000
     [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
     Internal error: Oops: 0000000086000004 [openbmc#1] SMP
     CPU: 0 PID: 812 Comm: NetworkManager Tainted: G           OE     5.15.0-1035-bluefield openbmc#37-Ubuntu
     Hardware name: https://www.mellanox.com BlueField-3 SmartNIC Main Card/BlueField-3 SmartNIC Main Card, BIOS 4.6.0.13024 Jan 19 2024
     pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
     pc : 0x0
     lr : __napi_poll+0x40/0x230
     sp : ffff800008003e00
     x29: ffff800008003e00 x28: 0000000000000000 x27: 00000000ffffffff
     x26: ffff000066027238 x25: ffff00007cedec00 x24: ffff800008003ec8
     x23: 000000000000012c x22: ffff800008003eb7 x21: 0000000000000000
     x20: 0000000000000001 x19: ffff000066027238 x18: 0000000000000000
     x17: ffff578fcb450000 x16: ffffa870b083c7c0 x15: 0000aaab010441d0
     x14: 0000000000000001 x13: 00726f7272655f65 x12: 6769675f6662786c
     x11: 0000000000000000 x10: 0000000000000000 x9 : ffffa870b0842398
     x8 : 0000000000000004 x7 : fe5a48b9069706ea x6 : 17fdb11fc84ae0d2
     x5 : d94a82549d594f35 x4 : 0000000000000000 x3 : 0000000000400100
     x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000066027238
     Call trace:
      0x0
      net_rx_action+0x178/0x360
      __do_softirq+0x15c/0x428
      __irq_exit_rcu+0xac/0xec
      irq_exit+0x18/0x2c
      handle_domain_irq+0x6c/0xa0
      gic_handle_irq+0xec/0x1b0
      call_on_irq_stack+0x20/0x2c
      do_interrupt_handler+0x5c/0x70
      el1_interrupt+0x30/0x50
      el1h_64_irq_handler+0x18/0x2c
      el1h_64_irq+0x7c/0x80
      __setup_irq+0x4c0/0x950
      request_threaded_irq+0xf4/0x1bc
      mlxbf_gige_request_irqs+0x68/0x110 [mlxbf_gige]
      mlxbf_gige_open+0x5c/0x170 [mlxbf_gige]
      __dev_open+0x100/0x220
      __dev_change_flags+0x16c/0x1f0
      dev_change_flags+0x2c/0x70
      do_setlink+0x220/0xa40
      __rtnl_newlink+0x56c/0x8a0
      rtnl_newlink+0x58/0x84
      rtnetlink_rcv_msg+0x138/0x3c4
      netlink_rcv_skb+0x64/0x130
      rtnetlink_rcv+0x20/0x30
      netlink_unicast+0x2ec/0x360
      netlink_sendmsg+0x278/0x490
      __sock_sendmsg+0x5c/0x6c
      ____sys_sendmsg+0x290/0x2d4
      ___sys_sendmsg+0x84/0xd0
      __sys_sendmsg+0x70/0xd0
      __arm64_sys_sendmsg+0x2c/0x40
      invoke_syscall+0x78/0x100
      el0_svc_common.constprop.0+0x54/0x184
      do_el0_svc+0x30/0xac
      el0_svc+0x48/0x160
      el0t_64_sync_handler+0xa4/0x12c
      el0t_64_sync+0x1a4/0x1a8
     Code: bad PC value
     ---[ end trace 7d1c3f3bf9d81885 ]---
     Kernel panic - not syncing: Oops: Fatal exception in interrupt
     Kernel Offset: 0x2870a7a00000 from 0xffff800008000000
     PHYS_OFFSET: 0x80000000
     CPU features: 0x0,000005c1,a3332a5a
     Memory Limit: none
     ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]---

The exception happens because there is a pending RX interrupt before the
call to request_irq(RX IRQ) executes.  Then, the RX IRQ handler fires
immediately after this request_irq() completes. The RX IRQ handler runs
"napi_schedule()" before NAPI is fully initialized via "netif_napi_add()"
and "napi_enable()", both which happen later in the open() logic.

The logic in mlxbf_gige_open() must fully initialize NAPI before any calls
to request_irq() execute.

Fixes: f92e186 ("Add Mellanox BlueField Gigabit Ethernet driver")
Signed-off-by: David Thompson <davthompson@nvidia.com>
Reviewed-by: Asmaa Mnebhi <asmaa@nvidia.com>
Link: https://lore.kernel.org/r/20240325183627.7641-1-davthompson@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants