Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shell/2600 #1

Merged
merged 2 commits into from
Jun 9, 2014
Merged

Shell/2600 #1

merged 2 commits into from
Jun 9, 2014

Conversation

dbnicholson
Copy link
Member

Changes needed to fix package generation. Without these, the debian rules were picking up the wrong version and creating packages with the wrong names. This was causing the updated packages to be rejected from the apt repo by reprepro since files of the same name but different checksum already existed.

[endlessm/eos-shell#2600]

OBS adds an additional version suffix at the beginning of the build in
debian/changelog. This breaks the Ubuntu flow where all real changes go
into debian.master/changelog. Likewise, the version detection in the
rules (unlike the debian packaging tools) read debian.master/changelog.
We need to preserve the OBS version, so keep the two files in sync
during clean.

[endlessm/eos-shell#2600]
With the previous hack to keep the two changelogs in sync, the debian
packaging will now regard the version OBS generates correctly. That
means that when OBS is running, the previous version will be endlessX
and the current version will be endlessXbemY.

In that case, we need to change the module ABI directory name (or add a
new module directory) every time we do a release so that the the ABI
checker will find the "previous" version file.

[endlessm/eos-shell#2600]
rshuler pushed a commit that referenced this pull request Jun 9, 2014
@rshuler rshuler merged commit 9e62de8 into endless Jun 9, 2014
@rshuler rshuler deleted the shell/2600 branch June 9, 2014 14:34
dsd pushed a commit that referenced this pull request Aug 11, 2014
Dave Jones got the following lockdep splat:

>  ======================================================
>  [ INFO: possible circular locking dependency detected ]
>  3.12.0-rc3+ #92 Not tainted
>  -------------------------------------------------------
>  trinity-child2/15191 is trying to acquire lock:
>   (&rdp->nocb_wq){......}, at: [<ffffffff8108ff43>] __wake_up+0x23/0x50
>
> but task is already holding lock:
>   (&ctx->lock){-.-...}, at: [<ffffffff81154c19>] perf_event_exit_task+0x109/0x230
>
> which lock already depends on the new lock.
>
>
> the existing dependency chain (in reverse order) is:
>
> -> #3 (&ctx->lock){-.-...}:
>         [<ffffffff810cc243>] lock_acquire+0x93/0x200
>         [<ffffffff81733f90>] _raw_spin_lock+0x40/0x80
>         [<ffffffff811500ff>] __perf_event_task_sched_out+0x2df/0x5e0
>         [<ffffffff81091b83>] perf_event_task_sched_out+0x93/0xa0
>         [<ffffffff81732052>] __schedule+0x1d2/0xa20
>         [<ffffffff81732f30>] preempt_schedule_irq+0x50/0xb0
>         [<ffffffff817352b6>] retint_kernel+0x26/0x30
>         [<ffffffff813eed04>] tty_flip_buffer_push+0x34/0x50
>         [<ffffffff813f0504>] pty_write+0x54/0x60
>         [<ffffffff813e900d>] n_tty_write+0x32d/0x4e0
>         [<ffffffff813e5838>] tty_write+0x158/0x2d0
>         [<ffffffff811c4850>] vfs_write+0xc0/0x1f0
>         [<ffffffff811c52cc>] SyS_write+0x4c/0xa0
>         [<ffffffff8173d4e4>] tracesys+0xdd/0xe2
>
> -> #2 (&rq->lock){-.-.-.}:
>         [<ffffffff810cc243>] lock_acquire+0x93/0x200
>         [<ffffffff81733f90>] _raw_spin_lock+0x40/0x80
>         [<ffffffff810980b2>] wake_up_new_task+0xc2/0x2e0
>         [<ffffffff81054336>] do_fork+0x126/0x460
>         [<ffffffff81054696>] kernel_thread+0x26/0x30
>         [<ffffffff8171ff93>] rest_init+0x23/0x140
>         [<ffffffff81ee1e4b>] start_kernel+0x3f6/0x403
>         [<ffffffff81ee1571>] x86_64_start_reservations+0x2a/0x2c
>         [<ffffffff81ee1664>] x86_64_start_kernel+0xf1/0xf4
>
> -> #1 (&p->pi_lock){-.-.-.}:
>         [<ffffffff810cc243>] lock_acquire+0x93/0x200
>         [<ffffffff8173419b>] _raw_spin_lock_irqsave+0x4b/0x90
>         [<ffffffff810979d1>] try_to_wake_up+0x31/0x350
>         [<ffffffff81097d62>] default_wake_function+0x12/0x20
>         [<ffffffff81084af8>] autoremove_wake_function+0x18/0x40
>         [<ffffffff8108ea38>] __wake_up_common+0x58/0x90
>         [<ffffffff8108ff59>] __wake_up+0x39/0x50
>         [<ffffffff8110d4f8>] __call_rcu_nocb_enqueue+0xa8/0xc0
>         [<ffffffff81111450>] __call_rcu+0x140/0x820
>         [<ffffffff81111b8d>] call_rcu+0x1d/0x20
>         [<ffffffff81093697>] cpu_attach_domain+0x287/0x360
>         [<ffffffff81099d7e>] build_sched_domains+0xe5e/0x10a0
>         [<ffffffff81efa7fc>] sched_init_smp+0x3b7/0x47a
>         [<ffffffff81ee1f4e>] kernel_init_freeable+0xf6/0x202
>         [<ffffffff817200be>] kernel_init+0xe/0x190
>         [<ffffffff8173d22c>] ret_from_fork+0x7c/0xb0
>
> -> #0 (&rdp->nocb_wq){......}:
>         [<ffffffff810cb7ca>] __lock_acquire+0x191a/0x1be0
>         [<ffffffff810cc243>] lock_acquire+0x93/0x200
>         [<ffffffff8173419b>] _raw_spin_lock_irqsave+0x4b/0x90
>         [<ffffffff8108ff43>] __wake_up+0x23/0x50
>         [<ffffffff8110d4f8>] __call_rcu_nocb_enqueue+0xa8/0xc0
>         [<ffffffff81111450>] __call_rcu+0x140/0x820
>         [<ffffffff81111bb0>] kfree_call_rcu+0x20/0x30
>         [<ffffffff81149abf>] put_ctx+0x4f/0x70
>         [<ffffffff81154c3e>] perf_event_exit_task+0x12e/0x230
>         [<ffffffff81056b8d>] do_exit+0x30d/0xcc0
>         [<ffffffff8105893c>] do_group_exit+0x4c/0xc0
>         [<ffffffff810589c4>] SyS_exit_group+0x14/0x20
>         [<ffffffff8173d4e4>] tracesys+0xdd/0xe2
>
> other info that might help us debug this:
>
> Chain exists of:
>   &rdp->nocb_wq --> &rq->lock --> &ctx->lock
>
>   Possible unsafe locking scenario:
>
>         CPU0                    CPU1
>         ----                    ----
>    lock(&ctx->lock);
>                                 lock(&rq->lock);
>                                 lock(&ctx->lock);
>    lock(&rdp->nocb_wq);
>
>  *** DEADLOCK ***
>
> 1 lock held by trinity-child2/15191:
>  #0:  (&ctx->lock){-.-...}, at: [<ffffffff81154c19>] perf_event_exit_task+0x109/0x230
>
> stack backtrace:
> CPU: 2 PID: 15191 Comm: trinity-child2 Not tainted 3.12.0-rc3+ #92
>  ffffffff82565b70 ffff880070c2dbf8 ffffffff8172a363 ffffffff824edf40
>  ffff880070c2dc38 ffffffff81726741 ffff880070c2dc90 ffff88022383b1c0
>  ffff88022383aac0 0000000000000000 ffff88022383b188 ffff88022383b1c0
> Call Trace:
>  [<ffffffff8172a363>] dump_stack+0x4e/0x82
>  [<ffffffff81726741>] print_circular_bug+0x200/0x20f
>  [<ffffffff810cb7ca>] __lock_acquire+0x191a/0x1be0
>  [<ffffffff810c6439>] ? get_lock_stats+0x19/0x60
>  [<ffffffff8100b2f4>] ? native_sched_clock+0x24/0x80
>  [<ffffffff810cc243>] lock_acquire+0x93/0x200
>  [<ffffffff8108ff43>] ? __wake_up+0x23/0x50
>  [<ffffffff8173419b>] _raw_spin_lock_irqsave+0x4b/0x90
>  [<ffffffff8108ff43>] ? __wake_up+0x23/0x50
>  [<ffffffff8108ff43>] __wake_up+0x23/0x50
>  [<ffffffff8110d4f8>] __call_rcu_nocb_enqueue+0xa8/0xc0
>  [<ffffffff81111450>] __call_rcu+0x140/0x820
>  [<ffffffff8109bc8f>] ? local_clock+0x3f/0x50
>  [<ffffffff81111bb0>] kfree_call_rcu+0x20/0x30
>  [<ffffffff81149abf>] put_ctx+0x4f/0x70
>  [<ffffffff81154c3e>] perf_event_exit_task+0x12e/0x230
>  [<ffffffff81056b8d>] do_exit+0x30d/0xcc0
>  [<ffffffff810c9af5>] ? trace_hardirqs_on_caller+0x115/0x1e0
>  [<ffffffff810c9bcd>] ? trace_hardirqs_on+0xd/0x10
>  [<ffffffff8105893c>] do_group_exit+0x4c/0xc0
>  [<ffffffff810589c4>] SyS_exit_group+0x14/0x20
>  [<ffffffff8173d4e4>] tracesys+0xdd/0xe2

The underlying problem is that perf is invoking call_rcu() with the
scheduler locks held, but in NOCB mode, call_rcu() will with high
probability invoke the scheduler -- which just might want to use its
locks.  The reason that call_rcu() needs to invoke the scheduler is
to wake up the corresponding rcuo callback-offload kthread, which
does the job of starting up a grace period and invoking the callbacks
afterwards.

One solution (championed on a related problem by Lai Jiangshan) is to
simply defer the wakeup to some point where scheduler locks are no longer
held.  Since we don't want to unnecessarily incur the cost of such
deferral, the task before us is threefold:

1.	Determine when it is likely that a relevant scheduler lock is held.

2.	Defer the wakeup in such cases.

3.	Ensure that all deferred wakeups eventually happen, preferably
	sooner rather than later.

We use irqs_disabled_flags() as a proxy for relevant scheduler locks
being held.  This works because the relevant locks are always acquired
with interrupts disabled.  We may defer more often than needed, but that
is at least safe.

The wakeup deferral is tracked via a new field in the per-CPU and
per-RCU-flavor rcu_data structure, namely ->nocb_defer_wakeup.

This flag is checked by the RCU core processing.  The __rcu_pending()
function now checks this flag, which causes rcu_check_callbacks()
to initiate RCU core processing at each scheduling-clock interrupt
where this flag is set.  Of course this is not sufficient because
scheduling-clock interrupts are often turned off (the things we used to
be able to count on!).  So the flags are also checked on entry to any
state that RCU considers to be idle, which includes both NO_HZ_IDLE idle
state and NO_HZ_FULL user-mode-execution state.

This approach should allow call_rcu() to be invoked regardless of what
locks you might be holding, the key word being "should".

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
dsd pushed a commit that referenced this pull request Aug 11, 2014
___cfg80211_scan_done() can be called in some cases
(e.g. on NETDEV_DOWN) before the low level driver
notified scan completion (which is indicated by
passing leak=true).

Clearing rdev->scan_req in this case is buggy, as
scan_done_wk might have already being queued/running
(and can't be flushed as it takes rtnl()).

If a new scan will be requested at this stage, the
scan_done_wk will try freeing it (instead of the
previous scan), and this will later result in
a use after free.

Simply remove the "leak" option, and replace it with
a standard WARN_ON.

An example backtrace after such crash:
Unable to handle kernel paging request at virtual address fffffee5
pgd = c0004000
[fffffee5] *pgd=9fdf6821, *pte=00000000, *ppte=00000000
Internal error: Oops: 17 [#1] SMP ARM
PC is at cfg80211_scan_done+0x28/0xc4 [cfg80211]
LR is at __ieee80211_scan_completed+0xe4/0x2dc [mac80211]
[<bf0077b0>] (cfg80211_scan_done+0x28/0xc4 [cfg80211])
[<bf0973d4>] (__ieee80211_scan_completed+0xe4/0x2dc [mac80211])
[<bf0982cc>] (ieee80211_scan_work+0x94/0x4f0 [mac80211])
[<c005fd10>] (process_one_work+0x1b0/0x4a8)
[<c0060404>] (worker_thread+0x138/0x37c)
[<c0066d70>] (kthread+0xa4/0xb0)

Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
dsd pushed a commit that referenced this pull request Aug 11, 2014
After the previous fix, there still has another ASSERT failure if turning
off any type of quota while fsstress is running at the same time.

Backtrace in this case:

[   50.867897] XFS: Assertion failed: XFS_IS_GQUOTA_ON(mp), file: fs/xfs/xfs_qm.c, line: 2118
[   50.867924] ------------[ cut here ]------------
... <snip>
[   50.867957] Kernel BUG at ffffffffa0b55a32 [verbose debug info unavailable]
[   50.867999] invalid opcode: 0000 [#1] SMP
[   50.869407] Call Trace:
[   50.869446]  [<ffffffffa0bc408a>] xfs_qm_vop_create_dqattach+0x19a/0x2d0 [xfs]
[   50.869512]  [<ffffffffa0b9cc45>] xfs_create+0x5c5/0x6a0 [xfs]
[   50.869564]  [<ffffffffa0b5307c>] xfs_vn_mknod+0xac/0x1d0 [xfs]
[   50.869615]  [<ffffffffa0b531d6>] xfs_vn_mkdir+0x16/0x20 [xfs]
[   50.869655]  [<ffffffff811becd5>] vfs_mkdir+0x95/0x130
[   50.869689]  [<ffffffff811bf63a>] SyS_mkdirat+0xaa/0xe0
[   50.869723]  [<ffffffff811bf689>] SyS_mkdir+0x19/0x20
[   50.869757]  [<ffffffff8170f7dd>] system_call_fastpath+0x1a/0x1f
[   50.869793] Code: 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 <snip>
[   50.870003] RIP  [<ffffffffa0b55a32>] assfail+0x22/0x30 [xfs]
[   50.870050]  RSP <ffff88002941fd60>
[   50.879251] ---[ end trace c93a2b342341c65b ]---

We're hitting the ASSERT(XFS_IS_*QUOTA_ON(mp)) in xfs_qm_vop_create_dqattach(),
however the assertion itself is not right IMHO.  While performing quota off, we
firstly clear the XFS_*QUOTA_ACTIVE bit(s) from struct xfs_mount without taking
any special locks, see xfs_qm_scall_quotaoff().  Hence there is no guarantee
that the desired quota is still active.

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
dsd pushed a commit that referenced this pull request Aug 11, 2014
ttyA has ld associated to n_gsm, when ttyA is closing, it triggers
to release gsmttyB's ld data dlci[B], then race would happen if gsmttyB
is opening in parallel.

(Note: This patch set differs from previous set in that it uses mutex
instead of spin lock to avoid race, so that it avoids sleeping in automic
context)

Here are race cases we found recently in test:

CASE #1
====================================================================
releasing dlci[B] race with gsmtty_install(gsmttyB), then panic
in gsmtty_open(gsmttyB), as below:

 tty_release(ttyA)                  tty_open(gsmttyB)
     |                                   |
   -----                           gsmtty_install(gsmttyB)
     |                                   |
   -----                    gsm_dlci_alloc(gsmttyB) => alloc dlci[B]
 tty_ldisc_release(ttyA)               -----
     |                                   |
 gsm_dlci_release(dlci[B])             -----
     |                                   |
 gsm_dlci_free(dlci[B])                -----
     |                                   |
   -----                           gsmtty_open(gsmttyB)

 gsmtty_open()
 {
     struct gsm_dlci *dlci = tty->driver_data; => here it uses dlci[B]
     ...
 }

 In gsmtty_open(gsmttyA), it uses dlci[B] which was release, so hit a panic.
=====================================================================

CASE #2
=====================================================================
releasing dlci[0] race with gsmtty_install(gsmttyB), then panic
in gsmtty_open(), as below:

 tty_release(ttyA)                  tty_open(gsmttyB)
     |                                   |
   -----                           gsmtty_install(gsmttyB)
     |                                   |
   -----                    gsm_dlci_alloc(gsmttyB) => alloc dlci[B]
     |                                   |
   -----                         gsmtty_open(gsmttyB) fail
     |                                   |
   -----                           tty_release(gsmttyB)
     |                                   |
   -----                           gsmtty_close(gsmttyB)
     |                                   |
   -----                        gsmtty_detach_dlci(dlci[B])
     |                                   |
   -----                             dlci_put(dlci[B])
     |                                   |
 tty_ldisc_release(ttyA)               -----
     |                                   |
 gsm_dlci_release(dlci[0])             -----
     |                                   |
 gsm_dlci_free(dlci[0])                -----
     |                                   |
   -----                             dlci_put(dlci[0])

 In gsmtty_detach_dlci(dlci[B]), it tries to use dlci[0] which was released,
 then hit panic.
=====================================================================

IMHO, n_gsm tty operations would refer released ldisc,  as long as
gsm_dlci_release() has chance to release ldisc data when some gsmtty operations
are ongoing..

This patch is try to avoid it by:

1) in n_gsm driver, use a global gsm mutex lock to avoid gsm_dlci_release() run in
parallel with gsmtty_install();

2) Increase dlci's ref count in gsmtty_install() instead of in gsmtty_open(), the
purpose is to prevent gsm_dlci_release() releasing dlci after gsmtty_install()
allocats dlci but before gsmtty_open increases dlci's ref count;

3) Decrease dlci's ref count in gsmtty_remove(), a tty framework API, this is the
opposite process of step 2).

Signed-off-by: Chao Bi <chao.bi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
dsd pushed a commit that referenced this pull request Aug 11, 2014
vb2_fop_release does not hold the lock although it is modifying the
queue->owner field.
This could lead to race conditions on the vb2_perform_io function
when multiple applications are accessing the video device via
read/write API:
[ 308.297741] BUG: unable to handle kernel NULL pointer dereference at
0000000000000260
[ 308.297759] IP: [<ffffffffa07a9fd2>] vb2_perform_fileio+0x372/0x610
[videobuf2_core]
[ 308.297794] PGD 159719067 PUD 158119067 PMD 0
[ 308.297812] Oops: 0000 #1 SMP
[ 308.297826] Modules linked in: qt5023_video videobuf2_dma_sg
qtec_xform videobuf2_vmalloc videobuf2_memops videobuf2_core
qtec_white qtec_mem gpio_xilinx qtec_cmosis qtec_pcie fglrx(PO)
spi_xilinx spi_bitbang qt5023
[ 308.297888] CPU: 1 PID: 2189 Comm: java Tainted: P O 3.11.0-qtec-standard #1
[ 308.297919] Hardware name: QTechnology QT5022/QT5022, BIOS
PM_2.1.0.309 X64 05/23/2013
[ 308.297952] task: ffff8801564e1690 ti: ffff88014dc02000 task.ti:
ffff88014dc02000
[ 308.297962] RIP: 0010:[<ffffffffa07a9fd2>] [<ffffffffa07a9fd2>]
vb2_perform_fileio+0x372/0x610 [videobuf2_core]
[ 308.297985] RSP: 0018:ffff88014dc03df8 EFLAGS: 00010202
[ 308.297995] RAX: 0000000000000000 RBX: ffff880158a23000 RCX: dead000000100100
[ 308.298003] RDX: 0000000000000000 RSI: dead000000200200 RDI: 0000000000000000
[ 308.298012] RBP: ffff88014dc03e58 R08: 0000000000000000 R09: 0000000000000001
[ 308.298020] R10: ffffea00051e8380 R11: ffff88014dc03fd8 R12: ffff880158a23070
[ 308.298029] R13: ffff8801549040b8 R14: 0000000000198000 R15: 0000000001887e60
[ 308.298040] FS: 00007f65130d5700(0000) GS:ffff88015ed00000(0000)
knlGS:0000000000000000
[ 308.298049] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 308.298057] CR2: 0000000000000260 CR3: 0000000159630000 CR4: 00000000000007e0
[ 308.298064] Stack:
[ 308.298071] ffff880156416c00 0000000000198000 0000000000000000
ffff880100000001
[ 308.298087] ffff88014dc03f50 00000000810a79ca 0002000000000001
ffff880154904718
[ 308.298101] ffff880156416c00 0000000000198000 ffff880154904338
ffff88014dc03f50
[ 308.298116] Call Trace:
[ 308.298143] [<ffffffffa07aa3c4>] vb2_read+0x14/0x20 [videobuf2_core]
[ 308.298198] [<ffffffffa07aa494>] vb2_fop_read+0xc4/0x120 [videobuf2_core]
[ 308.298252] [<ffffffff8154ee9e>] v4l2_read+0x7e/0xc0
[ 308.298296] [<ffffffff8116e639>] vfs_read+0xa9/0x160
[ 308.298312] [<ffffffff8116e882>] SyS_read+0x52/0xb0
[ 308.298328] [<ffffffff81784179>] tracesys+0xd0/0xd5
[ 308.298335] Code: e5 d6 ff ff 83 3d be 24 00 00 04 89 c2 4c 8b 45 b0
44 8b 4d b8 0f 8f 20 02 00 00 85 d2 75 32 83 83 78 03 00 00 01 4b 8b
44 c5 48 <8b> 88 60 02 00 00 85 c9 0f 84 b0 00 00 00 8b 40 58 89 c2 41
89
[ 308.298487] RIP [<ffffffffa07a9fd2>] vb2_perform_fileio+0x372/0x610
[videobuf2_core]
[ 308.298507] RSP <ffff88014dc03df8>
[ 308.298514] CR2: 0000000000000260
[ 308.298526] ---[ end trace e8f01717c96d1e41 ]---

Signed-off-by: Ricardo Ribalda <ricardo.ribalda@gmail.com>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
Acked-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
dsd pushed a commit that referenced this pull request Aug 11, 2014
This patches fixes the following warning by replacing smp_processor_id()
with raw_smp_processor_id():

[   11.120893] BUG: using smp_processor_id() in preemptible [00000000] code: arping/3510
[   11.120913] caller is .packet_sendmsg+0xc14/0xe68
[   11.120920] CPU: 13 PID: 3510 Comm: arping Not tainted 3.13.0-rc3-next-20131211-dirty #1
[   11.120926] Call Trace:
[   11.120932] [c0000001f803f6f0] [c0000000000138dc] .show_stack+0x110/0x25c (unreliable)
[   11.120942] [c0000001f803f7e0] [c00000000083dd24] .dump_stack+0xa0/0x37c
[   11.120951] [c0000001f803f870] [c000000000493fd4] .debug_smp_processor_id+0xfc/0x12c
[   11.120959] [c0000001f803f900] [c0000000007eba78] .packet_sendmsg+0xc14/0xe68
[   11.120968] [c0000001f803fa80] [c000000000700968] .sock_sendmsg+0xa0/0xe0
[   11.120975] [c0000001f803fbf0] [c0000000007014d8] .SyS_sendto+0x100/0x148
[   11.120983] [c0000001f803fd60] [c0000000006fff10] .SyS_socketcall+0x1c4/0x2e8
[   11.120990] [c0000001f803fe30] [c00000000000a1e4] syscall_exit+0x0/0x9c

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dsd pushed a commit that referenced this pull request Aug 11, 2014
Adding clocks from a kernel module can cause a NULL pointer
dereference if the parent of a clock is added after the clock is
added. This happens because __clk_init() iterates over the list
of orphans and reparents the orphans to the clock being
registered before creating the debugfs entry for the clock.
Create the debugfs entry first before reparenting the orphans.

Unable to handle kernel NULL pointer dereference at virtual address 00000028
pgd = ef3e4000
[00000028] *pgd=bf810831
Internal error: Oops: 17 [#1] PREEMPT SMP ARM
Modules linked in: mmcc_8960(+)
CPU: 0 PID: 52 Comm: modprobe Not tainted 3.12.0-rc2-00023-g1021a28-dirty #659
task: ef319200 ti: ef3a6000 task.ti: ef3a6000
PC is at lock_rename+0x24/0xc4
LR is at debugfs_rename+0x34/0x208
pc : [<c0317238>]    lr : [<c047dfe4>]    psr: 00000013
sp : ef3a7b88  ip : ef3a7ba8  fp : ef3a7ba4
r10: ef3d51cc  r9 : ef3bc680  r8 : ef3d5210
r7 : ef3bc640  r6 : eee287e0  r5 : eee287e0  r4 : 00000000
r3 : ef3bc640  r2 : 00000000  r1 : eee287e0  r0 : 00000000
Flags: nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 10c5787d  Table: af3e406a  DAC: 00000015
Process modprobe (pid: 52, stack limit = 0xef3a6240)
Stack: (0xef3a7b88 to 0xef3a8000)
7b80:                   ef3bc640 ee4047e0 00000000 eee287e0 ef3a7bec ef3a7ba8
7ba0: c047dfe4 c0317220 ef3bc680 ef3d51cc ef3a7bdc ef3a7bc0 c06e29d0 c0268784
7bc0: c08946e8 ef3d5210 00000000 ef3bc700 ef3d5290 ef3d5210 ef3bc680 ef3d51cc
7be0: ef3a7c0c ef3a7bf0 c05b9e9c c047dfbc 00000000 00000000 ef3d5210 ef3d5290
7c00: ef3a7c24 ef3a7c10 c05baebc c05b9e30 00000001 00000001 ef3a7c64 ef3a7c28
7c20: c05bb124 c05bae9c bf000cd8 ef3bc7c0 000000d0 c0ff129c bf001774 00000002
7c40: ef3bc740 ef3d5290 ef0f9a10 bf001774 bf00042c 00000061 ef3a7c8c ef3a7c68
7c60: c05bb480 c05baed8 bf001774 ef3d5290 ef0f9a10 bf001774 ef38bc10 ef0f9a00
7c80: ef3a7cac ef3a7c90 c05bb5a8 c05bb3a0 bf001774 00000062 ef0f9a10 ef38bc18
7ca0: ef3a7cec ef3a7cb0 bf00010c c05bb56c 00000000 ef38ba00 00000000 ef3d60d0
7cc0: ef3a7cdc c0fefc24 ef0f9a10 c0a091c0 bf000d24 00000000 bf0029f0 bf006000
7ce0: ef3a7cfc ef3a7cf0 c05156c0 bf000040 ef3a7d2c ef3a7d00 c0513f5c c05156a8
7d00: ef3a7d2c ef0f9a10 ef0f9a10 bf000d24 ef0f9a44 c09ca588 00000000 bf006000
7d20: ef3a7d4c ef3a7d30 c05142b8 c0513ecc ef0fd25c 00000000 bf000d24 c0514214
7d40: ef3a7d74 ef3a7d50 c0512030 c0514220 ef0050a8 ef0fd250 ef0050f8 bf000d24
7d60: ef37c100 c09ed150 ef3a7d84 ef3a7d78 c05139c8 c0511fd8 ef3a7
7d80: c051344c c05139a8 bf000864 c09ca588 ef3a7db4 bf000d24 bf002
7da0: c09ca588 00000000 ef3a7dcc ef3a7db8 c05149dc c0513360 ef3a7
7dc0: ef3a7ddc ef3a7dd0 c0515914 c0514960 ef3a7dec ef3a7de0 bf006
7de0: ef3a7e74 ef3a7df0 c0208800 bf00600c ef3a7e1c ef3a7e00 c04c5
7e00: ffffffff c09d46c4 00000000 bf0029a8 ef3a7e34 ef3a7e20 c024c
7e20: ffffffff c09d46c4 ef3a7e5c ef3a7e38 c024e2fc c024ce40 00000
7e40: ef3a7f48 bf0029b4 bf0029a8 271aeb1c ef3a7f48 bf0029a8 00000001 ef383c00
7e60: bf0029f0 00000001 ef3a7f3c ef3a7e78 c028fac4 c0208718 bf0029b4 00007fff
7e80: c028cd58 000000d2 f0065000 00000000 ef3a7ebc 00000000 00000000 bf0029b4
7ea0: 00000000 bf0029ac bf0029b4 ef3a6000 ef3a7efc c08bf128 00000000 00000000
7ec0: 00000000 00000000 00000000 00000000 6e72656b 00006c65 00000000 00000000
7ee0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
7f00: 00000000 00000000 00000000 271aeb1c ef3a7f2c 00016376 b6f38008 001d3774
7f20: 00000080 c020f968 ef3a6000 00000000 ef3a7fa4 ef3a7f40 c02904dc c028e178
7f40: c020f898 010ccfa8 f0065000 00016376 f0073f60 f0073d7d f007a1e8 00002b24
7f60: 000039e4 00000000 00000000 00000000 0000002f 00000030 00000019 00000016
7f80: 00000012 00000000 00000000 010de1b2 b6f38008 010ccfa8 00000000 ef3a7fa8
7fa0: c020f6c0 c0290434 010de1b2 b6f38008 b6f38008 00016376 001d3774 00000000
7fc0: 010de1b2 b6f38008 010ccfa8 00000080 010de1b2 bedb6f90 010de1c9 0001d8dc
7fe0: 0000000c bedb674c 0001ce30 000094c4 60000010 b6f38008 00000008 0000001d
[<c0317238>] (lock_rename+0x24/0xc4) from [<c047dfe4>] (debugfs_rename+0x34/0x208)
[<c047dfe4>] (debugfs_rename+0x34/0x208) from [<c05b9e9c>] (clk_debug_reparent+0x78/0xc0)
[<c05baebc>] (__clk_reparent+0x2c/0x3c) from [<c05bb124>] (__clk_init+0x258/0x4c8)
[<c05bb124>] (__clk_init+0x258/0x4c8) from [<c05bb480>] (_clk_register+0xec/0x1cc)
[<c05bb480>] (_clk_register+0xec/0x1cc) from [<c05bb5a8>] (devm_clk_register+0x48/0x7c)
[<c05bb5a8>] (devm_clk_register+0x48/0x7c) from [<bf00010c>] (msm_mmcc_8960_probe+0xd8/0x190 [mmcc_8960])
[<bf00010c>] (msm_mmcc_8960_probe+0xd8/0x190 [mmcc_8960]) from [<c05156c0>] (platform_drv_probe+0x24/0x28)
[<c05156c0>] (platform_drv_probe+0x24/0x28) from [<c0513f5c>] (driver_probe_device+0x9c/0x354)
[<c0513f5c>] (driver_probe_device+0x9c/0x354) from [<c05142b8>] (__driver_attach+0xa4/0xa8)
[<c05142b8>] (__driver_attach+0xa4/0xa8) from [<c0512030>] (bus_for_each_dev+0x64/0x98)
[<c0512030>] (bus_for_each_dev+0x64/0x98) from [<c05139c8>] (driver_attach+0x2c/0x30)
[<c05139c8>] (driver_attach+0x2c/0x30) from [<c051344c>] (bus_add_driver+0xf8/0x2a8)
[<c051344c>] (bus_add_driver+0xf8/0x2a8) from [<c05149dc>] (driver_register+0x88/0x104)
[<c05149dc>] (driver_register+0x88/0x104) from [<c0515914>] (__platform_driver_register+0x58/0x6c)
[<c0515914>] (__platform_driver_register+0x58/0x6c) from [<bf006018>] (msm_mmcc_8960_driver_init+0x18/0x24 [mmcc_8960])
[<bf006018>] (msm_mmcc_8960_driver_init+0x18/0x24 [mmcc_8960]) from [<c0208800>] (do_one_initcall+0xf4/0x1b8)
[<c0208800>] (do_one_initcall+0xf4/0x1b8) from [<c028fac4>] (load_module+0x1958/0x22bc)
[<c028fac4>] (load_module+0x1958/0x22bc) from [<c02904dc>] (SyS_init_module+0xb4/0x120)
[<c02904dc>] (SyS_init_module+0xb4/0x120) from [<c020f6c0>] (ret_fast_syscall+0x0/0x48)
Code: e1500001 e1a04000 e1a05001 0a000021 (e5903028)

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Mike Turquette <mturquette@linaro.org>
dsd pushed a commit that referenced this pull request Aug 11, 2014
iwl_nvm_init() return value wasn't checked in some
path, which resulted in the following panic (if
there was some issue with the nvm):

Unable to handle kernel NULL pointer dereference at virtual address 00000004
pgd = d0460000
[00000004] *pgd=00000000
Internal error: Oops: 5 [#1] PREEMPT SMP
Modules linked in: iwlmvm(+) iwlwifi mac80211 cfg80211 compat [last unloaded: compat]
PC is at iwl_mvm_mac_setup_register+0x12c/0x460 [iwlmvm]
LR is at 0x2710
pc : [<bf50dd4c>]    lr : [<00002710>]    psr: 20800013
sp : d00cfe18  ip : 0000081e  fp : d006b908
r10: d0711408  r9 : bf532e64  r8 : d006b5bc
r7 : d01af000  r6 : bf39cefc  r5 : d006ab00  r4 : d006b5a4
r3 : 00000001  r2 : 00000000  r1 : d006a120  r0 : d006b860

Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
dsd pushed a commit that referenced this pull request Aug 11, 2014
Commit 4d9b109(tty: Prevent deadlock in n_gsm driver)
tried to close all the virtual ports synchronously before closing the
phycial ports, so the tty_vhangup() is used.

But the tty_unlock/lock() is wrong:
tty_release
 tty_ldisc_release
  tty_lock_pair(tty, o_tty)  < == Here the tty is for physical port
  tty_ldisc_kill
    gsmld_close
      gsm_cleanup_mux
        gsm_dlci_release
          tty = tty_port_tty_get(&dlci->port)
                            < == Here the tty(s) are for virtual port

They are different ttys, so before tty_vhangup(virtual tty), do not need
to call the tty_unlock(virtual tty) at all which causes unbalanced unlock
warning.

When enabling mutex debugging option, we will hit the below warning also:
[   99.276903] =====================================
[   99.282172] [ BUG: bad unlock balance detected! ]
[   99.287442] 3.10.20-261976-gaec5ba0 #44 Tainted: G           O
[   99.293972] -------------------------------------
[   99.299240] mmgr/152 is trying to release lock (&tty->legacy_mutex) at:
[   99.306693] [<c1b2dcad>] mutex_unlock+0xd/0x10
[   99.311669] but there are no more locks to release!
[   99.317131]
[   99.317131] other info that might help us debug this:
[   99.324440] 3 locks held by mmgr/152:
[   99.328542]  #0:  (&tty->legacy_mutex/1){......}, at: [<c1b30ab0>] tty_lock_nested+0x40/0x90
[   99.338116]  #1:  (&tty->ldisc_mutex){......}, at: [<c15dbd02>] tty_ldisc_kill+0x22/0xd0
[   99.347284]  #2:  (&gsm->mutex){......}, at: [<c15e3d83>] gsm_cleanup_mux+0x73/0x170
[   99.356060]
[   99.356060] stack backtrace:
[   99.360932] CPU: 0 PID: 152 Comm: mmgr Tainted: G           O 3.10.20-261976-gaec5ba0 #44
[   99.370086]  ef4a4de0 ef4a4de0 ef4c1d98 c1b27b91 ef4c1db8 c1292655 c1dd10f5 c1b2dcad
[   99.378921]  c1b2dcad ef4a4de0 ef4a528c ffffffff ef4c1dfc c12930dd 00000246 00000000
[   99.387754]  00000000 00000000 c15e1926 00000000 00000001 ddfa7530 00000003 c1b2dcad
[   99.396588] Call Trace:
[   99.399326]  [<c1b27b91>] dump_stack+0x16/0x18
[   99.404307]  [<c1292655>] print_unlock_imbalance_bug+0xe5/0xf0
[   99.410840]  [<c1b2dcad>] ? mutex_unlock+0xd/0x10
[   99.416110]  [<c1b2dcad>] ? mutex_unlock+0xd/0x10
[   99.421382]  [<c12930dd>] lock_release_non_nested+0x1cd/0x210
[   99.427818]  [<c15e1926>] ? gsm_destroy_network+0x36/0x130
[   99.433964]  [<c1b2dcad>] ? mutex_unlock+0xd/0x10
[   99.439235]  [<c12931a2>] lock_release+0x82/0x1c0
[   99.444505]  [<c1b2dcad>] ? mutex_unlock+0xd/0x10
[   99.449776]  [<c1b2dcad>] ? mutex_unlock+0xd/0x10
[   99.455047]  [<c1b2dc2f>] __mutex_unlock_slowpath+0x5f/0xd0
[   99.461288]  [<c1b2dcad>] mutex_unlock+0xd/0x10
[   99.466365]  [<c1b30bb1>] tty_unlock+0x21/0x50
[   99.471345]  [<c15e3dd1>] gsm_cleanup_mux+0xc1/0x170
[   99.476906]  [<c15e44d2>] gsmld_close+0x52/0x90
[   99.481983]  [<c15db905>] tty_ldisc_close.isra.1+0x35/0x50
[   99.488127]  [<c15dbd0c>] tty_ldisc_kill+0x2c/0xd0
[   99.493494]  [<c15dc7af>] tty_ldisc_release+0x2f/0x50
[   99.499152]  [<c15d572c>] tty_release+0x37c/0x4b0
[   99.504424]  [<c1b2dcad>] ? mutex_unlock+0xd/0x10
[   99.509695]  [<c1b2dcad>] ? mutex_unlock+0xd/0x10
[   99.514967]  [<c1372f6e>] ? eventpoll_release_file+0x7e/0x90
[   99.521307]  [<c1335849>] __fput+0xd9/0x200
[   99.525996]  [<c133597d>] ____fput+0xd/0x10
[   99.530685]  [<c125c731>] task_work_run+0x81/0xb0
[   99.535957]  [<c12019e9>] do_notify_resume+0x49/0x70
[   99.541520]  [<c1b30dc4>] work_notifysig+0x29/0x31
[   99.546897] ------------[ cut here ]------------

So here we can call tty_vhangup() directly which is for virtual port.

Reviewed-by: Chao Bi <chao.bi@intel.com>
Signed-off-by: Liu, Chuansheng <chuansheng.liu@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
dsd pushed a commit that referenced this pull request Aug 11, 2014
Command "tcrypt sec=1 mode=403" give the follwoing error for Polling
mode:
root@am335x-evm:/# insmod tcrypt.ko sec=1 mode=403
[...]

[  346.982754] test 15 ( 4096 byte blocks, 1024 bytes per update,   4 updates):   4352 opers/sec,  17825792 bytes/sec
[  347.992661] test 16 ( 4096 byte blocks, 4096 bytes per update,   1 updates):   7095 opers/sec,  29061120 bytes/sec
[  349.002667] test 17 ( 8192 byte blocks,   16 bytes per update, 512 updates):
[  349.010882] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[  349.020037] pgd = ddeac000
[  349.022884] [00000000] *pgd=9dcb4831, *pte=00000000, *ppte=00000000
[  349.029816] Internal error: Oops: 17 [#1] PREEMPT SMP ARM
[  349.035482] Modules linked in: tcrypt(+)
[  349.039617] CPU: 0 PID: 1473 Comm: insmod Not tainted 3.12.4-01566-g6279006-dirty #38
[  349.047832] task: dda91540 ti: ddcd2000 task.ti: ddcd2000
[  349.053517] PC is at omap_sham_xmit_dma+0x6c/0x238
[  349.058544] LR is at omap_sham_xmit_dma+0x38/0x238
[  349.063570] pc : [<c04eb7cc>]    lr : [<c04eb798>]    psr: 20000013
[  349.063570] sp : ddcd3c78  ip : 00000000  fp : 9d8980b8
[  349.075610] r10: 00000000  r9 : 00000000  r8 : 00000000
[  349.081090] r7 : 00001000  r6 : dd898000  r5 : 00000040  r4 : ddb10550
[  349.087935] r3 : 00000004  r2 : 00000010  r1 : 53100080  r0 : 00000000
[  349.094783] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[  349.102268] Control: 10c5387d  Table: 9deac019  DAC: 00000015
[  349.108294] Process insmod (pid: 1473, stack limit = 0xddcd2248)

[...]

This is because polling_mode is not enabled for ctx without FLAGS_FINUP.

For polling mode the bufcnt is made 0 unconditionally. But it should be made 0
only if it is a final update or a total is not zero(This condition is similar
to what is done in DMA case). Because of this wrong hashes are produced.

Fixing the same.

Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
dsd pushed a commit that referenced this pull request Aug 11, 2014
As part of normal operaions, the hrtimer subsystem frequently calls
into the timekeeping code, creating a locking order of
  hrtimer locks -> timekeeping locks

clock_was_set_delayed() was suppoed to allow us to avoid deadlocks
between the timekeeping the hrtimer subsystem, so that we could
notify the hrtimer subsytem the time had changed while holding
the timekeeping locks. This was done by scheduling delayed work
that would run later once we were out of the timekeeing code.

But unfortunately the lock chains are complex enoguh that in
scheduling delayed work, we end up eventually trying to grab
an hrtimer lock.

Sasha Levin noticed this in testing when the new seqlock lockdep
enablement triggered the following (somewhat abrieviated) message:

[  251.100221] ======================================================
[  251.100221] [ INFO: possible circular locking dependency detected ]
[  251.100221] 3.13.0-rc2-next-20131206-sasha-00005-g8be2375-dirty #4053 Not tainted
[  251.101967] -------------------------------------------------------
[  251.101967] kworker/10:1/4506 is trying to acquire lock:
[  251.101967]  (timekeeper_seq){----..}, at: [<ffffffff81160e96>] retrigger_next_event+0x56/0x70
[  251.101967]
[  251.101967] but task is already holding lock:
[  251.101967]  (hrtimer_bases.lock#11){-.-...}, at: [<ffffffff81160e7c>] retrigger_next_event+0x3c/0x70
[  251.101967]
[  251.101967] which lock already depends on the new lock.
[  251.101967]
[  251.101967]
[  251.101967] the existing dependency chain (in reverse order) is:
[  251.101967]
-> #5 (hrtimer_bases.lock#11){-.-...}:
[snipped]
-> #4 (&rt_b->rt_runtime_lock){-.-...}:
[snipped]
-> #3 (&rq->lock){-.-.-.}:
[snipped]
-> #2 (&p->pi_lock){-.-.-.}:
[snipped]
-> #1 (&(&pool->lock)->rlock){-.-...}:
[  251.101967]        [<ffffffff81194803>] validate_chain+0x6c3/0x7b0
[  251.101967]        [<ffffffff81194d9d>] __lock_acquire+0x4ad/0x580
[  251.101967]        [<ffffffff81194ff2>] lock_acquire+0x182/0x1d0
[  251.101967]        [<ffffffff84398500>] _raw_spin_lock+0x40/0x80
[  251.101967]        [<ffffffff81153e69>] __queue_work+0x1a9/0x3f0
[  251.101967]        [<ffffffff81154168>] queue_work_on+0x98/0x120
[  251.101967]        [<ffffffff81161351>] clock_was_set_delayed+0x21/0x30
[  251.101967]        [<ffffffff811c4bd1>] do_adjtimex+0x111/0x160
[  251.101967]        [<ffffffff811e2711>] compat_sys_adjtimex+0x41/0x70
[  251.101967]        [<ffffffff843a4b49>] ia32_sysret+0x0/0x5
[  251.101967]
-> #0 (timekeeper_seq){----..}:
[snipped]
[  251.101967] other info that might help us debug this:
[  251.101967]
[  251.101967] Chain exists of:
  timekeeper_seq --> &rt_b->rt_runtime_lock --> hrtimer_bases.lock#11

[  251.101967]  Possible unsafe locking scenario:
[  251.101967]
[  251.101967]        CPU0                    CPU1
[  251.101967]        ----                    ----
[  251.101967]   lock(hrtimer_bases.lock#11);
[  251.101967]                                lock(&rt_b->rt_runtime_lock);
[  251.101967]                                lock(hrtimer_bases.lock#11);
[  251.101967]   lock(timekeeper_seq);
[  251.101967]
[  251.101967]  *** DEADLOCK ***
[  251.101967]
[  251.101967] 3 locks held by kworker/10:1/4506:
[  251.101967]  #0:  (events){.+.+.+}, at: [<ffffffff81154960>] process_one_work+0x200/0x530
[  251.101967]  #1:  (hrtimer_work){+.+...}, at: [<ffffffff81154960>] process_one_work+0x200/0x530
[  251.101967]  #2:  (hrtimer_bases.lock#11){-.-...}, at: [<ffffffff81160e7c>] retrigger_next_event+0x3c/0x70
[  251.101967]
[  251.101967] stack backtrace:
[  251.101967] CPU: 10 PID: 4506 Comm: kworker/10:1 Not tainted 3.13.0-rc2-next-20131206-sasha-00005-g8be2375-dirty #4053
[  251.101967] Workqueue: events clock_was_set_work

So the best solution is to avoid calling clock_was_set_delayed() while
holding the timekeeping lock, and instead using a flag variable to
decide if we should call clock_was_set() once we've released the locks.

This works for the case here, where the do_adjtimex() was the deadlock
trigger point. Unfortuantely, in update_wall_time() we still hold
the jiffies lock, which would deadlock with the ipi triggered by
clock_was_set(), preventing us from calling it even after we drop the
timekeeping lock. So instead call clock_was_set_delayed() at that point.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: stable <stable@vger.kernel.org> #3.10+
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Tested-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
dsd pushed a commit that referenced this pull request Aug 11, 2014
The user has the option of disabling the platform driver:
00:02.0 Unassigned class [ff80]: XenSource, Inc. Xen Platform Device (rev 01)

which is used to unplug the emulated drivers (IDE, Realtek 8169, etc)
and allow the PV drivers to take over. If the user wishes
to disable that they can set:

  xen_platform_pci=0
  (in the guest config file)

or
  xen_emul_unplug=never
  (on the Linux command line)

except it does not work properly. The PV drivers still try to
load and since the Xen platform driver is not run - and it
has not initialized the grant tables, most of the PV drivers
stumble upon:

input: Xen Virtual Keyboard as /devices/virtual/input/input5
input: Xen Virtual Pointer as /devices/virtual/input/input6M
------------[ cut here ]------------
kernel BUG at /home/konrad/ssd/konrad/linux/drivers/xen/grant-table.c:1206!
invalid opcode: 0000 [#1] SMP
Modules linked in: xen_kbdfront(+) xenfs xen_privcmd
CPU: 6 PID: 1389 Comm: modprobe Not tainted 3.13.0-rc1upstream-00021-ga6c892b-dirty #1
Hardware name: Xen HVM domU, BIOS 4.4-unstable 11/26/2013
RIP: 0010:[<ffffffff813ddc40>]  [<ffffffff813ddc40>] get_free_entries+0x2e0/0x300
Call Trace:
 [<ffffffff8150d9a3>] ? evdev_connect+0x1e3/0x240
 [<ffffffff813ddd0e>] gnttab_grant_foreign_access+0x2e/0x70
 [<ffffffffa0010081>] xenkbd_connect_backend+0x41/0x290 [xen_kbdfront]
 [<ffffffffa0010a12>] xenkbd_probe+0x2f2/0x324 [xen_kbdfront]
 [<ffffffff813e5757>] xenbus_dev_probe+0x77/0x130
 [<ffffffff813e7217>] xenbus_frontend_dev_probe+0x47/0x50
 [<ffffffff8145e9a9>] driver_probe_device+0x89/0x230
 [<ffffffff8145ebeb>] __driver_attach+0x9b/0xa0
 [<ffffffff8145eb50>] ? driver_probe_device+0x230/0x230
 [<ffffffff8145eb50>] ? driver_probe_device+0x230/0x230
 [<ffffffff8145cf1c>] bus_for_each_dev+0x8c/0xb0
 [<ffffffff8145e7d9>] driver_attach+0x19/0x20
 [<ffffffff8145e260>] bus_add_driver+0x1a0/0x220
 [<ffffffff8145f1ff>] driver_register+0x5f/0xf0
 [<ffffffff813e55c5>] xenbus_register_driver_common+0x15/0x20
 [<ffffffff813e76b3>] xenbus_register_frontend+0x23/0x40
 [<ffffffffa0015000>] ? 0xffffffffa0014fff
 [<ffffffffa001502b>] xenkbd_init+0x2b/0x1000 [xen_kbdfront]
 [<ffffffff81002049>] do_one_initcall+0x49/0x170

.. snip..

which is hardly nice. This patch fixes this by having each
PV driver check for:
 - if running in PV, then it is fine to execute (as that is their
   native environment).
 - if running in HVM, check if user wanted 'xen_emul_unplug=never',
   in which case bail out and don't load any PV drivers.
 - if running in HVM, and if PCI device 5853:0001 (xen_platform_pci)
   does not exist, then bail out and not load PV drivers.
 - (v2) if running in HVM, and if the user wanted 'xen_emul_unplug=ide-disks',
   then bail out for all PV devices _except_ the block one.
   Ditto for the network one ('nics').
 - (v2) if running in HVM, and if the user wanted 'xen_emul_unplug=unnecessary'
   then load block PV driver, and also setup the legacy IDE paths.
   In (v3) make it actually load PV drivers.

Reported-by: Sander Eikelenboom <linux@eikelenboom.it
Reported-by: Anthony PERARD <anthony.perard@citrix.com>
Reported-and-Tested-by: Fabio Fantoni <fabio.fantoni@m2r.biz>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[v2: Add extra logic to handle the myrid ways 'xen_emul_unplug'
can be used per Ian and Stefano suggestion]
[v3: Make the unnecessary case work properly]
[v4: s/disks/ide-disks/ spotted by Fabio]
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com> [for PCI parts]
CC: stable@vger.kernel.org
dsd pushed a commit that referenced this pull request Aug 11, 2014
When disconnecting it is possible that the l2cap_conn pointer is already
NULL when bt_6lowpan_del_conn() is entered. Looking at l2cap_conn_del
also verifies this as there's a NULL check there too. This patch adds
the missing NULL check without which the following bug may occur:

BUG: unable to handle kernel NULL pointer dereference at   (null)
IP: [<c131e9c7>] bt_6lowpan_del_conn+0x19/0x12a
*pde = 00000000
Oops: 0000 [#1] SMP
CPU: 1 PID: 52 Comm: kworker/u5:1 Not tainted 3.12.0+ #196
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Workqueue: hci0 hci_rx_work
task: f6259b00 ti: f48c0000 task.ti: f48c0000
EIP: 0060:[<c131e9c7>] EFLAGS: 00010282 CPU: 1
EIP is at bt_6lowpan_del_conn+0x19/0x12a
EAX: 00000000 EBX: ef094e10 ECX: 00000000 EDX: 00000016
ESI: 00000000 EDI: f48c1e60 EBP: f48c1e50 ESP: f48c1e34
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: 00000000 CR3: 30c65000 CR4: 00000690
Stack:
 f4d38000 00000000 f4d38000 00000002 ef094e10 00000016 f48c1e60 f48c1e70
 c1316bed f48c1e84 c1316bed 00000000 00000001 ef094e10 f48c1e84 f48c1ed0
 c1303cc6 c1303c7b f31f331a c1303cc6 f6e7d1c0 f3f8ea16 f3f8f380 f4d38008
Call Trace:
 [<c1316bed>] l2cap_disconn_cfm+0x3f/0x5b
 [<c1316bed>] ? l2cap_disconn_cfm+0x3f/0x5b
 [<c1303cc6>] hci_event_packet+0x645/0x2117
 [<c1303c7b>] ? hci_event_packet+0x5fa/0x2117
 [<c1303cc6>] ? hci_event_packet+0x645/0x2117
 [<c12681bd>] ? __kfree_skb+0x65/0x68
 [<c12681eb>] ? kfree_skb+0x2b/0x2e
 [<c130d3fb>] ? hci_send_to_sock+0x18d/0x199
 [<c12fa327>] hci_rx_work+0xf9/0x295
 [<c12fa327>] ? hci_rx_work+0xf9/0x295
 [<c1036d25>] process_one_work+0x128/0x1df
 [<c1346a39>] ? _raw_spin_unlock_irq+0x8/0x12
 [<c1036d25>] ? process_one_work+0x128/0x1df
 [<c103713a>] worker_thread+0x127/0x1c4
 [<c1037013>] ? rescuer_thread+0x216/0x216
 [<c103aec6>] kthread+0x88/0x8d
 [<c1040000>] ? task_rq_lock+0x37/0x6e
 [<c13474b7>] ret_from_kernel_thread+0x1b/0x28
 [<c103ae3e>] ? __kthread_parkme+0x50/0x50
Code: 05 b8 f4 ff ff ff 8d 65 f4 5b 5e 5f 5d 8d 67 f8 5f c3 57 8d 7c 24 08 83 e4 f8 ff 77 fc 55 89 e5 57 56f
EIP: [<c131e9c7>] bt_6lowpan_del_conn+0x19/0x12a SS:ESP 0068:f48c1e34
CR2: 0000000000000000

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
dsd pushed a commit that referenced this pull request Aug 11, 2014
Upon unload of the brcmfmac driver it gave a kernel warning because
cfg80211 still believed to be connected to an AP. The brcmfmac had
already transitioned to disconnected state during unload. This patch
adds informing cfg80211 about this transition. This will get rid of
warning from cfg80211 seen upon module unload:

 ------------[ cut here ]------------
 WARNING: CPU: 3 PID: 24303 at net/wireless/core.c:952
	cfg80211_netdev_notifier_call+0x193/0x640 [cfg80211]()
 Modules linked in: brcmfmac(O-) brcmutil(O) cfg80211(O) ... [last unloaded: bcma]
 CPU: 3 PID: 24303 Comm: rmmod Tainted: G        W  O 3.13.0-rc4-wl-testing-x64-00002-gb472b6d-dirty #1
 Hardware name: Dell Inc. Latitude E6410/07XJP9, BIOS A07 02/15/2011
  00000000000003b8 ffff8800b211faf8 ffffffff815a7fcd 0000000000000007
  0000000000000000 ffff8800b211fb38 ffffffff8104819c ffff880000000000
  ffff8800c889d008 ffff8800b2000220 ffff8800c889a000 ffff8800c889d018
 Call Trace:
  [<ffffffff815a7fcd>] dump_stack+0x46/0x58
  [<ffffffff8104819c>] warn_slowpath_common+0x8c/0xc0
  [<ffffffff810481ea>] warn_slowpath_null+0x1a/0x20
  [<ffffffffa173fd83>] cfg80211_netdev_notifier_call+0x193/0x640 [cfg80211]
  [<ffffffff81521ca8>] ? arp_ifdown+0x18/0x20
  [<ffffffff8152d75a>] ? fib_disable_ip+0x3a/0x50
  [<ffffffff815b143d>] notifier_call_chain+0x4d/0x70
  [<ffffffff8106d6e6>] raw_notifier_call_chain+0x16/0x20
  [<ffffffff814b9ae0>] call_netdevice_notifiers_info+0x40/0x70
  [<ffffffff814b9b26>] call_netdevice_notifiers+0x16/0x20
  [<ffffffff814bb59d>] rollback_registered_many+0x17d/0x280
  [<ffffffff814bb74d>] rollback_registered+0x2d/0x40
  [<ffffffff814bb7c8>] unregister_netdevice_queue+0x68/0xd0
  [<ffffffff814bb9c0>] unregister_netdev+0x20/0x30
  [<ffffffffa180069e>] brcmf_del_if+0xce/0x180 [brcmfmac]
  [<ffffffffa1800b3c>] brcmf_detach+0x6c/0xe0 [brcmfmac]

Reviewed-by: Hante Meuleman <meuleman@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
dsd pushed a commit that referenced this pull request Aug 11, 2014
For devices with a separated audio-only interface (em2860), call
em28xx_init_extension() only once.

That fixes a bug with Kworld 305U (eb1a:e305):

    [  658.730715] em2860 #0: V4L2 video device registered as video1
    [  658.730728] em2860 #0: V4L2 VBI device registered as vbi0
    [  658.736907] em2860 #0: Remote control support is not available for this card.
    [  658.736965] em2860 #1: Remote control support is not available for this card.
    [  658.737230] ------------[ cut here ]------------
    [  658.737246] WARNING: CPU: 2 PID: 60 at lib/list_debug.c:36 __list_add+0x8a/0xc0()
    [  658.737256] list_add double add: new=ffff8800a9a40410, prev=ffff8800a9a40410, next=ffffffffa08720d0.
    [  658.737266] Modules linked in: tuner_xc2028 netconsole rc_hauppauge em28xx_rc rc_core tuner_simple tuner_types tda9887 tda8290 tuner tvp5150 msp3400 em28xx_v4l em28xx tveeprom
 v4l2_common fuse ccm nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6tabl
e_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4
nf_nat nf_conntrack iptable_mangle iptable_security bnep iptable_raw vfat fat arc4 iwldvm mac80211 x86_pkg_temp_thermal coretemp kvm_intel nfsd iwlwifi snd_hda_codec_hdmi kvm snd_hda
_codec_realtek snd_hda_intel snd_hda_codec auth_rpcgss nfs_acl cfg80211 lockd snd_hwdep snd_seq btusb sunrpc crc32_pclmul bluetooth crc32c_intel snd_seq_device snd_pcm uvcvideo r8169
 ghash_clmulni_intel videobuf2_vmalloc videobuf2_memops videobuf2_core snd_page_alloc snd_timer snd videodev mei_me iTCO_wdt mii shpchp joydev mei media iTCO_vendor_support lpc_ich m
icrocode soundcore rfkill serio_raw i2c_i801 mfd_core nouveau i915 ttm i2c_algo_bit drm_kms_helper drm i2c_core mxm_wmi wmi video
    [  658.738601] CPU: 2 PID: 60 Comm: kworker/2:1 Not tainted 3.13.0-rc1+ #18
    [  658.738611] Hardware name: SAMSUNG ELECTRONICS CO., LTD. 550P5C/550P7C/SAMSUNG_NP1234567890, BIOS P04ABI.013.130220.dg 02/20/2013
    [  658.738624] Workqueue: events request_module_async [em28xx]
    [  658.738646]  0000000000000009 ffff8802209dfc68 ffffffff816a3c96 ffff8802209dfcb0
    [  658.738700]  ffff8802209dfca0 ffffffff8106aaad ffff8800a9a40410 ffffffffa08720d0
    [  658.738754]  ffff8800a9a40410 0000000000000000 0000000000000080 ffff8802209dfd00
    [  658.738814] Call Trace:
    [  658.738836]  [<ffffffff816a3c96>] dump_stack+0x45/0x56
    [  658.738851]  [<ffffffff8106aaad>] warn_slowpath_common+0x7d/0xa0
    [  658.738864]  [<ffffffff8106ab1c>] warn_slowpath_fmt+0x4c/0x50
    [  658.738880]  [<ffffffffa0868a7d>] ? em28xx_init_extension+0x1d/0x80 [em28xx]
    [  658.738898]  [<ffffffff81343b8a>] __list_add+0x8a/0xc0
    [  658.738911]  [<ffffffffa0868a98>] em28xx_init_extension+0x38/0x80 [em28xx]
    [  658.738927]  [<ffffffffa086a059>] request_module_async+0x19/0x110 [em28xx]
    [  658.738942]  [<ffffffff810873b5>] process_one_work+0x1f5/0x510
    [  658.738954]  [<ffffffff81087353>] ? process_one_work+0x193/0x510
    [  658.738967]  [<ffffffff810880bb>] worker_thread+0x11b/0x3a0
    [  658.738979]  [<ffffffff81087fa0>] ? manage_workers.isra.24+0x2b0/0x2b0
    [  658.738992]  [<ffffffff8108ea2f>] kthread+0xff/0x120
    [  658.739005]  [<ffffffff8108e930>] ? kthread_create_on_node+0x250/0x250
    [  658.739017]  [<ffffffff816b517c>] ret_from_fork+0x7c/0xb0
    [  658.739029]  [<ffffffff8108e930>] ? kthread_create_on_node+0x250/0x250
    [  658.739040] ---[ end trace c1acd24b354108de ]---
    [  658.739051] em2860 #1: Remote control support is not available for this card.
    [  658.742407] em28xx-audio.c: probing for em28xx Audio Vendor Class
    [  658.742429] em28xx-audio.c: Copyright (C) 2006 Markus Rechberger
    [  658.742440] em28xx-audio.c: Copyright (C) 2007-2011 Mauro Carvalho Chehab
    [  658.744798] em28xx-audio.c: probing for em28xx Audio Vendor Class
    [  658.744823] em28xx-audio.c: Copyright (C) 2006 Markus Rechberger
    [  658.744836] em28xx-audio.c: Copyright (C) 2007-2011 Mauro Carvalho Chehab
    [  658.746849] em28xx-audio.c: probing for em28xx Audio Vendor Class
    [  658.746863] em28xx-audio.c: Copyright (C) 2006 Markus Rechberger
    [  658.746874] em28xx-audio.c: Copyright (C) 2007-2011 Mauro Carvalho Chehab
    ...

Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
dsd pushed a commit that referenced this pull request Aug 11, 2014
…rt_entries()

[   89.237347] BUG: unable to handle kernel paging request at ffff880096326000
[   89.237369] IP: [<ffffffff81347227>] gen6_ppgtt_insert_entries+0x117/0x170
[   89.237382] PGD 2272067 PUD 25df0e067 PMD 25de5c067 PTE 8000000096326060
[   89.237394] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
[   89.237404] CPU: 1 PID: 1981 Comm: gem_concurrent_ Not tainted 3.13.0-rc4+ #639
[   89.237411] Hardware name: Intel Corporation 2012 Client Platform/Emerald Lake 2, BIOS ACRVMBY1.86C.0078.P00.1201161002 01/16/2012
[   89.237420] task: ffff88024c038030 ti: ffff88024b130000 task.ti: ffff88024b130000
[   89.237425] RIP: 0010:[<ffffffff81347227>]  [<ffffffff81347227>] gen6_ppgtt_insert_entries+0x117/0x170
[   89.237435] RSP: 0018:ffff88024b131ae0  EFLAGS: 00010286
[   89.237440] RAX: ffff880096325000 RBX: 0000000000000400 RCX: 0000000000001000
[   89.237445] RDX: 0000000000000200 RSI: 0000000000000001 RDI: 0000000000000010
[   89.237451] RBP: ffff88024b131b30 R08: ffff88024cc3aef0 R09: 0000000000000000
[   89.237456] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88024cc3ae00
[   89.237462] R13: ffff88024a578000 R14: 0000000000000001 R15: ffff88024a578ffc
[   89.237469] FS:  00007ff5475d8900(0000) GS:ffff88025d020000(0000) knlGS:0000000000000000
[   89.237475] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   89.237480] CR2: ffff880096326000 CR3: 000000024d531000 CR4: 00000000001407e0
[   89.237485] Stack:
[   89.237488]  ffff880000000000 0000020000000000 ffff88024b23f2c0 0000000100000000
[   89.237499]  0000000000000001 000000000007ffff ffff8801e7bf5ac0 ffff8801e7bf5ac0
[   89.237510]  ffff88024cc3ae00 ffff880248a2ee40 ffff88024b131b58 ffffffff813455ed
[   89.237521] Call Trace:
[   89.237528]  [<ffffffff813455ed>] ppgtt_bind_vma+0x3d/0x60
[   89.237534]  [<ffffffff8133d8dc>] i915_gem_object_pin+0x55c/0x6a0
[   89.237541]  [<ffffffff8134275b>] i915_gem_execbuffer_reserve_vma.isra.14+0x5b/0x110
[   89.237548]  [<ffffffff81342a88>] i915_gem_execbuffer_reserve+0x278/0x2c0
[   89.237555]  [<ffffffff81343d29>] i915_gem_do_execbuffer.isra.22+0x699/0x1250
[   89.237562]  [<ffffffff81344d91>] ? i915_gem_execbuffer2+0x51/0x290
[   89.237569]  [<ffffffff81344de6>] i915_gem_execbuffer2+0xa6/0x290
[   89.237575]  [<ffffffff813014f2>] drm_ioctl+0x4d2/0x610
[   89.237582]  [<ffffffff81080bf1>] ? cpuacct_account_field+0xa1/0xc0
[   89.237588]  [<ffffffff81080b55>] ? cpuacct_account_field+0x5/0xc0
[   89.237597]  [<ffffffff811371c0>] do_vfs_ioctl+0x300/0x520
[   89.237603]  [<ffffffff810757a1>] ? vtime_account_user+0x91/0xa0
[   89.237610]  [<ffffffff810e40eb>] ?  context_tracking_user_exit+0x9b/0xe0
[   89.237617]  [<ffffffff81083d7d>] ? trace_hardirqs_on+0xd/0x10
[   89.237623]  [<ffffffff81137425>] SyS_ioctl+0x45/0x80
[   89.237630]  [<ffffffff815afffa>] tracesys+0xd4/0xd9
[   89.237634] Code: 5b 41 5c 41 5d 41 5e 41 5f 5d c3 66 0f 1f 84 00 00 00 00 00 83 45 bc 01 49 8b 84 24 78 01 00 00 65 ff 0c 25 e0 b8 00 00 8b 55 bc <4c> 8b 2c d0 65 ff 04 25 e0 b8 00 00 49 8b 45 00 48 c1 e8 2d 48
[   89.237741] RIP  [<ffffffff81347227>] gen6_ppgtt_insert_entries+0x117/0x170
[   89.237749]  RSP <ffff88024b131ae0>
[   89.237753] CR2: ffff880096326000
[   89.237758] ---[ end trace 27416ba8b18d496c ]---

This bug dates back to the original introduction of the
gen6_ppgtt_insert_entries()

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Dropped cc: stable since without full ppgtt there's no way
we'll access the last page directory with this function since that
range is occupied (only in the allocator) with the ppgtt pdes. Without
aliasing we can start to use that range and blow up.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
dsd pushed a commit that referenced this pull request Aug 11, 2014
Jon Maloy says:

====================
tipc: link setup and failover improvements

This series consists of four unrelated commits with different purposes.

- Commit #1 is purely cosmetic and pedagogic, hopefully making the
  failover/tunneling logics slightly easier to understand.
- Commit #2 fixes a bug that has always been in the code, but was not
  discovered until very recently.
- Commit #3 fixes a non-fatal race issue in the neighbour discovery
  code.
- Commit #4 removes an unnecessary indirection step during link
  startup.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
dsd pushed a commit that referenced this pull request Aug 11, 2014
This fixes the following warning:

  BUG: sleeping function called from invalid context at mm/slub.c:940
  in_atomic(): 1, irqs_disabled(): 1, pid: 17, name: khubd
  CPU: 0 PID: 17 Comm: khubd Not tainted 3.12.0-00004-g938dd60-dirty #1

   __might_sleep+0xbe/0xc0
   kmem_cache_alloc_trace+0x36/0x170
   c67x00_urb_enqueue+0x5c/0x254
   usb_hcd_submit_urb+0x66e/0x724
   usb_submit_urb+0x2ac/0x308
   usb_start_wait_urb+0x2c/0xb8
   usb_control_msg+0x8c/0xa8
   hub_port_init+0x191/0x718
   hub_thread+0x804/0xe14
   kthread+0x72/0x78
   ret_from_kernel_thread+0x8/0xc

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Acked-by: Peter Korsgaard <peter@korsgaard.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
dsd pushed a commit that referenced this pull request Aug 11, 2014
Avoid that bus_unregister() triggers a use-after-free with
CONFIG_DEBUG_KOBJECT_RELEASE=y. This patch avoids that the
following sequence triggers a kernel crash with memory poisoning
enabled:
* bus_register()
* driver_register()
* driver_unregister()
* bus_unregister()

The above sequence causes the bus private data to be freed from
inside the bus_unregister() call although it is not guaranteed in
that function that the reference count on the bus private data has
dropped to zero. As an example, with CONFIG_DEBUG_KOBJECT_RELEASE=y
the ${bus}/drivers kobject is still holding a reference on
bus->p->subsys.kobj via its parent pointer at the time the bus
private data is freed. Fix this by deferring freeing the bus private
data until the last kobject_put() call on bus->p->subsys.kobj.

The kernel oops triggered by the above sequence and with memory
poisoning enabled and that is fixed by this patch is as follows:

general protection fault: 0000 [#1] PREEMPT SMP
CPU: 3 PID: 2711 Comm: kworker/3:32 Tainted: G        W  O 3.13.0-rc4-debug+ #1
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Workqueue: events kobject_delayed_cleanup
task: ffff880037f866d0 ti: ffff88003b638000 task.ti: ffff88003b638000
Call Trace:
 [<ffffffff81263105>] ? kobject_get_path+0x25/0x100
 [<ffffffff81264354>] kobject_uevent_env+0x134/0x600
 [<ffffffff8126482b>] kobject_uevent+0xb/0x10
 [<ffffffff81262fa2>] kobject_delayed_cleanup+0xc2/0x1b0
 [<ffffffff8106c047>] process_one_work+0x217/0x700
 [<ffffffff8106bfdb>] ? process_one_work+0x1ab/0x700
 [<ffffffff8106c64b>] worker_thread+0x11b/0x3a0
 [<ffffffff8106c530>] ? process_one_work+0x700/0x700
 [<ffffffff81074b70>] kthread+0xf0/0x110
 [<ffffffff81074a80>] ? insert_kthread_work+0x80/0x80
 [<ffffffff815673bc>] ret_from_fork+0x7c/0xb0
 [<ffffffff81074a80>] ? insert_kthread_work+0x80/0x80
Code: 89 f8 48 89 e5 f6 82 c0 27 63 81 20 74 15 0f 1f 44 00 00 48 83 c0 01 0f b6 10 f6 82 c0 27 63 81 20 75 f0 5d c3 66 0f 1f 44 00 00 <80> 3f 00 55 48 89 e5 74 15 48 89 f8 0f 1f 40 00 48 83 c0 01 80
RIP  [<ffffffff81267ed0>] strlen+0x0/0x30
 RSP <ffff88003b639c70>
---[ end trace 210f883ef80376aa ]---

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
dsd pushed a commit that referenced this pull request Aug 11, 2014
In function free_dmar_iommu(), it sets IRQ handler data to NULL
before calling free_irq(), which will cause invalid memory access
because free_irq() will access IRQ handler data when calling
function dmar_msi_mask(). So only set IRQ handler data to NULL
after calling free_irq().

Sample stack dump:
[   13.094010] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
[   13.103215] IP: [<ffffffff810a97cd>] __lock_acquire+0x4d/0x12a0
[   13.110104] PGD 0
[   13.112614] Oops: 0000 [#1] SMP
[   13.116585] Modules linked in:
[   13.120260] CPU: 60 PID: 1 Comm: swapper/0 Tainted: G        W    3.13.0-rc1-gerry+ #9
[   13.129367] Hardware name: Intel Corporation LH Pass ........../SVRBD-ROW_T, BIOS SE5C600.86B.99.99.x059.091020121352 09/10/2012
[   13.142555] task: ffff88042dd38010 ti: ffff88042dd32000 task.ti: ffff88042dd32000
[   13.151179] RIP: 0010:[<ffffffff810a97cd>]  [<ffffffff810a97cd>] __lock_acquire+0x4d/0x12a0
[   13.160867] RSP: 0000:ffff88042dd33b78  EFLAGS: 00010046
[   13.166969] RAX: 0000000000000046 RBX: 0000000000000002 RCX: 0000000000000000
[   13.175122] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000048
[   13.183274] RBP: ffff88042dd33bd8 R08: 0000000000000002 R09: 0000000000000001
[   13.191417] R10: 0000000000000000 R11: 0000000000000001 R12: ffff88042dd38010
[   13.199571] R13: 0000000000000000 R14: 0000000000000048 R15: 0000000000000000
[   13.207725] FS:  0000000000000000(0000) GS:ffff88103f200000(0000) knlGS:0000000000000000
[   13.217014] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   13.223596] CR2: 0000000000000048 CR3: 0000000001a0b000 CR4: 00000000000407e0
[   13.231747] Stack:
[   13.234160]  0000000000000004 0000000000000046 ffff88042dd33b98 ffffffff810a567d
[   13.243059]  ffff88042dd33c08 ffffffff810bb14c ffffffff828995a0 0000000000000046
[   13.251969]  0000000000000000 0000000000000000 0000000000000002 0000000000000000
[   13.260862] Call Trace:
[   13.263775]  [<ffffffff810a567d>] ? trace_hardirqs_off+0xd/0x10
[   13.270571]  [<ffffffff810bb14c>] ? vprintk_emit+0x23c/0x570
[   13.277058]  [<ffffffff810ab1e3>] lock_acquire+0x93/0x120
[   13.283269]  [<ffffffff814623f7>] ? dmar_msi_mask+0x47/0x70
[   13.289677]  [<ffffffff8156b449>] _raw_spin_lock_irqsave+0x49/0x90
[   13.296748]  [<ffffffff814623f7>] ? dmar_msi_mask+0x47/0x70
[   13.303153]  [<ffffffff814623f7>] dmar_msi_mask+0x47/0x70
[   13.309354]  [<ffffffff810c0d93>] irq_shutdown+0x53/0x60
[   13.315467]  [<ffffffff810bdd9d>] __free_irq+0x26d/0x280
[   13.321580]  [<ffffffff810be920>] free_irq+0xf0/0x180
[   13.327395]  [<ffffffff81466591>] free_dmar_iommu+0x271/0x2b0
[   13.333996]  [<ffffffff810a947d>] ? trace_hardirqs_on+0xd/0x10
[   13.340696]  [<ffffffff81461a17>] free_iommu+0x17/0x50
[   13.346597]  [<ffffffff81dc75a5>] init_dmars+0x691/0x77a
[   13.352711]  [<ffffffff81dc7afd>] intel_iommu_init+0x351/0x438
[   13.359400]  [<ffffffff81d8a711>] ? iommu_setup+0x27d/0x27d
[   13.365806]  [<ffffffff81d8a739>] pci_iommu_init+0x28/0x52
[   13.372114]  [<ffffffff81000342>] do_one_initcall+0x122/0x180
[   13.378707]  [<ffffffff81077738>] ? parse_args+0x1e8/0x320
[   13.385016]  [<ffffffff81d850e8>] kernel_init_freeable+0x1e1/0x26c
[   13.392100]  [<ffffffff81d84833>] ? do_early_param+0x88/0x88
[   13.398596]  [<ffffffff8154f8b0>] ? rest_init+0xd0/0xd0
[   13.404614]  [<ffffffff8154f8be>] kernel_init+0xe/0x130
[   13.410626]  [<ffffffff81574d6c>] ret_from_fork+0x7c/0xb0
[   13.416829]  [<ffffffff8154f8b0>] ? rest_init+0xd0/0xd0
[   13.422842] Code: ec 99 00 85 c0 8b 05 53 05 a5 00 41 0f 45 d8 85 c0 0f 84 ff 00 00 00 8b 05 99 f9 7e 01 49 89 fe 41 89 f7 85 c0 0f 84 03 01 00 00 <49> 8b 06 be 01 00 00 00 48 3d c0 0e 01 82 0f 44 de 41 83 ff 01
[   13.450191] RIP  [<ffffffff810a97cd>] __lock_acquire+0x4d/0x12a0
[   13.458598]  RSP <ffff88042dd33b78>
[   13.462671] CR2: 0000000000000048
[   13.466551] ---[ end trace c5bd26a37c81d760 ]---

Reviewed-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
dsd pushed a commit that referenced this pull request Aug 11, 2014
…ices

Data structure drhd->iommu is shared between DMA remapping driver and
interrupt remapping driver, so DMA remapping driver shouldn't release
drhd->iommu when it failed to initialize IOMMU devices. Otherwise it
may cause invalid memory access to the interrupt remapping driver.

Sample stack dump:
[   13.315090] BUG: unable to handle kernel paging request at ffffc9000605a088
[   13.323221] IP: [<ffffffff81461bac>] qi_submit_sync+0x15c/0x400
[   13.330107] PGD 82f81e067 PUD c2f81e067 PMD 82e846067 PTE 0
[   13.336818] Oops: 0002 [#1] SMP
[   13.340757] Modules linked in:
[   13.344422] CPU: 0 PID: 4 Comm: kworker/0:0 Not tainted 3.13.0-rc1-gerry+ #7
[   13.352474] Hardware name: Intel Corporation LH Pass ........../SVRBD-ROW_T,                                               BIOS SE5C600.86B.99.99.x059.091020121352 09/10/2012
[   13.365659] Workqueue: events work_for_cpu_fn
[   13.370774] task: ffff88042ddf00d0 ti: ffff88042ddee000 task.ti: ffff88042dde                                              e000
[   13.379389] RIP: 0010:[<ffffffff81461bac>]  [<ffffffff81461bac>] qi_submit_sy                                              nc+0x15c/0x400
[   13.389055] RSP: 0000:ffff88042ddef940  EFLAGS: 00010002
[   13.395151] RAX: 00000000000005e0 RBX: 0000000000000082 RCX: 0000000200000025
[   13.403308] RDX: ffffc9000605a000 RSI: 0000000000000010 RDI: ffff88042ddb8610
[   13.411446] RBP: ffff88042ddef9a0 R08: 00000000000005d0 R09: 0000000000000001
[   13.419599] R10: 0000000000000000 R11: 000000000000005d R12: 000000000000005c
[   13.427742] R13: ffff88102d84d300 R14: 0000000000000174 R15: ffff88042ddb4800
[   13.435877] FS:  0000000000000000(0000) GS:ffff88043de00000(0000) knlGS:00000                                              00000000000
[   13.445168] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   13.451749] CR2: ffffc9000605a088 CR3: 0000000001a0b000 CR4: 00000000000407f0
[   13.459895] Stack:
[   13.462297]  ffff88042ddb85d0 000000000000005d ffff88042ddef9b0 0000000000000                                              5d0
[   13.471147]  00000000000005c0 ffff88042ddb8000 000000000000005c 0000000000000                                              015
[   13.480001]  ffff88042ddb4800 0000000000000282 ffff88042ddefa40 ffff88042ddef                                              ac0
[   13.488855] Call Trace:
[   13.491771]  [<ffffffff8146848d>] modify_irte+0x9d/0xd0
[   13.497778]  [<ffffffff8146886d>] intel_setup_ioapic_entry+0x10d/0x290
[   13.505250]  [<ffffffff810a92a6>] ? trace_hardirqs_on_caller+0x16/0x1e0
[   13.512824]  [<ffffffff810346b0>] ? default_init_apic_ldr+0x60/0x60
[   13.519998]  [<ffffffff81468be0>] setup_ioapic_remapped_entry+0x20/0x30
[   13.527566]  [<ffffffff8103683a>] io_apic_setup_irq_pin+0x12a/0x2c0
[   13.534742]  [<ffffffff8136673b>] ? acpi_pci_irq_find_prt_entry+0x2b9/0x2d8
[   13.544102]  [<ffffffff81037fd5>] io_apic_setup_irq_pin_once+0x85/0xa0
[   13.551568]  [<ffffffff8103816f>] ? mp_find_ioapic_pin+0x8f/0xf0
[   13.558434]  [<ffffffff81038044>] io_apic_set_pci_routing+0x34/0x70
[   13.565621]  [<ffffffff8102f4cf>] mp_register_gsi+0xaf/0x1c0
[   13.572111]  [<ffffffff8102f5ee>] acpi_register_gsi_ioapic+0xe/0x10
[   13.579286]  [<ffffffff8102f33f>] acpi_register_gsi+0xf/0x20
[   13.585779]  [<ffffffff81366b86>] acpi_pci_irq_enable+0x171/0x1e3
[   13.592764]  [<ffffffff8146d771>] pcibios_enable_device+0x31/0x40
[   13.599744]  [<ffffffff81320e9b>] do_pci_enable_device+0x3b/0x60
[   13.606633]  [<ffffffff81322248>] pci_enable_device_flags+0xc8/0x120
[   13.613887]  [<ffffffff813222f3>] pci_enable_device+0x13/0x20
[   13.620484]  [<ffffffff8132fa7e>] pcie_port_device_register+0x1e/0x510
[   13.627947]  [<ffffffff810a92a6>] ? trace_hardirqs_on_caller+0x16/0x1e0
[   13.635510]  [<ffffffff810a947d>] ? trace_hardirqs_on+0xd/0x10
[   13.642189]  [<ffffffff813302b8>] pcie_portdrv_probe+0x58/0xc0
[   13.648877]  [<ffffffff81323ba5>] local_pci_probe+0x45/0xa0
[   13.655266]  [<ffffffff8106bc44>] work_for_cpu_fn+0x14/0x20
[   13.661656]  [<ffffffff8106fa79>] process_one_work+0x369/0x710
[   13.668334]  [<ffffffff8106fa02>] ? process_one_work+0x2f2/0x710
[   13.675215]  [<ffffffff81071d56>] ? worker_thread+0x46/0x690
[   13.681714]  [<ffffffff81072194>] worker_thread+0x484/0x690
[   13.688109]  [<ffffffff81071d10>] ? cancel_delayed_work_sync+0x20/0x20
[   13.695576]  [<ffffffff81079c60>] kthread+0xf0/0x110
[   13.701300]  [<ffffffff8108e7bf>] ? local_clock+0x3f/0x50
[   13.707492]  [<ffffffff81079b70>] ? kthread_create_on_node+0x250/0x250
[   13.714959]  [<ffffffff81574d2c>] ret_from_fork+0x7c/0xb0
[   13.721152]  [<ffffffff81079b70>] ? kthread_create_on_node+0x250/0x250

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
dsd pushed a commit that referenced this pull request Aug 11, 2014
The below tells us the static_key conversion has a problem; since the
exact point of clearing that flag isn't too important, delay the flip
and use a workqueue to process it.

[ ] TSC synchronization [CPU#0 -> CPU#22]:
[ ] Measured 8 cycles TSC warp between CPUs, turning off TSC clock.
[ ]
[ ] ======================================================
[ ] [ INFO: possible circular locking dependency detected ]
[ ] 3.13.0-rc3-01745-g848b0d0322cb-dirty #637 Not tainted
[ ] -------------------------------------------------------
[ ] swapper/0/1 is trying to acquire lock:
[ ]  (jump_label_mutex){+.+...}, at: [<ffffffff8115a637>] jump_label_lock+0x17/0x20
[ ]
[ ] but task is already holding lock:
[ ]  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff8109408b>] cpu_hotplug_begin+0x2b/0x60
[ ]
[ ] which lock already depends on the new lock.
[ ]
[ ]
[ ] the existing dependency chain (in reverse order) is:
[ ]
[ ] -> #1 (cpu_hotplug.lock){+.+.+.}:
[ ]        [<ffffffff810def00>] lock_acquire+0x90/0x130
[ ]        [<ffffffff81661f83>] mutex_lock_nested+0x63/0x3e0
[ ]        [<ffffffff81093fdc>] get_online_cpus+0x3c/0x60
[ ]        [<ffffffff8104cc67>] arch_jump_label_transform+0x37/0x130
[ ]        [<ffffffff8115a3cf>] __jump_label_update+0x5f/0x80
[ ]        [<ffffffff8115a48d>] jump_label_update+0x9d/0xb0
[ ]        [<ffffffff8115aa6d>] static_key_slow_inc+0x9d/0xb0
[ ]        [<ffffffff810c0f65>] sched_feat_set+0xf5/0x100
[ ]        [<ffffffff810c5bdc>] set_numabalancing_state+0x2c/0x30
[ ]        [<ffffffff81d12f3d>] numa_policy_init+0x1af/0x1b7
[ ]        [<ffffffff81cebdf4>] start_kernel+0x35d/0x41f
[ ]        [<ffffffff81ceb5a5>] x86_64_start_reservations+0x2a/0x2c
[ ]        [<ffffffff81ceb6a2>] x86_64_start_kernel+0xfb/0xfe
[ ]
[ ] -> #0 (jump_label_mutex){+.+...}:
[ ]        [<ffffffff810de141>] __lock_acquire+0x1701/0x1eb0
[ ]        [<ffffffff810def00>] lock_acquire+0x90/0x130
[ ]        [<ffffffff81661f83>] mutex_lock_nested+0x63/0x3e0
[ ]        [<ffffffff8115a637>] jump_label_lock+0x17/0x20
[ ]        [<ffffffff8115aa3b>] static_key_slow_inc+0x6b/0xb0
[ ]        [<ffffffff810ca775>] clear_sched_clock_stable+0x15/0x20
[ ]        [<ffffffff810503b3>] mark_tsc_unstable+0x23/0x70
[ ]        [<ffffffff810772cb>] check_tsc_sync_source+0x14b/0x150
[ ]        [<ffffffff81076612>] native_cpu_up+0x3a2/0x890
[ ]        [<ffffffff810941cb>] _cpu_up+0xdb/0x160
[ ]        [<ffffffff810942c9>] cpu_up+0x79/0x90
[ ]        [<ffffffff81d0af6b>] smp_init+0x60/0x8c
[ ]        [<ffffffff81cebf42>] kernel_init_freeable+0x8c/0x197
[ ]        [<ffffffff8164e32e>] kernel_init+0xe/0x130
[ ]        [<ffffffff8166beec>] ret_from_fork+0x7c/0xb0
[ ]
[ ] other info that might help us debug this:
[ ]
[ ]  Possible unsafe locking scenario:
[ ]
[ ]        CPU0                    CPU1
[ ]        ----                    ----
[ ]   lock(cpu_hotplug.lock);
[ ]                                lock(jump_label_mutex);
[ ]                                lock(cpu_hotplug.lock);
[ ]   lock(jump_label_mutex);
[ ]
[ ]  *** DEADLOCK ***
[ ]
[ ] 2 locks held by swapper/0/1:
[ ]  #0:  (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff81094037>] cpu_maps_update_begin+0x17/0x20
[ ]  #1:  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff8109408b>] cpu_hotplug_begin+0x2b/0x60
[ ]
[ ] stack backtrace:
[ ] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.13.0-rc3-01745-g848b0d0322cb-dirty #637
[ ] Hardware name: Supermicro X8DTN/X8DTN, BIOS 4.6.3 01/08/2010
[ ]  ffffffff82c9c270 ffff880236843bb8 ffffffff8165c5f5 ffffffff82c9c270
[ ]  ffff880236843bf8 ffffffff81658c02 ffff880236843c80 ffff8802368586a0
[ ]  ffff880236858678 0000000000000001 0000000000000002 ffff880236858000
[ ] Call Trace:
[ ]  [<ffffffff8165c5f5>] dump_stack+0x4e/0x7a
[ ]  [<ffffffff81658c02>] print_circular_bug+0x1f9/0x207
[ ]  [<ffffffff810de141>] __lock_acquire+0x1701/0x1eb0
[ ]  [<ffffffff816680ff>] ? __atomic_notifier_call_chain+0x8f/0xb0
[ ]  [<ffffffff810def00>] lock_acquire+0x90/0x130
[ ]  [<ffffffff8115a637>] ? jump_label_lock+0x17/0x20
[ ]  [<ffffffff8115a637>] ? jump_label_lock+0x17/0x20
[ ]  [<ffffffff81661f83>] mutex_lock_nested+0x63/0x3e0
[ ]  [<ffffffff8115a637>] ? jump_label_lock+0x17/0x20
[ ]  [<ffffffff8115a637>] jump_label_lock+0x17/0x20
[ ]  [<ffffffff8115aa3b>] static_key_slow_inc+0x6b/0xb0
[ ]  [<ffffffff810ca775>] clear_sched_clock_stable+0x15/0x20
[ ]  [<ffffffff810503b3>] mark_tsc_unstable+0x23/0x70
[ ]  [<ffffffff810772cb>] check_tsc_sync_source+0x14b/0x150
[ ]  [<ffffffff81076612>] native_cpu_up+0x3a2/0x890
[ ]  [<ffffffff810941cb>] _cpu_up+0xdb/0x160
[ ]  [<ffffffff810942c9>] cpu_up+0x79/0x90
[ ]  [<ffffffff81d0af6b>] smp_init+0x60/0x8c
[ ]  [<ffffffff81cebf42>] kernel_init_freeable+0x8c/0x197
[ ]  [<ffffffff8164e320>] ? rest_init+0xd0/0xd0
[ ]  [<ffffffff8164e32e>] kernel_init+0xe/0x130
[ ]  [<ffffffff8166beec>] ret_from_fork+0x7c/0xb0
[ ]  [<ffffffff8164e320>] ? rest_init+0xd0/0xd0
[ ] ------------[ cut here ]------------
[ ] WARNING: CPU: 0 PID: 1 at /usr/src/linux-2.6/kernel/smp.c:374 smp_call_function_many+0xad/0x300()
[ ] Modules linked in:
[ ] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.13.0-rc3-01745-g848b0d0322cb-dirty #637
[ ] Hardware name: Supermicro X8DTN/X8DTN, BIOS 4.6.3 01/08/2010
[ ]  0000000000000009 ffff880236843be0 ffffffff8165c5f5 0000000000000000
[ ]  ffff880236843c18 ffffffff81093d8c 0000000000000000 0000000000000000
[ ]  ffffffff81ccd1a0 ffffffff810ca951 0000000000000000 ffff880236843c28
[ ] Call Trace:
[ ]  [<ffffffff8165c5f5>] dump_stack+0x4e/0x7a
[ ]  [<ffffffff81093d8c>] warn_slowpath_common+0x8c/0xc0
[ ]  [<ffffffff810ca951>] ? sched_clock_tick+0x1/0xa0
[ ]  [<ffffffff81093dda>] warn_slowpath_null+0x1a/0x20
[ ]  [<ffffffff8110b72d>] smp_call_function_many+0xad/0x300
[ ]  [<ffffffff8104f200>] ? arch_unregister_cpu+0x30/0x30
[ ]  [<ffffffff8104f200>] ? arch_unregister_cpu+0x30/0x30
[ ]  [<ffffffff810ca951>] ? sched_clock_tick+0x1/0xa0
[ ]  [<ffffffff8110ba96>] smp_call_function+0x46/0x80
[ ]  [<ffffffff8104f200>] ? arch_unregister_cpu+0x30/0x30
[ ]  [<ffffffff8110bb3c>] on_each_cpu+0x3c/0xa0
[ ]  [<ffffffff810ca950>] ? sched_clock_idle_sleep_event+0x20/0x20
[ ]  [<ffffffff810ca951>] ? sched_clock_tick+0x1/0xa0
[ ]  [<ffffffff8104f964>] text_poke_bp+0x64/0xd0
[ ]  [<ffffffff810ca950>] ? sched_clock_idle_sleep_event+0x20/0x20
[ ]  [<ffffffff8104ccde>] arch_jump_label_transform+0xae/0x130
[ ]  [<ffffffff8115a3cf>] __jump_label_update+0x5f/0x80
[ ]  [<ffffffff8115a48d>] jump_label_update+0x9d/0xb0
[ ]  [<ffffffff8115aa6d>] static_key_slow_inc+0x9d/0xb0
[ ]  [<ffffffff810ca775>] clear_sched_clock_stable+0x15/0x20
[ ]  [<ffffffff810503b3>] mark_tsc_unstable+0x23/0x70
[ ]  [<ffffffff810772cb>] check_tsc_sync_source+0x14b/0x150
[ ]  [<ffffffff81076612>] native_cpu_up+0x3a2/0x890
[ ]  [<ffffffff810941cb>] _cpu_up+0xdb/0x160
[ ]  [<ffffffff810942c9>] cpu_up+0x79/0x90
[ ]  [<ffffffff81d0af6b>] smp_init+0x60/0x8c
[ ]  [<ffffffff81cebf42>] kernel_init_freeable+0x8c/0x197
[ ]  [<ffffffff8164e320>] ? rest_init+0xd0/0xd0
[ ]  [<ffffffff8164e32e>] kernel_init+0xe/0x130
[ ]  [<ffffffff8166beec>] ret_from_fork+0x7c/0xb0
[ ]  [<ffffffff8164e320>] ? rest_init+0xd0/0xd0
[ ] ---[ end trace 6ff1df5620c49d26 ]---
[ ] tsc: Marking TSC unstable due to check_tsc_sync_source failed

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-v55fgqj3nnyqnngmvuu8ep6h@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
dsd pushed a commit that referenced this pull request Aug 11, 2014
Sometimes we may meet the following lockdep issue.
The root cause is .set_clock callback is executed with spin_lock_irqsave
in sdhci_do_set_ios. However, the IMX set_clock callback will try to access
clk_get_rate which is using a mutex lock.

The fix avoids access mutex in .set_clock callback by initializing the
pltfm_host->clock at probe time and use it later instead of calling
clk_get_rate again in atomic context.

[ INFO: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected ]
3.13.0-rc1+ #285 Not tainted
------------------------------------------------------
kworker/u8:1/29 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
 (prepare_lock){+.+...}, at: [<80480b08>] clk_prepare_lock+0x44/0xe4

and this task is already holding:
 (&(&host->lock)->rlock#2){-.-...}, at: [<804611f4>] sdhci_do_set_ios+0x20/0x720
which would create a new lock dependency:
 (&(&host->lock)->rlock#2){-.-...} -> (prepare_lock){+.+...}

but this new dependency connects a HARDIRQ-irq-safe lock:
 (&(&host->lock)->rlock#2){-.-...}
... which became HARDIRQ-irq-safe at:
  [<8005f030>] mark_lock+0x140/0x6ac
  [<80060760>] __lock_acquire+0xb30/0x1cbc
  [<800620d0>] lock_acquire+0x70/0x84
  [<8061d2f0>] _raw_spin_lock+0x30/0x40
  [<80460668>] sdhci_irq+0x24/0xa68
  [<8006b1d4>] handle_irq_event_percpu+0x54/0x18c
  [<8006b350>] handle_irq_event+0x44/0x64
  [<8006e50c>] handle_fasteoi_irq+0xa0/0x170
  [<8006a8f0>] generic_handle_irq+0x30/0x44
  [<8000f238>] handle_IRQ+0x54/0xbc
  [<8000864c>] gic_handle_irq+0x30/0x64
  [<80013024>] __irq_svc+0x44/0x5c
  [<80614c58>] printk+0x38/0x40
  [<804622a8>] sdhci_add_host+0x844/0xbcc
  [<80464948>] sdhci_esdhc_imx_probe+0x378/0x67c
  [<8032ee88>] platform_drv_probe+0x20/0x50
  [<8032d48c>] driver_probe_device+0x118/0x234
  [<8032d690>] __driver_attach+0x9c/0xa0
  [<8032b89c>] bus_for_each_dev+0x68/0x9c
  [<8032cf44>] driver_attach+0x20/0x28
  [<8032cbc8>] bus_add_driver+0x148/0x1f4
  [<8032dce0>] driver_register+0x80/0x100
  [<8032ee54>] __platform_driver_register+0x50/0x64
  [<8084b094>] sdhci_esdhc_imx_driver_init+0x18/0x20
  [<80008980>] do_one_initcall+0x108/0x16c
  [<8081cca4>] kernel_init_freeable+0x10c/0x1d0
  [<80611c50>] kernel_init+0x10/0x120
  [<8000e9c8>] ret_from_fork+0x14/0x2c

to a HARDIRQ-irq-unsafe lock:
 (prepare_lock){+.+...}
... which became HARDIRQ-irq-unsafe at:
...  [<8005f030>] mark_lock+0x140/0x6ac
  [<8005f604>] mark_held_locks+0x68/0x12c
  [<8005f780>] trace_hardirqs_on_caller+0xb8/0x1d8
  [<8005f8b4>] trace_hardirqs_on+0x14/0x18
  [<8061a130>] mutex_trylock+0x180/0x20c
  [<80480ad8>] clk_prepare_lock+0x14/0xe4
  [<804816a4>] clk_notifier_register+0x28/0xf0
  [<80015120>] twd_clk_init+0x50/0x68
  [<80008980>] do_one_initcall+0x108/0x16c
  [<8081cca4>] kernel_init_freeable+0x10c/0x1d0
  [<80611c50>] kernel_init+0x10/0x120
  [<8000e9c8>] ret_from_fork+0x14/0x2c

other info that might help us debug this:

 Possible interrupt unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(prepare_lock);
                               local_irq_disable();
                               lock(&(&host->lock)->rlock#2);
                               lock(prepare_lock);
  <Interrupt>
    lock(&(&host->lock)->rlock#2);

 *** DEADLOCK ***

3 locks held by kworker/u8:1/29:
 #0:  (kmmcd){.+.+.+}, at: [<8003db18>] process_one_work+0x128/0x468
 #1:  ((&(&host->detect)->work)){+.+.+.}, at: [<8003db18>] process_one_work+0x128/0x468
 #2:  (&(&host->lock)->rlock#2){-.-...}, at: [<804611f4>] sdhci_do_set_ios+0x20/0x720

the dependencies between HARDIRQ-irq-safe lock and the holding lock:
-> (&(&host->lock)->rlock#2){-.-...} ops: 330 {
   IN-HARDIRQ-W at:
                    [<8005f030>] mark_lock+0x140/0x6ac
                    [<80060760>] __lock_acquire+0xb30/0x1cbc
                    [<800620d0>] lock_acquire+0x70/0x84
                    [<8061d2f0>] _raw_spin_lock+0x30/0x40
                    [<80460668>] sdhci_irq+0x24/0xa68
                    [<8006b1d4>] handle_irq_event_percpu+0x54/0x18c
                    [<8006b350>] handle_irq_event+0x44/0x64
                    [<8006e50c>] handle_fasteoi_irq+0xa0/0x170
                    [<8006a8f0>] generic_handle_irq+0x30/0x44
                    [<8000f238>] handle_IRQ+0x54/0xbc
                    [<8000864c>] gic_handle_irq+0x30/0x64
                    [<80013024>] __irq_svc+0x44/0x5c
                    [<80614c58>] printk+0x38/0x40
                    [<804622a8>] sdhci_add_host+0x844/0xbcc
                    [<80464948>] sdhci_esdhc_imx_probe+0x378/0x67c
                    [<8032ee88>] platform_drv_probe+0x20/0x50
                    [<8032d48c>] driver_probe_device+0x118/0x234
                    [<8032d690>] __driver_attach+0x9c/0xa0
                    [<8032b89c>] bus_for_each_dev+0x68/0x9c
                    [<8032cf44>] driver_attach+0x20/0x28
                    [<8032cbc8>] bus_add_driver+0x148/0x1f4
                    [<8032dce0>] driver_register+0x80/0x100
                    [<8032ee54>] __platform_driver_register+0x50/0x64
                    [<8084b094>] sdhci_esdhc_imx_driver_init+0x18/0x20
                    [<80008980>] do_one_initcall+0x108/0x16c
                    [<8081cca4>] kernel_init_freeable+0x10c/0x1d0
                    [<80611c50>] kernel_init+0x10/0x120
                    [<8000e9c8>] ret_from_fork+0x14/0x2c
   IN-SOFTIRQ-W at:
                    [<8005f030>] mark_lock+0x140/0x6ac
                    [<80060204>] __lock_acquire+0x5d4/0x1cbc
                    [<800620d0>] lock_acquire+0x70/0x84
                    [<8061d40c>] _raw_spin_lock_irqsave+0x40/0x54
                    [<8045e4a4>] sdhci_tasklet_finish+0x1c/0x120
                    [<8002b538>] tasklet_action+0xa0/0x15c
                    [<8002b778>] __do_softirq+0x118/0x290
                    [<8002bcf4>] irq_exit+0xb4/0x10c
                    [<8000f240>] handle_IRQ+0x5c/0xbc
                    [<8000864c>] gic_handle_irq+0x30/0x64
                    [<80013024>] __irq_svc+0x44/0x5c
                    [<80614c58>] printk+0x38/0x40
                    [<804622a8>] sdhci_add_host+0x844/0xbcc
                    [<80464948>] sdhci_esdhc_imx_probe+0x378/0x67c
                    [<8032ee88>] platform_drv_probe+0x20/0x50
                    [<8032d48c>] driver_probe_device+0x118/0x234
                    [<8032d690>] __driver_attach+0x9c/0xa0
                    [<8032b89c>] bus_for_each_dev+0x68/0x9c
                    [<8032cf44>] driver_attach+0x20/0x28
                    [<8032cbc8>] bus_add_driver+0x148/0x1f4
                    [<8032dce0>] driver_register+0x80/0x100
                    [<8032ee54>] __platform_driver_register+0x50/0x64
                    [<8084b094>] sdhci_esdhc_imx_driver_init+0x18/0x20
                    [<80008980>] do_one_initcall+0x108/0x16c
                    [<8081cca4>] kernel_init_freeable+0x10c/0x1d0
                    [<80611c50>] kernel_init+0x10/0x120
                    [<8000e9c8>] ret_from_fork+0x14/0x2c
   INITIAL USE at:
                   [<8005f030>] mark_lock+0x140/0x6ac
                   [<8005ff0c>] __lock_acquire+0x2dc/0x1cbc
                   [<800620d0>] lock_acquire+0x70/0x84
                   [<8061d40c>] _raw_spin_lock_irqsave+0x40/0x54
                   [<804611f4>] sdhci_do_set_ios+0x20/0x720
                   [<80461924>] sdhci_set_ios+0x30/0x3c
                   [<8044cea0>] mmc_power_up+0x6c/0xd0
                   [<8044dac4>] mmc_start_host+0x60/0x70
                   [<8044eb3c>] mmc_add_host+0x60/0x88
                   [<8046225c>] sdhci_add_host+0x7f8/0xbcc
                   [<80464948>] sdhci_esdhc_imx_probe+0x378/0x67c
                   [<8032ee88>] platform_drv_probe+0x20/0x50
                   [<8032d48c>] driver_probe_device+0x118/0x234
                   [<8032d690>] __driver_attach+0x9c/0xa0
                   [<8032b89c>] bus_for_each_dev+0x68/0x9c
                   [<8032cf44>] driver_attach+0x20/0x28
                   [<8032cbc8>] bus_add_driver+0x148/0x1f4
                   [<8032dce0>] driver_register+0x80/0x100
                   [<8032ee54>] __platform_driver_register+0x50/0x64
                   [<8084b094>] sdhci_esdhc_imx_driver_init+0x18/0x20
                   [<80008980>] do_one_initcall+0x108/0x16c
                   [<8081cca4>] kernel_init_freeable+0x10c/0x1d0
                   [<80611c50>] kernel_init+0x10/0x120
                   [<8000e9c8>] ret_from_fork+0x14/0x2c
 }
 ... key      at: [<80e040e8>] __key.26952+0x0/0x8
 ... acquired at:
   [<8005eb60>] check_usage+0x3d0/0x5c0
   [<8005edac>] check_irq_usage+0x5c/0xb8
   [<80060d38>] __lock_acquire+0x1108/0x1cbc
   [<800620d0>] lock_acquire+0x70/0x84
   [<8061a210>] mutex_lock_nested+0x54/0x3c0
   [<80480b08>] clk_prepare_lock+0x44/0xe4
   [<8048188c>] clk_get_rate+0x14/0x64
   [<8046374c>] esdhc_pltfm_set_clock+0x20/0x2a4
   [<8045d70c>] sdhci_set_clock+0x4c/0x498
   [<80461518>] sdhci_do_set_ios+0x344/0x720
   [<80461924>] sdhci_set_ios+0x30/0x3c
   [<8044c390>] __mmc_set_clock+0x44/0x60
   [<8044cd4c>] mmc_set_clock+0x10/0x14
   [<8044f8f4>] mmc_init_card+0x1b4/0x1520
   [<80450f00>] mmc_attach_mmc+0xb4/0x194
   [<8044da08>] mmc_rescan+0x294/0x2f0
   [<8003db94>] process_one_work+0x1a4/0x468
   [<8003e850>] worker_thread+0x118/0x3e0
   [<80044de0>] kthread+0xd4/0xf0
   [<8000e9c8>] ret_from_fork+0x14/0x2c

the dependencies between the lock to be acquired and HARDIRQ-irq-unsafe lock:
-> (prepare_lock){+.+...} ops: 395 {
   HARDIRQ-ON-W at:
                    [<8005f030>] mark_lock+0x140/0x6ac
                    [<8005f604>] mark_held_locks+0x68/0x12c
                    [<8005f780>] trace_hardirqs_on_caller+0xb8/0x1d8
                    [<8005f8b4>] trace_hardirqs_on+0x14/0x18
                    [<8061a130>] mutex_trylock+0x180/0x20c
                    [<80480ad8>] clk_prepare_lock+0x14/0xe4
                    [<804816a4>] clk_notifier_register+0x28/0xf0
                    [<80015120>] twd_clk_init+0x50/0x68
                    [<80008980>] do_one_initcall+0x108/0x16c
                    [<8081cca4>] kernel_init_freeable+0x10c/0x1d0
                    [<80611c50>] kernel_init+0x10/0x120
                    [<8000e9c8>] ret_from_fork+0x14/0x2c
   SOFTIRQ-ON-W at:
                    [<8005f030>] mark_lock+0x140/0x6ac
                    [<8005f604>] mark_held_locks+0x68/0x12c
                    [<8005f7c8>] trace_hardirqs_on_caller+0x100/0x1d8
                    [<8005f8b4>] trace_hardirqs_on+0x14/0x18
                    [<8061a130>] mutex_trylock+0x180/0x20c
                    [<80480ad8>] clk_prepare_lock+0x14/0xe4
                    [<804816a4>] clk_notifier_register+0x28/0xf0
                    [<80015120>] twd_clk_init+0x50/0x68
                    [<80008980>] do_one_initcall+0x108/0x16c
                    [<8081cca4>] kernel_init_freeable+0x10c/0x1d0
                    [<80611c50>] kernel_init+0x10/0x120
                    [<8000e9c8>] ret_from_fork+0x14/0x2c
   INITIAL USE at:
                   [<8005f030>] mark_lock+0x140/0x6ac
                   [<8005ff0c>] __lock_acquire+0x2dc/0x1cbc
                   [<800620d0>] lock_acquire+0x70/0x84
                   [<8061a0c8>] mutex_trylock+0x118/0x20c
                   [<80480ad8>] clk_prepare_lock+0x14/0xe4
                   [<80482af8>] __clk_init+0x1c/0x45c
                   [<8048306c>] _clk_register+0xd0/0x170
                   [<80483148>] clk_register+0x3c/0x7c
                   [<80483b4c>] clk_register_fixed_rate+0x88/0xd8
                   [<80483c04>] of_fixed_clk_setup+0x68/0x94
                   [<8084c6fc>] of_clk_init+0x44/0x68
                   [<808202b0>] time_init+0x2c/0x38
                   [<8081ca14>] start_kernel+0x1e4/0x368
                   [<10008074>] 0x10008074
 }
 ... key      at: [<808afebc>] prepare_lock+0x38/0x48
 ... acquired at:
   [<8005eb94>] check_usage+0x404/0x5c0
   [<8005edac>] check_irq_usage+0x5c/0xb8
   [<80060d38>] __lock_acquire+0x1108/0x1cbc
   [<800620d0>] lock_acquire+0x70/0x84
   [<8061a210>] mutex_lock_nested+0x54/0x3c0
   [<80480b08>] clk_prepare_lock+0x44/0xe4
   [<8048188c>] clk_get_rate+0x14/0x64
   [<8046374c>] esdhc_pltfm_set_clock+0x20/0x2a4
   [<8045d70c>] sdhci_set_clock+0x4c/0x498
   [<80461518>] sdhci_do_set_ios+0x344/0x720
   [<80461924>] sdhci_set_ios+0x30/0x3c
   [<8044c390>] __mmc_set_clock+0x44/0x60
   [<8044cd4c>] mmc_set_clock+0x10/0x14
   [<8044f8f4>] mmc_init_card+0x1b4/0x1520
   [<80450f00>] mmc_attach_mmc+0xb4/0x194
   [<8044da08>] mmc_rescan+0x294/0x2f0
   [<8003db94>] process_one_work+0x1a4/0x468
   [<8003e850>] worker_thread+0x118/0x3e0
   [<80044de0>] kthread+0xd4/0xf0
   [<8000e9c8>] ret_from_fork+0x14/0x2c

stack backtrace:
CPU: 2 PID: 29 Comm: kworker/u8:1 Not tainted 3.13.0-rc1+ #285
Workqueue: kmmcd mmc_rescan
Backtrace:
[<80012160>] (dump_backtrace+0x0/0x10c) from [<80012438>] (show_stack+0x18/0x1c)
 r6:00000000 r5:00000000 r4:8088ecc8 r3:bfa11200
[<80012420>] (show_stack+0x0/0x1c) from [<80616b14>] (dump_stack+0x84/0x9c)
[<80616a90>] (dump_stack+0x0/0x9c) from [<8005ebb4>] (check_usage+0x424/0x5c0)
 r5:80979940 r4:bfa29b44
[<8005e790>] (check_usage+0x0/0x5c0) from [<8005edac>] (check_irq_usage+0x5c/0xb8)
[<8005ed50>] (check_irq_usage+0x0/0xb8) from [<80060d38>] (__lock_acquire+0x1108/0x1cbc)
 r8:bfa115e8 r7:80df9884 r6:80dafa9c r5:00000003 r4:bfa115d0
[<8005fc30>] (__lock_acquire+0x0/0x1cbc) from [<800620d0>] (lock_acquire+0x70/0x84)
[<80062060>] (lock_acquire+0x0/0x84) from [<8061a210>] (mutex_lock_nested+0x54/0x3c0)
 r7:bfa11200 r6:80dafa9c r5:00000000 r4:80480b08
[<8061a1bc>] (mutex_lock_nested+0x0/0x3c0) from [<80480b08>] (clk_prepare_lock+0x44/0xe4)
[<80480ac4>] (clk_prepare_lock+0x0/0xe4) from [<8048188c>] (clk_get_rate+0x14/0x64)
 r6:03197500 r5:bf0e9aa8 r4:bf827400 r3:808ae128
[<80481878>] (clk_get_rate+0x0/0x64) from [<8046374c>] (esdhc_pltfm_set_clock+0x20/0x2a4)
 r5:bf0e9aa8 r4:bf0e9c40
[<8046372c>] (esdhc_pltfm_set_clock+0x0/0x2a4) from [<8045d70c>] (sdhci_set_clock+0x4c/0x498)
[<8045d6c0>] (sdhci_set_clock+0x0/0x498) from [<80461518>] (sdhci_do_set_ios+0x344/0x720)
 r8:0000003b r7:20000113 r6:bf0e9d68 r5:bf0e9aa8 r4:bf0e9c40
r3:00000000
[<804611d4>] (sdhci_do_set_ios+0x0/0x720) from [<80461924>] (sdhci_set_ios+0x30/0x3c)
 r9:00000004 r8:bf131000 r7:bf131048 r6:00000000 r5:bf0e9aa8
r4:bf0e9800
[<804618f4>] (sdhci_set_ios+0x0/0x3c) from [<8044c390>] (__mmc_set_clock+0x44/0x60)
 r5:03197500 r4:bf0e9800
[<8044c34c>] (__mmc_set_clock+0x0/0x60) from [<8044cd4c>] (mmc_set_clock+0x10/0x14)
 r5:00000000 r4:bf0e9800
[<8044cd3c>] (mmc_set_clock+0x0/0x14) from [<8044f8f4>] (mmc_init_card+0x1b4/0x1520)
[<8044f740>] (mmc_init_card+0x0/0x1520) from [<80450f00>] (mmc_attach_mmc+0xb4/0x194)
[<80450e4c>] (mmc_attach_mmc+0x0/0x194) from [<8044da08>] (mmc_rescan+0x294/0x2f0)
 r5:8065f358 r4:bf0e9af8
[<8044d774>] (mmc_rescan+0x0/0x2f0) from [<8003db94>] (process_one_work+0x1a4/0x468)
 r8:00000000 r7:bfa29eb0 r6:bf80dc00 r5:bf0e9af8 r4:bf9e3f00
r3:8044d774
[<8003d9f0>] (process_one_work+0x0/0x468) from [<8003e850>] (worker_thread+0x118/0x3e0)
[<8003e738>] (worker_thread+0x0/0x3e0) from [<80044de0>] (kthread+0xd4/0xf0)
[<80044d0c>] (kthread+0x0/0xf0) from [<8000e9c8>] (ret_from_fork+0x14/0x2c)
 r7:00000000 r6:00000000 r5:80044d0c r4:bf9e7f00

Fixes: 0ddf03c mmc: esdhc-imx: parse max-frequency from devicetree
Signed-off-by: Dong Aisheng <b29396@freescale.com>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Tested-by: Philippe De Muyter <phdm@macqel.be>
Cc: stable <stable@vger.kernel.org> # 3.13
Signed-off-by: Chris Ball <chris@printf.net>
dsd pushed a commit that referenced this pull request Aug 11, 2014
In https://bugzilla.kernel.org/show_bug.cgi?id=67561, a locking dependency is reported
when b43 is used with hostapd, and rfkill is used to kill the radio output.

The lockdep splat (in part) is as follows:

======================================================
[ INFO: possible circular locking dependency detected ]
3.12.0 #1 Not tainted
-------------------------------------------------------
rfkill/10040 is trying to acquire lock:
 (rtnl_mutex){+.+.+.}, at: [<ffffffff8146f282>] rtnl_lock+0x12/0x20

but task is already holding lock:
 (rfkill_global_mutex){+.+.+.}, at: [<ffffffffa04832ca>] rfkill_fop_write+0x6a/0x170 [rfkill]

--snip--

Chain exists of:
  rtnl_mutex --> misc_mtx --> rfkill_global_mutex

The fix is to move the initialization of the hardware random number generator
outside the code range covered by the rtnl_mutex.

Reported-by: yury <urykhy@gmail.com>
Tested-by: yury <urykhy@gmail.com>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Stable <stable@vger.kernel.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
dsd pushed a commit that referenced this pull request Aug 11, 2014
* init the sts flag to 0 (missed)
* fix write the real bit not sts value
* Set PORTCS_STS and DEVLC_STS only if sts = 1

[Peter Chen: This one and the next patch fix the problem occurred imx27
and imx31, and imx27 and imx31 usb support are enabled until 3.14, so
these two patches isn't needed for -stable]

Signed-off-by: Chris Ruehl <chris.ruehl@gtsys.com.hk>
Signed-off-by: Peter Chen <peter.chen@freescale.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
starnight pushed a commit that referenced this pull request Aug 18, 2022
BugLink: https://bugs.launchpad.net/bugs/1978240

[ Upstream commit 7ba2d9d ]

Resource dump menu may span over more than a single page, support it.
Otherwise, menu read may result in a memory access violation: reading
outside of the allocated page.
Note that page format of the first menu page contains menu headers while
the proceeding menu pages contain only records.

The KASAN logs are as follows:
BUG: KASAN: slab-out-of-bounds in strcmp+0x9b/0xb0
Read of size 1 at addr ffff88812b2e1fd0 by task systemd-udevd/496

CPU: 5 PID: 496 Comm: systemd-udevd Tainted: G    B  5.16.0_for_upstream_debug_2022_01_10_23_12 #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x57/0x7d
 print_address_description.constprop.0+0x1f/0x140
 ? strcmp+0x9b/0xb0
 ? strcmp+0x9b/0xb0
 kasan_report.cold+0x83/0xdf
 ? strcmp+0x9b/0xb0
 strcmp+0x9b/0xb0
 mlx5_rsc_dump_init+0x4ab/0x780 [mlx5_core]
 ? mlx5_rsc_dump_destroy+0x80/0x80 [mlx5_core]
 ? lockdep_hardirqs_on_prepare+0x286/0x400
 ? raw_spin_unlock_irqrestore+0x47/0x50
 ? aomic_notifier_chain_register+0x32/0x40
 mlx5_load+0x104/0x2e0 [mlx5_core]
 mlx5_init_one+0x41b/0x610 [mlx5_core]
 ....
The buggy address belongs to the object at ffff88812b2e0000
 which belongs to the cache kmalloc-4k of size 4096
The buggy address is located 4048 bytes to the right of
 4096-byte region [ffff88812b2e0000, ffff88812b2e1000)
The buggy address belongs to the page:
page:000000009d69807a refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88812b2e6000 pfn:0x12b2e0
head:000000009d69807a order:3 compound_mapcount:0 compound_pincount:0
flags: 0x8000000000010200(slab|head|zone=2)
raw: 8000000000010200 0000000000000000 dead000000000001 ffff888100043040
raw: ffff88812b2e6000 0000000080040000 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff88812b2e1e80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff88812b2e1f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88812b2e1f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
                                                 ^
 ffff88812b2e2000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88812b2e2080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

Fixes: 12206b1 ("net/mlx5: Add support for resource dump")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
starnight pushed a commit that referenced this pull request Aug 18, 2022
[ Upstream commit fe83f5e ]

The commit in Fixes started adding INT3 after RETs as a mitigation
against straight-line speculation.

The fastop SETcc implementation in kvm's insn emulator uses macro magic
to generate all possible SETcc functions and to jump to them when
emulating the respective instruction.

However, it hardcodes the size and alignment of those functions to 4: a
three-byte SETcc insn and a single-byte RET. BUT, with SLS, there's an
INT3 that gets slapped after the RET, which brings the whole scheme out
of alignment:

  15:   0f 90 c0                seto   %al
  18:   c3                      ret
  19:   cc                      int3
  1a:   0f 1f 00                nopl   (%rax)
  1d:   0f 91 c0                setno  %al
  20:   c3                      ret
  21:   cc                      int3
  22:   0f 1f 00                nopl   (%rax)
  25:   0f 92 c0                setb   %al
  28:   c3                      ret
  29:   cc                      int3

and this explodes like this:

  int3: 0000 [#1] PREEMPT SMP PTI
  CPU: 0 PID: 2435 Comm: qemu-system-x86 Not tainted 5.17.0-rc8-sls #1
  Hardware name: Dell Inc. Precision WorkStation T3400  /0TP412, BIOS A14 04/30/2012
  RIP: 0010:setc+0x5/0x8 [kvm]
  Code: 00 00 0f 1f 00 0f b6 05 43 24 06 00 c3 cc 0f 1f 80 00 00 00 00 0f 90 c0 c3 cc 0f \
	  1f 00 0f 91 c0 c3 cc 0f 1f 00 0f 92 c0 c3 cc <0f> 1f 00 0f 93 c0 c3 cc 0f 1f 00 \
	  0f 94 c0 c3 cc 0f 1f 00 0f 95 c0
  Call Trace:
   <TASK>
   ? x86_emulate_insn [kvm]
   ? x86_emulate_instruction [kvm]
   ? vmx_handle_exit [kvm_intel]
   ? kvm_arch_vcpu_ioctl_run [kvm]
   ? kvm_vcpu_ioctl [kvm]
   ? __x64_sys_ioctl
   ? do_syscall_64
   ? entry_SYSCALL_64_after_hwframe
   </TASK>

Raise the alignment value when SLS is enabled and use a macro for that
instead of hard-coding naked numbers.

Fixes: e463a09 ("x86: Add straight-line-speculation mitigation")
Reported-by: Jamie Heilman <jamie@audible.transient.net>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Jamie Heilman <jamie@audible.transient.net>
Link: https://lore.kernel.org/r/YjGzJwjrvxg5YZ0Z@audible.transient.net
[Add a comment and a bit of safety checking, since this is going to be changed
 again for IBT support. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CVE-2022-29900
CVE-2022-29901
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
starnight pushed a commit that referenced this pull request Aug 18, 2022
The return thunk call makes the fastop functions larger, just like IBT
does. Consider a 16-byte FASTOP_SIZE when CONFIG_RETHUNK is enabled.

Otherwise, functions will be incorrectly aligned and when computing their
position for differently sized operators, they will executed in the middle
or end of a function, which may as well be an int3, leading to a crash
like:

[   36.091116] int3: 0000 [#1] SMP NOPTI
[   36.091119] CPU: 3 PID: 1371 Comm: qemu-system-x86 Not tainted 5.15.0-41-generic #44
[   36.091120] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
[   36.091121] RIP: 0010:xaddw_ax_dx+0x9/0x10 [kvm]
[   36.091185] Code: 00 0f bb d0 c3 cc cc cc cc 48 0f bb d0 c3 cc cc cc cc 0f 1f 80 00 00 00 00 0f c0 d0 c3 cc cc cc cc 66 0f c1 d0 c3 cc cc cc cc <0f> 1f 80 00 00 00 00 0f c1 d0 c3 cc cc cc cc 48 0f c1 d0 c3 cc cc
[   36.091186] RSP: 0018:ffffb1f541143c98 EFLAGS: 00000202
[   36.091188] RAX: 0000000089abcdef RBX: 0000000000000001 RCX: 0000000000000000
[   36.091188] RDX: 0000000076543210 RSI: ffffffffc073c6d0 RDI: 0000000000000200
[   36.091189] RBP: ffffb1f541143ca0 R08: ffff9f1803350a70 R09: 0000000000000002
[   36.091190] R10: ffff9f1803350a70 R11: 0000000000000000 R12: ffff9f1803350a70
[   36.091190] R13: ffffffffc077fee0 R14: 0000000000000000 R15: 0000000000000000
[   36.091191] FS:  00007efdfce8d640(0000) GS:ffff9f187dd80000(0000) knlGS:0000000000000000
[   36.091192] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   36.091192] CR2: 0000000000000000 CR3: 0000000009b62002 CR4: 0000000000772ee0
[   36.091195] PKRU: 55555554
[   36.091195] Call Trace:
[   36.091197]  <TASK>
[   36.091198]  ? fastop+0x5a/0xa0 [kvm]
[   36.091222]  x86_emulate_insn+0x7b8/0xe90 [kvm]
[   36.091244]  x86_emulate_instruction+0x2f4/0x630 [kvm]
[   36.091263]  ? kvm_arch_vcpu_load+0x7c/0x230 [kvm]
[   36.091283]  ? vmx_prepare_switch_to_host+0xf7/0x190 [kvm_intel]
[   36.091290]  complete_emulated_mmio+0x297/0x320 [kvm]
[   36.091310]  kvm_arch_vcpu_ioctl_run+0x32f/0x550 [kvm]
[   36.091330]  kvm_vcpu_ioctl+0x29e/0x6d0 [kvm]
[   36.091344]  ? kvm_vcpu_ioctl+0x120/0x6d0 [kvm]
[   36.091357]  ? __fget_files+0x86/0xc0
[   36.091362]  ? __fget_files+0x86/0xc0
[   36.091363]  __x64_sys_ioctl+0x92/0xd0
[   36.091366]  do_syscall_64+0x59/0xc0
[   36.091369]  ? syscall_exit_to_user_mode+0x27/0x50
[   36.091370]  ? do_syscall_64+0x69/0xc0
[   36.091371]  ? syscall_exit_to_user_mode+0x27/0x50
[   36.091372]  ? __x64_sys_writev+0x1c/0x30
[   36.091374]  ? do_syscall_64+0x69/0xc0
[   36.091374]  ? exit_to_user_mode_prepare+0x37/0xb0
[   36.091378]  ? syscall_exit_to_user_mode+0x27/0x50
[   36.091379]  ? do_syscall_64+0x69/0xc0
[   36.091379]  ? do_syscall_64+0x69/0xc0
[   36.091380]  ? do_syscall_64+0x69/0xc0
[   36.091381]  ? do_syscall_64+0x69/0xc0
[   36.091381]  entry_SYSCALL_64_after_hwframe+0x61/0xcb
[   36.091384] RIP: 0033:0x7efdfe6d1aff
[   36.091390] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00
[   36.091391] RSP: 002b:00007efdfce8c460 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[   36.091393] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007efdfe6d1aff
[   36.091393] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000000c
[   36.091394] RBP: 0000558f1609e220 R08: 0000558f13fb8190 R09: 00000000ffffffff
[   36.091394] R10: 0000558f16b5e950 R11: 0000000000000246 R12: 0000000000000000
[   36.091394] R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
[   36.091396]  </TASK>
[   36.091397] Modules linked in: isofs nls_iso8859_1 kvm_intel joydev kvm input_leds serio_raw sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ipmi_devintf ipmi_msghandler drm msr ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net net_failover crypto_simd ahci xhci_pci cryptd psmouse virtio_blk libahci xhci_pci_renesas failover
[   36.123271] ---[ end trace db3c0ab5a48fabcc ]---
[   36.123272] RIP: 0010:xaddw_ax_dx+0x9/0x10 [kvm]
[   36.123319] Code: 00 0f bb d0 c3 cc cc cc cc 48 0f bb d0 c3 cc cc cc cc 0f 1f 80 00 00 00 00 0f c0 d0 c3 cc cc cc cc 66 0f c1 d0 c3 cc cc cc cc <0f> 1f 80 00 00 00 00 0f c1 d0 c3 cc cc cc cc 48 0f c1 d0 c3 cc cc
[   36.123320] RSP: 0018:ffffb1f541143c98 EFLAGS: 00000202
[   36.123321] RAX: 0000000089abcdef RBX: 0000000000000001 RCX: 0000000000000000
[   36.123321] RDX: 0000000076543210 RSI: ffffffffc073c6d0 RDI: 0000000000000200
[   36.123322] RBP: ffffb1f541143ca0 R08: ffff9f1803350a70 R09: 0000000000000002
[   36.123322] R10: ffff9f1803350a70 R11: 0000000000000000 R12: ffff9f1803350a70
[   36.123323] R13: ffffffffc077fee0 R14: 0000000000000000 R15: 0000000000000000
[   36.123323] FS:  00007efdfce8d640(0000) GS:ffff9f187dd80000(0000) knlGS:0000000000000000
[   36.123324] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   36.123325] CR2: 0000000000000000 CR3: 0000000009b62002 CR4: 0000000000772ee0
[   36.123327] PKRU: 55555554
[   36.123328] Kernel panic - not syncing: Fatal exception in interrupt
[   36.123410] Kernel Offset: 0x1400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[   36.135305] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---

Fixes: aa3d480 ("x86: Use return-thunk in asm code")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Co-developed-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Message-Id: <20220713171241.184026-1-cascardo@canonical.com>
Tested-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(backported from commit 84e7051)
[cascardo: factor out ENDBR_INSN_SIZE as ENDBR is not used]
CVE-2022-29900
CVE-2022-29901
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
starnight pushed a commit that referenced this pull request Aug 18, 2022
When running with return thunks enabled under 32-bit EFI, the system
crashes with:

  kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
  BUG: unable to handle page fault for address: 000000005bc02900
  #PF: supervisor instruction fetch in kernel mode
  #PF: error_code(0x0011) - permissions violation
  PGD 18f7063 P4D 18f7063 PUD 18ff063 PMD 190e063 PTE 800000005bc02063
  Oops: 0011 [#1] PREEMPT SMP PTI
  CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.19.0-rc6+ #166
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:0x5bc02900
  Code: Unable to access opcode bytes at RIP 0x5bc028d6.
  RSP: 0018:ffffffffb3203e10 EFLAGS: 00010046
  RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000048
  RDX: 000000000190dfac RSI: 0000000000001710 RDI: 000000007eae823b
  RBP: ffffffffb3203e70 R08: 0000000001970000 R09: ffffffffb3203e28
  R10: 747563657865206c R11: 6c6977203a696665 R12: 0000000000001710
  R13: 0000000000000030 R14: 0000000001970000 R15: 0000000000000001
  FS:  0000000000000000(0000) GS:ffff8e013ca00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0018 ES: 0018 CR0: 0000000080050033
  CR2: 000000005bc02900 CR3: 0000000001930000 CR4: 00000000000006f0
  Call Trace:
   ? efi_set_virtual_address_map+0x9c/0x175
   efi_enter_virtual_mode+0x4a6/0x53e
   start_kernel+0x67c/0x71e
   x86_64_start_reservations+0x24/0x2a
   x86_64_start_kernel+0xe9/0xf4
   secondary_startup_64_no_verify+0xe5/0xeb

That's because it cannot jump to the return thunk from the 32-bit code.

Using a naked RET and marking it as safe allows the system to proceed
booting.

Fixes: aa3d480 ("x86: Use return-thunk in asm code")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: <stable@vger.kernel.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 51a6fa0)
CVE-2022-29900
CVE-2022-29901
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
starnight pushed a commit that referenced this pull request Aug 18, 2022
BugLink: https://bugs.launchpad.net/bugs/1980278

[ Upstream commit 486b9ee ]

Function ice_plug_aux_dev() assigns pf->adev field too early prior
aux device initialization and on other side ice_unplug_aux_dev()
starts aux device deinit and at the end assigns NULL to pf->adev.
This is wrong because pf->adev should always be non-NULL only when
aux device is fully initialized and ready. This wrong order causes
a crash when ice_send_event_to_aux() call occurs because that function
depends on non-NULL value of pf->adev and does not assume that
aux device is half-initialized or half-destroyed.
After order correction the race window is tiny but it is still there,
as Leon mentioned and manipulation with pf->adev needs to be protected
by mutex.

Fix (un-)plugging functions so pf->adev field is set after aux device
init and prior aux device destroy and protect pf->adev assignment by
new mutex. This mutex is also held during ice_send_event_to_aux()
call to ensure that aux device is valid during that call.
Note that device lock used ice_send_event_to_aux() needs to be kept
to avoid race with aux drv unload.

Reproducer:
cycle=1
while :;do
        echo "#### Cycle: $cycle"

        ip link set ens7f0 mtu 9000
        ip link add bond0 type bond mode 1 miimon 100
        ip link set bond0 up
        ifenslave bond0 ens7f0
        ip link set bond0 mtu 9000
        ethtool -L ens7f0 combined 1
        ip link del bond0
        ip link set ens7f0 mtu 1500
        sleep 1

        let cycle++
done

In short when the device is added/removed to/from bond the aux device
is unplugged/plugged. When MTU of the device is changed an event is
sent to aux device asynchronously. This can race with (un)plugging
operation and because pf->adev is set too early (plug) or too late
(unplug) the function ice_send_event_to_aux() can touch uninitialized
or destroyed fields. In the case of crash below pf->adev->dev.mutex.

Crash:
[   53.372066] bond0: (slave ens7f0): making interface the new active one
[   53.378622] bond0: (slave ens7f0): Enslaving as an active interface with an u
p link
[   53.386294] IPv6: ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
[   53.549104] bond0: (slave ens7f1): Enslaving as a backup interface with an up
 link
[   54.118906] ice 0000:ca:00.0 ens7f0: Number of in use tx queues changed inval
idating tc mappings. Priority traffic classification disabled!
[   54.233374] ice 0000:ca:00.1 ens7f1: Number of in use tx queues changed inval
idating tc mappings. Priority traffic classification disabled!
[   54.248204] bond0: (slave ens7f0): Releasing backup interface
[   54.253955] bond0: (slave ens7f1): making interface the new active one
[   54.274875] bond0: (slave ens7f1): Releasing backup interface
[   54.289153] bond0 (unregistering): Released all slaves
[   55.383179] MII link monitoring set to 100 ms
[   55.398696] bond0: (slave ens7f0): making interface the new active one
[   55.405241] BUG: kernel NULL pointer dereference, address: 0000000000000080
[   55.405289] bond0: (slave ens7f0): Enslaving as an active interface with an u
p link
[   55.412198] #PF: supervisor write access in kernel mode
[   55.412200] #PF: error_code(0x0002) - not-present page
[   55.412201] PGD 25d2ad067 P4D 0
[   55.412204] Oops: 0002 [#1] PREEMPT SMP NOPTI
[   55.412207] CPU: 0 PID: 403 Comm: kworker/0:2 Kdump: loaded Tainted: G S
           5.17.0-13579-g57f2d6540f03 #1
[   55.429094] bond0: (slave ens7f1): Enslaving as a backup interface with an up
 link
[   55.430224] Hardware name: Dell Inc. PowerEdge R750/06V45N, BIOS 1.4.4 10/07/
2021
[   55.430226] Workqueue: ice ice_service_task [ice]
[   55.468169] RIP: 0010:mutex_unlock+0x10/0x20
[   55.472439] Code: 0f b1 13 74 96 eb e0 4c 89 ee eb d8 e8 79 54 ff ff 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 65 48 8b 04 25 40 ef 01 00 31 d2 <f0> 48 0f b1 17 75 01 c3 e9 e3 fe ff ff 0f 1f 00 0f 1f 44 00 00 48
[   55.491186] RSP: 0018:ff4454230d7d7e28 EFLAGS: 00010246
[   55.496413] RAX: ff1a79b208b08000 RBX: ff1a79b2182e8880 RCX: 0000000000000001
[   55.503545] RDX: 0000000000000000 RSI: ff4454230d7d7db0 RDI: 0000000000000080
[   55.510678] RBP: ff1a79d1c7e48b68 R08: ff4454230d7d7db0 R09: 0000000000000041
[   55.517812] R10: 00000000000000a5 R11: 00000000000006e6 R12: ff1a79d1c7e48bc0
[   55.524945] R13: 0000000000000000 R14: ff1a79d0ffc305c0 R15: 0000000000000000
[   55.532076] FS:  0000000000000000(0000) GS:ff1a79d0ffc00000(0000) knlGS:0000000000000000
[   55.540163] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   55.545908] CR2: 0000000000000080 CR3: 00000003487ae003 CR4: 0000000000771ef0
[   55.553041] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   55.560173] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   55.567305] PKRU: 55555554
[   55.570018] Call Trace:
[   55.572474]  <TASK>
[   55.574579]  ice_service_task+0xaab/0xef0 [ice]
[   55.579130]  process_one_work+0x1c5/0x390
[   55.583141]  ? process_one_work+0x390/0x390
[   55.587326]  worker_thread+0x30/0x360
[   55.590994]  ? process_one_work+0x390/0x390
[   55.595180]  kthread+0xe6/0x110
[   55.598325]  ? kthread_complete_and_exit+0x20/0x20
[   55.603116]  ret_from_fork+0x1f/0x30
[   55.606698]  </TASK>

Fixes: f9f5301 ("ice: Register auxiliary device to provide RDMA")
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Dave Ertman <david.m.ertman@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
starnight pushed a commit that referenced this pull request Aug 18, 2022
BugLink: https://bugs.launchpad.net/bugs/1980278

[ Upstream commit c61711c ]

We are accessing "desc->ops" in sof_pci_probe without checking "desc"
pointer. This results in NULL pointer exception if pci_id->driver_data
i.e desc pointer isn't defined in sof device probe:

BUG: kernel NULL pointer dereference, address: 0000000000000060
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
RIP: 0010:sof_pci_probe+0x1e/0x17f [snd_sof_pci]
Code: Unable to access opcode bytes at RIP 0xffffffffc043dff4.
RSP: 0018:ffffac4b03b9b8d8 EFLAGS: 00010246

Add NULL pointer check for sof_dev_desc pointer to avoid such exception.

Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Signed-off-by: Ajit Kumar Pandey <AjitKumar.Pandey@amd.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Link: https://lore.kernel.org/r/20220426183357.102155-1-pierre-louis.bossart@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
starnight pushed a commit that referenced this pull request Aug 18, 2022
BugLink: https://bugs.launchpad.net/bugs/1980278

commit 478d134 upstream.

Kernel panic when injecting memory_failure for the global huge_zero_page,
when CONFIG_DEBUG_VM is enabled, as follows.

  Injecting memory failure for pfn 0x109ff9 at process virtual address 0x20ff9000
  page:00000000fb053fc3 refcount:2 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x109e00
  head:00000000fb053fc3 order:9 compound_mapcount:0 compound_pincount:0
  flags: 0x17fffc000010001(locked|head|node=0|zone=2|lastcpupid=0x1ffff)
  raw: 017fffc000010001 0000000000000000 dead000000000122 0000000000000000
  raw: 0000000000000000 0000000000000000 00000002ffffffff 0000000000000000
  page dumped because: VM_BUG_ON_PAGE(is_huge_zero_page(head))
  ------------[ cut here ]------------
  kernel BUG at mm/huge_memory.c:2499!
  invalid opcode: 0000 [#1] PREEMPT SMP PTI
  CPU: 6 PID: 553 Comm: split_bug Not tainted 5.18.0-rc1+ #11
  Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 3288b3c 04/01/2014
  RIP: 0010:split_huge_page_to_list+0x66a/0x880
  Code: 84 9b fb ff ff 48 8b 7c 24 08 31 f6 e8 9f 5d 2a 00 b8 b8 02 00 00 e9 e8 fb ff ff 48 c7 c6 e8 47 3c 82 4c b
  RSP: 0018:ffffc90000dcbdf8 EFLAGS: 00010246
  RAX: 000000000000003c RBX: 0000000000000001 RCX: 0000000000000000
  RDX: 0000000000000000 RSI: ffffffff823e4c4f RDI: 00000000ffffffff
  RBP: ffff88843fffdb40 R08: 0000000000000000 R09: 00000000fffeffff
  R10: ffffc90000dcbc48 R11: ffffffff82d68448 R12: ffffea0004278000
  R13: ffffffff823c6203 R14: 0000000000109ff9 R15: ffffea000427fe40
  FS:  00007fc375a26740(0000) GS:ffff88842fd80000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007fc3757c9290 CR3: 0000000102174006 CR4: 00000000003706e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
  try_to_split_thp_page+0x3a/0x130
  memory_failure+0x128/0x800
  madvise_inject_error.cold+0x8b/0xa1
  __x64_sys_madvise+0x54/0x60
  do_syscall_64+0x35/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xae
  RIP: 0033:0x7fc3754f8bf9
  Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 8
  RSP: 002b:00007ffeda93a1d8 EFLAGS: 00000217 ORIG_RAX: 000000000000001c
  RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc3754f8bf9
  RDX: 0000000000000064 RSI: 0000000000003000 RDI: 0000000020ff9000
  RBP: 00007ffeda93a200 R08: 0000000000000000 R09: 0000000000000000
  R10: 00000000ffffffff R11: 0000000000000217 R12: 0000000000400490
  R13: 00007ffeda93a2e0 R14: 0000000000000000 R15: 0000000000000000

We think that raising BUG is overkilling for splitting huge_zero_page, the
huge_zero_page can't be met from normal paths other than memory failure,
but memory failure is a valid caller.  So we tend to replace the BUG to
WARN + returning -EBUSY, and thus the panic above won't happen again.

Link: https://lkml.kernel.org/r/f35f8b97377d5d3ede1bc5ac3114da888c57cbce.1651052574.git.xuyu@linux.alibaba.com
Fixes: d173d54 ("mm/memory-failure.c: skip huge_zero_page in memory_failure()")
Fixes: 6a46079 ("HWPOISON: The high level memory error handler in the VM v7")
Signed-off-by: Xu Yu <xuyu@linux.alibaba.com>
Suggested-by: Yang Shi <shy828301@gmail.com>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Reviewed-by: Yang Shi <shy828301@gmail.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
starnight pushed a commit that referenced this pull request Aug 18, 2022
BugLink: https://bugs.launchpad.net/bugs/1980278

commit e333eed upstream.

Since commit f1131b9 ("net: phy: micrel: use
kszphy_suspend()/kszphy_resume for irq aware devices") the following
NULL pointer dereference is observed on a board with KSZ8061:

 # udhcpc -i eth0
udhcpc: started, v1.35.0
8<--- cut here ---
Unable to handle kernel NULL pointer dereference at virtual address 00000008
pgd = f73cef4e
[00000008] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in:
CPU: 0 PID: 196 Comm: ifconfig Not tainted 5.15.37-dirty #94
Hardware name: Freescale i.MX6 SoloX (Device Tree)
PC is at kszphy_config_reset+0x10/0x114
LR is at kszphy_resume+0x24/0x64
...

The KSZ8061 phy_driver structure does not have the .probe/..driver_data
fields, which means that priv is not allocated.

This causes the NULL pointer dereference inside kszphy_config_reset().

Fix the problem by using the generic suspend/resume functions as before.

Another alternative would be to provide the .probe and .driver_data
information into the structure, but to be on the safe side, let's
just restore Ethernet functionality by using the generic suspend/resume.

Cc: stable@vger.kernel.org
Fixes: f1131b9 ("net: phy: micrel: use kszphy_suspend()/kszphy_resume for irq aware devices")
Signed-off-by: Fabio Estevam <festevam@denx.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20220504143104.1286960-1-festevam@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
starnight pushed a commit that referenced this pull request Aug 18, 2022
BugLink: https://bugs.launchpad.net/bugs/1980278

commit 1825b93 upstream.

The following VM_BUG_ON_FOLIO() is triggered when memory error event
happens on the (thp/folio) pages which are about to be freed:

  [ 1160.232771] page:00000000b36a8a0f refcount:1 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x16a000
  [ 1160.236916] page:00000000b36a8a0f refcount:0 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x16a000
  [ 1160.240684] flags: 0x57ffffc0800000(hwpoison|node=1|zone=2|lastcpupid=0x1fffff)
  [ 1160.243458] raw: 0057ffffc0800000 dead000000000100 dead000000000122 0000000000000000
  [ 1160.246268] raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
  [ 1160.249197] page dumped because: VM_BUG_ON_FOLIO(!folio_test_large(folio))
  [ 1160.251815] ------------[ cut here ]------------
  [ 1160.253438] kernel BUG at include/linux/mm.h:788!
  [ 1160.256162] invalid opcode: 0000 [#1] PREEMPT SMP PTI
  [ 1160.258172] CPU: 2 PID: 115368 Comm: mceinj.sh Tainted: G            E     5.18.0-rc1-v5.18-rc1-220404-2353-005-g83111+ #3
  [ 1160.262049] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1.fc35 04/01/2014
  [ 1160.265103] RIP: 0010:dump_page.cold+0x27e/0x2bd
  [ 1160.266757] Code: fe ff ff 48 c7 c6 81 f1 5a 98 e9 4c fe ff ff 48 c7 c6 a1 95 59 98 e9 40 fe ff ff 48 c7 c6 50 bf 5a 98 48 89 ef e8 9d 04 6d ff <0f> 0b 41 f7 c4 ff 0f 00 00 0f 85 9f fd ff ff 49 8b 04 24 a9 00 00
  [ 1160.273180] RSP: 0018:ffffaa2c4d59fd18 EFLAGS: 00010292
  [ 1160.274969] RAX: 000000000000003e RBX: 0000000000000001 RCX: 0000000000000000
  [ 1160.277263] RDX: 0000000000000001 RSI: ffffffff985995a1 RDI: 00000000ffffffff
  [ 1160.279571] RBP: ffffdc9c45a80000 R08: 0000000000000000 R09: 00000000ffffdfff
  [ 1160.281794] R10: ffffaa2c4d59fb08 R11: ffffffff98940d08 R12: ffffdc9c45a80000
  [ 1160.283920] R13: ffffffff985b6f94 R14: 0000000000000000 R15: ffffdc9c45a80000
  [ 1160.286641] FS:  00007eff54ce1740(0000) GS:ffff99c67bd00000(0000) knlGS:0000000000000000
  [ 1160.289498] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [ 1160.291106] CR2: 00005628381a5f68 CR3: 0000000104712003 CR4: 0000000000170ee0
  [ 1160.293031] Call Trace:
  [ 1160.293724]  <TASK>
  [ 1160.294334]  get_hwpoison_page+0x47d/0x570
  [ 1160.295474]  memory_failure+0x106/0xaa0
  [ 1160.296474]  ? security_capable+0x36/0x50
  [ 1160.297524]  hard_offline_page_store+0x43/0x80
  [ 1160.298684]  kernfs_fop_write_iter+0x11c/0x1b0
  [ 1160.299829]  new_sync_write+0xf9/0x160
  [ 1160.300810]  vfs_write+0x209/0x290
  [ 1160.301835]  ksys_write+0x4f/0xc0
  [ 1160.302718]  do_syscall_64+0x3b/0x90
  [ 1160.303664]  entry_SYSCALL_64_after_hwframe+0x44/0xae
  [ 1160.304981] RIP: 0033:0x7eff54b018b7

As shown in the RIP address, this VM_BUG_ON in folio_entire_mapcount() is
called from dump_page("hwpoison: unhandlable page") in get_any_page().
The below explains the mechanism of the race:

  CPU 0                                       CPU 1

    memory_failure
      get_hwpoison_page
        get_any_page
          dump_page
            compound = PageCompound
                                                free_pages_prepare
                                                  page->flags &= ~PAGE_FLAGS_CHECK_AT_PREP
            folio_entire_mapcount
              VM_BUG_ON_FOLIO(!folio_test_large(folio))

So replace dump_page() with safer one, pr_err().

Link: https://lkml.kernel.org/r/20220427053220.719866-1-naoya.horiguchi@linux.dev
Fixes: 74e8ee4 ("mm: Turn head_compound_mapcount() into folio_entire_mapcount()")
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
starnight pushed a commit that referenced this pull request Aug 18, 2022
BugLink: https://bugs.launchpad.net/bugs/1981375

[ Upstream commit a4a6f3c ]

nvme_mpath_init_identify() invoked from nvme_init_identify() fetches a
fresh ANA log from the ctrl.  This is essential to have an up to date
path states for both existing namespaces and for those scan_work may
discover once the ctrl is up.

This happens in the following cases:
  1) A new ctrl is being connected.
  2) An existing ctrl is successfully reconnected.
  3) An existing ctrl is being reset.

While in (1) ctrl->namespaces is empty, (2 & 3) may have namespaces, and
nvme_read_ana_log() may call nvme_update_ns_ana_state().

This result in a hang when the ANA state of an existing namespace changes
and makes the disk live: nvme_mpath_set_live() issues IO to the namespace
through the ctrl, which does NOT have IO queues yet.

See sample hang below.

Solution:
- nvme_update_ns_ana_state() to call set_live only if ctrl is live
- nvme_read_ana_log() call from nvme_mpath_init_identify()
  therefore only fetches and parses the ANA log;
  any erros in this process will fail the ctrl setup as appropriate;
- a separate function nvme_mpath_update()
  is called in nvme_start_ctrl();
  this parses the ANA log without fetching it.
  At this point the ctrl is live,
  therefore, disks can be set live normally.

Sample failure:
    nvme nvme0: starting error recovery
    nvme nvme0: Reconnecting in 10 seconds...
    block nvme0n6: no usable path - requeuing I/O
    INFO: task kworker/u8:3:312 blocked for more than 122 seconds.
          Tainted: G            E     5.14.5-1.el7.elrepo.x86_64 #1
    Workqueue: nvme-wq nvme_tcp_reconnect_ctrl_work [nvme_tcp]
    Call Trace:
     __schedule+0x2a2/0x7e0
     schedule+0x4e/0xb0
     io_schedule+0x16/0x40
     wait_on_page_bit_common+0x15c/0x3e0
     do_read_cache_page+0x1e0/0x410
     read_cache_page+0x12/0x20
     read_part_sector+0x46/0x100
     read_lba+0x121/0x240
     efi_partition+0x1d2/0x6a0
     bdev_disk_changed.part.0+0x1df/0x430
     bdev_disk_changed+0x18/0x20
     blkdev_get_whole+0x77/0xe0
     blkdev_get_by_dev+0xd2/0x3a0
     __device_add_disk+0x1ed/0x310
     device_add_disk+0x13/0x20
     nvme_mpath_set_live+0x138/0x1b0 [nvme_core]
     nvme_update_ns_ana_state+0x2b/0x30 [nvme_core]
     nvme_update_ana_state+0xca/0xe0 [nvme_core]
     nvme_parse_ana_log+0xac/0x170 [nvme_core]
     nvme_read_ana_log+0x7d/0xe0 [nvme_core]
     nvme_mpath_init_identify+0x105/0x150 [nvme_core]
     nvme_init_identify+0x2df/0x4d0 [nvme_core]
     nvme_init_ctrl_finish+0x8d/0x3b0 [nvme_core]
     nvme_tcp_setup_ctrl+0x337/0x390 [nvme_tcp]
     nvme_tcp_reconnect_ctrl_work+0x24/0x40 [nvme_tcp]
     process_one_work+0x1bd/0x360
     worker_thread+0x50/0x3d0

Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
starnight pushed a commit that referenced this pull request Aug 18, 2022
BugLink: https://bugs.launchpad.net/bugs/1981375

[ Upstream commit 4503cc7 ]

Do not allow to write timestamps on RX rings if PF is being configured.
When PF is being configured RX rings can be freed or rebuilt. If at the
same time timestamps are updated, the kernel will crash by dereferencing
null RX ring pointer.

PID: 1449   TASK: ff187d28ed658040  CPU: 34  COMMAND: "ice-ptp-0000:51"
 #0 [ff1966a94a713bb0] machine_kexec at ffffffff9d05a0be
 #1 [ff1966a94a713c08] __crash_kexec at ffffffff9d192e9d
 #2 [ff1966a94a713cd0] crash_kexec at ffffffff9d1941bd
 #3 [ff1966a94a713ce8] oops_end at ffffffff9d01bd54
 #4 [ff1966a94a713d08] no_context at ffffffff9d06bda4
 #5 [ff1966a94a713d60] __bad_area_nosemaphore at ffffffff9d06c10c
 #6 [ff1966a94a713da8] do_page_fault at ffffffff9d06cae4
 #7 [ff1966a94a713de0] page_fault at ffffffff9da0107e
    [exception RIP: ice_ptp_update_cached_phctime+91]
    RIP: ffffffffc076db8b  RSP: ff1966a94a713e98  RFLAGS: 00010246
    RAX: 16e3db9c6b7ccae4  RBX: ff187d269dd3c180  RCX: ff187d269cd4d018
    RDX: 0000000000000000  RSI: 0000000000000000  RDI: 0000000000000000
    RBP: ff187d269cfcc644   R8: ff187d339b9641b0   R9: 0000000000000000
    R10: 0000000000000002  R11: 0000000000000000  R12: ff187d269cfcc648
    R13: ffffffff9f128784  R14: ffffffff9d101b70  R15: ff187d269cfcc640
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #8 [ff1966a94a713ea0] ice_ptp_periodic_work at ffffffffc076dbef [ice]
 #9 [ff1966a94a713ee0] kthread_worker_fn at ffffffff9d101c1b
 #10 [ff1966a94a713f10] kthread at ffffffff9d101b4d
 #11 [ff1966a94a713f50] ret_from_fork at ffffffff9da0023f

Fixes: 77a7811 ("ice: enable receive hardware timestamping")
Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Reviewed-by: Michal Schmidt <mschmidt@redhat.com>
Tested-by: Dave Cain <dcain@redhat.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
starnight pushed a commit that referenced this pull request Aug 18, 2022
… Clang CFI

BugLink: https://bugs.launchpad.net/bugs/1981649

commit d2a02e3 upstream.

blake2s_compress_generic is weakly aliased by blake2s_compress. The
current harness for function selection uses a function pointer, which is
ordinarily inlined and resolved at compile time. But when Clang's CFI is
enabled, CFI still triggers when making an indirect call via a weak
symbol. This seems like a bug in Clang's CFI, as though it's bucketing
weak symbols and strong symbols differently. It also only seems to
trigger when "full LTO" mode is used, rather than "thin LTO".

[    0.000000][    T0] Kernel panic - not syncing: CFI failure (target: blake2s_compress_generic+0x0/0x1444)
[    0.000000][    T0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.16.0-mainline-06981-g076c855b846e #1
[    0.000000][    T0] Hardware name: MT6873 (DT)
[    0.000000][    T0] Call trace:
[    0.000000][    T0]  dump_backtrace+0xfc/0x1dc
[    0.000000][    T0]  dump_stack_lvl+0xa8/0x11c
[    0.000000][    T0]  panic+0x194/0x464
[    0.000000][    T0]  __cfi_check_fail+0x54/0x58
[    0.000000][    T0]  __cfi_slowpath_diag+0x354/0x4b0
[    0.000000][    T0]  blake2s_update+0x14c/0x178
[    0.000000][    T0]  _extract_entropy+0xf4/0x29c
[    0.000000][    T0]  crng_initialize_primary+0x24/0x94
[    0.000000][    T0]  rand_initialize+0x2c/0x6c
[    0.000000][    T0]  start_kernel+0x2f8/0x65c
[    0.000000][    T0]  __primary_switched+0xc4/0x7be4
[    0.000000][    T0] Rebooting in 5 seconds..

Nonetheless, the function pointer method isn't so terrific anyway, so
this patch replaces it with a simple boolean, which also gets inlined
away. This successfully works around the Clang bug.

In general, I'm not too keen on all of the indirection involved here; it
clearly does more harm than good. Hopefully the whole thing can get
cleaned up down the road when lib/crypto is overhauled more
comprehensively. But for now, we go with a simple bandaid.

Fixes: 6048fdc ("lib/crypto: blake2s: include as built-in")
Link: ClangBuiltLinux/linux#1567
Reported-by: Miles Chen <miles.chen@mediatek.com>
Tested-by: Miles Chen <miles.chen@mediatek.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: John Stultz <john.stultz@linaro.org>
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
starnight pushed a commit that referenced this pull request Aug 18, 2022
BugLink: https://bugs.launchpad.net/bugs/1981862

[ Upstream commit 2e40316 ]

Will reported the following splat when running with Protected KVM
enabled:

[    2.427181] ------------[ cut here ]------------
[    2.427668] WARNING: CPU: 3 PID: 1 at arch/arm64/kvm/mmu.c:489 __create_hyp_private_mapping+0x118/0x1ac
[    2.428424] Modules linked in:
[    2.429040] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.18.0-rc2-00084-g8635adc4efc7 #1
[    2.429589] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
[    2.430286] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    2.430734] pc : __create_hyp_private_mapping+0x118/0x1ac
[    2.431091] lr : create_hyp_exec_mappings+0x40/0x80
[    2.431377] sp : ffff80000803baf0
[    2.431597] x29: ffff80000803bb00 x28: 0000000000000000 x27: 0000000000000000
[    2.432156] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
[    2.432561] x23: ffffcd96c343b000 x22: 0000000000000000 x21: ffff80000803bb40
[    2.433004] x20: 0000000000000004 x19: 0000000000001800 x18: 0000000000000000
[    2.433343] x17: 0003e68cf7efdd70 x16: 0000000000000004 x15: fffffc81f602a2c8
[    2.434053] x14: ffffdf8380000000 x13: ffffcd9573200000 x12: ffffcd96c343b000
[    2.434401] x11: 0000000000000004 x10: ffffcd96c1738000 x9 : 0000000000000004
[    2.434812] x8 : ffff80000803bb40 x7 : 7f7f7f7f7f7f7f7f x6 : 544f422effff306b
[    2.435136] x5 : 000000008020001e x4 : ffff207d80a88c00 x3 : 0000000000000005
[    2.435480] x2 : 0000000000001800 x1 : 000000014f4ab800 x0 : 000000000badca11
[    2.436149] Call trace:
[    2.436600]  __create_hyp_private_mapping+0x118/0x1ac
[    2.437576]  create_hyp_exec_mappings+0x40/0x80
[    2.438180]  kvm_init_vector_slots+0x180/0x194
[    2.458941]  kvm_arch_init+0x80/0x274
[    2.459220]  kvm_init+0x48/0x354
[    2.459416]  arm_init+0x20/0x2c
[    2.459601]  do_one_initcall+0xbc/0x238
[    2.459809]  do_initcall_level+0x94/0xb4
[    2.460043]  do_initcalls+0x54/0x94
[    2.460228]  do_basic_setup+0x1c/0x28
[    2.460407]  kernel_init_freeable+0x110/0x178
[    2.460610]  kernel_init+0x20/0x1a0
[    2.460817]  ret_from_fork+0x10/0x20
[    2.461274] ---[ end trace 0000000000000000 ]---

Indeed, the Protected KVM mode promotes __create_hyp_private_mapping()
to a hypercall as EL1 no longer has access to the hypervisor's stage-1
page-table. However, the call from kvm_init_vector_slots() happens after
pKVM has been initialized on the primary CPU, but before it has been
initialized on secondaries. As such, if the KVM initcall procedure is
migrated from one CPU to another in this window, the hypercall may end up
running on a CPU for which EL2 has not been initialized.

Fortunately, the pKVM hypervisor doesn't rely on the host to re-map the
vectors in the private range, so the hypercall in question is in fact
superfluous. Skip it when pKVM is enabled.

Reported-by: Will Deacon <will@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
[maz: simplified the checks slightly]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220513092607.35233-1-qperret@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
starnight pushed a commit that referenced this pull request Aug 18, 2022
BugLink: https://bugs.launchpad.net/bugs/1981864

[ Upstream commit 194d250 ]

drm_cvt_mode may return NULL and we should check it.

This bug is found by syzkaller:

FAULT_INJECTION stacktrace:
[  168.567394] FAULT_INJECTION: forcing a failure.
name failslab, interval 1, probability 0, space 0, times 1
[  168.567403] CPU: 1 PID: 6425 Comm: syz Kdump: loaded Not tainted 4.19.90-vhulk2201.1.0.h1035.kasan.eulerosv2r10.aarch64 #1
[  168.567406] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[  168.567408] Call trace:
[  168.567414]  dump_backtrace+0x0/0x310
[  168.567418]  show_stack+0x28/0x38
[  168.567423]  dump_stack+0xec/0x15c
[  168.567427]  should_fail+0x3ac/0x3d0
[  168.567437]  __should_failslab+0xb8/0x120
[  168.567441]  should_failslab+0x28/0xc0
[  168.567445]  kmem_cache_alloc_trace+0x50/0x640
[  168.567454]  drm_mode_create+0x40/0x90
[  168.567458]  drm_cvt_mode+0x48/0xc78
[  168.567477]  virtio_gpu_conn_get_modes+0xa8/0x140 [virtio_gpu]
[  168.567485]  drm_helper_probe_single_connector_modes+0x3a4/0xd80
[  168.567492]  drm_mode_getconnector+0x2e0/0xa70
[  168.567496]  drm_ioctl_kernel+0x11c/0x1d8
[  168.567514]  drm_ioctl+0x558/0x6d0
[  168.567522]  do_vfs_ioctl+0x160/0xf30
[  168.567525]  ksys_ioctl+0x98/0xd8
[  168.567530]  __arm64_sys_ioctl+0x50/0xc8
[  168.567536]  el0_svc_common+0xc8/0x320
[  168.567540]  el0_svc_handler+0xf8/0x160
[  168.567544]  el0_svc+0x10/0x218

KASAN stacktrace:
[  168.567561] BUG: KASAN: null-ptr-deref in virtio_gpu_conn_get_modes+0xb4/0x140 [virtio_gpu]
[  168.567565] Read of size 4 at addr 0000000000000054 by task syz/6425
[  168.567566]
[  168.567571] CPU: 1 PID: 6425 Comm: syz Kdump: loaded Not tainted 4.19.90-vhulk2201.1.0.h1035.kasan.eulerosv2r10.aarch64 #1
[  168.567573] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[  168.567575] Call trace:
[  168.567578]  dump_backtrace+0x0/0x310
[  168.567582]  show_stack+0x28/0x38
[  168.567586]  dump_stack+0xec/0x15c
[  168.567591]  kasan_report+0x244/0x2f0
[  168.567594]  __asan_load4+0x58/0xb0
[  168.567607]  virtio_gpu_conn_get_modes+0xb4/0x140 [virtio_gpu]
[  168.567612]  drm_helper_probe_single_connector_modes+0x3a4/0xd80
[  168.567617]  drm_mode_getconnector+0x2e0/0xa70
[  168.567621]  drm_ioctl_kernel+0x11c/0x1d8
[  168.567624]  drm_ioctl+0x558/0x6d0
[  168.567628]  do_vfs_ioctl+0x160/0xf30
[  168.567632]  ksys_ioctl+0x98/0xd8
[  168.567636]  __arm64_sys_ioctl+0x50/0xc8
[  168.567641]  el0_svc_common+0xc8/0x320
[  168.567645]  el0_svc_handler+0xf8/0x160
[  168.567649]  el0_svc+0x10/0x218

Signed-off-by: Liu Zixian <liuzixian4@huawei.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20220322091730.1653-1-liuzixian4@huawei.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
starnight pushed a commit that referenced this pull request Aug 18, 2022
BugLink: https://bugs.launchpad.net/bugs/1981864

[ Upstream commit e68cb83 ]

If bitmap area contains invalid data, kernel will crash then mdadm
triggers "Segmentation fault".
This is cluster-md speical bug. In non-clustered env, mdadm will
handle broken metadata case. In clustered array, only kernel space
handles bitmap slot info. But even this bug only happened in clustered
env, current sanity check is wrong, the code should be changed.

How to trigger: (faulty injection)

dd if=/dev/zero bs=1M count=1 oflag=direct of=/dev/sda
dd if=/dev/zero bs=1M count=1 oflag=direct of=/dev/sdb
mdadm -C /dev/md0 -b clustered -e 1.2 -n 2 -l mirror /dev/sda /dev/sdb
mdadm -Ss
echo aaa > magic.txt
 == below modifying slot 2 bitmap data ==
dd if=magic.txt of=/dev/sda seek=16384 bs=1 count=3 <== destroy magic
dd if=/dev/zero of=/dev/sda seek=16436 bs=1 count=4 <== ZERO chunksize
mdadm -A /dev/md0 /dev/sda /dev/sdb
 == kernel crashes. mdadm outputs "Segmentation fault" ==

Reason of kernel crash:

In md_bitmap_read_sb (called by md_bitmap_create), bad bitmap magic didn't
block chunksize assignment, and zero value made DIV_ROUND_UP_SECTOR_T()
trigger "divide error".

Crash log:

kernel: md: md0 stopped.
kernel: md/raid1:md0: not clean -- starting background reconstruction
kernel: md/raid1:md0: active with 2 out of 2 mirrors
kernel: dlm: ... ...
kernel: md-cluster: Joined cluster 44810aba-38bb-e6b8-daca-bc97a0b254aa slot 1
kernel: md0: invalid bitmap file superblock: bad magic
kernel: md_bitmap_copy_from_slot can't get bitmap from slot 2
kernel: md-cluster: Could not gather bitmaps from slot 2
kernel: divide error: 0000 [#1] SMP NOPTI
kernel: CPU: 0 PID: 1603 Comm: mdadm Not tainted 5.14.6-1-default
kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
kernel: RIP: 0010:md_bitmap_create+0x1d1/0x850 [md_mod]
kernel: RSP: 0018:ffffc22ac0843ba0 EFLAGS: 00010246
kernel: ... ...
kernel: Call Trace:
kernel:  ? dlm_lock_sync+0xd0/0xd0 [md_cluster 77fe..7a0]
kernel:  md_bitmap_copy_from_slot+0x2c/0x290 [md_mod 24ea..d3a]
kernel:  load_bitmaps+0xec/0x210 [md_cluster 77fe..7a0]
kernel:  md_bitmap_load+0x81/0x1e0 [md_mod 24ea..d3a]
kernel:  do_md_run+0x30/0x100 [md_mod 24ea..d3a]
kernel:  md_ioctl+0x1290/0x15a0 [md_mod 24ea....d3a]
kernel:  ? mddev_unlock+0xaa/0x130 [md_mod 24ea..d3a]
kernel:  ? blkdev_ioctl+0xb1/0x2b0
kernel:  block_ioctl+0x3b/0x40
kernel:  __x64_sys_ioctl+0x7f/0xb0
kernel:  do_syscall_64+0x59/0x80
kernel:  ? exit_to_user_mode_prepare+0x1ab/0x230
kernel:  ? syscall_exit_to_user_mode+0x18/0x40
kernel:  ? do_syscall_64+0x69/0x80
kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xae
kernel: RIP: 0033:0x7f4a15fa722b
kernel: ... ...
kernel: ---[ end trace 8afa7612f559c868 ]---
kernel: RIP: 0010:md_bitmap_create+0x1d1/0x850 [md_mod]

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Signed-off-by: Heming Zhao <heming.zhao@suse.com>
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
starnight pushed a commit that referenced this pull request Aug 18, 2022
BugLink: https://bugs.launchpad.net/bugs/1981864

[ Upstream commit 161c64d ]

When ath11k modules are removed using rmmod with spectral scan enabled,
crash is observed. Different crash trace is observed for each crash.

Send spectral scan disable WMI command to firmware before cleaning
the spectral dbring in the spectral_deinit API to avoid this crash.

call trace from one of the crash observed:
[ 1252.880802] Unable to handle kernel NULL pointer dereference at virtual address 00000008
[ 1252.882722] pgd = 0f42e886
[ 1252.890955] [00000008] *pgd=00000000
[ 1252.893478] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
[ 1253.093035] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.89 #0
[ 1253.115261] Hardware name: Generic DT based system
[ 1253.121149] PC is at ath11k_spectral_process_data+0x434/0x574 [ath11k]
[ 1253.125940] LR is at 0x88e31017
[ 1253.132448] pc : [<7f9387b8>]    lr : [<88e31017>]    psr: a0000193
[ 1253.135488] sp : 80d01bc8  ip : 00000001  fp : 970e0000
[ 1253.141737] r10: 88e31000  r9 : 970ec000  r8 : 00000080
[ 1253.146946] r7 : 94734040  r6 : a0000113  r5 : 00000057  r4 : 00000000
[ 1253.152159] r3 : e18cb694  r2 : 00000217  r1 : 1df1f000  r0 : 00000001
[ 1253.158755] Flags: NzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
[ 1253.165266] Control: 10c0383d  Table: 5e71006a  DAC: 00000055
[ 1253.172472] Process swapper/0 (pid: 0, stack limit = 0x60870141)
[ 1253.458055] [<7f9387b8>] (ath11k_spectral_process_data [ath11k]) from [<7f917fdc>] (ath11k_dbring_buffer_release_event+0x214/0x2e4 [ath11k])
[ 1253.466139] [<7f917fdc>] (ath11k_dbring_buffer_release_event [ath11k]) from [<7f8ea3c4>] (ath11k_wmi_tlv_op_rx+0x1840/0x29cc [ath11k])
[ 1253.478807] [<7f8ea3c4>] (ath11k_wmi_tlv_op_rx [ath11k]) from [<7f8fe868>] (ath11k_htc_rx_completion_handler+0x180/0x4e0 [ath11k])
[ 1253.490699] [<7f8fe868>] (ath11k_htc_rx_completion_handler [ath11k]) from [<7f91308c>] (ath11k_ce_per_engine_service+0x2c4/0x3b4 [ath11k])
[ 1253.502386] [<7f91308c>] (ath11k_ce_per_engine_service [ath11k]) from [<7f9a4198>] (ath11k_pci_ce_tasklet+0x28/0x80 [ath11k_pci])
[ 1253.514811] [<7f9a4198>] (ath11k_pci_ce_tasklet [ath11k_pci]) from [<8032227c>] (tasklet_action_common.constprop.2+0x64/0xe8)
[ 1253.526476] [<8032227c>] (tasklet_action_common.constprop.2) from [<803021e8>] (__do_softirq+0x130/0x2d0)
[ 1253.537756] [<803021e8>] (__do_softirq) from [<80322610>] (irq_exit+0xcc/0xe8)
[ 1253.547304] [<80322610>] (irq_exit) from [<8036a4a4>] (__handle_domain_irq+0x60/0xb4)
[ 1253.554428] [<8036a4a4>] (__handle_domain_irq) from [<805eb348>] (gic_handle_irq+0x4c/0x90)
[ 1253.562321] [<805eb348>] (gic_handle_irq) from [<80301a78>] (__irq_svc+0x58/0x8c)

Tested-on: QCN6122 hw1.0 AHB WLAN.HK.2.6.0.1-00851-QCAHKSWPL_SILICONZ-1

Signed-off-by: Hari Chandrakanthan <quic_haric@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/1649396345-349-1-git-send-email-quic_haric@quicinc.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
starnight pushed a commit that referenced this pull request Aug 18, 2022
BugLink: https://bugs.launchpad.net/bugs/1981864

[ Upstream commit b72a4af ]

Double free crash is observed when FW recovery(caused by wmi
timeout/crash) is followed by immediate suspend event. The FW recovery
is triggered by ath10k_core_restart() which calls driver clean up via
ath10k_halt(). When the suspend event occurs between the FW recovery,
the restart worker thread is put into frozen state until suspend completes.
The suspend event triggers ath10k_stop() which again triggers ath10k_halt()
The double invocation of ath10k_halt() causes ath10k_htt_rx_free() to be
called twice(Note: ath10k_htt_rx_alloc was not called by restart worker
thread because of its frozen state), causing the crash.

To fix this, during the suspend flow, skip call to ath10k_halt() in
ath10k_stop() when the current driver state is ATH10K_STATE_RESTARTING.
Also, for driver state ATH10K_STATE_RESTARTING, call
ath10k_wait_for_suspend() in ath10k_stop(). This is because call to
ath10k_wait_for_suspend() is skipped later in
[ath10k_halt() > ath10k_core_stop()] for the driver state
ATH10K_STATE_RESTARTING.

The frozen restart worker thread will be cancelled during resume when the
device comes out of suspend.

Below is the crash stack for reference:

[  428.469167] ------------[ cut here ]------------
[  428.469180] kernel BUG at mm/slub.c:4150!
[  428.469193] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[  428.469219] Workqueue: events_unbound async_run_entry_fn
[  428.469230] RIP: 0010:kfree+0x319/0x31b
[  428.469241] RSP: 0018:ffffa1fac015fc30 EFLAGS: 00010246
[  428.469247] RAX: ffffedb10419d108 RBX: ffff8c05262b0000
[  428.469252] RDX: ffff8c04a8c07000 RSI: 0000000000000000
[  428.469256] RBP: ffffa1fac015fc78 R08: 0000000000000000
[  428.469276] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  428.469285] Call Trace:
[  428.469295]  ? dma_free_attrs+0x5f/0x7d
[  428.469320]  ath10k_core_stop+0x5b/0x6f
[  428.469336]  ath10k_halt+0x126/0x177
[  428.469352]  ath10k_stop+0x41/0x7e
[  428.469387]  drv_stop+0x88/0x10e
[  428.469410]  __ieee80211_suspend+0x297/0x411
[  428.469441]  rdev_suspend+0x6e/0xd0
[  428.469462]  wiphy_suspend+0xb1/0x105
[  428.469483]  ? name_show+0x2d/0x2d
[  428.469490]  dpm_run_callback+0x8c/0x126
[  428.469511]  ? name_show+0x2d/0x2d
[  428.469517]  __device_suspend+0x2e7/0x41b
[  428.469523]  async_suspend+0x1f/0x93
[  428.469529]  async_run_entry_fn+0x3d/0xd1
[  428.469535]  process_one_work+0x1b1/0x329
[  428.469541]  worker_thread+0x213/0x372
[  428.469547]  kthread+0x150/0x15f
[  428.469552]  ? pr_cont_work+0x58/0x58
[  428.469558]  ? kthread_blkcg+0x31/0x31

Tested-on: QCA6174 hw3.2 PCI WLAN.RM.4.4.1-00288-QCARMSWPZ-1
Co-developed-by: Wen Gong <quic_wgong@quicinc.com>
Signed-off-by: Wen Gong <quic_wgong@quicinc.com>
Signed-off-by: Abhishek Kumar <kuabhs@chromium.org>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20220426221859.v2.1.I650b809482e1af8d0156ed88b5dc2677a0711d46@changeid
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
starnight pushed a commit that referenced this pull request Aug 18, 2022
BugLink: https://bugs.launchpad.net/bugs/1981864

[ Upstream commit b6b1c3c ]

RTAS runs in real mode (MSR[DR] and MSR[IR] unset) and in 32-bit big
endian mode (MSR[SF,LE] unset).

The change in MSR is done in enter_rtas() in a relatively complex way,
since the MSR value could be hardcoded.

Furthermore, a panic has been reported when hitting the watchdog interrupt
while running in RTAS, this leads to the following stack trace:

  watchdog: CPU 24 Hard LOCKUP
  watchdog: CPU 24 TB:997512652051031, last heartbeat TB:997504470175378 (15980ms ago)
  ...
  Supported: No, Unreleased kernel
  CPU: 24 PID: 87504 Comm: drmgr Kdump: loaded Tainted: G            E  X    5.14.21-150400.71.1.bz196362_2-default #1 SLE15-SP4 (unreleased) 0d821077ef4faa8dfaf370efb5fdca1fa35f4e2c
  NIP:  000000001fb41050 LR: 000000001fb4104c CTR: 0000000000000000
  REGS: c00000000fc33d60 TRAP: 0100   Tainted: G            E  X     (5.14.21-150400.71.1.bz196362_2-default)
  MSR:  8000000002981000 <SF,VEC,VSX,ME>  CR: 48800002  XER: 20040020
  CFAR: 000000000000011c IRQMASK: 1
  GPR00: 0000000000000003 ffffffffffffffff 0000000000000001 00000000000050dc
  GPR04: 000000001ffb6100 0000000000000020 0000000000000001 000000001fb09010
  GPR08: 0000000020000000 0000000000000000 0000000000000000 0000000000000000
  GPR12: 80040000072a40a8 c00000000ff8b680 0000000000000007 0000000000000034
  GPR16: 000000001fbf6e94 000000001fbf6d84 000000001fbd1db0 000000001fb3f008
  GPR20: 000000001fb41018 ffffffffffffffff 000000000000017f fffffffffffff68f
  GPR24: 000000001fb18fe8 000000001fb3e000 000000001fb1adc0 000000001fb1cf40
  GPR28: 000000001fb26000 000000001fb460f0 000000001fb17f18 000000001fb17000
  NIP [000000001fb41050] 0x1fb41050
  LR [000000001fb4104c] 0x1fb4104c
  Call Trace:
  Instruction dump:
  XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
  XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
  Oops: Unrecoverable System Reset, sig: 6 [#1]
  LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
  ...
  Supported: No, Unreleased kernel
  CPU: 24 PID: 87504 Comm: drmgr Kdump: loaded Tainted: G            E  X    5.14.21-150400.71.1.bz196362_2-default #1 SLE15-SP4 (unreleased) 0d821077ef4faa8dfaf370efb5fdca1fa35f4e2c
  NIP:  000000001fb41050 LR: 000000001fb4104c CTR: 0000000000000000
  REGS: c00000000fc33d60 TRAP: 0100   Tainted: G            E  X     (5.14.21-150400.71.1.bz196362_2-default)
  MSR:  8000000002981000 <SF,VEC,VSX,ME>  CR: 48800002  XER: 20040020
  CFAR: 000000000000011c IRQMASK: 1
  GPR00: 0000000000000003 ffffffffffffffff 0000000000000001 00000000000050dc
  GPR04: 000000001ffb6100 0000000000000020 0000000000000001 000000001fb09010
  GPR08: 0000000020000000 0000000000000000 0000000000000000 0000000000000000
  GPR12: 80040000072a40a8 c00000000ff8b680 0000000000000007 0000000000000034
  GPR16: 000000001fbf6e94 000000001fbf6d84 000000001fbd1db0 000000001fb3f008
  GPR20: 000000001fb41018 ffffffffffffffff 000000000000017f fffffffffffff68f
  GPR24: 000000001fb18fe8 000000001fb3e000 000000001fb1adc0 000000001fb1cf40
  GPR28: 000000001fb26000 000000001fb460f0 000000001fb17f18 000000001fb17000
  NIP [000000001fb41050] 0x1fb41050
  LR [000000001fb4104c] 0x1fb4104c
  Call Trace:
  Instruction dump:
  XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
  XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
  ---[ end trace 3ddec07f638c34a2 ]---

This happens because MSR[RI] is unset when entering RTAS but there is no
valid reason to not set it here.

RTAS is expected to be called with MSR[RI] as specified in PAPR+ section
"7.2.1 Machine State":

  R1–7.2.1–9. If called with MSR[RI] equal to 1, then RTAS must protect
  its own critical regions from recursion by setting the MSR[RI] bit to
  0 when in the critical regions.

Fixing this by reviewing the way MSR is compute before calling RTAS. Now a
hardcoded value meaning real mode, 32 bits big endian mode and Recoverable
Interrupt is loaded. In the case MSR[S] is set, it will remain set while
entering RTAS as only urfid can unset it (thanks Fabiano).

In addition a check is added in do_enter_rtas() to detect calls made with
MSR[RI] unset, as we are forcing it on later.

This patch has been tested on the following machines:
Power KVM Guest
  P8 S822L (host Ubuntu kernel 5.11.0-49-generic)
PowerVM LPAR
  P8 9119-MME (FW860.A1)
  p9 9008-22L (FW950.00)
  P10 9080-HEX (FW1010.00)

Suggested-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220504101244.12107-1-ldufour@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
starnight pushed a commit that referenced this pull request Aug 18, 2022
BugLink: https://bugs.launchpad.net/bugs/1981864

[ Upstream commit 365ab7e ]

When removing the max9286 module we get a kernel oops:

Unable to handle kernel paging request at virtual address 000000aa00000094
Mem abort info:
  ESR = 0x96000004
  EC = 0x25: DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
  FSC = 0x04: level 0 translation fault
Data abort info:
  ISV = 0, ISS = 0x00000004
  CM = 0, WnR = 0
user pgtable: 4k pages, 48-bit VAs, pgdp=0000000880d85000
[000000aa00000094] pgd=0000000000000000, p4d=0000000000000000
Internal error: Oops: 96000004 [#1] PREEMPT SMP
Modules linked in: fsl_jr_uio caam_jr rng_core libdes caamkeyblob_desc caamhash_desc caamalg_desc crypto_engine max9271 authenc crct10dif_ce mxc_jpeg_encdec
CPU: 2 PID: 713 Comm: rmmod Tainted: G         C        5.15.5-00057-gaebcd29c8ed7-dirty #5
Hardware name: Freescale i.MX8QXP MEK (DT)
pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : i2c_mux_del_adapters+0x24/0xf0
lr : max9286_remove+0x28/0xd0 [max9286]
sp : ffff800013a9bbf0
x29: ffff800013a9bbf0 x28: ffff00080b6da940 x27: 0000000000000000
x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
x23: ffff000801a5b970 x22: ffff0008048b0890 x21: ffff800009297000
x20: ffff0008048b0f70 x19: 000000aa00000064 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
x14: 0000000000000014 x13: 0000000000000000 x12: ffff000802da49e8
x11: ffff000802051918 x10: ffff000802da4920 x9 : ffff000800030098
x8 : 0101010101010101 x7 : 7f7f7f7f7f7f7f7f x6 : fefefeff6364626d
x5 : 8080808000000000 x4 : 0000000000000000 x3 : 0000000000000000
x2 : ffffffffffffffff x1 : ffff00080b6da940 x0 : 0000000000000000
Call trace:
 i2c_mux_del_adapters+0x24/0xf0
 max9286_remove+0x28/0xd0 [max9286]
 i2c_device_remove+0x40/0x110
 __device_release_driver+0x188/0x234
 driver_detach+0xc4/0x150
 bus_remove_driver+0x60/0xe0
 driver_unregister+0x34/0x64
 i2c_del_driver+0x58/0xa0
 max9286_i2c_driver_exit+0x1c/0x490 [max9286]
 __arm64_sys_delete_module+0x194/0x260
 invoke_syscall+0x48/0x114
 el0_svc_common.constprop.0+0xd4/0xfc
 do_el0_svc+0x2c/0x94
 el0_svc+0x28/0x80
 el0t_64_sync_handler+0xa8/0x130
 el0t_64_sync+0x1a0/0x1a4

The Oops happens because the I2C client data does not point to
max9286_priv anymore but to v4l2_subdev. The change happened in
max9286_init() which calls v4l2_i2c_subdev_init() later on...

Besides fixing the max9286_remove() function, remove the call to
i2c_set_clientdata() in max9286_probe(), to avoid confusion, and make
the necessary changes to max9286_init() so that it doesn't have to use
i2c_get_clientdata() in order to fetch the pointer to priv.

Fixes: 66d8c9d ("media: i2c: Add MAX9286 driver")
Signed-off-by: Laurentiu Palcu <laurentiu.palcu@oss.nxp.com>
Reviewed-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
starnight pushed a commit that referenced this pull request Aug 18, 2022
BugLink: https://bugs.launchpad.net/bugs/1981864

[ Upstream commit 6b9dbed ]

pty_write() invokes kmalloc() which may invoke a normal printk() to print
failure message.  This can cause a deadlock in the scenario reported by
syz-bot below:

       CPU0              CPU1                    CPU2
       ----              ----                    ----
                         lock(console_owner);
                                                 lock(&port_lock_key);
  lock(&port->lock);
                         lock(&port_lock_key);
                                                 lock(&port->lock);
  lock(console_owner);

As commit dbdda84 ("printk: Add console owner and waiter logic to
load balance console writes") said, such deadlock can be prevented by
using printk_deferred() in kmalloc() (which is invoked in the section
guarded by the port->lock).  But there are too many printk() on the
kmalloc() path, and kmalloc() can be called from anywhere, so changing
printk() to printk_deferred() is too complicated and inelegant.

Therefore, this patch chooses to specify __GFP_NOWARN to kmalloc(), so
that printk() will not be called, and this deadlock problem can be
avoided.

Syzbot reported the following lockdep error:

======================================================
WARNING: possible circular locking dependency detected
5.4.143-00237-g08ccc19a-dirty #10 Not tainted
------------------------------------------------------
syz-executor.4/29420 is trying to acquire lock:
ffffffff8aedb2a0 (console_owner){....}-{0:0}, at: console_trylock_spinning kernel/printk/printk.c:1752 [inline]
ffffffff8aedb2a0 (console_owner){....}-{0:0}, at: vprintk_emit+0x2ca/0x470 kernel/printk/printk.c:2023

but task is already holding lock:
ffff8880119c9158 (&port->lock){-.-.}-{2:2}, at: pty_write+0xf4/0x1f0 drivers/tty/pty.c:120

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (&port->lock){-.-.}-{2:2}:
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0x35/0x50 kernel/locking/spinlock.c:159
       tty_port_tty_get drivers/tty/tty_port.c:288 [inline]          		<-- lock(&port->lock);
       tty_port_default_wakeup+0x1d/0xb0 drivers/tty/tty_port.c:47
       serial8250_tx_chars+0x530/0xa80 drivers/tty/serial/8250/8250_port.c:1767
       serial8250_handle_irq.part.0+0x31f/0x3d0 drivers/tty/serial/8250/8250_port.c:1854
       serial8250_handle_irq drivers/tty/serial/8250/8250_port.c:1827 [inline] 	<-- lock(&port_lock_key);
       serial8250_default_handle_irq+0xb2/0x220 drivers/tty/serial/8250/8250_port.c:1870
       serial8250_interrupt+0xfd/0x200 drivers/tty/serial/8250/8250_core.c:126
       __handle_irq_event_percpu+0x109/0xa50 kernel/irq/handle.c:156
       [...]

-> #1 (&port_lock_key){-.-.}-{2:2}:
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0x35/0x50 kernel/locking/spinlock.c:159
       serial8250_console_write+0x184/0xa40 drivers/tty/serial/8250/8250_port.c:3198
										<-- lock(&port_lock_key);
       call_console_drivers kernel/printk/printk.c:1819 [inline]
       console_unlock+0x8cb/0xd00 kernel/printk/printk.c:2504
       vprintk_emit+0x1b5/0x470 kernel/printk/printk.c:2024			<-- lock(console_owner);
       vprintk_func+0x8d/0x250 kernel/printk/printk_safe.c:394
       printk+0xba/0xed kernel/printk/printk.c:2084
       register_console+0x8b3/0xc10 kernel/printk/printk.c:2829
       univ8250_console_init+0x3a/0x46 drivers/tty/serial/8250/8250_core.c:681
       console_init+0x49d/0x6d3 kernel/printk/printk.c:2915
       start_kernel+0x5e9/0x879 init/main.c:713
       secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:241

-> #0 (console_owner){....}-{0:0}:
       [...]
       lock_acquire+0x127/0x340 kernel/locking/lockdep.c:4734
       console_trylock_spinning kernel/printk/printk.c:1773 [inline]		<-- lock(console_owner);
       vprintk_emit+0x307/0x470 kernel/printk/printk.c:2023
       vprintk_func+0x8d/0x250 kernel/printk/printk_safe.c:394
       printk+0xba/0xed kernel/printk/printk.c:2084
       fail_dump lib/fault-inject.c:45 [inline]
       should_fail+0x67b/0x7c0 lib/fault-inject.c:144
       __should_failslab+0x152/0x1c0 mm/failslab.c:33
       should_failslab+0x5/0x10 mm/slab_common.c:1224
       slab_pre_alloc_hook mm/slab.h:468 [inline]
       slab_alloc_node mm/slub.c:2723 [inline]
       slab_alloc mm/slub.c:2807 [inline]
       __kmalloc+0x72/0x300 mm/slub.c:3871
       kmalloc include/linux/slab.h:582 [inline]
       tty_buffer_alloc+0x23f/0x2a0 drivers/tty/tty_buffer.c:175
       __tty_buffer_request_room+0x156/0x2a0 drivers/tty/tty_buffer.c:273
       tty_insert_flip_string_fixed_flag+0x93/0x250 drivers/tty/tty_buffer.c:318
       tty_insert_flip_string include/linux/tty_flip.h:37 [inline]
       pty_write+0x126/0x1f0 drivers/tty/pty.c:122				<-- lock(&port->lock);
       n_tty_write+0xa7a/0xfc0 drivers/tty/n_tty.c:2356
       do_tty_write drivers/tty/tty_io.c:961 [inline]
       tty_write+0x512/0x930 drivers/tty/tty_io.c:1045
       __vfs_write+0x76/0x100 fs/read_write.c:494
       [...]

other info that might help us debug this:

Chain exists of:
  console_owner --> &port_lock_key --> &port->lock

Link: https://lkml.kernel.org/r/20220511061951.1114-2-zhengqi.arch@bytedance.com
Link: https://lkml.kernel.org/r/20220510113809.80626-2-zhengqi.arch@bytedance.com
Fixes: b6da31b ("tty: Fix data race in tty_insert_flip_string_fixed_flag")
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Acked-by: Jiri Slaby <jirislaby@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
starnight pushed a commit that referenced this pull request Aug 18, 2022
BugLink: https://bugs.launchpad.net/bugs/1981864

commit 6213f5d upstream.

Let's avoid false-alarmed lockdep warning.

[   58.914674] [T1501146] -> #2 (&sb->s_type->i_mutex_key#20){+.+.}-{3:3}:
[   58.915975] [T1501146] system_server:        down_write+0x7c/0xe0
[   58.916738] [T1501146] system_server:        f2fs_quota_sync+0x60/0x1a8
[   58.917563] [T1501146] system_server:        block_operations+0x16c/0x43c
[   58.918410] [T1501146] system_server:        f2fs_write_checkpoint+0x114/0x318
[   58.919312] [T1501146] system_server:        f2fs_issue_checkpoint+0x178/0x21c
[   58.920214] [T1501146] system_server:        f2fs_sync_fs+0x48/0x6c
[   58.920999] [T1501146] system_server:        f2fs_do_sync_file+0x334/0x738
[   58.921862] [T1501146] system_server:        f2fs_sync_file+0x30/0x48
[   58.922667] [T1501146] system_server:        __arm64_sys_fsync+0x84/0xf8
[   58.923506] [T1501146] system_server:        el0_svc_common.llvm.12821150825140585682+0xd8/0x20c
[   58.924604] [T1501146] system_server:        do_el0_svc+0x28/0xa0
[   58.925366] [T1501146] system_server:        el0_svc+0x24/0x38
[   58.926094] [T1501146] system_server:        el0_sync_handler+0x88/0xec
[   58.926920] [T1501146] system_server:        el0_sync+0x1b4/0x1c0

[   58.927681] [T1501146] -> #1 (&sbi->cp_global_sem){+.+.}-{3:3}:
[   58.928889] [T1501146] system_server:        down_write+0x7c/0xe0
[   58.929650] [T1501146] system_server:        f2fs_write_checkpoint+0xbc/0x318
[   58.930541] [T1501146] system_server:        f2fs_issue_checkpoint+0x178/0x21c
[   58.931443] [T1501146] system_server:        f2fs_sync_fs+0x48/0x6c
[   58.932226] [T1501146] system_server:        sync_filesystem+0xac/0x130
[   58.933053] [T1501146] system_server:        generic_shutdown_super+0x38/0x150
[   58.933958] [T1501146] system_server:        kill_block_super+0x24/0x58
[   58.934791] [T1501146] system_server:        kill_f2fs_super+0xcc/0x124
[   58.935618] [T1501146] system_server:        deactivate_locked_super+0x90/0x120
[   58.936529] [T1501146] system_server:        deactivate_super+0x74/0xac
[   58.937356] [T1501146] system_server:        cleanup_mnt+0x128/0x168
[   58.938150] [T1501146] system_server:        __cleanup_mnt+0x18/0x28
[   58.938944] [T1501146] system_server:        task_work_run+0xb8/0x14c
[   58.939749] [T1501146] system_server:        do_notify_resume+0x114/0x1e8
[   58.940595] [T1501146] system_server:        work_pending+0xc/0x5f0

[   58.941375] [T1501146] -> #0 (&sbi->gc_lock){+.+.}-{3:3}:
[   58.942519] [T1501146] system_server:        __lock_acquire+0x1270/0x2868
[   58.943366] [T1501146] system_server:        lock_acquire+0x114/0x294
[   58.944169] [T1501146] system_server:        down_write+0x7c/0xe0
[   58.944930] [T1501146] system_server:        f2fs_issue_checkpoint+0x13c/0x21c
[   58.945831] [T1501146] system_server:        f2fs_sync_fs+0x48/0x6c
[   58.946614] [T1501146] system_server:        f2fs_do_sync_file+0x334/0x738
[   58.947472] [T1501146] system_server:        f2fs_ioc_commit_atomic_write+0xc8/0x14c
[   58.948439] [T1501146] system_server:        __f2fs_ioctl+0x674/0x154c
[   58.949253] [T1501146] system_server:        f2fs_ioctl+0x54/0x88
[   58.950018] [T1501146] system_server:        __arm64_sys_ioctl+0xa8/0x110
[   58.950865] [T1501146] system_server:        el0_svc_common.llvm.12821150825140585682+0xd8/0x20c
[   58.951965] [T1501146] system_server:        do_el0_svc+0x28/0xa0
[   58.952727] [T1501146] system_server:        el0_svc+0x24/0x38
[   58.953454] [T1501146] system_server:        el0_sync_handler+0x88/0xec
[   58.954279] [T1501146] system_server:        el0_sync+0x1b4/0x1c0

Cc: stable@vger.kernel.org
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
starnight pushed a commit that referenced this pull request Aug 18, 2022
BugLink: https://bugs.launchpad.net/bugs/1981864

commit c1cee4a upstream.

It can happen that the parent of a bfqq changes between the moment we
decide two queues are worth to merge (and set bic->stable_merge_bfqq)
and the moment bfq_setup_merge() is called. This can happen e.g. because
the process submitted IO for a different cgroup and thus bfqq got
reparented. It can even happen that the bfqq we are merging with has
parent cgroup that is already offline and going to be destroyed in which
case the merge can lead to use-after-free issues such as:

BUG: KASAN: use-after-free in __bfq_deactivate_entity+0x9cb/0xa50
Read of size 8 at addr ffff88800693c0c0 by task runc:[2:INIT]/10544

CPU: 0 PID: 10544 Comm: runc:[2:INIT] Tainted: G            E     5.15.2-0.g5fb85fd-default #1 openSUSE Tumbleweed (unreleased) f1f3b891c72369aebecd2e43e4641a6358867c70
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a-rebuilt.opensuse.org 04/01/2014
Call Trace:
 <IRQ>
 dump_stack_lvl+0x46/0x5a
 print_address_description.constprop.0+0x1f/0x140
 ? __bfq_deactivate_entity+0x9cb/0xa50
 kasan_report.cold+0x7f/0x11b
 ? __bfq_deactivate_entity+0x9cb/0xa50
 __bfq_deactivate_entity+0x9cb/0xa50
 ? update_curr+0x32f/0x5d0
 bfq_deactivate_entity+0xa0/0x1d0
 bfq_del_bfqq_busy+0x28a/0x420
 ? resched_curr+0x116/0x1d0
 ? bfq_requeue_bfqq+0x70/0x70
 ? check_preempt_wakeup+0x52b/0xbc0
 __bfq_bfqq_expire+0x1a2/0x270
 bfq_bfqq_expire+0xd16/0x2160
 ? try_to_wake_up+0x4ee/0x1260
 ? bfq_end_wr_async_queues+0xe0/0xe0
 ? _raw_write_unlock_bh+0x60/0x60
 ? _raw_spin_lock_irq+0x81/0xe0
 bfq_idle_slice_timer+0x109/0x280
 ? bfq_dispatch_request+0x4870/0x4870
 __hrtimer_run_queues+0x37d/0x700
 ? enqueue_hrtimer+0x1b0/0x1b0
 ? kvm_clock_get_cycles+0xd/0x10
 ? ktime_get_update_offsets_now+0x6f/0x280
 hrtimer_interrupt+0x2c8/0x740

Fix the problem by checking that the parent of the two bfqqs we are
merging in bfq_setup_merge() is the same.

Link: https://lore.kernel.org/linux-block/20211125172809.GC19572@quack2.suse.cz/
CC: stable@vger.kernel.org
Fixes: 430a67f ("block, bfq: merge bursts of newly-created queues")
Tested-by: "yukuai (C)" <yukuai3@huawei.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220401102752.8599-2-jack@suse.cz
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
starnight pushed a commit that referenced this pull request Aug 18, 2022
BugLink: https://bugs.launchpad.net/bugs/1981864

commit f87c7a4 upstream.

Hulk Robot reported a BUG_ON:
 ==================================================================
 EXT4-fs error (device loop3): ext4_mb_generate_buddy:805: group 0,
 block bitmap and bg descriptor inconsistent: 25 vs 31513 free clusters
 kernel BUG at fs/ext4/ext4_jbd2.c:53!
 invalid opcode: 0000 [#1] SMP KASAN PTI
 CPU: 0 PID: 25371 Comm: syz-executor.3 Not tainted 5.10.0+ #1
 RIP: 0010:ext4_put_nojournal fs/ext4/ext4_jbd2.c:53 [inline]
 RIP: 0010:__ext4_journal_stop+0x10e/0x110 fs/ext4/ext4_jbd2.c:116
 [...]
 Call Trace:
  ext4_write_inline_data_end+0x59a/0x730 fs/ext4/inline.c:795
  generic_perform_write+0x279/0x3c0 mm/filemap.c:3344
  ext4_buffered_write_iter+0x2e3/0x3d0 fs/ext4/file.c:270
  ext4_file_write_iter+0x30a/0x11c0 fs/ext4/file.c:520
  do_iter_readv_writev+0x339/0x3c0 fs/read_write.c:732
  do_iter_write+0x107/0x430 fs/read_write.c:861
  vfs_writev fs/read_write.c:934 [inline]
  do_pwritev+0x1e5/0x380 fs/read_write.c:1031
 [...]
 ==================================================================

Above issue may happen as follows:
           cpu1                     cpu2
__________________________|__________________________
do_pwritev
 vfs_writev
  do_iter_write
   ext4_file_write_iter
    ext4_buffered_write_iter
     generic_perform_write
      ext4_da_write_begin
                           vfs_fallocate
                            ext4_fallocate
                             ext4_convert_inline_data
                              ext4_convert_inline_data_nolock
                               ext4_destroy_inline_data_nolock
                                clear EXT4_STATE_MAY_INLINE_DATA
                               ext4_map_blocks
                                ext4_ext_map_blocks
                                 ext4_mb_new_blocks
                                  ext4_mb_regular_allocator
                                   ext4_mb_good_group_nolock
                                    ext4_mb_init_group
                                     ext4_mb_init_cache
                                      ext4_mb_generate_buddy  --> error
       ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA)
                                ext4_restore_inline_data
                                 set EXT4_STATE_MAY_INLINE_DATA
       ext4_block_write_begin
      ext4_da_write_end
       ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA)
       ext4_write_inline_data_end
        handle=NULL
        ext4_journal_stop(handle)
         __ext4_journal_stop
          ext4_put_nojournal(handle)
           ref_cnt = (unsigned long)handle
           BUG_ON(ref_cnt == 0)  ---> BUG_ON

The lock held by ext4_convert_inline_data is xattr_sem, but the lock
held by generic_perform_write is i_rwsem. Therefore, the two locks can
be concurrent.

To solve above issue, we add inode_lock() for ext4_convert_inline_data().
At the same time, move ext4_convert_inline_data() in front of
ext4_punch_hole(), remove similar handling from ext4_punch_hole().

Fixes: 0c8d414 ("ext4: let fallocate handle inline data correctly")
Cc: stable@vger.kernel.org
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220428134031.4153381-1-libaokun1@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
starnight pushed a commit that referenced this pull request Aug 18, 2022
BugLink: https://bugs.launchpad.net/bugs/1981864

commit f4534c9 upstream.

We got issue as follows:
EXT4-fs error (device loop0) in ext4_reserve_inode_write:5741: Out of memory
EXT4-fs error (device loop0): ext4_setattr:5462: inode #13: comm syz-executor.0: mark_inode_dirty error
EXT4-fs error (device loop0) in ext4_setattr:5519: Out of memory
EXT4-fs error (device loop0): ext4_ind_map_blocks:595: inode #13: comm syz-executor.0: Can't allocate blocks for non-extent mapped inodes with bigalloc
------------[ cut here ]------------
WARNING: CPU: 1 PID: 4361 at fs/ext4/file.c:301 ext4_file_write_iter+0x11c9/0x1220
Modules linked in:
CPU: 1 PID: 4361 Comm: syz-executor.0 Not tainted 5.10.0+ #1
RIP: 0010:ext4_file_write_iter+0x11c9/0x1220
RSP: 0018:ffff924d80b27c00 EFLAGS: 00010282
RAX: ffffffff815a3379 RBX: 0000000000000000 RCX: 000000003b000000
RDX: ffff924d81601000 RSI: 00000000000009cc RDI: 00000000000009cd
RBP: 000000000000000d R08: ffffffffbc5a2c6b R09: 0000902e0e52a96f
R10: ffff902e2b7c1b40 R11: ffff902e2b7c1b40 R12: 000000000000000a
R13: 0000000000000001 R14: ffff902e0e52aa10 R15: ffffffffffffff8b
FS:  00007f81a7f65700(0000) GS:ffff902e3bc80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffff600400 CR3: 000000012db88001 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 do_iter_readv_writev+0x2e5/0x360
 do_iter_write+0x112/0x4c0
 do_pwritev+0x1e5/0x390
 __x64_sys_pwritev2+0x7e/0xa0
 do_syscall_64+0x37/0x50
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Above issue may happen as follows:
Assume
inode.i_size=4096
EXT4_I(inode)->i_disksize=4096

step 1: set inode->i_isize = 8192
ext4_setattr
  if (attr->ia_size != inode->i_size)
    EXT4_I(inode)->i_disksize = attr->ia_size;
    rc = ext4_mark_inode_dirty
       ext4_reserve_inode_write
          ext4_get_inode_loc
            __ext4_get_inode_loc
              sb_getblk --> return -ENOMEM
   ...
   if (!error)  ->will not update i_size
     i_size_write(inode, attr->ia_size);
Now:
inode.i_size=4096
EXT4_I(inode)->i_disksize=8192

step 2: Direct write 4096 bytes
ext4_file_write_iter
 ext4_dio_write_iter
   iomap_dio_rw ->return error
 if (extend)
   ext4_handle_inode_extension
     WARN_ON_ONCE(i_size_read(inode) < EXT4_I(inode)->i_disksize);
->Then trigger warning.

To solve above issue, if mark inode dirty failed in ext4_setattr just
set 'EXT4_I(inode)->i_disksize' with old value.

Signed-off-by: Ye Bin <yebin10@huawei.com>
Link: https://lore.kernel.org/r/20220326065351.761952-1-yebin10@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
starnight pushed a commit that referenced this pull request Aug 18, 2022
BugLink: https://bugs.launchpad.net/bugs/1981864

commit ef09ed5 upstream.

we got issue as follows:
EXT4-fs error (device loop0): ext4_mb_generate_buddy:1141: group 0, block bitmap and bg descriptor inconsistent: 25 vs 31513 free cls
------------[ cut here ]------------
kernel BUG at fs/ext4/inode.c:2708!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
CPU: 2 PID: 2147 Comm: rep Not tainted 5.18.0-rc2-next-20220413+ #155
RIP: 0010:ext4_writepages+0x1977/0x1c10
RSP: 0018:ffff88811d3e7880 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88811c098000
RDX: 0000000000000000 RSI: ffff88811c098000 RDI: 0000000000000002
RBP: ffff888128140f50 R08: ffffffffb1ff6387 R09: 0000000000000000
R10: 0000000000000007 R11: ffffed10250281ea R12: 0000000000000001
R13: 00000000000000a4 R14: ffff88811d3e7bb8 R15: ffff888128141028
FS:  00007f443aed9740(0000) GS:ffff8883aef00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020007200 CR3: 000000011c2a4000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 do_writepages+0x130/0x3a0
 filemap_fdatawrite_wbc+0x83/0xa0
 filemap_flush+0xab/0xe0
 ext4_alloc_da_blocks+0x51/0x120
 __ext4_ioctl+0x1534/0x3210
 __x64_sys_ioctl+0x12c/0x170
 do_syscall_64+0x3b/0x90

It may happen as follows:
1. write inline_data inode
vfs_write
  new_sync_write
    ext4_file_write_iter
      ext4_buffered_write_iter
        generic_perform_write
          ext4_da_write_begin
            ext4_da_write_inline_data_begin -> If inline data size too
            small will allocate block to write, then mapping will has
            dirty page
                ext4_da_convert_inline_data_to_extent ->clear EXT4_STATE_MAY_INLINE_DATA
2. fallocate
do_vfs_ioctl
  ioctl_preallocate
    vfs_fallocate
      ext4_fallocate
        ext4_convert_inline_data
          ext4_convert_inline_data_nolock
            ext4_map_blocks -> fail will goto restore data
            ext4_restore_inline_data
              ext4_create_inline_data
              ext4_write_inline_data
              ext4_set_inode_state -> set inode EXT4_STATE_MAY_INLINE_DATA
3. writepages
__ext4_ioctl
  ext4_alloc_da_blocks
    filemap_flush
      filemap_fdatawrite_wbc
        do_writepages
          ext4_writepages
            if (ext4_has_inline_data(inode))
              BUG_ON(ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA))

The root cause of this issue is we destory inline data until call
ext4_writepages under delay allocation mode.  But there maybe already
convert from inline to extent.  To solve this issue, we call
filemap_flush first..

Cc: stable@kernel.org
Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220516122634.1690462-1-yebin10@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
starnight pushed a commit that referenced this pull request Aug 18, 2022
BugLink: https://bugs.launchpad.net/bugs/1981864

commit 863e0d8 upstream.

When user_dlm_destroy_lock failed, it didn't clean up the flags it set
before exit.  For USER_LOCK_IN_TEARDOWN, if this function fails because of
lock is still in used, next time when unlink invokes this function, it
will return succeed, and then unlink will remove inode and dentry if lock
is not in used(file closed), but the dlm lock is still linked in dlm lock
resource, then when bast come in, it will trigger a panic due to
user-after-free.  See the following panic call trace.  To fix this,
USER_LOCK_IN_TEARDOWN should be reverted if fail.  And also error should
be returned if USER_LOCK_IN_TEARDOWN is set to let user know that unlink
fail.

For the case of ocfs2_dlm_unlock failure, besides USER_LOCK_IN_TEARDOWN,
USER_LOCK_BUSY is also required to be cleared.  Even though spin lock is
released in between, but USER_LOCK_IN_TEARDOWN is still set, for
USER_LOCK_BUSY, if before every place that waits on this flag,
USER_LOCK_IN_TEARDOWN is checked to bail out, that will make sure no flow
waits on the busy flag set by user_dlm_destroy_lock(), then we can
simplely revert USER_LOCK_BUSY when ocfs2_dlm_unlock fails.  Fix
user_dlm_cluster_lock() which is the only function not following this.

[  941.336392] (python,26174,16):dlmfs_unlink:562 ERROR: unlink
004fb0000060000b5a90b8c847b72e1, error -16 from destroy
[  989.757536] ------------[ cut here ]------------
[  989.757709] kernel BUG at fs/ocfs2/dlmfs/userdlm.c:173!
[  989.757876] invalid opcode: 0000 [#1] SMP
[  989.758027] Modules linked in: ksplice_2zhuk2jr_ib_ipoib_new(O)
ksplice_2zhuk2jr(O) mptctl mptbase xen_netback xen_blkback xen_gntalloc
xen_gntdev xen_evtchn cdc_ether usbnet mii ocfs2 jbd2 rpcsec_gss_krb5
auth_rpcgss nfsv4 nfsv3 nfs_acl nfs fscache lockd grace ocfs2_dlmfs
ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs bnx2fc
fcoe libfcoe libfc scsi_transport_fc sunrpc ipmi_devintf bridge stp llc
rds_rdma rds bonding ib_sdp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad
rdma_cm ib_cm iw_cm falcon_lsm_serviceable(PE) falcon_nf_netcontain(PE)
mlx4_vnic falcon_kal(E) falcon_lsm_pinned_13402(E) mlx4_ib ib_sa ib_mad
ib_core ib_addr xenfs xen_privcmd dm_multipath iTCO_wdt iTCO_vendor_support
pcspkr sb_edac edac_core i2c_i801 lpc_ich mfd_core ipmi_ssif i2c_core ipmi_si
ipmi_msghandler
[  989.760686]  ioatdma sg ext3 jbd mbcache sd_mod ahci libahci ixgbe dca ptp
pps_core vxlan udp_tunnel ip6_udp_tunnel megaraid_sas mlx4_core crc32c_intel
be2iscsi bnx2i cnic uio cxgb4i cxgb4 cxgb3i libcxgbi ipv6 cxgb3 mdio
libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi wmi
dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
ksplice_2zhuk2jr_ib_ipoib_old]
[  989.761987] CPU: 10 PID: 19102 Comm: dlm_thread Tainted: P           OE
4.1.12-124.57.1.el6uek.x86_64 #2
[  989.762290] Hardware name: Oracle Corporation ORACLE SERVER
X5-2/ASM,MOTHERBOARD,1U, BIOS 30350100 06/17/2021
[  989.762599] task: ffff880178af6200 ti: ffff88017f7c8000 task.ti:
ffff88017f7c8000
[  989.762848] RIP: e030:[<ffffffffc07d4316>]  [<ffffffffc07d4316>]
__user_dlm_queue_lockres.part.4+0x76/0x80 [ocfs2_dlmfs]
[  989.763185] RSP: e02b:ffff88017f7cbcb8  EFLAGS: 00010246
[  989.763353] RAX: 0000000000000000 RBX: ffff880174d48008 RCX:
0000000000000003
[  989.763565] RDX: 0000000000120012 RSI: 0000000000000003 RDI:
ffff880174d48170
[  989.763778] RBP: ffff88017f7cbcc8 R08: ffff88021f4293b0 R09:
0000000000000000
[  989.763991] R10: ffff880179c8c000 R11: 0000000000000003 R12:
ffff880174d48008
[  989.764204] R13: 0000000000000003 R14: ffff880179c8c000 R15:
ffff88021db7a000
[  989.764422] FS:  0000000000000000(0000) GS:ffff880247480000(0000)
knlGS:ffff880247480000
[  989.764685] CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
[  989.764865] CR2: ffff8000007f6800 CR3: 0000000001ae0000 CR4:
0000000000042660
[  989.765081] Stack:
[  989.765167]  0000000000000003 ffff880174d48040 ffff88017f7cbd18
ffffffffc07d455f
[  989.765442]  ffff88017f7cbd88 ffffffff816fb639 ffff88017f7cbd38
ffff8800361b5600
[  989.765717]  ffff88021db7a000 ffff88021f429380 0000000000000003
ffffffffc0453020
[  989.765991] Call Trace:
[  989.766093]  [<ffffffffc07d455f>] user_bast+0x5f/0xf0 [ocfs2_dlmfs]
[  989.766287]  [<ffffffff816fb639>] ? schedule_timeout+0x169/0x2d0
[  989.766475]  [<ffffffffc0453020>] ? o2dlm_lock_ast_wrapper+0x20/0x20
[ocfs2_stack_o2cb]
[  989.766738]  [<ffffffffc045303a>] o2dlm_blocking_ast_wrapper+0x1a/0x20
[ocfs2_stack_o2cb]
[  989.767010]  [<ffffffffc0864ec6>] dlm_do_local_bast+0x46/0xe0 [ocfs2_dlm]
[  989.767217]  [<ffffffffc084f5cc>] ? dlm_lockres_calc_usage+0x4c/0x60
[ocfs2_dlm]
[  989.767466]  [<ffffffffc08501f1>] dlm_thread+0xa31/0x1140 [ocfs2_dlm]
[  989.767662]  [<ffffffff816f78da>] ? __schedule+0x24a/0x810
[  989.767834]  [<ffffffff816f78ce>] ? __schedule+0x23e/0x810
[  989.768006]  [<ffffffff816f78da>] ? __schedule+0x24a/0x810
[  989.768178]  [<ffffffff816f78ce>] ? __schedule+0x23e/0x810
[  989.768349]  [<ffffffff816f78da>] ? __schedule+0x24a/0x810
[  989.768521]  [<ffffffff816f78ce>] ? __schedule+0x23e/0x810
[  989.768693]  [<ffffffff816f78da>] ? __schedule+0x24a/0x810
[  989.768893]  [<ffffffff816f78ce>] ? __schedule+0x23e/0x810
[  989.769067]  [<ffffffff816f78da>] ? __schedule+0x24a/0x810
[  989.769241]  [<ffffffff810ce4d0>] ? wait_woken+0x90/0x90
[  989.769411]  [<ffffffffc084f7c0>] ? dlm_kick_thread+0x80/0x80 [ocfs2_dlm]
[  989.769617]  [<ffffffff810a8bbb>] kthread+0xcb/0xf0
[  989.769774]  [<ffffffff816f78da>] ? __schedule+0x24a/0x810
[  989.769945]  [<ffffffff816f78da>] ? __schedule+0x24a/0x810
[  989.770117]  [<ffffffff810a8af0>] ? kthread_create_on_node+0x180/0x180
[  989.770321]  [<ffffffff816fdaa1>] ret_from_fork+0x61/0x90
[  989.770492]  [<ffffffff810a8af0>] ? kthread_create_on_node+0x180/0x180
[  989.770689] Code: d0 00 00 00 f0 45 7d c0 bf 00 20 00 00 48 89 83 c0 00 00
00 48 89 83 c8 00 00 00 e8 55 c1 8c c0 83 4b 04 10 48 83 c4 08 5b 5d c3 <0f>
0b 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 55 41 54 53 48 83
[  989.771892] RIP  [<ffffffffc07d4316>]
__user_dlm_queue_lockres.part.4+0x76/0x80 [ocfs2_dlmfs]
[  989.772174]  RSP <ffff88017f7cbcb8>
[  989.772704] ---[ end trace ebd1e38cebcc93a8 ]---
[  989.772907] Kernel panic - not syncing: Fatal exception
[  989.773173] Kernel Offset: disabled

Link: https://lkml.kernel.org/r/20220518235224.87100-2-junxiao.bi@oracle.com
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
starnight pushed a commit that referenced this pull request Aug 18, 2022
BugLink: https://bugs.launchpad.net/bugs/1981864

commit 31ab27b upstream.

Submitting a cs with 0 chunks, causes an oops later, found trying
to execute the wrong userspace driver.

MESA_LOADER_DRIVER_OVERRIDE=v3d glxinfo

[172536.665184] BUG: kernel NULL pointer dereference, address: 00000000000001d8
[172536.665188] #PF: supervisor read access in kernel mode
[172536.665189] #PF: error_code(0x0000) - not-present page
[172536.665191] PGD 6712a0067 P4D 6712a0067 PUD 5af9ff067 PMD 0
[172536.665195] Oops: 0000 [#1] SMP NOPTI
[172536.665197] CPU: 7 PID: 2769838 Comm: glxinfo Tainted: P           O      5.10.81 #1-NixOS
[172536.665199] Hardware name: To be filled by O.E.M. To be filled by O.E.M./CROSSHAIR V FORMULA-Z, BIOS 2201 03/23/2015
[172536.665272] RIP: 0010:amdgpu_cs_ioctl+0x96/0x1ce0 [amdgpu]
[172536.665274] Code: 75 18 00 00 4c 8b b2 88 00 00 00 8b 46 08 48 89 54 24 68 49 89 f7 4c 89 5c 24 60 31 d2 4c 89 74 24 30 85 c0 0f 85 c0 01 00 00 <48> 83 ba d8 01 00 00 00 48 8b b4 24 90 00 00 00 74 16 48 8b 46 10
[172536.665276] RSP: 0018:ffffb47c0e81bbe0 EFLAGS: 00010246
[172536.665277] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[172536.665278] RDX: 0000000000000000 RSI: ffffb47c0e81be28 RDI: ffffb47c0e81bd68
[172536.665279] RBP: ffff936524080010 R08: 0000000000000000 R09: ffffb47c0e81be38
[172536.665281] R10: ffff936524080010 R11: ffff936524080000 R12: ffffb47c0e81bc40
[172536.665282] R13: ffffb47c0e81be28 R14: ffff9367bc410000 R15: ffffb47c0e81be28
[172536.665283] FS:  00007fe35e05d740(0000) GS:ffff936c1edc0000(0000) knlGS:0000000000000000
[172536.665284] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[172536.665286] CR2: 00000000000001d8 CR3: 0000000532e46000 CR4: 00000000000406e0
[172536.665287] Call Trace:
[172536.665322]  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
[172536.665332]  drm_ioctl_kernel+0xaa/0xf0 [drm]
[172536.665338]  drm_ioctl+0x201/0x3b0 [drm]
[172536.665369]  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
[172536.665372]  ? selinux_file_ioctl+0x135/0x230
[172536.665399]  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
[172536.665403]  __x64_sys_ioctl+0x83/0xb0
[172536.665406]  do_syscall_64+0x33/0x40
[172536.665409]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2018
Signed-off-by: Dave Airlie <airlied@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
starnight pushed a commit that referenced this pull request Aug 18, 2022
BugLink: https://bugs.launchpad.net/bugs/1981864

commit 7d54c15 upstream.

We see the following GPF when register_ftrace_direct fails:

[ ] general protection fault, probably for non-canonical address \
  0x200000000000010: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI
[...]
[ ] RIP: 0010:ftrace_find_rec_direct+0x53/0x70
[ ] Code: 48 c1 e0 03 48 03 42 08 48 8b 10 31 c0 48 85 d2 74 [...]
[ ] RSP: 0018:ffffc9000138bc10 EFLAGS: 00010206
[ ] RAX: 0000000000000000 RBX: ffffffff813e0df0 RCX: 000000000000003b
[ ] RDX: 0200000000000000 RSI: 000000000000000c RDI: ffffffff813e0df0
[ ] RBP: ffffffffa00a3000 R08: ffffffff81180ce0 R09: 0000000000000001
[ ] R10: ffffc9000138bc18 R11: 0000000000000001 R12: ffffffff813e0df0
[ ] R13: ffffffff813e0df0 R14: ffff888171b56400 R15: 0000000000000000
[ ] FS:  00007fa9420c7780(0000) GS:ffff888ff6a00000(0000) knlGS:000000000
[ ] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ ] CR2: 000000000770d000 CR3: 0000000107d50003 CR4: 0000000000370ee0
[ ] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ ] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ ] Call Trace:
[ ]  <TASK>
[ ]  register_ftrace_direct+0x54/0x290
[ ]  ? render_sigset_t+0xa0/0xa0
[ ]  bpf_trampoline_update+0x3f5/0x4a0
[ ]  ? 0xffffffffa00a3000
[ ]  bpf_trampoline_link_prog+0xa9/0x140
[ ]  bpf_tracing_prog_attach+0x1dc/0x450
[ ]  bpf_raw_tracepoint_open+0x9a/0x1e0
[ ]  ? find_held_lock+0x2d/0x90
[ ]  ? lock_release+0x150/0x430
[ ]  __sys_bpf+0xbd6/0x2700
[ ]  ? lock_is_held_type+0xd8/0x130
[ ]  __x64_sys_bpf+0x1c/0x20
[ ]  do_syscall_64+0x3a/0x80
[ ]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[ ] RIP: 0033:0x7fa9421defa9
[ ] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 9 f8 [...]
[ ] RSP: 002b:00007ffed743bd78 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
[ ] RAX: ffffffffffffffda RBX: 00000000069d2480 RCX: 00007fa9421defa9
[ ] RDX: 0000000000000078 RSI: 00007ffed743bd80 RDI: 0000000000000011
[ ] RBP: 00007ffed743be00 R08: 0000000000bb7270 R09: 0000000000000000
[ ] R10: 00000000069da210 R11: 0000000000000246 R12: 0000000000000001
[ ] R13: 00007ffed743c4b0 R14: 00000000069d2480 R15: 0000000000000001
[ ]  </TASK>
[ ] Modules linked in: klp_vm(OK)
[ ] ---[ end trace 0000000000000000 ]---

One way to trigger this is:
  1. load a livepatch that patches kernel function xxx;
  2. run bpftrace -e 'kfunc:xxx {}', this will fail (expected for now);
  3. repeat #2 => gpf.

This is because the entry is added to direct_functions, but not removed.
Fix this by remove the entry from direct_functions when
register_ftrace_direct fails.

Also remove the last trailing space from ftrace.c, so we don't have to
worry about it anymore.

Link: https://lkml.kernel.org/r/20220524170839.900849-1-song@kernel.org

Cc: stable@vger.kernel.org
Fixes: 763e34e ("ftrace: Add register_ftrace_direct()")
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
starnight pushed a commit that referenced this pull request Aug 18, 2022
BugLink: https://bugs.launchpad.net/bugs/1981864

commit 0f2571a upstream.

In normal stop process, it does like this:
   do_md_stop
      |
   __md_stop (pers->free(); mddev->private=NULL)
      |
   md_free (free mddev)
__md_stop sets mddev->private to NULL after pers->free. The raid device
will be stopped and mddev memory is free. But in reshape, it doesn't
free the mddev and mddev will still be used in new raid.

In reshape, it first sets mddev->private to new_pers and then runs
old_pers->free(). Now raid0 sets mddev->private to NULL in raid0_free.
The new raid can't work anymore. It will panic when dereference
mddev->private because of NULL pointer dereference.

It can panic like this:
[63010.814972] kernel BUG at drivers/md/raid10.c:928!
[63010.819778] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[63010.825011] CPU: 3 PID: 44437 Comm: md0_resync Kdump: loaded Not tainted 5.14.0-86.el9.x86_64 #1
[63010.833789] Hardware name: Dell Inc. PowerEdge R6415/07YXFK, BIOS 1.15.0 09/11/2020
[63010.841440] RIP: 0010:raise_barrier+0x161/0x170 [raid10]
[63010.865508] RSP: 0018:ffffc312408bbc10 EFLAGS: 00010246
[63010.870734] RAX: 0000000000000000 RBX: ffffa00bf7d39800 RCX: 0000000000000000
[63010.877866] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffa00bf7d39800
[63010.884999] RBP: 0000000000000000 R08: fffffa4945e74400 R09: 0000000000000000
[63010.892132] R10: ffffa00eed02f798 R11: 0000000000000000 R12: ffffa00bbc435200
[63010.899266] R13: ffffa00bf7d39800 R14: 0000000000000400 R15: 0000000000000003
[63010.906399] FS:  0000000000000000(0000) GS:ffffa00eed000000(0000) knlGS:0000000000000000
[63010.914485] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[63010.920229] CR2: 00007f5cfbe99828 CR3: 0000000105efe000 CR4: 00000000003506e0
[63010.927363] Call Trace:
[63010.929822]  ? bio_reset+0xe/0x40
[63010.933144]  ? raid10_alloc_init_r10buf+0x60/0xa0 [raid10]
[63010.938629]  raid10_sync_request+0x756/0x1610 [raid10]
[63010.943770]  md_do_sync.cold+0x3e4/0x94c
[63010.947698]  md_thread+0xab/0x160
[63010.951024]  ? md_write_inc+0x50/0x50
[63010.954688]  kthread+0x149/0x170
[63010.957923]  ? set_kthread_struct+0x40/0x40
[63010.962107]  ret_from_fork+0x22/0x30

Removing the code that sets mddev->private to NULL in raid0 can fix
problem.

Fixes: 0c031fd (md: Move alloc/free acct bioset in to personality)
Reported-by: Fine Fan <ffan@redhat.com>
Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
starnight pushed a commit that referenced this pull request Aug 18, 2022
We would like to make chip_info table const, but 8821c uses one field as
a variable, and causes core dump. To fix this, move the field to another
struct that can be read and written.

BUG: unable to handle page fault for address: ffffffffc09f52f4
PGD 5b5215067 P4D 5b5215067 PUD 5b5217067 PMD 111f61067 PTE 8000000111e07161
Oops: 0003 [#1] PREEMPT SMP NOPTI
CPU: 6 PID: 436 Comm: NetworkManager Not tainted 5.18.0-rc7-debug-01822-g89d8f53ff6e7 #1 5cac31ca93432e53341863abfb3332fd98b144da
Hardware name: HP HP Desktop M01-F1xxx/87D6, BIOS F.12 12/17/2020
RIP: 0010:rtw8821c_phy_set_param+0x262/0x380 [rtw88_8821c]
Code: e8 53 f3 c0 d6 48 8b 43 10 4c 8b 63 38 be 24 0a 00 00 48 89 df 48
 8b 40 68 e8 3a f3 c0 d6 89 e9 be 28 0a 00 00 48 89 df d3 e8 <41> 89 84
 24 54 01 00 00 48 8b 43 10 4c 8b 63 38 48 8b 40 68 e8 15
RSP: 0018:ffffb08c417cb6f0 EFLAGS: 00010286
RAX: 0000000064b80c1c RBX: ffff93d15a0120e0 RCX: 0000000000000000
RDX: 0000000034028211 RSI: 0000000000000a28 RDI: ffff93d15a0120e0
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000006 R12: ffffffffc09f51a0
R13: ffff93d15a0156d0 R14: 0000000000000000 R15: 0000000000000001
FS:  00007f4e9b73d1c0(0000) GS:ffff93d83ab80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffc09f52f4 CR3: 0000000103b9e000 CR4: 0000000000350ee0
Call Trace:
 <TASK>
 rtw_core_start+0xbd/0x190 [rtw88_core de79d6bdfd083d102030858972032e5706726279]
 rtw_ops_start+0x26/0x40 [rtw88_core de79d6bdfd083d102030858972032e5706726279]
 drv_start+0x42/0x100 [mac80211 21e803d0ad10691f64c6c81ecc24c0c6c36e5d58]
 ieee80211_do_open+0x2fb/0x900 [mac80211 21e803d0ad10691f64c6c81ecc24c0c6c36e5d58]
 ieee80211_open+0x67/0x80 [mac80211 21e803d0ad10691f64c6c81ecc24c0c6c36e5d58]
 __dev_open+0xdd/0x180
 [...]

Fixes: 89d8f53 ("wifi: rtw88: Fix Sparse warning for rtw8821c_hw_spec")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Cc: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220608020312.9663-1-pkshih@realtek.com
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants