su without password via /proc/pid/mem write #2

Closed
andrew-aladev opened this Issue Jan 23, 2012 · 1 comment

Projects

None yet

2 participants

@andrew-aladev

http://blog.zx2c4.com/749
your fs/proc/base.c is vulnerable too

@richo richo pushed a commit to richo/linux that referenced this issue Mar 6, 2012
Arvid Brodin usb/isp1760: Simpler queue head list code.
Small code refactoring to ease the real fix in patch #2.

Signed-off-by: Arvid Brodin <arvid.brodin@enea.com>
Tested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
e08f6a2
@richo richo pushed a commit to richo/linux that referenced this issue Mar 6, 2012
Sarang Radke [SCSI] qla4xxx: fix call trace on rmmod with ql4xdontresethba=1
abort all active commands from eh_host_reset in-case
of ql4xdontresethba=1

Fix following call trace:-
Nov 21 14:50:47 172.17.140.111 qla4xxx 0000:13:00.4: qla4_8xxx_disable_msix: qla4xxx (rsp_q)
Nov 21 14:50:47 172.17.140.111 qla4xxx 0000:13:00.4: PCI INT A disabled
Nov 21 14:50:47 172.17.140.111 slab error in kmem_cache_destroy(): cache `qla4xxx_srbs': Can't free all objects
Nov 21 14:50:47 172.17.140.111 Pid: 9154, comm: rmmod Tainted: G           O 3.2.0-rc2+ #2
Nov 21 14:50:47 172.17.140.111 Call Trace:
Nov 21 14:50:47 172.17.140.111  [<c051231a>] ? kmem_cache_destroy+0x9a/0xb0
Nov 21 14:50:47 172.17.140.111  [<c0489c4a>] ? sys_delete_module+0x14a/0x210
Nov 21 14:50:47 172.17.140.111  [<c04fd552>] ? do_munmap+0x202/0x280
Nov 21 14:50:47 172.17.140.111  [<c04a6d4e>] ? audit_syscall_entry+0x1ae/0x1d0
Nov 21 14:50:47 172.17.140.111  [<c083019f>] ? sysenter_do_call+0x12/0x28
Nov 21 14:51:50 172.17.140.111 SLAB: cache with size 64 has lost its name
Nov 21 14:51:50 172.17.140.111 iscsi: registered transport (qla4xxx)
Nov 21 14:51:50 172.17.140.111 qla4xxx 0000:13:00.4: PCI INT A -> GSI 28 (level, low) -> IRQ 28

Signed-off-by: Sarang Radke <sarang.radke@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
8a28896
@richo richo pushed a commit to richo/linux that referenced this issue Mar 6, 2012
@borntraeger borntraeger [S390] kvm: fix sleeping function ... at mm/page_alloc.c:2260
commit cc77245
    [S390] fix list corruption in gmap reverse mapping

added a potential dead lock:

BUG: sleeping function called from invalid context at mm/page_alloc.c:2260
in_atomic(): 1, irqs_disabled(): 0, pid: 1108, name: qemu-system-s39
3 locks held by qemu-system-s39/1108:
 #0:  (&kvm->slots_lock){+.+.+.}, at: [<000003e004866542>] kvm_set_memory_region+0x3a/0x6c [kvm]
 #1:  (&mm->mmap_sem){++++++}, at: [<0000000000123790>] gmap_map_segment+0x9c/0x298
 #2:  (&(&mm->page_table_lock)->rlock){+.+.+.}, at: [<00000000001237a8>] gmap_map_segment+0xb4/0x298
CPU: 0 Not tainted 3.1.3 #45
Process qemu-system-s39 (pid: 1108, task: 00000004f8b3cb30, ksp: 00000004fd5978d0)
00000004fd5979a0 00000004fd597920 0000000000000002 0000000000000000
       00000004fd5979c0 00000004fd597938 00000004fd597938 0000000000617e96
       0000000000000000 00000004f8b3cf58 0000000000000000 0000000000000000
       000000000000000d 000000000000000c 00000004fd597988 0000000000000000
       0000000000000000 0000000000100a18 00000004fd597920 00000004fd597960
Call Trace:
([<0000000000100926>] show_trace+0xee/0x144)
 [<0000000000131f3a>] __might_sleep+0x12a/0x158
 [<0000000000217fb4>] __alloc_pages_nodemask+0x224/0xadc
 [<0000000000123086>] gmap_alloc_table+0x46/0x114
 [<000000000012395c>] gmap_map_segment+0x268/0x298
 [<000003e00486b014>] kvm_arch_commit_memory_region+0x44/0x6c [kvm]
 [<000003e004866414>] __kvm_set_memory_region+0x3b0/0x4a4 [kvm]
 [<000003e004866554>] kvm_set_memory_region+0x4c/0x6c [kvm]
 [<000003e004867c7a>] kvm_vm_ioctl+0x14a/0x314 [kvm]
 [<0000000000292100>] do_vfs_ioctl+0x94/0x588
 [<0000000000292688>] SyS_ioctl+0x94/0xac
 [<000000000061e124>] sysc_noemu+0x22/0x28
 [<000003fffcd5e7ca>] 0x3fffcd5e7ca
3 locks held by qemu-system-s39/1108:
 #0:  (&kvm->slots_lock){+.+.+.}, at: [<000003e004866542>] kvm_set_memory_region+0x3a/0x6c [kvm]
 #1:  (&mm->mmap_sem){++++++}, at: [<0000000000123790>] gmap_map_segment+0x9c/0x298
 #2:  (&(&mm->page_table_lock)->rlock){+.+.+.}, at: [<00000000001237a8>] gmap_map_segment+0xb4/0x298

Fix this by freeing the lock on the alloc path. This is ok, since the
gmap table is never freed until we call gmap_free, so the table we are
walking cannot go.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
c86cce2
@richo richo pushed a commit to richo/linux that referenced this issue Mar 6, 2012
@htejun htejun mempool: fix and document synchronization and memory barrier usage
mempool_alloc/free() use undocumented smp_mb()'s.  The code is slightly
broken and misleading.

The lockless part is in mempool_free().  It wants to determine whether the
item being freed needs to be returned to the pool or backing allocator
without grabbing pool->lock.  Two things need to be guaranteed for correct
operation.

1. pool->curr_nr + #allocated should never dip below pool->min_nr.
2. Waiters shouldn't be left dangling.

For #1, The only necessary condition is that curr_nr visible at free is
from after the allocation of the element being freed (details in the
comment).  For most cases, this is true without any barrier but there can
be fringe cases where the allocated pointer is passed to the freeing task
without going through memory barriers.  To cover this case, wmb is
necessary before returning from allocation and rmb is necessary before
reading curr_nr.  IOW,

	ALLOCATING TASK			FREEING TASK

	update pool state after alloc;
	wmb();
	pass pointer to freeing task;
					read pointer;
					rmb();
					read pool state to free;

The current code doesn't have wmb after pool update during allocation and
may theoretically, on machines where unlock doesn't behave as full wmb,
lead to pool depletion and deadlock.  smp_wmb() needs to be added after
successful allocation from reserved elements and smp_mb() in
mempool_free() can be replaced with smp_rmb().

For #2, the waiter needs to add itself to waitqueue and then check the
wait condition and the waker needs to update the wait condition and then
wake up.  Because waitqueue operations always go through full spinlock
synchronization, there is no need for extra memory barriers.

Furthermore, mempool_alloc() is already holding pool->lock when it decides
that it needs to wait.  There is no reason to do unlock - add waitqueue -
test condition again.  It can simply add itself to waitqueue while holding
pool->lock and then unlock and sleep.

This patch adds smp_wmb() after successful allocation from reserved pool,
replaces smp_mb() in mempool_free() with smp_rmb() and extend pool->lock
over waitqueue addition.  More importantly, it explains what memory
barriers do and how the lockless testing is correct.

-v2: Oleg pointed out that unlock doesn't imply wmb.  Added explicit
     smp_wmb() after successful allocation from reserved pool and
     updated comments accordingly.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5b99054
@richo richo pushed a commit to richo/linux that referenced this issue Mar 6, 2012
Glauber Costa net: introduce res_counter_charge_nofail() for socket allocations
There is a case in __sk_mem_schedule(), where an allocation
is beyond the maximum, but yet we are allowed to proceed.
It happens under the following condition:

	sk->sk_wmem_queued + size >= sk->sk_sndbuf

The network code won't revert the allocation in this case,
meaning that at some point later it'll try to do it. Since
this is never communicated to the underlying res_counter
code, there is an inbalance in res_counter uncharge operation.

I see two ways of fixing this:

1) storing the information about those allocations somewhere
   in memcg, and then deducting from that first, before
   we start draining the res_counter,
2) providing a slightly different allocation function for
   the res_counter, that matches the original behavior of
   the network code more closely.

I decided to go for #2 here, believing it to be more elegant,
since #1 would require us to do basically that, but in a more
obscure way.

Signed-off-by: Glauber Costa <glommer@parallels.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
CC: Tejun Heo <tj@kernel.org>
CC: Li Zefan <lizf@cn.fujitsu.com>
CC: Laurent Chavey <chavey@google.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
0e90b31
@richo richo pushed a commit to richo/linux that referenced this issue Mar 6, 2012
@ming1 ming1 HID: usbhid: fix dead lock between open and disconect
There is no reason to hold hiddev->existancelock before
calling usb_deregister_dev, so move it out of the lock.

The patch fixes the lockdep warning below.

[ 5733.386271] ======================================================
[ 5733.386274] [ INFO: possible circular locking dependency detected ]
[ 5733.386278] 3.2.0-custom-next-20120111+ #1 Not tainted
[ 5733.386281] -------------------------------------------------------
[ 5733.386284] khubd/186 is trying to acquire lock:
[ 5733.386288]  (minor_rwsem){++++.+}, at: [<ffffffffa0011a04>] usb_deregister_dev+0x37/0x9e [usbcore]
[ 5733.386311]
[ 5733.386312] but task is already holding lock:
[ 5733.386315]  (&hiddev->existancelock){+.+...}, at: [<ffffffffa0094d17>] hiddev_disconnect+0x26/0x87 [usbhid]
[ 5733.386328]
[ 5733.386329] which lock already depends on the new lock.
[ 5733.386330]
[ 5733.386333]
[ 5733.386334] the existing dependency chain (in reverse order) is:
[ 5733.386336]
[ 5733.386337] -> #1 (&hiddev->existancelock){+.+...}:
[ 5733.386346]        [<ffffffff81082d26>] lock_acquire+0xcb/0x10e
[ 5733.386357]        [<ffffffff813df961>] __mutex_lock_common+0x60/0x465
[ 5733.386366]        [<ffffffff813dfe4d>] mutex_lock_nested+0x36/0x3b
[ 5733.386371]        [<ffffffffa0094ad6>] hiddev_open+0x113/0x193 [usbhid]
[ 5733.386378]        [<ffffffffa0011971>] usb_open+0x66/0xc2 [usbcore]
[ 5733.386390]        [<ffffffff8111a8b5>] chrdev_open+0x12b/0x154
[ 5733.386402]        [<ffffffff811159a8>] __dentry_open.isra.16+0x20b/0x355
[ 5733.386408]        [<ffffffff811165dc>] nameidata_to_filp+0x43/0x4a
[ 5733.386413]        [<ffffffff81122ed5>] do_last+0x536/0x570
[ 5733.386419]        [<ffffffff8112300b>] path_openat+0xce/0x301
[ 5733.386423]        [<ffffffff81123327>] do_filp_open+0x33/0x81
[ 5733.386427]        [<ffffffff8111664d>] do_sys_open+0x6a/0xfc
[ 5733.386431]        [<ffffffff811166fb>] sys_open+0x1c/0x1e
[ 5733.386434]        [<ffffffff813e7c79>] system_call_fastpath+0x16/0x1b
[ 5733.386441]
[ 5733.386441] -> #0 (minor_rwsem){++++.+}:
[ 5733.386448]        [<ffffffff8108255d>] __lock_acquire+0xa80/0xd74
[ 5733.386454]        [<ffffffff81082d26>] lock_acquire+0xcb/0x10e
[ 5733.386458]        [<ffffffff813e01f5>] down_write+0x44/0x77
[ 5733.386464]        [<ffffffffa0011a04>] usb_deregister_dev+0x37/0x9e [usbcore]
[ 5733.386475]        [<ffffffffa0094d2d>] hiddev_disconnect+0x3c/0x87 [usbhid]
[ 5733.386483]        [<ffffffff8132df51>] hid_disconnect+0x3f/0x54
[ 5733.386491]        [<ffffffff8132dfb4>] hid_device_remove+0x4e/0x7a
[ 5733.386496]        [<ffffffff812c0957>] __device_release_driver+0x81/0xcd
[ 5733.386502]        [<ffffffff812c09c3>] device_release_driver+0x20/0x2d
[ 5733.386507]        [<ffffffff812c0564>] bus_remove_device+0x114/0x128
[ 5733.386512]        [<ffffffff812bdd6f>] device_del+0x131/0x183
[ 5733.386519]        [<ffffffff8132def3>] hid_destroy_device+0x1e/0x3d
[ 5733.386525]        [<ffffffffa00916b0>] usbhid_disconnect+0x36/0x42 [usbhid]
[ 5733.386530]        [<ffffffffa000fb60>] usb_unbind_interface+0x57/0x11f [usbcore]
[ 5733.386542]        [<ffffffff812c0957>] __device_release_driver+0x81/0xcd
[ 5733.386547]        [<ffffffff812c09c3>] device_release_driver+0x20/0x2d
[ 5733.386552]        [<ffffffff812c0564>] bus_remove_device+0x114/0x128
[ 5733.386557]        [<ffffffff812bdd6f>] device_del+0x131/0x183
[ 5733.386562]        [<ffffffffa000de61>] usb_disable_device+0xa8/0x1d8 [usbcore]
[ 5733.386573]        [<ffffffffa0006bd2>] usb_disconnect+0xab/0x11f [usbcore]
[ 5733.386583]        [<ffffffffa0008aa0>] hub_thread+0x73b/0x1157 [usbcore]
[ 5733.386593]        [<ffffffff8105dc0f>] kthread+0x95/0x9d
[ 5733.386601]        [<ffffffff813e90b4>] kernel_thread_helper+0x4/0x10
[ 5733.386607]
[ 5733.386608] other info that might help us debug this:
[ 5733.386609]
[ 5733.386612]  Possible unsafe locking scenario:
[ 5733.386613]
[ 5733.386615]        CPU0                    CPU1
[ 5733.386618]        ----                    ----
[ 5733.386620]   lock(&hiddev->existancelock);
[ 5733.386625]                                lock(minor_rwsem);
[ 5733.386630]                                lock(&hiddev->existancelock);
[ 5733.386635]   lock(minor_rwsem);
[ 5733.386639]
[ 5733.386640]  *** DEADLOCK ***
[ 5733.386641]
[ 5733.386644] 6 locks held by khubd/186:
[ 5733.386646]  #0:  (&__lockdep_no_validate__){......}, at: [<ffffffffa00084af>] hub_thread+0x14a/0x1157 [usbcore]
[ 5733.386661]  #1:  (&__lockdep_no_validate__){......}, at: [<ffffffffa0006b77>] usb_disconnect+0x50/0x11f [usbcore]
[ 5733.386677]  #2:  (hcd->bandwidth_mutex){+.+.+.}, at: [<ffffffffa0006bc8>] usb_disconnect+0xa1/0x11f [usbcore]
[ 5733.386693]  #3:  (&__lockdep_no_validate__){......}, at: [<ffffffff812c09bb>] device_release_driver+0x18/0x2d
[ 5733.386704]  #4:  (&__lockdep_no_validate__){......}, at: [<ffffffff812c09bb>] device_release_driver+0x18/0x2d
[ 5733.386714]  #5:  (&hiddev->existancelock){+.+...}, at: [<ffffffffa0094d17>] hiddev_disconnect+0x26/0x87 [usbhid]
[ 5733.386727]
[ 5733.386727] stack backtrace:
[ 5733.386731] Pid: 186, comm: khubd Not tainted 3.2.0-custom-next-20120111+ #1
[ 5733.386734] Call Trace:
[ 5733.386741]  [<ffffffff81062881>] ? up+0x34/0x3b
[ 5733.386747]  [<ffffffff813d9ef3>] print_circular_bug+0x1f8/0x209
[ 5733.386752]  [<ffffffff8108255d>] __lock_acquire+0xa80/0xd74
[ 5733.386756]  [<ffffffff810808b4>] ? trace_hardirqs_on_caller+0x15d/0x1a3
[ 5733.386763]  [<ffffffff81043a3f>] ? vprintk+0x3f4/0x419
[ 5733.386774]  [<ffffffffa0011a04>] ? usb_deregister_dev+0x37/0x9e [usbcore]
[ 5733.386779]  [<ffffffff81082d26>] lock_acquire+0xcb/0x10e
[ 5733.386789]  [<ffffffffa0011a04>] ? usb_deregister_dev+0x37/0x9e [usbcore]
[ 5733.386797]  [<ffffffff813e01f5>] down_write+0x44/0x77
[ 5733.386807]  [<ffffffffa0011a04>] ? usb_deregister_dev+0x37/0x9e [usbcore]
[ 5733.386818]  [<ffffffffa0011a04>] usb_deregister_dev+0x37/0x9e [usbcore]
[ 5733.386825]  [<ffffffffa0094d2d>] hiddev_disconnect+0x3c/0x87 [usbhid]
[ 5733.386830]  [<ffffffff8132df51>] hid_disconnect+0x3f/0x54
[ 5733.386834]  [<ffffffff8132dfb4>] hid_device_remove+0x4e/0x7a
[ 5733.386839]  [<ffffffff812c0957>] __device_release_driver+0x81/0xcd
[ 5733.386844]  [<ffffffff812c09c3>] device_release_driver+0x20/0x2d
[ 5733.386848]  [<ffffffff812c0564>] bus_remove_device+0x114/0x128
[ 5733.386854]  [<ffffffff812bdd6f>] device_del+0x131/0x183
[ 5733.386859]  [<ffffffff8132def3>] hid_destroy_device+0x1e/0x3d
[ 5733.386865]  [<ffffffffa00916b0>] usbhid_disconnect+0x36/0x42 [usbhid]
[ 5733.386876]  [<ffffffffa000fb60>] usb_unbind_interface+0x57/0x11f [usbcore]
[ 5733.386882]  [<ffffffff812c0957>] __device_release_driver+0x81/0xcd
[ 5733.386886]  [<ffffffff812c09c3>] device_release_driver+0x20/0x2d
[ 5733.386890]  [<ffffffff812c0564>] bus_remove_device+0x114/0x128
[ 5733.386895]  [<ffffffff812bdd6f>] device_del+0x131/0x183
[ 5733.386905]  [<ffffffffa000de61>] usb_disable_device+0xa8/0x1d8 [usbcore]
[ 5733.386916]  [<ffffffffa0006bd2>] usb_disconnect+0xab/0x11f [usbcore]
[ 5733.386921]  [<ffffffff813dff82>] ? __mutex_unlock_slowpath+0x130/0x141
[ 5733.386929]  [<ffffffffa0008aa0>] hub_thread+0x73b/0x1157 [usbcore]
[ 5733.386935]  [<ffffffff8106a51d>] ? finish_task_switch+0x78/0x150
[ 5733.386941]  [<ffffffff8105e396>] ? __init_waitqueue_head+0x4c/0x4c
[ 5733.386950]  [<ffffffffa0008365>] ? usb_remote_wakeup+0x56/0x56 [usbcore]
[ 5733.386955]  [<ffffffff8105dc0f>] kthread+0x95/0x9d
[ 5733.386961]  [<ffffffff813e90b4>] kernel_thread_helper+0x4/0x10
[ 5733.386966]  [<ffffffff813e24b8>] ? retint_restore_args+0x13/0x13
[ 5733.386970]  [<ffffffff8105db7a>] ? __init_kthread_worker+0x55/0x55
[ 5733.386974]  [<ffffffff813e90b0>] ? gs_change+0x13/0x13

Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
ba18311
@richo richo pushed a commit to richo/linux that referenced this issue Mar 6, 2012
Stefano Stabellini xen pvhvm: do not remap pirqs onto evtchns if !xen_have_vector_callback
CC: stable@kernel.org #2.6.37 and onwards
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
207d543
@richo richo pushed a commit to richo/linux that referenced this issue Mar 6, 2012
Mel Gorman mm: compaction: check pfn_valid when entering a new MAX_ORDER_NR_PAGE…
…S block during isolation for migration

When isolating for migration, migration starts at the start of a zone
which is not necessarily pageblock aligned.  Further, it stops isolating
when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally
not aligned.  This allows isolate_migratepages() to call pfn_to_page() on
an invalid PFN which can result in a crash.  This was originally reported
against a 3.0-based kernel with the following trace in a crash dump.

PID: 9902   TASK: d47aecd0  CPU: 0   COMMAND: "memcg_process_s"
 #0 [d72d3ad0] crash_kexec at c028cfdb
 #1 [d72d3b24] oops_end at c05c5322
 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60
 #3 [d72d3bec] bad_area at c0227fb6
 #4 [d72d3c00] do_page_fault at c05c72ec
 #5 [d72d3c80] error_code (via page_fault) at c05c47a4
    EAX: 00000000  EBX: 000c0000  ECX: 00000001  EDX: 00000807  EBP: 000c0000
    DS:  007b      ESI: 00000001  ES:  007b      EDI: f3000a80  GS:  6f50
    CS:  0060      EIP: c030b15a  ERR: ffffffff  EFLAGS: 00010002
 #6 [d72d3cb4] isolate_migratepages at c030b15a
 #7 [d72d3d14] zone_watermark_ok at c02d26cb
 #8 [d72d3d2c] compact_zone at c030b8de
 #9 [d72d3d68] compact_zone_order at c030bba1
#10 [d72d3db4] try_to_compact_pages at c030bc84
#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7
#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7
#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97
#14 [d72d3eb8] alloc_pages_vma at c030a845
#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb
#16 [d72d3f00] handle_mm_fault at c02f36c6
#17 [d72d3f30] do_page_fault at c05c70ed
#18 [d72d3fb0] error_code (via page_fault) at c05c47a4
    EAX: b71ff00  EBX: 00000001  ECX: 00001600  EDX: 00000431
    DS:  007b      ESI: 08048950  ES:  007b      EDI: bfaa3788
    SS:  007b      ESP: bfaa36e0  EBP: bfaa3828  GS:  6f50
    CS:  0073      EIP: 080487c8  ERR: ffffffff  EFLAGS: 00010202

It was also reported by Herbert van den Bergh against 3.1-based kernel
with the following snippet from the console log.

BUG: unable to handle kernel paging request at 01c00008
IP: [<c0522399>] isolate_migratepages+0x119/0x390
*pdpt = 000000002f7ce001 *pde = 0000000000000000

It is expected that it also affects 3.2.x and current mainline.

The problem is that pfn_valid is only called on the first PFN being
checked and that PFN is not necessarily aligned.  Lets say we have a case
like this

H = MAX_ORDER_NR_PAGES boundary
| = pageblock boundary
m = cc->migrate_pfn
f = cc->free_pfn
o = memory hole

H------|------H------|----m-Hoooooo|ooooooH-f----|------H

The migrate_pfn is just below a memory hole and the free scanner is beyond
the hole.  When isolate_migratepages started, it scans from migrate_pfn to
migrate_pfn+pageblock_nr_pages which is now in a memory hole.  It checks
pfn_valid() on the first PFN but then scans into the hole where there are
not necessarily valid struct pages.

This patch ensures that isolate_migratepages calls pfn_valid when
necessary.

Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
0bf380b
@richo richo pushed a commit to richo/linux that referenced this issue Mar 6, 2012
@torvalds torvalds Merge tag 'sound-fixes' of git://git.kernel.org/pub/scm/linux/kernel/…
…git/tiwai/sound

sound fixes #2 for 3.3-rc3

A collection of small fixes, mostly for regressions.
In addition, a few ASoC wm8994 updates are included, too.

* tag 'sound-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ASoC: wm8994: Disable line output discharge prior to ramping VMID
  ASoC: wm8994: Fix typo in VMID ramp setting
  ALSA: oxygen, virtuoso: fix exchanged L/R volumes of aux and CD inputs
  ALSA: usb-audio: add Edirol UM-3G support
  ALSA: hda - add support for Uniwill ECS M31EI notebook
  ALSA: hda - Fix error handling in patch_ca0132.c
  ASoC: wm8994: Enabling VMID should take a runtime PM reference
  ALSA: hda/realtek - Fix a wrong condition
  ALSA: emu8000: Remove duplicate linux/moduleparam.h include from emu8000_patch.c
  ALSA: hda/realtek - Add missing Bass and CLFE as vmaster slaves
  ASoC: wm_hubs: Correct line input to line output 2 paths
  ASoC: cs42l73: Fix Output [X|A|V]SP_SCLK Sourcing Mode setting for master mode
  ASoC: wm8962: Fix word length configuration
  ASoC: core: Better support for idle_bias_off suspend ignores
  ASoC: wm8994: Remove ASoC level register cache sync
  ASoC: wm_hubs: Fix routing of input PGAs to line output mixer
e862f2e
@richo richo pushed a commit to richo/linux that referenced this issue Mar 6, 2012
@yizou yizou ixgbe: do not update real num queues when netdev is going away
If the netdev is already in NETREG_UNREGISTERING/_UNREGISTERED state, do not
update the real num tx queues. netdev_queue_update_kobjects() is already
called via remove_queue_kobjects() at NETREG_UNREGISTERING time. So, when
upper layer driver, e.g., FCoE protocol stack is monitoring the netdev
event of NETDEV_UNREGISTER and calls back to LLD ndo_fcoe_disable() to remove
extra queues allocated for FCoE, the associated txq sysfs kobjects are already
removed, and trying to update the real num queues would cause something like
below:

...
PID: 25138  TASK: ffff88021e64c440  CPU: 3   COMMAND: "kworker/3:3"
 #0 [ffff88021f007760] machine_kexec at ffffffff810226d9
 #1 [ffff88021f0077d0] crash_kexec at ffffffff81089d2d
 #2 [ffff88021f0078a0] oops_end at ffffffff813bca78
 #3 [ffff88021f0078d0] no_context at ffffffff81029e72
 #4 [ffff88021f007920] __bad_area_nosemaphore at ffffffff8102a155
 #5 [ffff88021f0079f0] bad_area_nosemaphore at ffffffff8102a23e
 #6 [ffff88021f007a00] do_page_fault at ffffffff813bf32e
 #7 [ffff88021f007b10] page_fault at ffffffff813bc045
    [exception RIP: sysfs_find_dirent+17]
    RIP: ffffffff81178611  RSP: ffff88021f007bc0  RFLAGS: 00010246
    RAX: ffff88021e64c440  RBX: ffffffff8156cc63  RCX: 0000000000000004
    RDX: ffffffff8156cc63  RSI: 0000000000000000  RDI: 0000000000000000
    RBP: ffff88021f007be0   R8: 0000000000000004   R9: 0000000000000008
    R10: ffffffff816fed00  R11: 0000000000000004  R12: 0000000000000000
    R13: ffffffff8156cc63  R14: 0000000000000000  R15: ffff8802222a0000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #8 [ffff88021f007be8] sysfs_get_dirent at ffffffff81178c07
 #9 [ffff88021f007c18] sysfs_remove_group at ffffffff8117ac27
#10 [ffff88021f007c48] netdev_queue_update_kobjects at ffffffff813178f9
#11 [ffff88021f007c88] netif_set_real_num_tx_queues at ffffffff81303e38
#12 [ffff88021f007cc8] ixgbe_set_num_queues at ffffffffa0249763 [ixgbe]
#13 [ffff88021f007cf8] ixgbe_init_interrupt_scheme at ffffffffa024ea89 [ixgbe]
#14 [ffff88021f007d48] ixgbe_fcoe_disable at ffffffffa0267113 [ixgbe]
#15 [ffff88021f007d68] vlan_dev_fcoe_disable at ffffffffa014fef5 [8021q]
#16 [ffff88021f007d78] fcoe_interface_cleanup at ffffffffa02b7dfd [fcoe]
#17 [ffff88021f007df8] fcoe_destroy_work at ffffffffa02b7f08 [fcoe]
#18 [ffff88021f007e18] process_one_work at ffffffff8105d7ca
#19 [ffff88021f007e68] worker_thread at ffffffff81060513
#20 [ffff88021f007ee8] kthread at ffffffff810648b6
#21 [ffff88021f007f48] kernel_thread_helper at ffffffff813c40f4

Signed-off-by: Yi Zou <yi.zou@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9d837ea
@richo richo pushed a commit to richo/linux that referenced this issue Mar 6, 2012
Andre Guedes Bluetooth: Fix potential deadlock
We don't need to use the _sync variant in hci_conn_hold and
hci_conn_put to cancel conn->disc_work delayed work. This way
we avoid potential deadlocks like this one reported by lockdep.

======================================================
[ INFO: possible circular locking dependency detected ]
3.2.0+ #1 Not tainted
-------------------------------------------------------
kworker/u:1/17 is trying to acquire lock:
 (&hdev->lock){+.+.+.}, at: [<ffffffffa0041155>] hci_conn_timeout+0x62/0x158 [bluetooth]

but task is already holding lock:
 ((&(&conn->disc_work)->work)){+.+...}, at: [<ffffffff81035751>] process_one_work+0x11a/0x2bf

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 ((&(&conn->disc_work)->work)){+.+...}:
       [<ffffffff81057444>] lock_acquire+0x8a/0xa7
       [<ffffffff81034ed1>] wait_on_work+0x3d/0xaa
       [<ffffffff81035b54>] __cancel_work_timer+0xac/0xef
       [<ffffffff81035ba4>] cancel_delayed_work_sync+0xd/0xf
       [<ffffffffa00554b0>] smp_chan_create+0xde/0xe6 [bluetooth]
       [<ffffffffa0056160>] smp_conn_security+0xa3/0x12d [bluetooth]
       [<ffffffffa0053640>] l2cap_connect_cfm+0x237/0x2e8 [bluetooth]
       [<ffffffffa004239c>] hci_proto_connect_cfm+0x2d/0x6f [bluetooth]
       [<ffffffffa0046ea5>] hci_event_packet+0x29d1/0x2d60 [bluetooth]
       [<ffffffffa003dde3>] hci_rx_work+0xd0/0x2e1 [bluetooth]
       [<ffffffff810357af>] process_one_work+0x178/0x2bf
       [<ffffffff81036178>] worker_thread+0xce/0x152
       [<ffffffff81039a03>] kthread+0x95/0x9d
       [<ffffffff812e7754>] kernel_thread_helper+0x4/0x10

-> #1 (slock-AF_BLUETOOTH-BTPROTO_L2CAP){+.+...}:
       [<ffffffff81057444>] lock_acquire+0x8a/0xa7
       [<ffffffff812e553a>] _raw_spin_lock_bh+0x36/0x6a
       [<ffffffff81244d56>] lock_sock_nested+0x24/0x7f
       [<ffffffffa004d96f>] lock_sock+0xb/0xd [bluetooth]
       [<ffffffffa0052906>] l2cap_chan_connect+0xa9/0x26f [bluetooth]
       [<ffffffffa00545f8>] l2cap_sock_connect+0xb3/0xff [bluetooth]
       [<ffffffff81243b48>] sys_connect+0x69/0x8a
       [<ffffffff812e6579>] system_call_fastpath+0x16/0x1b

-> #0 (&hdev->lock){+.+.+.}:
       [<ffffffff81056d06>] __lock_acquire+0xa80/0xd74
       [<ffffffff81057444>] lock_acquire+0x8a/0xa7
       [<ffffffff812e3870>] __mutex_lock_common+0x48/0x38e
       [<ffffffff812e3c75>] mutex_lock_nested+0x2a/0x31
       [<ffffffffa0041155>] hci_conn_timeout+0x62/0x158 [bluetooth]
       [<ffffffff810357af>] process_one_work+0x178/0x2bf
       [<ffffffff81036178>] worker_thread+0xce/0x152
       [<ffffffff81039a03>] kthread+0x95/0x9d
       [<ffffffff812e7754>] kernel_thread_helper+0x4/0x10

other info that might help us debug this:

Chain exists of:
  &hdev->lock --> slock-AF_BLUETOOTH-BTPROTO_L2CAP --> (&(&conn->disc_work)->work)

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock((&(&conn->disc_work)->work));
                               lock(slock-AF_BLUETOOTH-BTPROTO_L2CAP);
                               lock((&(&conn->disc_work)->work));
  lock(&hdev->lock);

 *** DEADLOCK ***

2 locks held by kworker/u:1/17:
 #0:  (hdev->name){.+.+.+}, at: [<ffffffff81035751>] process_one_work+0x11a/0x2bf
 #1:  ((&(&conn->disc_work)->work)){+.+...}, at: [<ffffffff81035751>] process_one_work+0x11a/0x2bf

stack backtrace:
Pid: 17, comm: kworker/u:1 Not tainted 3.2.0+ #1
Call Trace:
 [<ffffffff812e06c6>] print_circular_bug+0x1f8/0x209
 [<ffffffff81056d06>] __lock_acquire+0xa80/0xd74
 [<ffffffff81021ef2>] ? arch_local_irq_restore+0x6/0xd
 [<ffffffff81022bc7>] ? vprintk+0x3f9/0x41e
 [<ffffffff81057444>] lock_acquire+0x8a/0xa7
 [<ffffffffa0041155>] ? hci_conn_timeout+0x62/0x158 [bluetooth]
 [<ffffffff812e3870>] __mutex_lock_common+0x48/0x38e
 [<ffffffffa0041155>] ? hci_conn_timeout+0x62/0x158 [bluetooth]
 [<ffffffff81190fd6>] ? __dynamic_pr_debug+0x6d/0x6f
 [<ffffffffa0041155>] ? hci_conn_timeout+0x62/0x158 [bluetooth]
 [<ffffffff8105320f>] ? trace_hardirqs_off+0xd/0xf
 [<ffffffff812e3c75>] mutex_lock_nested+0x2a/0x31
 [<ffffffffa0041155>] hci_conn_timeout+0x62/0x158 [bluetooth]
 [<ffffffff810357af>] process_one_work+0x178/0x2bf
 [<ffffffff81035751>] ? process_one_work+0x11a/0x2bf
 [<ffffffff81055af3>] ? lock_acquired+0x1d0/0x1df
 [<ffffffffa00410f3>] ? hci_acl_disconn+0x65/0x65 [bluetooth]
 [<ffffffff81036178>] worker_thread+0xce/0x152
 [<ffffffff810407ed>] ? finish_task_switch+0x45/0xc5
 [<ffffffff810360aa>] ? manage_workers.isra.25+0x16a/0x16a
 [<ffffffff81039a03>] kthread+0x95/0x9d
 [<ffffffff812e7754>] kernel_thread_helper+0x4/0x10
 [<ffffffff812e5db4>] ? retint_restore_args+0x13/0x13
 [<ffffffff8103996e>] ? __init_kthread_worker+0x55/0x55
 [<ffffffff812e7750>] ? gs_change+0x13/0x13

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Reviewed-by: Ulisses Furquim <ulisses@profusion.mobi>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
a51cd2b
@richo richo pushed a commit to richo/linux that referenced this issue Mar 6, 2012
Yevgeny Petrilin mlx4: Replacing pool_lock with mutex
Under the spinlock we call request_irq(), which allocates memory with GFP_KERNEL,
This causes the following trace when DEBUG_SPINLOCK is enabled, it can cause
the following trace:

 BUG: spinlock wrong CPU on CPU#2, ethtool/2595
 lock: ffff8801f9cbc2b0, .magic: dead4ead, .owner: ethtool/2595, .owner_cpu: 0
 Pid: 2595, comm: ethtool Not tainted 3.0.18 #2
 Call Trace:
 spin_bug+0xa2/0xf0
 do_raw_spin_unlock+0x71/0xa0
 _raw_spin_unlock+0xe/0x10
 mlx4_assign_eq+0x12b/0x190 [mlx4_core]
 mlx4_en_activate_cq+0x252/0x2d0 [mlx4_en]
 ? mlx4_en_activate_rx_rings+0x227/0x370 [mlx4_en]
 mlx4_en_start_port+0x189/0xb90 [mlx4_en]
 mlx4_en_set_ringparam+0x29a/0x340 [mlx4_en]
 dev_ethtool+0x816/0xb10
 ? dev_get_by_name_rcu+0xa4/0xe0
 dev_ioctl+0x2b5/0x470
 handle_mm_fault+0x1cd/0x2d0
 sock_do_ioctl+0x5d/0x70
 sock_ioctl+0x79/0x2f0
 do_vfs_ioctl+0x8c/0x340
 sys_ioctl+0xa1/0xb0
 system_call_fastpath+0x16/0x1b

Replacing with mutex, which is enough in this case.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
730c41d
@bootc bootc pushed a commit to bootc/linux-rpi-orig that referenced this issue May 8, 2012
Mel Gorman mm: compaction: check pfn_valid when entering a new MAX_ORDER_NR_PAGE…
…S block during isolation for migration

commit 0bf380b upstream.

When isolating for migration, migration starts at the start of a zone
which is not necessarily pageblock aligned.  Further, it stops isolating
when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally
not aligned.  This allows isolate_migratepages() to call pfn_to_page() on
an invalid PFN which can result in a crash.  This was originally reported
against a 3.0-based kernel with the following trace in a crash dump.

PID: 9902   TASK: d47aecd0  CPU: 0   COMMAND: "memcg_process_s"
 #0 [d72d3ad0] crash_kexec at c028cfdb
 #1 [d72d3b24] oops_end at c05c5322
 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60
 #3 [d72d3bec] bad_area at c0227fb6
 #4 [d72d3c00] do_page_fault at c05c72ec
 #5 [d72d3c80] error_code (via page_fault) at c05c47a4
    EAX: 00000000  EBX: 000c0000  ECX: 00000001  EDX: 00000807  EBP: 000c0000
    DS:  007b      ESI: 00000001  ES:  007b      EDI: f3000a80  GS:  6f50
    CS:  0060      EIP: c030b15a  ERR: ffffffff  EFLAGS: 00010002
 #6 [d72d3cb4] isolate_migratepages at c030b15a
 #7 [d72d3d14] zone_watermark_ok at c02d26cb
 #8 [d72d3d2c] compact_zone at c030b8de
 #9 [d72d3d68] compact_zone_order at c030bba1
#10 [d72d3db4] try_to_compact_pages at c030bc84
#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7
#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7
#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97
#14 [d72d3eb8] alloc_pages_vma at c030a845
#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb
#16 [d72d3f00] handle_mm_fault at c02f36c6
#17 [d72d3f30] do_page_fault at c05c70ed
#18 [d72d3fb0] error_code (via page_fault) at c05c47a4
    EAX: b71ff00  EBX: 00000001  ECX: 00001600  EDX: 00000431
    DS:  007b      ESI: 08048950  ES:  007b      EDI: bfaa3788
    SS:  007b      ESP: bfaa36e0  EBP: bfaa3828  GS:  6f50
    CS:  0073      EIP: 080487c8  ERR: ffffffff  EFLAGS: 00010202

It was also reported by Herbert van den Bergh against 3.1-based kernel
with the following snippet from the console log.

BUG: unable to handle kernel paging request at 01c00008
IP: [<c0522399>] isolate_migratepages+0x119/0x390
*pdpt = 000000002f7ce001 *pde = 0000000000000000

It is expected that it also affects 3.2.x and current mainline.

The problem is that pfn_valid is only called on the first PFN being
checked and that PFN is not necessarily aligned.  Lets say we have a case
like this

H = MAX_ORDER_NR_PAGES boundary
| = pageblock boundary
m = cc->migrate_pfn
f = cc->free_pfn
o = memory hole

H------|------H------|----m-Hoooooo|ooooooH-f----|------H

The migrate_pfn is just below a memory hole and the free scanner is beyond
the hole.  When isolate_migratepages started, it scans from migrate_pfn to
migrate_pfn+pageblock_nr_pages which is now in a memory hole.  It checks
pfn_valid() on the first PFN but then scans into the hole where there are
not necessarily valid struct pages.

This patch ensures that isolate_migratepages calls pfn_valid when
necessary.

Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9da11af
@raspberrypi raspberrypi referenced this issue May 12, 2012
Closed

JIRA SW-5991 #20

@PeterHuewe PeterHuewe pushed a commit to PeterHuewe/linux that referenced this issue May 18, 2012
Kirill A. Shutemov hwmon: (coretemp) fix oops on cpu unplug
commit b704871 upstream.

coretemp tries to access core_data array beyond bounds on cpu unplug if
core id of the cpu if more than NUM_REAL_CORES-1.

BUG: unable to handle kernel NULL pointer dereference at 000000000000013c
IP: [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
PGD 673e5a067 PUD 66e9b3067 PMD 0
Oops: 0000 [#1] SMP
CPU 79
Modules linked in: sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf bnep bluetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter nf_conntrack_ipv4 nf_defrag_ipv4 ip6_tables xt_state nf_conntrack coretemp crc32c_intel asix tpm_tis pcspkr usbnet iTCO_wdt i2c_i801 microcode mii joydev tpm i2c_core iTCO_vendor_support tpm_bios i7core_edac igb ioatdma edac_core dca megaraid_sas [last unloaded: oprofile]

Pid: 3315, comm: set-cpus Tainted: G        W    3.4.0-rc5+ #2 QCI QSSC-S4R/QSSC-S4R
RIP: 0010:[<ffffffffa00159af>]  [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
RSP: 0018:ffff880472fb3d48  EFLAGS: 00010246
RAX: 0000000000000124 RBX: 0000000000000034 RCX: 00000000ffffffff
RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246
RBP: ffff880472fb3d88 R08: ffff88077fcd36c0 R09: 0000000000000001
R10: ffffffff8184bc48 R11: 0000000000000000 R12: ffff880273095800
R13: 0000000000000013 R14: ffff8802730a1810 R15: 0000000000000000
FS:  00007f694a20f720(0000) GS:ffff88077fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000000000013c CR3: 000000067209b000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process set-cpus (pid: 3315, threadinfo ffff880472fb2000, task ffff880471fa0000)
Stack:
 ffff880277b4c308 0000000000000003 ffff880472fb3d88 0000000000000005
 0000000000000034 00000000ffffffd1 ffffffff81cadc70 ffff880472fb3e14
 ffff880472fb3dc8 ffffffff8161f48d ffff880471fa0000 0000000000000034
Call Trace:
 [<ffffffff8161f48d>] notifier_call_chain+0x4d/0x70
 [<ffffffff8107f1be>] __raw_notifier_call_chain+0xe/0x10
 [<ffffffff81059d30>] __cpu_notify+0x20/0x40
 [<ffffffff815fa251>] _cpu_down+0x81/0x270
 [<ffffffff815fa477>] cpu_down+0x37/0x50
 [<ffffffff815fd6a3>] store_online+0x63/0xc0
 [<ffffffff813c7078>] dev_attr_store+0x18/0x30
 [<ffffffff811f02cf>] sysfs_write_file+0xef/0x170
 [<ffffffff81180443>] vfs_write+0xb3/0x180
 [<ffffffff8118076a>] sys_write+0x4a/0x90
 [<ffffffff816236a9>] system_call_fastpath+0x16/0x1b
Code: 48 c7 c7 94 60 01 a0 44 0f b7 ac 10 ac 00 00 00 31 c0 e8 41 b7 5f e1 41 83 c5 02 49 63 c5 49 8b 44 c4 10 48 85 c0 74 56 45 31 ff <39> 58 18 75 4e eb 1f 49 63 d7 4c 89 f7 48 89 45 c8 48 6b d2 28
RIP  [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
 RSP <ffff880472fb3d48>
CR2: 000000000000013c

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5f991cd
@andatche andatche pushed a commit to andatche/raspi-linux that referenced this issue May 29, 2012
Kirill A. Shutemov hwmon: (coretemp) fix oops on cpu unplug
commit b704871 upstream.

coretemp tries to access core_data array beyond bounds on cpu unplug if
core id of the cpu if more than NUM_REAL_CORES-1.

BUG: unable to handle kernel NULL pointer dereference at 000000000000013c
IP: [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
PGD 673e5a067 PUD 66e9b3067 PMD 0
Oops: 0000 [#1] SMP
CPU 79
Modules linked in: sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf bnep bluetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter nf_conntrack_ipv4 nf_defrag_ipv4 ip6_tables xt_state nf_conntrack coretemp crc32c_intel asix tpm_tis pcspkr usbnet iTCO_wdt i2c_i801 microcode mii joydev tpm i2c_core iTCO_vendor_support tpm_bios i7core_edac igb ioatdma edac_core dca megaraid_sas [last unloaded: oprofile]

Pid: 3315, comm: set-cpus Tainted: G        W    3.4.0-rc5+ #2 QCI QSSC-S4R/QSSC-S4R
RIP: 0010:[<ffffffffa00159af>]  [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
RSP: 0018:ffff880472fb3d48  EFLAGS: 00010246
RAX: 0000000000000124 RBX: 0000000000000034 RCX: 00000000ffffffff
RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246
RBP: ffff880472fb3d88 R08: ffff88077fcd36c0 R09: 0000000000000001
R10: ffffffff8184bc48 R11: 0000000000000000 R12: ffff880273095800
R13: 0000000000000013 R14: ffff8802730a1810 R15: 0000000000000000
FS:  00007f694a20f720(0000) GS:ffff88077fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000000000013c CR3: 000000067209b000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process set-cpus (pid: 3315, threadinfo ffff880472fb2000, task ffff880471fa0000)
Stack:
 ffff880277b4c308 0000000000000003 ffff880472fb3d88 0000000000000005
 0000000000000034 00000000ffffffd1 ffffffff81cadc70 ffff880472fb3e14
 ffff880472fb3dc8 ffffffff8161f48d ffff880471fa0000 0000000000000034
Call Trace:
 [<ffffffff8161f48d>] notifier_call_chain+0x4d/0x70
 [<ffffffff8107f1be>] __raw_notifier_call_chain+0xe/0x10
 [<ffffffff81059d30>] __cpu_notify+0x20/0x40
 [<ffffffff815fa251>] _cpu_down+0x81/0x270
 [<ffffffff815fa477>] cpu_down+0x37/0x50
 [<ffffffff815fd6a3>] store_online+0x63/0xc0
 [<ffffffff813c7078>] dev_attr_store+0x18/0x30
 [<ffffffff811f02cf>] sysfs_write_file+0xef/0x170
 [<ffffffff81180443>] vfs_write+0xb3/0x180
 [<ffffffff8118076a>] sys_write+0x4a/0x90
 [<ffffffff816236a9>] system_call_fastpath+0x16/0x1b
Code: 48 c7 c7 94 60 01 a0 44 0f b7 ac 10 ac 00 00 00 31 c0 e8 41 b7 5f e1 41 83 c5 02 49 63 c5 49 8b 44 c4 10 48 85 c0 74 56 45 31 ff <39> 58 18 75 4e eb 1f 49 63 d7 4c 89 f7 48 89 45 c8 48 6b d2 28
RIP  [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
 RSP <ffff880472fb3d48>
CR2: 000000000000013c

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
f1f1ed7
@erique erique pushed a commit to erique/rpi_linux that referenced this issue Jul 16, 2012
@bebarino bebarino ks8851: Fix mutex deadlock in ks8851_net_stop()
There is a potential deadlock scenario when the ks8851 driver
is removed. The interrupt handler schedules a workqueue which
acquires a mutex that ks8851_net_stop() also acquires before
flushing the workqueue. Previously lockdep wouldn't be able
to find this problem but now that it has the support we can
trigger this lockdep warning by rmmoding the driver after
an ifconfig up.

Fix the possible deadlock by disabling the interrupts in
the chip and then release the lock across the workqueue
flushing. The mutex is only there to proect the registers
anyway so this should be ok.

=======================================================
[ INFO: possible circular locking dependency detected ]
3.0.21-00021-g8b33780-dirty #2911
-------------------------------------------------------
rmmod/125 is trying to acquire lock:
 ((&ks->irq_work)){+.+...}, at: [<c019e0b8>] flush_work+0x0/0xac

but task is already holding lock:
 (&ks->lock){+.+...}, at: [<bf00b850>] ks8851_net_stop+0x64/0x138 [ks8851]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&ks->lock){+.+...}:
       [<c01b89c8>] __lock_acquire+0x940/0x9f8
       [<c01b9058>] lock_acquire+0x10c/0x130
       [<c083dbec>] mutex_lock_nested+0x68/0x3dc
       [<bf00bd48>] ks8851_irq_work+0x24/0x46c [ks8851]
       [<c019c580>] process_one_work+0x2d8/0x518
       [<c019cb98>] worker_thread+0x220/0x3a0
       [<c01a2ad4>] kthread+0x88/0x94
       [<c0107008>] kernel_thread_exit+0x0/0x8

-> #0 ((&ks->irq_work)){+.+...}:
       [<c01b7984>] validate_chain+0x914/0x1018
       [<c01b89c8>] __lock_acquire+0x940/0x9f8
       [<c01b9058>] lock_acquire+0x10c/0x130
       [<c019e104>] flush_work+0x4c/0xac
       [<bf00b858>] ks8851_net_stop+0x6c/0x138 [ks8851]
       [<c06b209c>] __dev_close_many+0x98/0xcc
       [<c06b2174>] dev_close_many+0x68/0xd0
       [<c06b22ec>] rollback_registered_many+0xcc/0x2b8
       [<c06b2554>] rollback_registered+0x28/0x34
       [<c06b25b8>] unregister_netdevice_queue+0x58/0x7c
       [<c06b25f4>] unregister_netdev+0x18/0x20
       [<bf00c1f4>] ks8851_remove+0x64/0xb4 [ks8851]
       [<c049ddf0>] spi_drv_remove+0x18/0x1c
       [<c0468e98>] __device_release_driver+0x7c/0xbc
       [<c0468f64>] driver_detach+0x8c/0xb4
       [<c0467f00>] bus_remove_driver+0xb8/0xe8
       [<c01c1d20>] sys_delete_module+0x1e8/0x27c
       [<c0105ec0>] ret_fast_syscall+0x0/0x3c

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&ks->lock);
                               lock((&ks->irq_work));
                               lock(&ks->lock);
  lock((&ks->irq_work));

 *** DEADLOCK ***

4 locks held by rmmod/125:
 #0:  (&__lockdep_no_validate__){+.+.+.}, at: [<c0468f44>] driver_detach+0x6c/0xb4
 #1:  (&__lockdep_no_validate__){+.+.+.}, at: [<c0468f50>] driver_detach+0x78/0xb4
 #2:  (rtnl_mutex){+.+.+.}, at: [<c06b25e8>] unregister_netdev+0xc/0x20
 #3:  (&ks->lock){+.+...}, at: [<bf00b850>] ks8851_net_stop+0x64/0x138 [ks8851]

Cc: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
c5a9993
@erique erique pushed a commit to erique/rpi_linux that referenced this issue Jul 16, 2012
Konrad Rzeszutek Wilk xen/enlighten: Disable MWAIT_LEAF so that acpi-pad won't be loaded.
There are exactly four users of __monitor and __mwait:

 - cstate.c (which allows acpi_processor_ffh_cstate_enter to be called
   when the cpuidle API drivers are used. However patch
   "cpuidle: replace xen access to x86 pm_idle and default_idle"
   provides a mechanism to disable the cpuidle and use safe_halt.
 - smpboot (which allows mwait_play_dead to be called). However
   safe_halt is always used so we skip that.
 - intel_idle (same deal as above).
 - acpi_pad.c. This the one that we do not want to run as we
   will hit the below crash.

Why do we want to expose MWAIT_LEAF in the first place?
We want it for the xen-acpi-processor driver - which uploads
C-states to the hypervisor. If MWAIT_LEAF is set, the cstate.c
sets the proper address in the C-states so that the hypervisor
can benefit from using the MWAIT functionality. And that is
the sole reason for using it.

Without this patch, if a module performs mwait or monitor we
get this:

invalid opcode: 0000 [#1] SMP
CPU 2
.. snip..
Pid: 5036, comm: insmod Tainted: G           O 3.4.0-rc2upstream-dirty #2 Intel Corporation S2600CP/S2600CP
RIP: e030:[<ffffffffa000a017>]  [<ffffffffa000a017>] mwait_check_init+0x17/0x1000 [mwait_check]
RSP: e02b:ffff8801c298bf18  EFLAGS: 00010282
RAX: ffff8801c298a010 RBX: ffffffffa03b2000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff8801c29800d8 RDI: ffff8801ff097200
RBP: ffff8801c298bf18 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
R13: ffffffffa000a000 R14: 0000005148db7294 R15: 0000000000000003
FS:  00007fbb364f2700(0000) GS:ffff8801ff08c000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000000179f038 CR3: 00000001c9469000 CR4: 0000000000002660
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process insmod (pid: 5036, threadinfo ffff8801c298a000, task ffff8801c29cd7e0)
Stack:
 ffff8801c298bf48 ffffffff81002124 ffffffffa03b2000 00000000000081fd
 000000000178f010 000000000178f030 ffff8801c298bf78 ffffffff810c41e6
 00007fff3fb30db9 00007fff3fb30db9 00000000000081fd 0000000000010000
Call Trace:
 [<ffffffff81002124>] do_one_initcall+0x124/0x170
 [<ffffffff810c41e6>] sys_init_module+0xc6/0x220
 [<ffffffff815b15b9>] system_call_fastpath+0x16/0x1b
Code: <0f> 01 c8 31 c0 0f 01 c9 c9 c3 00 00 00 00 00 00 00 00 00 00 00 00
RIP  [<ffffffffa000a017>] mwait_check_init+0x17/0x1000 [mwait_check]
 RSP <ffff8801c298bf18>
---[ end trace 16582fc8a3d1e29a ]---
Kernel panic - not syncing: Fatal exception

With this module (which is what acpi_pad.c would hit):

MODULE_AUTHOR("Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>");
MODULE_DESCRIPTION("mwait_check_and_back");
MODULE_LICENSE("GPL");
MODULE_VERSION();

static int __init mwait_check_init(void)
{
	__monitor((void *)&current_thread_info()->flags, 0, 0);
	__mwait(0, 0);
	return 0;
}
static void __exit mwait_check_exit(void)
{
}
module_init(mwait_check_init);
module_exit(mwait_check_exit);

Reported-by: Liu, Jinsong <jinsong.liu@intel.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
df88b2d
@erique erique pushed a commit to erique/rpi_linux that referenced this issue Jul 16, 2012
Kirill A. Shutemov hwmon: (coretemp) fix oops on cpu unplug
coretemp tries to access core_data array beyond bounds on cpu unplug if
core id of the cpu if more than NUM_REAL_CORES-1.

BUG: unable to handle kernel NULL pointer dereference at 000000000000013c
IP: [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
PGD 673e5a067 PUD 66e9b3067 PMD 0
Oops: 0000 [#1] SMP
CPU 79
Modules linked in: sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf bnep bluetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter nf_conntrack_ipv4 nf_defrag_ipv4 ip6_tables xt_state nf_conntrack coretemp crc32c_intel asix tpm_tis pcspkr usbnet iTCO_wdt i2c_i801 microcode mii joydev tpm i2c_core iTCO_vendor_support tpm_bios i7core_edac igb ioatdma edac_core dca megaraid_sas [last unloaded: oprofile]

Pid: 3315, comm: set-cpus Tainted: G        W    3.4.0-rc5+ #2 QCI QSSC-S4R/QSSC-S4R
RIP: 0010:[<ffffffffa00159af>]  [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
RSP: 0018:ffff880472fb3d48  EFLAGS: 00010246
RAX: 0000000000000124 RBX: 0000000000000034 RCX: 00000000ffffffff
RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246
RBP: ffff880472fb3d88 R08: ffff88077fcd36c0 R09: 0000000000000001
R10: ffffffff8184bc48 R11: 0000000000000000 R12: ffff880273095800
R13: 0000000000000013 R14: ffff8802730a1810 R15: 0000000000000000
FS:  00007f694a20f720(0000) GS:ffff88077fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000000000013c CR3: 000000067209b000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process set-cpus (pid: 3315, threadinfo ffff880472fb2000, task ffff880471fa0000)
Stack:
 ffff880277b4c308 0000000000000003 ffff880472fb3d88 0000000000000005
 0000000000000034 00000000ffffffd1 ffffffff81cadc70 ffff880472fb3e14
 ffff880472fb3dc8 ffffffff8161f48d ffff880471fa0000 0000000000000034
Call Trace:
 [<ffffffff8161f48d>] notifier_call_chain+0x4d/0x70
 [<ffffffff8107f1be>] __raw_notifier_call_chain+0xe/0x10
 [<ffffffff81059d30>] __cpu_notify+0x20/0x40
 [<ffffffff815fa251>] _cpu_down+0x81/0x270
 [<ffffffff815fa477>] cpu_down+0x37/0x50
 [<ffffffff815fd6a3>] store_online+0x63/0xc0
 [<ffffffff813c7078>] dev_attr_store+0x18/0x30
 [<ffffffff811f02cf>] sysfs_write_file+0xef/0x170
 [<ffffffff81180443>] vfs_write+0xb3/0x180
 [<ffffffff8118076a>] sys_write+0x4a/0x90
 [<ffffffff816236a9>] system_call_fastpath+0x16/0x1b
Code: 48 c7 c7 94 60 01 a0 44 0f b7 ac 10 ac 00 00 00 31 c0 e8 41 b7 5f e1 41 83 c5 02 49 63 c5 49 8b 44 c4 10 48 85 c0 74 56 45 31 ff <39> 58 18 75 4e eb 1f 49 63 d7 4c 89 f7 48 89 45 c8 48 6b d2 28
RIP  [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
 RSP <ffff880472fb3d48>
CR2: 000000000000013c

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: stable@vger.kernel.org # 3.0+
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
b704871
@erique erique pushed a commit to erique/rpi_linux that referenced this issue Jul 16, 2012
James Bottomley [PARISC] fix PA1.1 oops on boot
All PA1.1 systems have been oopsing on boot since

commit f311847
Author: James Bottomley <James.Bottomley@HansenPartnership.com>
Date:   Wed Dec 22 10:22:11 2010 -0600

    parisc: flush pages through tmpalias space

because a PA2.0 instruction was accidentally introduced into the PA1.1 TLB
insertion interruption path when it was consolidated with the do_alias macro.
Fix the do_alias macro only to use PA2.0 instructions if compiled for 64 bit.
Cc: stable@vger.kernel.org  #2.6.39+
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
5e18558
@erique erique pushed a commit to erique/rpi_linux that referenced this issue Jul 16, 2012
John David Anglin [PARISC] fix crash in flush_icache_page_asm on PA1.1
As pointed out by serveral people, PA1.1 only has a type 26 instruction
meaning that the space register must be explicitly encoded.  Not giving an
explicit space means that the compiler uses the type 24 version which is PA2.0
only resulting in an illegal instruction crash.

This regression was caused by

    commit f311847
    Author: James Bottomley <James.Bottomley@HansenPartnership.com>
    Date:   Wed Dec 22 10:22:11 2010 -0600

        parisc: flush pages through tmpalias space

Reported-by: Helge Deller <deller@gmx.de>
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org	#2.6.39+
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
207f583
@erique erique pushed a commit to erique/rpi_linux that referenced this issue Jul 16, 2012
Andrea Arcangeli mm: pmd_read_atomic: fix 32bit PAE pmd walk vs pmd_populate SMP race …
…condition

commit 26c1917 upstream.

When holding the mmap_sem for reading, pmd_offset_map_lock should only
run on a pmd_t that has been read atomically from the pmdp pointer,
otherwise we may read only half of it leading to this crash.

PID: 11679  TASK: f06e8000  CPU: 3   COMMAND: "do_race_2_panic"
 #0 [f06a9dd8] crash_kexec at c049b5ec
 #1 [f06a9e2c] oops_end at c083d1c2
 #2 [f06a9e40] no_context at c0433ded
 #3 [f06a9e64] bad_area_nosemaphore at c043401a
 #4 [f06a9e6c] __do_page_fault at c0434493
 #5 [f06a9eec] do_page_fault at c083eb45
 #6 [f06a9f04] error_code (via page_fault) at c083c5d5
    EAX: 01fb470c EBX: fff35000 ECX: 00000003 EDX: 00000100 EBP:
    00000000
    DS:  007b     ESI: 9e201000 ES:  007b     EDI: 01fb4700 GS:  00e0
    CS:  0060     EIP: c083bc14 ERR: ffffffff EFLAGS: 00010246
 #7 [f06a9f38] _spin_lock at c083bc14
 #8 [f06a9f44] sys_mincore at c0507b7d
 #9 [f06a9fb0] system_call at c083becd
                         start           len
    EAX: ffffffda  EBX: 9e200000  ECX: 00001000  EDX: 6228537f
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 003d0f00
    SS:  007b      ESP: 62285354  EBP: 62285388  GS:  0033
    CS:  0073      EIP: 00291416  ERR: 000000da  EFLAGS: 00000286

This should be a longstanding bug affecting x86 32bit PAE without THP.
Only archs with 64bit large pmd_t and 32bit unsigned long should be
affected.

With THP enabled the barrier() in pmd_none_or_trans_huge_or_clear_bad()
would partly hide the bug when the pmd transition from none to stable,
by forcing a re-read of the *pmd in pmd_offset_map_lock, but when THP is
enabled a new set of problem arises by the fact could then transition
freely in any of the none, pmd_trans_huge or pmd_trans_stable states.
So making the barrier in pmd_none_or_trans_huge_or_clear_bad()
unconditional isn't good idea and it would be a flakey solution.

This should be fully fixed by introducing a pmd_read_atomic that reads
the pmd in order with THP disabled, or by reading the pmd atomically
with cmpxchg8b with THP enabled.

Luckily this new race condition only triggers in the places that must
already be covered by pmd_none_or_trans_huge_or_clear_bad() so the fix
is localized there but this bug is not related to THP.

NOTE: this can trigger on x86 32bit systems with PAE enabled with more
than 4G of ram, otherwise the high part of the pmd will never risk to be
truncated because it would be zero at all times, in turn so hiding the
SMP race.

This bug was discovered and fully debugged by Ulrich, quote:

----
[..]
pmd_none_or_trans_huge_or_clear_bad() loads the content of edx and
eax.

    496 static inline int pmd_none_or_trans_huge_or_clear_bad(pmd_t
    *pmd)
    497 {
    498         /* depend on compiler for an atomic pmd read */
    499         pmd_t pmdval = *pmd;

                                // edi = pmd pointer
0xc0507a74 <sys_mincore+548>:   mov    0x8(%esp),%edi
...
                                // edx = PTE page table high address
0xc0507a84 <sys_mincore+564>:   mov    0x4(%edi),%edx
...
                                // eax = PTE page table low address
0xc0507a8e <sys_mincore+574>:   mov    (%edi),%eax

[..]

Please note that the PMD is not read atomically. These are two "mov"
instructions where the high order bits of the PMD entry are fetched
first. Hence, the above machine code is prone to the following race.

-  The PMD entry {high|low} is 0x0000000000000000.
   The "mov" at 0xc0507a84 loads 0x00000000 into edx.

-  A page fault (on another CPU) sneaks in between the two "mov"
   instructions and instantiates the PMD.

-  The PMD entry {high|low} is now 0x00000003fda38067.
   The "mov" at 0xc0507a8e loads 0xfda38067 into eax.
----

Reported-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2d363e9
@popcornmix popcornmix pushed a commit to popcornmix/linux that referenced this issue Aug 16, 2012
@jtlayton jtlayton cifs: when CONFIG_HIGHMEM is set, serialize the read/write kmaps
commit 3cf003c upstream.

Jian found that when he ran fsx on a 32 bit arch with a large wsize the
process and one of the bdi writeback kthreads would sometimes deadlock
with a stack trace like this:

crash> bt
PID: 2789   TASK: f02edaa0  CPU: 3   COMMAND: "fsx"
 #0 [eed63cbc] schedule at c083c5b3
 #1 [eed63d80] kmap_high at c0500ec8
 #2 [eed63db0] cifs_async_writev at f7fabcd7 [cifs]
 #3 [eed63df0] cifs_writepages at f7fb7f5c [cifs]
 #4 [eed63e50] do_writepages at c04f3e32
 #5 [eed63e54] __filemap_fdatawrite_range at c04e152a
 #6 [eed63ea4] filemap_fdatawrite at c04e1b3e
 #7 [eed63eb4] cifs_file_aio_write at f7fa111a [cifs]
 #8 [eed63ecc] do_sync_write at c052d202
 #9 [eed63f74] vfs_write at c052d4ee
#10 [eed63f94] sys_write at c052df4c
#11 [eed63fb0] ia32_sysenter_target at c0409a98
    EAX: 00000004  EBX: 00000003  ECX: abd73b73  EDX: 012a65c6
    DS:  007b      ESI: 012a65c6  ES:  007b      EDI: 00000000
    SS:  007b      ESP: bf8db178  EBP: bf8db1f8  GS:  0033
    CS:  0073      EIP: 40000424  ERR: 00000004  EFLAGS: 00000246

Each task would kmap part of its address array before getting stuck, but
not enough to actually issue the write.

This patch fixes this by serializing the marshal_iov operations for
async reads and writes. The idea here is to ensure that cifs
aggressively tries to populate a request before attempting to fulfill
another one. As soon as all of the pages are kmapped for a request, then
we can unlock and allow another one to proceed.

There's no need to do this serialization on non-CONFIG_HIGHMEM arches
however, so optimize all of this out when CONFIG_HIGHMEM isn't set.

Reported-by: Jian Li <jiali@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
bb761c9
@popcornmix popcornmix pushed a commit to popcornmix/linux that referenced this issue Aug 16, 2012
@jtlayton jtlayton nfs: skip commit in releasepage if we're freeing memory for fs-relate…
…d reasons

commit 5cf02d0 upstream.

We've had some reports of a deadlock where rpciod ends up with a stack
trace like this:

    PID: 2507   TASK: ffff88103691ab40  CPU: 14  COMMAND: "rpciod/14"
     #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
     #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
     #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
     #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
     #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
     #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
     #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
     #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
     #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
     #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
    #10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
    #11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
    #12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
    #13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
    #14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
    #15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
    #16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
    #17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
    #18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
    #19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
    #20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
    #21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
    #22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
    #23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
    #24 [ffff8810343bfee8] kthread at ffffffff8108dd96
    #25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca

rpciod is trying to allocate memory for a new socket to talk to the
server. The VM ends up calling ->releasepage to get more memory, and it
tries to do a blocking commit. That commit can't succeed however without
a connected socket, so we deadlock.

Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
socket allocation, and having nfs_release_page check for that flag when
deciding whether to do a commit call. Also, set PF_FSTRANS
unconditionally in rpc_async_schedule since that function can also do
allocations sometimes.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
1c88c58
@popcornmix
Collaborator

The kernel update to 3.2.27 should have solved this. Please reopen if you don't think this is fixed.

@popcornmix popcornmix closed this Aug 20, 2012
@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Sep 20, 2012
@mgrzeschik mgrzeschik usb: chipidea: udc: don't stall endpoint if request list is empty in …
…isr_tr_complete_low

When attaching an imx28 or imx53 in USB gadget mode to a Windows host and
starting a rndis connection we see this message every 4-10 seconds:

    g_ether gadget: high speed config #2: RNDIS

Analysis shows that each time this message is printed, the rndis connection is
re-establish due to a reset because of a stalled endpoint (ep 0, dir 1). The
endpoint is stalled because the reqeust complete bit on that endpoint is set,
but in isr_tr_complete_low() the endpoint request list (mEp->qh.queue) is
empty.

This patch removed this check, because the code doesn't take the following
situation into account:

The loop over all endpoints in isr_tr_complete_handler() will call ep_nuke() on
both ep0/dir0 and ep/dir1 in the first loop. Pending reqeusts will be flushed
and completed here. There seems to be a race condition, the request is nuked,
but the request complete bit will be set, too. The subsequent check (in
ep0/dir1's loop cycle) for endpoint request list (mEp->qh.queue) empty will
fail.

Both other mainline chipidea drivers (mv_udc_core.c and fsl_udc_core.c) don't
have this check.

Cc: stable <stable@vger.kernel.org>
Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
db89960
@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Sep 20, 2012
Eddie Wai [SCSI] bnx2i: Fixed NULL ptr deference for 1G bnx2 Linux iSCSI offload
This patch fixes the following kernel panic invoked by uninitialized fields
in the chip initialization for the 1G bnx2 iSCSI offload.

One of the bits in the chip initialization is being used by the latest
firmware to control overflow packets.  When this control bit gets enabled
erroneously, it would ultimately result in a bad packet placement which would
cause the bnx2 driver to dereference a NULL ptr in the placement handler.

This can happen under certain stress I/O environment under the Linux
iSCSI offload operation.

This change only affects Broadcom's 5709 chipset.

Unable to handle kernel NULL pointer dereference at 0000000000000008 RIP:
 [<ffffffff881f0e7d>] :bnx2:bnx2_poll_work+0xd0d/0x13c5
Pid: 0, comm: swapper Tainted: G     ---- 2.6.18-333.el5debug #2
RIP: 0010:[<ffffffff881f0e7d>]  [<ffffffff881f0e7d>] :bnx2:bnx2_poll_work+0xd0d/0x13c5
RSP: 0018:ffff8101b575bd50  EFLAGS: 00010216
RAX: 0000000000000005 RBX: ffff81007c5fb180 RCX: 0000000000000000
RDX: 0000000000000ffc RSI: 00000000817e8000 RDI: 0000000000000220
RBP: ffff81015bbd7ec0 R08: ffff8100817e9000 R09: 0000000000000000
R10: ffff81007c5fb180 R11: 00000000000000c8 R12: 000000007a25a010
R13: 0000000000000000 R14: 0000000000000005 R15: ffff810159f80558
FS:  0000000000000000(0000) GS:ffff8101afebc240(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000008 CR3: 0000000000201000 CR4: 00000000000006a0
Process swapper (pid: 0, threadinfo ffff8101b5754000, task ffff8101afebd820)
Stack:  000000000000000b ffff810159f80000 0000000000000040 ffff810159f80520
 ffff810159f80500 00cf00cf8008e84b ffffc200100939e0 ffff810009035b20
 0000502900000000 000000be00000001 ffff8100817e7810 00d08101b575bea8
Call Trace:
 <IRQ>  [<ffffffff8008e0d0>] show_schedstat+0x1c2/0x25b
 [<ffffffff881f1886>] :bnx2:bnx2_poll+0xf6/0x231
 [<ffffffff8000c9b9>] net_rx_action+0xac/0x1b1
 [<ffffffff800125a0>] __do_softirq+0x89/0x133
 [<ffffffff8005e30c>] call_softirq+0x1c/0x28
 [<ffffffff8006d5de>] do_softirq+0x2c/0x7d
 [<ffffffff8006d46e>] do_IRQ+0xee/0xf7
 [<ffffffff8005d625>] ret_from_intr+0x0/0xa
 <EOI>  [<ffffffff801a5780>] acpi_processor_idle_simple+0x1c5/0x341
 [<ffffffff801a573d>] acpi_processor_idle_simple+0x182/0x341
 [<ffffffff801a55bb>] acpi_processor_idle_simple+0x0/0x341
 [<ffffffff80049560>] cpu_idle+0x95/0xb8
 [<ffffffff80078b1c>] start_secondary+0x479/0x488

Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Cc: stable@vger.kernel.org
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
d653220
@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Sep 20, 2012
Ben Myers xfs: stop the sync worker before xfs_unmountfs
Cancel work of the xfs_sync_worker before teardown of the log in
xfs_unmountfs.  This prevents occasional crashes on unmount like so:

PID: 21602  TASK: ee9df060  CPU: 0   COMMAND: "kworker/0:3"
 #0 [c5377d28] crash_kexec at c0292c94
 #1 [c5377d80] oops_end at c07090c2
 #2 [c5377d98] no_context at c06f614e
 #3 [c5377dbc] __bad_area_nosemaphore at c06f6281
 #4 [c5377df4] bad_area_nosemaphore at c06f629b
 #5 [c5377e00] do_page_fault at c070b0cb
 #6 [c5377e7c] error_code (via page_fault) at c070892c
    EAX: f300c6a8  EBX: f300c6a8  ECX: 000000c0  EDX: 000000c0  EBP: c5377ed0
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 00000001  GS:  ffffad20
    CS:  0060      EIP: c0481ad0  ERR: ffffffff  EFLAGS: 00010246
 #7 [c5377eb0] atomic64_read_cx8 at c0481ad0
 #8 [c5377ebc] xlog_assign_tail_lsn_locked at f7cc7c6e [xfs]
 #9 [c5377ed4] xfs_trans_ail_delete_bulk at f7ccd520 [xfs]
#10 [c5377f0c] xfs_buf_iodone at f7ccb602 [xfs]
#11 [c5377f24] xfs_buf_do_callbacks at f7cca524 [xfs]
#12 [c5377f30] xfs_buf_iodone_callbacks at f7cca5da [xfs]
#13 [c5377f4c] xfs_buf_iodone_work at f7c718d0 [xfs]
#14 [c5377f58] process_one_work at c024ee4c
#15 [c5377f98] worker_thread at c024f43d
#16 [c5377fbc] kthread at c025326b
#17 [c5377fe8] kernel_thread_helper at c070e834

PID: 26653  TASK: e79143b0  CPU: 3   COMMAND: "umount"
 #0 [cde0fda0] __schedule at c0706595
 #1 [cde0fe28] schedule at c0706b89
 #2 [cde0fe30] schedule_timeout at c0705600
 #3 [cde0fe94] __down_common at c0706098
 #4 [cde0fec8] __down at c0706122
 #5 [cde0fed0] down at c025936f
 #6 [cde0fee0] xfs_buf_lock at f7c7131d [xfs]
 #7 [cde0ff00] xfs_freesb at f7cc2236 [xfs]
 #8 [cde0ff10] xfs_fs_put_super at f7c80f21 [xfs]
 #9 [cde0ff1c] generic_shutdown_super at c0333d7a
#10 [cde0ff38] kill_block_super at c0333e0f
#11 [cde0ff48] deactivate_locked_super at c0334218
#12 [cde0ff58] deactivate_super at c033495d
#13 [cde0ff68] mntput_no_expire at c034bc13
#14 [cde0ff7c] sys_umount at c034cc69
#15 [cde0ffa0] sys_oldumount at c034ccd4
#16 [cde0ffb0] system_call at c0707e66

commit 11159a0 added this to xfs_log_unmount and needs to be cleaned up
at a later date.

Signed-off-by: Ben Myers <bpm@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
4026c9f
@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Sep 20, 2012
Tushar Behera ARM: SAMSUNG: Use spin_lock_{irqsave,irqrestore} in clk_set_rate
The spinlock clocks_lock can be held during ISR, hence it is not safe to
hold that lock with disabling interrupts.

It fixes following potential deadlock.

=========================================================
[ INFO: possible irq lock inversion dependency detected ]
3.6.0-rc4+ #2 Not tainted
---------------------------------------------------------
swapper/0/1 just changed the state of lock:
 (&(&host->lock)->rlock){-.....}, at: [<c027fb0d>] sdhci_irq+0x15/0x564
but this lock took another, HARDIRQ-unsafe lock in the past:
 (clocks_lock){+.+...}

and interrupts could create inverse lock ordering between them.

other info that might help us debug this:
 Possible interrupt unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(clocks_lock);
                               local_irq_disable();
                               lock(&(&host->lock)->rlock);
                               lock(clocks_lock);
  <Interrupt>
    lock(&(&host->lock)->rlock);

 *** DEADLOCK ***

Signed-off-by: Tushar Behera <tushar.behera@linaro.org>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
d6838a6
@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Sep 20, 2012
@minipli minipli xfrm_user: return error pointer instead of NULL #2
When dump_one_policy() returns an error, e.g. because of a too small
buffer to dump the whole xfrm policy, xfrm_policy_netlink() returns
NULL instead of an error pointer. But its caller expects an error
pointer and therefore continues to operate on a NULL skbuff.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
c254637
@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Sep 29, 2012
Ben Myers xfs: stop the sync worker before xfs_unmountfs
Cancel work of the xfs_sync_worker before teardown of the log in
xfs_unmountfs.  This prevents occasional crashes on unmount like so:

PID: 21602  TASK: ee9df060  CPU: 0   COMMAND: "kworker/0:3"
 #0 [c5377d28] crash_kexec at c0292c94
 #1 [c5377d80] oops_end at c07090c2
 #2 [c5377d98] no_context at c06f614e
 #3 [c5377dbc] __bad_area_nosemaphore at c06f6281
 #4 [c5377df4] bad_area_nosemaphore at c06f629b
 #5 [c5377e00] do_page_fault at c070b0cb
 #6 [c5377e7c] error_code (via page_fault) at c070892c
    EAX: f300c6a8  EBX: f300c6a8  ECX: 000000c0  EDX: 000000c0  EBP: c5377ed0
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 00000001  GS:  ffffad20
    CS:  0060      EIP: c0481ad0  ERR: ffffffff  EFLAGS: 00010246
 #7 [c5377eb0] atomic64_read_cx8 at c0481ad0
 #8 [c5377ebc] xlog_assign_tail_lsn_locked at f7cc7c6e [xfs]
 #9 [c5377ed4] xfs_trans_ail_delete_bulk at f7ccd520 [xfs]
#10 [c5377f0c] xfs_buf_iodone at f7ccb602 [xfs]
#11 [c5377f24] xfs_buf_do_callbacks at f7cca524 [xfs]
#12 [c5377f30] xfs_buf_iodone_callbacks at f7cca5da [xfs]
#13 [c5377f4c] xfs_buf_iodone_work at f7c718d0 [xfs]
#14 [c5377f58] process_one_work at c024ee4c
#15 [c5377f98] worker_thread at c024f43d
#16 [c5377fbc] kthread at c025326b
#17 [c5377fe8] kernel_thread_helper at c070e834

PID: 26653  TASK: e79143b0  CPU: 3   COMMAND: "umount"
 #0 [cde0fda0] __schedule at c0706595
 #1 [cde0fe28] schedule at c0706b89
 #2 [cde0fe30] schedule_timeout at c0705600
 #3 [cde0fe94] __down_common at c0706098
 #4 [cde0fec8] __down at c0706122
 #5 [cde0fed0] down at c025936f
 #6 [cde0fee0] xfs_buf_lock at f7c7131d [xfs]
 #7 [cde0ff00] xfs_freesb at f7cc2236 [xfs]
 #8 [cde0ff10] xfs_fs_put_super at f7c80f21 [xfs]
 #9 [cde0ff1c] generic_shutdown_super at c0333d7a
#10 [cde0ff38] kill_block_super at c0333e0f
#11 [cde0ff48] deactivate_locked_super at c0334218
#12 [cde0ff58] deactivate_super at c033495d
#13 [cde0ff68] mntput_no_expire at c034bc13
#14 [cde0ff7c] sys_umount at c034cc69
#15 [cde0ffa0] sys_oldumount at c034ccd4
#16 [cde0ffb0] system_call at c0707e66

commit 11159a0 added this to xfs_log_unmount and needs to be cleaned up
at a later date.

Signed-off-by: Ben Myers <bpm@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
0ba6e53
@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Sep 29, 2012
Vinicius Costa Gomes Bluetooth: Fix not removing power_off delayed work
For example, when a usb reset is received (I could reproduce it
running something very similar to this[1] in a loop) it could be
that the device is unregistered while the power_off delayed work
is still scheduled to run.

Backtrace:

WARNING: at lib/debugobjects.c:261 debug_print_object+0x7c/0x8d()
Hardware name: To Be Filled By O.E.M.
ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x26
Modules linked in: nouveau mxm_wmi btusb wmi bluetooth ttm coretemp drm_kms_helper
Pid: 2114, comm: usb-reset Not tainted 3.5.0bt-next #2
Call Trace:
 [<ffffffff8124cc00>] ? free_obj_work+0x57/0x91
 [<ffffffff81058f88>] warn_slowpath_common+0x7e/0x97
 [<ffffffff81059035>] warn_slowpath_fmt+0x41/0x43
 [<ffffffff8124ccb6>] debug_print_object+0x7c/0x8d
 [<ffffffff8106e3ec>] ? __queue_work+0x259/0x259
 [<ffffffff8124d63e>] ? debug_check_no_obj_freed+0x6f/0x1b5
 [<ffffffff8124d667>] debug_check_no_obj_freed+0x98/0x1b5
 [<ffffffffa00aa031>] ? bt_host_release+0x10/0x1e [bluetooth]
 [<ffffffff810fc035>] kfree+0x90/0xe6
 [<ffffffffa00aa031>] bt_host_release+0x10/0x1e [bluetooth]
 [<ffffffff812ec2f9>] device_release+0x4a/0x7e
 [<ffffffff8123ef57>] kobject_release+0x11d/0x154
 [<ffffffff8123ed98>] kobject_put+0x4a/0x4f
 [<ffffffff812ec0d9>] put_device+0x12/0x14
 [<ffffffffa009472b>] hci_free_dev+0x22/0x26 [bluetooth]
 [<ffffffffa0280dd0>] btusb_disconnect+0x96/0x9f [btusb]
 [<ffffffff813581b4>] usb_unbind_interface+0x57/0x106
 [<ffffffff812ef988>] __device_release_driver+0x83/0xd6
 [<ffffffff812ef9fb>] device_release_driver+0x20/0x2d
 [<ffffffff813582a7>] usb_driver_release_interface+0x44/0x7b
 [<ffffffff81358795>] usb_forced_unbind_intf+0x45/0x4e
 [<ffffffff8134f959>] usb_reset_device+0xa6/0x12e
 [<ffffffff8135df86>] usbdev_do_ioctl+0x319/0xe20
 [<ffffffff81203244>] ? avc_has_perm_flags+0xc9/0x12e
 [<ffffffff812031a0>] ? avc_has_perm_flags+0x25/0x12e
 [<ffffffff81050101>] ? do_page_fault+0x31e/0x3a1
 [<ffffffff8135eaa6>] usbdev_ioctl+0x9/0xd
 [<ffffffff811126b1>] vfs_ioctl+0x21/0x34
 [<ffffffff81112f7b>] do_vfs_ioctl+0x408/0x44b
 [<ffffffff81208d45>] ? file_has_perm+0x76/0x81
 [<ffffffff8111300f>] sys_ioctl+0x51/0x76
 [<ffffffff8158db22>] system_call_fastpath+0x16/0x1b

[1] http://cpansearch.perl.org/src/DPAVLIN/Biblio-RFID-0.03/examples/usbreset.c

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Cc: stable@vger.kernel.org
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
78c04c0
@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Sep 29, 2012
@mcgrof mcgrof cfg80211: fix possible circular lock on reg_regdb_search()
When call_crda() is called we kick off a witch hunt search
for the same regulatory domain on our internal regulatory
database and that work gets kicked off on a workqueue, this
is done while the cfg80211_mutex is held. If that workqueue
kicks off it will first lock reg_regdb_search_mutex and
later cfg80211_mutex but to ensure two CPUs will not contend
against cfg80211_mutex the right thing to do is to have the
reg_regdb_search() wait until the cfg80211_mutex is let go.

The lockdep report is pasted below.

cfg80211: Calling CRDA to update world regulatory domain

======================================================
[ INFO: possible circular locking dependency detected ]
3.3.8 #3 Tainted: G           O
-------------------------------------------------------
kworker/0:1/235 is trying to acquire lock:
 (cfg80211_mutex){+.+...}, at: [<816468a4>] set_regdom+0x78c/0x808 [cfg80211]

but task is already holding lock:
 (reg_regdb_search_mutex){+.+...}, at: [<81646828>] set_regdom+0x710/0x808 [cfg80211]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (reg_regdb_search_mutex){+.+...}:
       [<800a8384>] lock_acquire+0x60/0x88
       [<802950a8>] mutex_lock_nested+0x54/0x31c
       [<81645778>] is_world_regdom+0x9f8/0xc74 [cfg80211]

-> #1 (reg_mutex#2){+.+...}:
       [<800a8384>] lock_acquire+0x60/0x88
       [<802950a8>] mutex_lock_nested+0x54/0x31c
       [<8164539c>] is_world_regdom+0x61c/0xc74 [cfg80211]

-> #0 (cfg80211_mutex){+.+...}:
       [<800a77b8>] __lock_acquire+0x10d4/0x17bc
       [<800a8384>] lock_acquire+0x60/0x88
       [<802950a8>] mutex_lock_nested+0x54/0x31c
       [<816468a4>] set_regdom+0x78c/0x808 [cfg80211]

other info that might help us debug this:

Chain exists of:
  cfg80211_mutex --> reg_mutex#2 --> reg_regdb_search_mutex

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(reg_regdb_search_mutex);
                               lock(reg_mutex#2);
                               lock(reg_regdb_search_mutex);
  lock(cfg80211_mutex);

 *** DEADLOCK ***

3 locks held by kworker/0:1/235:
 #0:  (events){.+.+..}, at: [<80089a00>] process_one_work+0x230/0x460
 #1:  (reg_regdb_work){+.+...}, at: [<80089a00>] process_one_work+0x230/0x460
 #2:  (reg_regdb_search_mutex){+.+...}, at: [<81646828>] set_regdom+0x710/0x808 [cfg80211]

stack backtrace:
Call Trace:
[<80290fd4>] dump_stack+0x8/0x34
[<80291bc4>] print_circular_bug+0x2ac/0x2d8
[<800a77b8>] __lock_acquire+0x10d4/0x17bc
[<800a8384>] lock_acquire+0x60/0x88
[<802950a8>] mutex_lock_nested+0x54/0x31c
[<816468a4>] set_regdom+0x78c/0x808 [cfg80211]

Reported-by: Felix Fietkau <nbd@openwrt.org>
Tested-by: Felix Fietkau <nbd@openwrt.org>
Cc: stable@vger.kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@do-not-panic.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
a85d0d7
@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Sep 29, 2012
Konrad Rzeszutek Wilk xen/boot: Disable BIOS SMP MP table search.
As the initial domain we are able to search/map certain regions
of memory to harvest configuration data. For all low-level we
use ACPI tables - for interrupts we use exclusively ACPI _PRT
(so DSDT) and MADT for INT_SRC_OVR.

The SMP MP table is not used at all. As a matter of fact we do
not even support machines that only have SMP MP but no ACPI tables.

Lets follow how Moorestown does it and just disable searching
for BIOS SMP tables.

This also fixes an issue on HP Proliant BL680c G5 and DL380 G6:

9f->100 for 1:1 PTE
Freeing 9f-100 pfn range: 97 pages freed
1-1 mapping on 9f->100
.. snip..
e820: BIOS-provided physical RAM map:
Xen: [mem 0x0000000000000000-0x000000000009efff] usable
Xen: [mem 0x000000000009f400-0x00000000000fffff] reserved
Xen: [mem 0x0000000000100000-0x00000000cfd1dfff] usable
.. snip..
Scan for SMP in [mem 0x00000000-0x000003ff]
Scan for SMP in [mem 0x0009fc00-0x0009ffff]
Scan for SMP in [mem 0x000f0000-0x000fffff]
found SMP MP-table at [mem 0x000f4fa0-0x000f4faf] mapped at [ffff8800000f4fa0]
(XEN) mm.c:908:d0 Error getting mfn 100 (pfn 5555555555555555) from L1 entry 0000000000100461 for l1e_owner=0, pg_owner=0
(XEN) mm.c:4995:d0 ptwr_emulate: could not get_page_from_l1e()
BUG: unable to handle kernel NULL pointer dereference at           (null)
IP: [<ffffffff81ac07e2>] xen_set_pte_init+0x66/0x71
. snip..
Pid: 0, comm: swapper Not tainted 3.6.0-rc6upstream-00188-gb6fb969-dirty #2 HP ProLiant BL680c G5
.. snip..
Call Trace:
 [<ffffffff81ad31c6>] __early_ioremap+0x18a/0x248
 [<ffffffff81624731>] ? printk+0x48/0x4a
 [<ffffffff81ad32ac>] early_ioremap+0x13/0x15
 [<ffffffff81acc140>] get_mpc_size+0x2f/0x67
 [<ffffffff81acc284>] smp_scan_config+0x10c/0x136
 [<ffffffff81acc2e4>] default_find_smp_config+0x36/0x5a
 [<ffffffff81ac3085>] setup_arch+0x5b3/0xb5b
 [<ffffffff81624731>] ? printk+0x48/0x4a
 [<ffffffff81abca7f>] start_kernel+0x90/0x390
 [<ffffffff81abc356>] x86_64_start_reservations+0x131/0x136
 [<ffffffff81abfa83>] xen_start_kernel+0x65f/0x661
(XEN) Domain 0 crashed: 'noreboot' set - not rebooting.

which is that ioremap would end up mapping 0xff using _PAGE_IOMAP
(which is what early_ioremap sticks as a flag) - which meant
we would get MFN 0xFF (pte ff461, which is OK), and then it would
also map 0x100 (b/c ioremap tries to get page aligned request, and
it was trying to map 0xf4fa0 + PAGE_SIZE - so it mapped the next page)
as _PAGE_IOMAP. Since 0x100 is actually a RAM page, and the _PAGE_IOMAP
bypasses the P2M lookup we would happily set the PTE to 1000461.
Xen would deny the request since we do not have access to the
Machine Frame Number (MFN) of 0x100. The P2M[0x100] is for example
0x80140.

CC: stable@vger.kernel.org
Fixes-Oracle-Bugzilla: https://bugzilla.oracle.com/bugzilla/show_bug.cgi?id=13665
Acked-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
bd49940
@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Sep 29, 2012
@wfarnsworth wfarnsworth ARM: 7524/1: support syscall tracing
As specified by ftrace-design.txt, TIF_SYSCALL_TRACEPOINT was
added, as well as NR_syscalls in asm/unistd.h.  Additionally,
__sys_trace was modified to call trace_sys_enter and
trace_sys_exit when appropriate.

Tests #2 - #4 of "perf test" now complete successfully.

Signed-off-by: Steven Walter <stevenrwalter@gmail.com>
Signed-off-by: Wade Farnsworth <wade_farnsworth@mentor.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
1f66e06
@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Sep 29, 2012
navin staging: usbip: stub_dev: Fixed oops during removal of usbip_host
stub_shutdown_connection() should set kernel thread pointers to NULL after killing them.
so that at the time of usbip_host removal stub_shutdown_connection() doesn't try to kill kernel threads
which are already killed.

[ 1504.312158] BUG: unable to handle kernel NULL pointer dereference at           (null)
[ 1504.315833] IP: [<ffffffff8108186f>] exit_creds+0x1f/0x70
[ 1504.317688] PGD 41f1c067 PUD 41d0f067 PMD 0
[ 1504.319611] Oops: 0000 [#2] SMP
[ 1504.321480] Modules linked in: vhci_hcd(O) usbip_host(O-) usbip_core(O) uas usb_storage joydev parport_pc bnep rfcomm ppdev binfmt_misc
snd_hda_codec_hdmi snd_hda_codec_conexant snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi arc4 snd_rawmidi snd_seq_midi_event
snd_seq ath9k mac80211 ath9k_common ath9k_hw snd_timer snd_seq_device snd ath i915 drm_kms_helper drm psmouse cfg80211 coretemp soundcore
lpc_ich i2c_algo_bit snd_page_alloc video intel_ips wmi btusb mac_hid bluetooth ideapad_laptop lp sparse_keymap serio_raw mei microcode parport r8169
[ 1504.329666] CPU 1
[ 1504.329687] Pid: 2434, comm: usbip_eh Tainted: G      D    O 3.6.0-rc31+ #2 LENOVO 20042                           /Base Board Product Name
[ 1504.333502] RIP: 0010:[<ffffffff8108186f>]  [<ffffffff8108186f>] exit_creds+0x1f/0x70
[ 1504.335371] RSP: 0018:ffff880041c7fdd0  EFLAGS: 00010282
[ 1504.337226] RAX: 0000000000000000 RBX: ffff880041c2db40 RCX: ffff880041e4ae50
[ 1504.339101] RDX: 0000000000000044 RSI: 0000000000000286 RDI: 0000000000000000
[ 1504.341027] RBP: ffff880041c7fde0 R08: ffff880041c7e000 R09: 0000000000000000
[ 1504.342934] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
[ 1504.344840] R13: 0000000000000000 R14: ffff88004d448000 R15: ffff880041c7fea0
[ 1504.346743] FS:  0000000000000000(0000) GS:ffff880077440000(0000) knlGS:0000000000000000
[ 1504.348671] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1504.350612] CR2: 0000000000000000 CR3: 0000000041d0d000 CR4: 00000000000007e0
[ 1504.352723] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1504.354734] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1504.356737] Process usbip_eh (pid: 2434, threadinfo ffff880041c7e000, task ffff88004d448000)
[ 1504.358711] Stack:
[ 1504.360635]  ffff880041c7fde0 ffff880041c2db40 ffff880041c7fe00 ffffffff81052aaa
[ 1504.362589]  ffff880041c2db40 0000000000000000 ffff880041c7fe30 ffffffff8107a148
[ 1504.364539]  ffff880041e4ae10 ffff880041e4ae00 ffff88004d448000 ffff88004d448000
[ 1504.366470] Call Trace:
[ 1504.368368]  [<ffffffff81052aaa>] __put_task_struct+0x4a/0x140
[ 1504.370307]  [<ffffffff8107a148>] kthread_stop+0x108/0x110
[ 1504.370312]  [<ffffffffa040da4e>] stub_shutdown_connection+0x3e/0x1b0 [usbip_host]
[ 1504.370315]  [<ffffffffa03fd920>] event_handler_loop+0x70/0x140 [usbip_core]
[ 1504.370318]  [<ffffffff8107ad30>] ? add_wait_queue+0x60/0x60
[ 1504.370320]  [<ffffffffa03fd8b0>] ? usbip_stop_eh+0x30/0x30 [usbip_core]
[ 1504.370322]  [<ffffffff8107a453>] kthread+0x93/0xa0
[ 1504.370327]  [<ffffffff81670e84>] kernel_thread_helper+0x4/0x10
[ 1504.370330]  [<ffffffff8107a3c0>] ? flush_kthread_worker+0xb0/0xb0
[ 1504.370332]  [<ffffffff81670e80>] ? gs_change+0x13/0x13
[ 1504.370351] Code: 0b 0f 0b 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 53 48 83 ec 08 66 66 66 66 90 48 8b 87 58 04 00 00 48 89 fb 48 8b bf 50 04 00 00
<8b> 00 48 c7 83 50 04 00 00 00 00 00 00 f0 ff 0f 0f 94 c0 84 c0
[ 1504.370353] RIP  [<ffffffff8108186f>] exit_creds+0x1f/0x70
[ 1504.370353]  RSP <ffff880041c7fdd0>
[ 1504.370354] CR2: 0000000000000000
[ 1504.401376] ---[ end trace 1971ce612a16727a ]---

Signed-off-by: navin patidar <navinp@cdac.in>
Acked-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
b7caecb
@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Sep 29, 2012
navin staging: usbip: vhci_hcd: Fixed oops during removal of vhci_hcd
In response to "usbip detach -p [port_number]" user command,vhci_shutdown_connection() gets
executed which kills tcp_tx,tcp_rx kernel threads but doesn't set thread pointers to NULL.
so, at the time of vhci_hcd removal vhci_shutdown_connection() again tries to kill kernel
threads which are already killed.

[  312.220259] BUG: unable to handle kernel NULL pointer dereference at           (null)
[  312.220313] IP: [<ffffffff8108186f>] exit_creds+0x1f/0x70
[  312.220349] PGD 5b7be067 PUD 5b7dc067 PMD 0
[  312.220388] Oops: 0000 [#1] SMP
[  312.220415] Modules linked in: vhci_hcd(O-) usbip_host(O) usbip_core(O) uas usb_storage joydev parport_pc bnep rfcomm ppdev binfmt_misc
snd_hda_codec_hdmi snd_hda_codec_conexant snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi arc4 snd_rawmidi snd_seq_midi_event snd_seq
ath9k mac80211 ath9k_common ath9k_hw snd_timer snd_seq_device snd ath i915 drm_kms_helper drm psmouse cfg80211 coretemp soundcore lpc_ich i2c_algo_bit
snd_page_alloc video intel_ips wmi btusb mac_hid bluetooth ideapad_laptop lp sparse_keymap serio_raw mei microcode parport r8169
[  312.220862] CPU 0
[  312.220882] Pid: 2095, comm: usbip_eh Tainted: G           O 3.6.0-rc3+ #2 LENOVO 20042                           /Base Board Product Name
[  312.220938] RIP: 0010:[<ffffffff8108186f>]  [<ffffffff8108186f>] exit_creds+0x1f/0x70
[  312.220979] RSP: 0018:ffff88004d6ebdb0  EFLAGS: 00010282
[  312.221008] RAX: 0000000000000000 RBX: ffff880041c4db40 RCX: ffff88005ad52230
[  312.221041] RDX: 0000000000000034 RSI: 0000000000000286 RDI: 0000000000000000
[  312.221074] RBP: ffff88004d6ebdc0 R08: ffff88004d6ea000 R09: 0000000000000000
[  312.221107] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
[  312.221144] R13: 0000000000000000 R14: ffff88005ad521d8 R15: ffff88004d6ebea0
[  312.221177] FS:  0000000000000000(0000) GS:ffff880077400000(0000) knlGS:0000000000000000
[  312.221214] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  312.221241] CR2: 0000000000000000 CR3: 000000005b7b9000 CR4: 00000000000007f0
[  312.221277] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  312.221309] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  312.221342] Process usbip_eh (pid: 2095, threadinfo ffff88004d6ea000, task ffff88004d44db40)
[  312.221384] Stack:
[  312.221396]  ffff88004d6ebdc0 ffff880041c4db40 ffff88004d6ebde0 ffffffff81052aaa
[  312.221442]  ffff880041c4db40 0000000000000000 ffff88004d6ebe10 ffffffff8107a148
[  312.221487]  ffff88005ad521f0 ffff88005ad52228 ffff88004d44db40 ffff88005ad521d8
[  312.221536] Call Trace:
[  312.221556]  [<ffffffff81052aaa>] __put_task_struct+0x4a/0x140
[  312.221587]  [<ffffffff8107a148>] kthread_stop+0x108/0x110
[  312.221616]  [<ffffffffa0412569>] vhci_shutdown_connection+0x49/0x2b4 [vhci_hcd]
[  312.221657]  [<ffffffffa0139920>] event_handler_loop+0x70/0x140 [usbip_core]
[  312.221692]  [<ffffffff8107ad30>] ? add_wait_queue+0x60/0x60
[  312.221721]  [<ffffffffa01398b0>] ? usbip_stop_eh+0x30/0x30 [usbip_core]
[  312.221754]  [<ffffffff8107a453>] kthread+0x93/0xa0
[  312.221784]  [<ffffffff81670e84>] kernel_thread_helper+0x4/0x10
[  312.221814]  [<ffffffff8107a3c0>] ? flush_kthread_worker+0xb0/0xb0
[  312.223589]  [<ffffffff81670e80>] ? gs_change+0x13/0x13
[  312.225360] Code: 0b 0f 0b 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 53 48 83 ec 08 66 66 66 66 90 48 8b 87 58 04 00 00 48 89 fb 48 8b bf 50 04 00 00
<8b> 00 48 c7 83 50 04 00 00 00 00 00 00 f0 ff 0f 0f 94 c0 84 c0
[  312.229663] RIP  [<ffffffff8108186f>] exit_creds+0x1f/0x70
[  312.231696]  RSP <ffff88004d6ebdb0>
[  312.233648] CR2: 0000000000000000
[  312.256464] ---[ end trace 1971ce612a167279 ]---

Signed-off-by: navin patidar <navinp@cdac.in>
Acked-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
cbf7d12
@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Sep 29, 2012
Steffen Maier [SCSI] zfcp: Make trace record tags unique
Duplicate fssrh_2 from a54ca0f
"[SCSI] zfcp: Redesign of the debug tracing for HBA records."
complicates distinction of generic status read response from
local link up.
Duplicate fsscth1 from 2c55b75
"[SCSI] zfcp: Redesign of the debug tracing for SAN records."
complicates distinction of good common transport response from
invalid port handle.

Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Reviewed-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> #2.6.38+
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
0100998
@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Sep 29, 2012
Steffen Maier [SCSI] zfcp: Do not wakeup while suspended
If the mapping of FCP device bus ID and corresponding subchannel
is modified while the Linux image is suspended, the resume of FCP
devices can fail. During resume, zfcp gets callbacks from cio regarding
the modified subchannels but they can be arbitrarily mixed with the
restore/resume callback. Since the cio callbacks would trigger
adapter recovery, zfcp could wakeup before the resume callback.
Therefore, ignore the cio callbacks regarding subchannels while
being suspended. We can safely do so, since zfcp does not deal itself
with subchannels. For problem determination purposes, we still trace the
ignored callback events.

The following kernel messages could be seen on resume:

kernel: <WWPN>: parent <FCP device bus ID> should not be sleeping

As part of adapter reopen recovery, zfcp performs auto port scanning
which can erroneously try to register new remote ports with
scsi_transport_fc and the device core code complains about the parent
(adapter) still sleeping.

kernel: zfcp.3dff9c: <FCP device bus ID>:\
 Setting up the QDIO connection to the FCP adapter failed
<last kernel message repeated 3 more times>
kernel: zfcp.574d43: <FCP device bus ID>:\
 ERP cannot recover an error on the FCP device

In such cases, the adapter gave up recovery and remained blocked along
with its child objects: remote ports and LUNs/scsi devices. Even the
adapter shutdown as part of giving up recovery failed because the ccw
device state remained disconnected. Later, the corresponding remote
ports ran into dev_loss_tmo. As a result, the LUNs were erroneously
not available again after resume.

Even a manually triggered adapter recovery (e.g. sysfs attribute
failed, or device offline/online via sysfs) could not recover the
adapter due to the remaining disconnected state of the corresponding
ccw device.

Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> #2.6.32+
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
cb45214
@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Sep 29, 2012
@JuliaLawall JuliaLawall [SCSI] zfcp: remove invalid reference to list iterator variable
If list_for_each_entry, etc complete a traversal of the list, the iterator
variable ends up pointing to an address at an offset from the list head,
and not a meaningful structure.  Thus this value should not be used after
the end of the iterator.  Replace port->adapter->scsi_host by
adapter->scsi_host.

This problem was found using Coccinelle (http://coccinelle.lip6.fr/).

Oversight in upsteam commit of v2.6.37
a1ca483
"[SCSI] zfcp: Move ACL/CFDC code to zfcp_cfdc.c"
which merged the content of zfcp_erp_port_access_changed().

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Reviewed-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> #2.6.37+
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
ca579c9
@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Sep 29, 2012
Steffen Maier [SCSI] zfcp: restore refcount check on port_remove
Upstream commit f3450c7
"[SCSI] zfcp: Replace local reference counting with common kref"
accidentally dropped a reference count check before tearing down
zfcp_ports that are potentially in use by zfcp_units.
Even remote ports in use can be removed causing
unreachable garbage objects zfcp_ports with zfcp_units.
Thus units won't come back even after a manual port_rescan.
The kref of zfcp_port->dev.kobj is already used by the driver core.
We cannot re-use it to track the number of zfcp_units.
Re-introduce our own counter for units per port
and check on port_remove.

Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: <stable@vger.kernel.org> #2.6.33+
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
d99b601
@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Sep 29, 2012
Martin Peschke [SCSI] zfcp: only access zfcp_scsi_dev for valid scsi_device
__scsi_remove_device (e.g. due to dev_loss_tmo) calls
zfcp_scsi_slave_destroy which in turn sends a close LUN FSF request to
the adapter. After 30 seconds without response,
zfcp_erp_timeout_handler kicks the ERP thread failing the close LUN
ERP action. zfcp_erp_wait in zfcp_erp_lun_shutdown_wait and thus
zfcp_scsi_slave_destroy returns and then scsi_device is no longer
valid. Sometime later the response to the close LUN FSF request may
finally come in. However, commit
b62a8d9
"[SCSI] zfcp: Use SCSI device data zfcp_scsi_dev instead of zfcp_unit"
introduced a number of attempts to unconditionally access struct
zfcp_scsi_dev through struct scsi_device causing a use-after-free.
This leads to an Oops due to kernel page fault in one of:
zfcp_fsf_abort_fcp_command_handler, zfcp_fsf_open_lun_handler,
zfcp_fsf_close_lun_handler, zfcp_fsf_req_trace,
zfcp_fsf_fcp_handler_common.
Move dereferencing of zfcp private data zfcp_scsi_dev allocated in
scsi_device via scsi_transport_reserve_device after the check for
potentially aborted FSF request and thus no longer valid scsi_device.
Only then assign sdev_to_zfcp(sdev) to the local auto variable struct
zfcp_scsi_dev *zfcp_sdev.

Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> #2.6.37+
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
d436de8
@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Oct 3, 2012
Szymon Janc NFC: Use dynamic initialization for rwlocks
If rwlock is dynamically allocated but statically initialized it is
missing proper lockdep annotation.

INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
Pid: 3352, comm: neard Not tainted 3.5.0-999-nfc+ #2
Call Trace:
[<ffffffff810c8526>] __lock_acquire+0x8f6/0x1bf0
[<ffffffff81739045>] ? printk+0x4d/0x4f
[<ffffffff810c9eed>] lock_acquire+0x9d/0x220
[<ffffffff81702bfe>] ? nfc_llcp_sock_from_sn+0x4e/0x160
[<ffffffff81746724>] _raw_read_lock+0x44/0x60
[<ffffffff81702bfe>] ? nfc_llcp_sock_from_sn+0x4e/0x160
[<ffffffff81702bfe>] nfc_llcp_sock_from_sn+0x4e/0x160
[<ffffffff817034a7>] nfc_llcp_get_sdp_ssap+0xa7/0x1b0
[<ffffffff81706353>] llcp_sock_bind+0x173/0x210
[<ffffffff815d9c94>] sys_bind+0xe4/0x100
[<ffffffff8139209e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[<ffffffff8174ea69>] system_call_fastpath+0x16/0x1b

Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
fe235b5
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
Andre Guedes Bluetooth: Fix potential deadlock
We don't need to use the _sync variant in hci_conn_hold and
hci_conn_put to cancel conn->disc_work delayed work. This way
we avoid potential deadlocks like this one reported by lockdep.

======================================================
[ INFO: possible circular locking dependency detected ]
3.2.0+ #1 Not tainted
-------------------------------------------------------
kworker/u:1/17 is trying to acquire lock:
 (&hdev->lock){+.+.+.}, at: [<ffffffffa0041155>] hci_conn_timeout+0x62/0x158 [bluetooth]

but task is already holding lock:
 ((&(&conn->disc_work)->work)){+.+...}, at: [<ffffffff81035751>] process_one_work+0x11a/0x2bf

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 ((&(&conn->disc_work)->work)){+.+...}:
       [<ffffffff81057444>] lock_acquire+0x8a/0xa7
       [<ffffffff81034ed1>] wait_on_work+0x3d/0xaa
       [<ffffffff81035b54>] __cancel_work_timer+0xac/0xef
       [<ffffffff81035ba4>] cancel_delayed_work_sync+0xd/0xf
       [<ffffffffa00554b0>] smp_chan_create+0xde/0xe6 [bluetooth]
       [<ffffffffa0056160>] smp_conn_security+0xa3/0x12d [bluetooth]
       [<ffffffffa0053640>] l2cap_connect_cfm+0x237/0x2e8 [bluetooth]
       [<ffffffffa004239c>] hci_proto_connect_cfm+0x2d/0x6f [bluetooth]
       [<ffffffffa0046ea5>] hci_event_packet+0x29d1/0x2d60 [bluetooth]
       [<ffffffffa003dde3>] hci_rx_work+0xd0/0x2e1 [bluetooth]
       [<ffffffff810357af>] process_one_work+0x178/0x2bf
       [<ffffffff81036178>] worker_thread+0xce/0x152
       [<ffffffff81039a03>] kthread+0x95/0x9d
       [<ffffffff812e7754>] kernel_thread_helper+0x4/0x10

-> #1 (slock-AF_BLUETOOTH-BTPROTO_L2CAP){+.+...}:
       [<ffffffff81057444>] lock_acquire+0x8a/0xa7
       [<ffffffff812e553a>] _raw_spin_lock_bh+0x36/0x6a
       [<ffffffff81244d56>] lock_sock_nested+0x24/0x7f
       [<ffffffffa004d96f>] lock_sock+0xb/0xd [bluetooth]
       [<ffffffffa0052906>] l2cap_chan_connect+0xa9/0x26f [bluetooth]
       [<ffffffffa00545f8>] l2cap_sock_connect+0xb3/0xff [bluetooth]
       [<ffffffff81243b48>] sys_connect+0x69/0x8a
       [<ffffffff812e6579>] system_call_fastpath+0x16/0x1b

-> #0 (&hdev->lock){+.+.+.}:
       [<ffffffff81056d06>] __lock_acquire+0xa80/0xd74
       [<ffffffff81057444>] lock_acquire+0x8a/0xa7
       [<ffffffff812e3870>] __mutex_lock_common+0x48/0x38e
       [<ffffffff812e3c75>] mutex_lock_nested+0x2a/0x31
       [<ffffffffa0041155>] hci_conn_timeout+0x62/0x158 [bluetooth]
       [<ffffffff810357af>] process_one_work+0x178/0x2bf
       [<ffffffff81036178>] worker_thread+0xce/0x152
       [<ffffffff81039a03>] kthread+0x95/0x9d
       [<ffffffff812e7754>] kernel_thread_helper+0x4/0x10

other info that might help us debug this:

Chain exists of:
  &hdev->lock --> slock-AF_BLUETOOTH-BTPROTO_L2CAP --> (&(&conn->disc_work)->work)

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock((&(&conn->disc_work)->work));
                               lock(slock-AF_BLUETOOTH-BTPROTO_L2CAP);
                               lock((&(&conn->disc_work)->work));
  lock(&hdev->lock);

 *** DEADLOCK ***

2 locks held by kworker/u:1/17:
 #0:  (hdev->name){.+.+.+}, at: [<ffffffff81035751>] process_one_work+0x11a/0x2bf
 #1:  ((&(&conn->disc_work)->work)){+.+...}, at: [<ffffffff81035751>] process_one_work+0x11a/0x2bf

stack backtrace:
Pid: 17, comm: kworker/u:1 Not tainted 3.2.0+ #1
Call Trace:
 [<ffffffff812e06c6>] print_circular_bug+0x1f8/0x209
 [<ffffffff81056d06>] __lock_acquire+0xa80/0xd74
 [<ffffffff81021ef2>] ? arch_local_irq_restore+0x6/0xd
 [<ffffffff81022bc7>] ? vprintk+0x3f9/0x41e
 [<ffffffff81057444>] lock_acquire+0x8a/0xa7
 [<ffffffffa0041155>] ? hci_conn_timeout+0x62/0x158 [bluetooth]
 [<ffffffff812e3870>] __mutex_lock_common+0x48/0x38e
 [<ffffffffa0041155>] ? hci_conn_timeout+0x62/0x158 [bluetooth]
 [<ffffffff81190fd6>] ? __dynamic_pr_debug+0x6d/0x6f
 [<ffffffffa0041155>] ? hci_conn_timeout+0x62/0x158 [bluetooth]
 [<ffffffff8105320f>] ? trace_hardirqs_off+0xd/0xf
 [<ffffffff812e3c75>] mutex_lock_nested+0x2a/0x31
 [<ffffffffa0041155>] hci_conn_timeout+0x62/0x158 [bluetooth]
 [<ffffffff810357af>] process_one_work+0x178/0x2bf
 [<ffffffff81035751>] ? process_one_work+0x11a/0x2bf
 [<ffffffff81055af3>] ? lock_acquired+0x1d0/0x1df
 [<ffffffffa00410f3>] ? hci_acl_disconn+0x65/0x65 [bluetooth]
 [<ffffffff81036178>] worker_thread+0xce/0x152
 [<ffffffff810407ed>] ? finish_task_switch+0x45/0xc5
 [<ffffffff810360aa>] ? manage_workers.isra.25+0x16a/0x16a
 [<ffffffff81039a03>] kthread+0x95/0x9d
 [<ffffffff812e7754>] kernel_thread_helper+0x4/0x10
 [<ffffffff812e5db4>] ? retint_restore_args+0x13/0x13
 [<ffffffff8103996e>] ? __init_kthread_worker+0x55/0x55
 [<ffffffff812e7750>] ? gs_change+0x13/0x13

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Reviewed-by: Ulisses Furquim <ulisses@profusion.mobi>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2f304d1
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
Luck, Tony x86: Remove some noise from boot log when starting cpus
Printing the "start_ip" for every secondary cpu is very noisy on a large
system - and doesn't add any value. Drop this message.

Console log before:
Booting Node   0, Processors  #1
smpboot cpu 1: start_ip = 96000
 #2
smpboot cpu 2: start_ip = 96000
 #3
smpboot cpu 3: start_ip = 96000
 #4
smpboot cpu 4: start_ip = 96000
       ...
 #31
smpboot cpu 31: start_ip = 96000
Brought up 32 CPUs

Console log after:
Booting Node   0, Processors  #1 #2 #3 #4 #5 #6 #7 Ok.
Booting Node   1, Processors  #8 #9 #10 #11 #12 #13 #14 #15 Ok.
Booting Node   0, Processors  #16 #17 #18 #19 #20 #21 #22 #23 Ok.
Booting Node   1, Processors  #24 #25 #26 #27 #28 #29 #30 #31
Brought up 32 CPUs

Acked-by: Borislav Petkov <bp@amd64.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/4f452eb42507460426@agluck-desktop.sc.intel.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
140f190
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
@djbw djbw [SCSI] libsas: fix lifetime of SAS_HA_FROZEN
Until all sas_tasks are known to no longer be in-flight this flag gates late
completions from colliding with error handling.  However, it must be cleared
prior to the submission of scsi_send_eh_cmnd() requests, otherwise those
commands will never be completed correctly.

This was spotted by slub debug:
 =============================================================================
 BUG sas_task: Objects remaining on kmem_cache_close()
 -----------------------------------------------------------------------------

 INFO: Slab 0xffffea001f0eba00 objects=34 used=1 fp=0xffff8807c3aecb00 flags=0x8000000000004080
 Pid: 22919, comm: modprobe Not tainted 3.2.0-isci+ #2
 Call Trace:
  [<ffffffff810fcdcd>] slab_err+0xb0/0xd2
  [<ffffffff810e1c50>] ? free_percpu+0x31/0x117
  [<ffffffff81100122>] ? kzalloc+0x14/0x16
  [<ffffffff81100122>] ? kzalloc+0x14/0x16
  [<ffffffff81100486>] kmem_cache_destroy+0x11d/0x270
  [<ffffffffa0112bdc>] sas_class_exit+0x10/0x12 [libsas]
  [<ffffffff81078fba>] sys_delete_module+0x1c4/0x23c
  [<ffffffff814797ba>] ? sysret_check+0x2e/0x69
  [<ffffffff8126479e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
  [<ffffffff81479782>] system_call_fastpath+0x16/0x1b
 INFO: Object 0xffff8807c3aed280 @offset=21120
 INFO: Allocated in sas_alloc_task+0x22/0x90 [libsas] age=4615311 cpu=2 pid=12966
  __slab_alloc.clone.3+0x1d1/0x234
  kmem_cache_alloc+0x52/0x10d
  sas_alloc_task+0x22/0x90 [libsas]
  sas_queuecommand+0x20e/0x230 [libsas]
  scsi_send_eh_cmnd+0xd1/0x30c
  scsi_eh_try_stu+0x4f/0x6b
  scsi_eh_ready_devs+0xba/0x6ef
  sas_scsi_recover_host+0xa35/0xab1 [libsas]
  scsi_error_handler+0x14b/0x5fa
  kthread+0x9d/0xa5
  kernel_thread_helper+0x4/0x10

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
8402347
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
Srivatsa S. Bhat x86, mce: Fix rcu splat in drain_mce_log_buffer()
While booting, the following message is seen:

[   21.665087] ===============================
[   21.669439] [ INFO: suspicious RCU usage. ]
[   21.673798] 3.2.0-0.0.0.28.36b5ec9-default #2 Not tainted
[   21.681353] -------------------------------
[   21.685864] arch/x86/kernel/cpu/mcheck/mce.c:194 suspicious rcu_dereference_index_check() usage!
[   21.695013]
[   21.695014] other info that might help us debug this:
[   21.695016]
[   21.703488]
[   21.703489] rcu_scheduler_active = 1, debug_locks = 1
[   21.710426] 3 locks held by modprobe/2139:
[   21.714754]  #0:  (&__lockdep_no_validate__){......}, at: [<ffffffff8133afd3>] __driver_attach+0x53/0xa0
[   21.725020]  #1:
[   21.725323] ioatdma: Intel(R) QuickData Technology Driver 4.00
[   21.733206]  (&__lockdep_no_validate__){......}, at: [<ffffffff8133afe1>] __driver_attach+0x61/0xa0
[   21.743015]  #2:  (i7core_edac_lock){+.+.+.}, at: [<ffffffffa01cfa5f>] i7core_probe+0x1f/0x5c0 [i7core_edac]
[   21.753708]
[   21.753709] stack backtrace:
[   21.758429] Pid: 2139, comm: modprobe Not tainted 3.2.0-0.0.0.28.36b5ec9-default #2
[   21.768253] Call Trace:
[   21.770838]  [<ffffffff810977cd>] lockdep_rcu_suspicious+0xcd/0x100
[   21.777366]  [<ffffffff8101aa41>] drain_mcelog_buffer+0x191/0x1b0
[   21.783715]  [<ffffffff8101aa78>] mce_register_decode_chain+0x18/0x20
[   21.790430]  [<ffffffffa01cf8db>] i7core_register_mci+0x2fb/0x3e4 [i7core_edac]
[   21.798003]  [<ffffffffa01cfb14>] i7core_probe+0xd4/0x5c0 [i7core_edac]
[   21.804809]  [<ffffffff8129566b>] local_pci_probe+0x5b/0xe0
[   21.810631]  [<ffffffff812957c9>] __pci_device_probe+0xd9/0xe0
[   21.816650]  [<ffffffff813362e4>] ? get_device+0x14/0x20
[   21.822178]  [<ffffffff81296916>] pci_device_probe+0x36/0x60
[   21.828061]  [<ffffffff8133ac8a>] really_probe+0x7a/0x2b0
[   21.833676]  [<ffffffff8133af23>] driver_probe_device+0x63/0xc0
[   21.839868]  [<ffffffff8133b01b>] __driver_attach+0x9b/0xa0
[   21.845718]  [<ffffffff8133af80>] ? driver_probe_device+0xc0/0xc0
[   21.852027]  [<ffffffff81339168>] bus_for_each_dev+0x68/0x90
[   21.857876]  [<ffffffff8133aa3c>] driver_attach+0x1c/0x20
[   21.863462]  [<ffffffff8133a64d>] bus_add_driver+0x16d/0x2b0
[   21.869377]  [<ffffffff8133b6dc>] driver_register+0x7c/0x160
[   21.875220]  [<ffffffff81296bda>] __pci_register_driver+0x6a/0xf0
[   21.881494]  [<ffffffffa01fe000>] ? 0xffffffffa01fdfff
[   21.886846]  [<ffffffffa01fe047>] i7core_init+0x47/0x1000 [i7core_edac]
[   21.893737]  [<ffffffff810001ce>] do_one_initcall+0x3e/0x180
[   21.899670]  [<ffffffff810a9b95>] sys_init_module+0xc5/0x220
[   21.905542]  [<ffffffff8149bc39>] system_call_fastpath+0x16/0x1b

Fix this by using ACCESS_ONCE() instead of rcu_dereference_check_mce()
over mcelog.next. Since the access to each entry is controlled by the
->finished field, ACCESS_ONCE() should work just fine. An rcu_dereference
is unnecessary here.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Suggested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
b11e3d7
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
@paulgortmaker paulgortmaker device.h: audit and cleanup users in main include dir
The <linux/device.h> header includes a lot of stuff, and
it in turn gets a lot of use just for the basic "struct device"
which appears so often.

Clean up the users as follows:

1) For those headers only needing "struct device" as a pointer
in fcn args, replace the include with exactly that.

2) For headers not really using anything from device.h, simply
delete the include altogether.

3) For headers relying on getting device.h implicitly before
being included themselves, now explicitly include device.h

4) For files in which doing #1 or #2 uncovers an implicit
dependency on some other header, fix by explicitly adding
the required header(s).

Any C files that were implicitly relying on device.h to be
present have already been dealt with in advance.

Total removals from #1 and #2: 51.  Total additions coming
from #3: 9.  Total other implicit dependencies from #4: 7.

As of 3.3-rc1, there were 110, so a net removal of 42 gives
about a 38% reduction in device.h presence in include/*

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
313162d
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
@dhowells dhowells Disintegrate asm/system.h for Blackfin [ver #2]
Disintegrate asm/system.h for Blackfin.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: uclinux-dist-devel@blackfin.uclinux.org
Signed-off-by: Bob Liu <lliubbo@gmail.com>
3bed8d6
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
@torvalds torvalds Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel…
…/git/lliubbo/blackfin

Pull blackfin updates from Bob Liu

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lliubbo/blackfin: (24 commits)
  blackfin: clean up string bfin_dma_5xx after rename.
  blackfin:dma: rename bfin_dma_5xx.c to bfin_dma.c
  bf548: ssm2602: Add ssm2602 platform data into bf548 ezkit board file.
  Blackfin: s/#if CONFIG/#ifdef CONFIG/
  Blackfin: pnav: delete duplicate linux/export.h include
  bf561: add ppi DLEN macro for 10bits to 16bits
  arch: blackfin: udpate defconfig
  Disintegrate asm/system.h for Blackfin [ver #2]
  arch/blackfin: don't generate random mac in bfin_get_ether_addr()
  Blackfin: wire up new process_vm syscalls
  blackfin: cleanup anomaly workarounds
  blackfin: update default defconfig
  blackfin: thread_info: add suspend flag
  bfin: add bfin_ad73311_machine platform device
  blackfin: bf537: stamp: update board file for 193x
  blackfin: kgdb: skip hardware watchpoint test
  bf548: add ppi interrupt mask and blanking clocks
  blackfin: bf561: forgot CSYNC in get_core_lock_noflush
  spi/bfin_spi: drop bits_per_word from client data
  blackfin: cplb-mpu: fix page mask table overflow
  ...
4f5b1af
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
@torvalds torvalds Merge branch 'amba' of git://git.linaro.org/people/rmk/linux-arm
Pull #2 ARM updates from Russell King:
 "Further ARM AMBA primecell updates which aren't included directly in
  the previous commit.  I wanted to keep these separate as they're
  touching stuff outside arch/arm/."

* 'amba' of git://git.linaro.org/people/rmk/linux-arm:
  ARM: 7362/1: AMBA: Add module_amba_driver() helper macro for amba_driver
  ARM: 7335/1: mach-u300: do away with MMC config files
  ARM: 7280/1: mmc: mmci: Cache MMCICLOCK and MMCIPOWER register
  ARM: 7309/1: realview: fix unconnected interrupts on EB11MP
  ARM: 7230/1: mmc: mmci: Fix PIO read for small SDIO packets
  ARM: 7227/1: mmc: mmci: Prepare for SDIO before setting up DMA job
  ARM: 7223/1: mmc: mmci: Fixup use of runtime PM and use autosuspend
  ARM: 7221/1: mmc: mmci: Change from using legacy suspend
  ARM: 7219/1: mmc: mmci: Change vdd_handler to a generic ios_handler
  ARM: 7218/1: mmc: mmci: Provide option to configure bus signal direction
  ARM: 7217/1: mmc: mmci: Put power register deviations in variant data
  ARM: 7216/1: mmc: mmci: Do not release spinlock in request_end
  ARM: 7215/1: mmc: mmci: Increase max_segs from 16 to 128
0d19eac
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
@paulgortmaker paulgortmaker blackfin: fix cmpxchg build fails from system.h fallout
Commit 3bed8d6 ("Disintegrate asm/system.h for Blackfin [ver #2]")
introduced arch/blackfin/include/asm/cmpxchg.h but has it also including
the asm-generic one which causes this:

  CC      arch/blackfin/kernel/asm-offsets.s
  In file included from arch/blackfin/include/asm/cmpxchg.h:125:0,
                 from arch/blackfin/include/asm/atomic.h:10,
                 from include/linux/atomic.h:4,
                 from include/linux/spinlock.h:384,
                 from include/linux/seqlock.h:29,
                 from include/linux/time.h:8,
                 from include/linux/timex.h:56,
                 from include/linux/sched.h:57,
                 from arch/blackfin/kernel/asm-offsets.c:10:
  include/asm-generic/cmpxchg.h:24:15: error: redefinition of '__xchg'
  arch/blackfin/include/asm/cmpxchg.h:82:29: note: previous definition of '__xchg' was here
  make[2]: *** [arch/blackfin/kernel/asm-offsets.s] Error 1

It really only needs two simple defines from asm-generic, so just use
those instead.

Cc: Bob Liu <lliubbo@gmail.com>
Cc: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1512cdc
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
@djbw djbw kobject: provide more diagnostic info for kobject_add_internal() fail…
…ures

1/ convert open-coded KERN_ERR+dump_stack() to WARN(), so that automated
   tools pick up this warning.

2/ include the 'child' and 'parent' kobject names.  This information was
   useful for tracking down the case where scsi invoked device_del() on a
   parent object and subsequently invoked device_add() on a child.  Now the
   warning looks like:

     kobject_add_internal failed for target8:0:16 (error: -2 parent: end_device-8:0:24)
     Pid: 2942, comm: scsi_scan_8 Not tainted 3.3.0-rc7-isci+ #2
     Call Trace:
      [<ffffffff8125e551>] kobject_add_internal+0x1c1/0x1f3
      [<ffffffff81075149>] ? trace_hardirqs_on+0xd/0xf
      [<ffffffff8125e659>] kobject_add_varg+0x41/0x50
      [<ffffffff8125e723>] kobject_add+0x64/0x66
      [<ffffffff8131124b>] device_add+0x12d/0x63a
      [<ffffffff8125e0ef>] ? kobject_put+0x4c/0x50
      [<ffffffff8132f370>] scsi_sysfs_add_sdev+0x4e/0x28a
      [<ffffffff8132dce3>] do_scan_async+0x9c/0x145

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: James Bottomley <JBottomley@parallels.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
282029c
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
@rolandd rolandd IB/uverbs: Lock SRQ / CQ / PD objects in a consistent order
Since XRC support was added, the uverbs code has locked SRQ, CQ and PD
objects needed during QP and SRQ creation in different orders
depending on the the code path.  This leads to the (at least
theoretical) possibility of deadlock, and triggers the lockdep splat
below.

Fix this by making sure we always lock the SRQ first, then CQs and
finally the PD.

    ======================================================
    [ INFO: possible circular locking dependency detected ]
    3.4.0-rc5+ #34 Not tainted
    -------------------------------------------------------
    ibv_srq_pingpon/2484 is trying to acquire lock:
     (SRQ-uobj){+++++.}, at: [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs]

    but task is already holding lock:
     (CQ-uobj){+++++.}, at: [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs]

    which lock already depends on the new lock.

    the existing dependency chain (in reverse order) is:

    -> #2 (CQ-uobj){+++++.}:
           [<ffffffff81070fd0>] lock_acquire+0xbf/0xfe
           [<ffffffff81384f28>] down_read+0x34/0x43
           [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs]
           [<ffffffffa00af542>] idr_read_obj+0x9/0x19 [ib_uverbs]
           [<ffffffffa00b16c3>] ib_uverbs_create_qp+0x180/0x684 [ib_uverbs]
           [<ffffffffa00ae3dd>] ib_uverbs_write+0xb7/0xc2 [ib_uverbs]
           [<ffffffff810fe47f>] vfs_write+0xa7/0xee
           [<ffffffff810fe65f>] sys_write+0x45/0x69
           [<ffffffff8138cdf9>] system_call_fastpath+0x16/0x1b

    -> #1 (PD-uobj){++++++}:
           [<ffffffff81070fd0>] lock_acquire+0xbf/0xfe
           [<ffffffff81384f28>] down_read+0x34/0x43
           [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs]
           [<ffffffffa00af542>] idr_read_obj+0x9/0x19 [ib_uverbs]
           [<ffffffffa00af8ad>] __uverbs_create_xsrq+0x96/0x386 [ib_uverbs]
           [<ffffffffa00b31b9>] ib_uverbs_detach_mcast+0x1cd/0x1e6 [ib_uverbs]
           [<ffffffffa00ae3dd>] ib_uverbs_write+0xb7/0xc2 [ib_uverbs]
           [<ffffffff810fe47f>] vfs_write+0xa7/0xee
           [<ffffffff810fe65f>] sys_write+0x45/0x69
           [<ffffffff8138cdf9>] system_call_fastpath+0x16/0x1b

    -> #0 (SRQ-uobj){+++++.}:
           [<ffffffff81070898>] __lock_acquire+0xa29/0xd06
           [<ffffffff81070fd0>] lock_acquire+0xbf/0xfe
           [<ffffffff81384f28>] down_read+0x34/0x43
           [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs]
           [<ffffffffa00af542>] idr_read_obj+0x9/0x19 [ib_uverbs]
           [<ffffffffa00b1728>] ib_uverbs_create_qp+0x1e5/0x684 [ib_uverbs]
           [<ffffffffa00ae3dd>] ib_uverbs_write+0xb7/0xc2 [ib_uverbs]
           [<ffffffff810fe47f>] vfs_write+0xa7/0xee
           [<ffffffff810fe65f>] sys_write+0x45/0x69
           [<ffffffff8138cdf9>] system_call_fastpath+0x16/0x1b

    other info that might help us debug this:

    Chain exists of:
      SRQ-uobj --> PD-uobj --> CQ-uobj

     Possible unsafe locking scenario:

           CPU0                    CPU1
           ----                    ----
      lock(CQ-uobj);
                                   lock(PD-uobj);
                                   lock(CQ-uobj);
      lock(SRQ-uobj);

     *** DEADLOCK ***

    3 locks held by ibv_srq_pingpon/2484:
     #0:  (QP-uobj){+.+...}, at: [<ffffffffa00b162c>] ib_uverbs_create_qp+0xe9/0x684 [ib_uverbs]
     #1:  (PD-uobj){++++++}, at: [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs]
     #2:  (CQ-uobj){+++++.}, at: [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs]

    stack backtrace:
    Pid: 2484, comm: ibv_srq_pingpon Not tainted 3.4.0-rc5+ #34
    Call Trace:
     [<ffffffff8137eff0>] print_circular_bug+0x1f8/0x209
     [<ffffffff81070898>] __lock_acquire+0xa29/0xd06
     [<ffffffffa00af37c>] ? __idr_get_uobj+0x20/0x5e [ib_uverbs]
     [<ffffffffa00af51b>] ? idr_read_uobj+0x2f/0x4d [ib_uverbs]
     [<ffffffff81070fd0>] lock_acquire+0xbf/0xfe
     [<ffffffffa00af51b>] ? idr_read_uobj+0x2f/0x4d [ib_uverbs]
     [<ffffffff81070eee>] ? lock_release+0x166/0x189
     [<ffffffff81384f28>] down_read+0x34/0x43
     [<ffffffffa00af51b>] ? idr_read_uobj+0x2f/0x4d [ib_uverbs]
     [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs]
     [<ffffffffa00af542>] idr_read_obj+0x9/0x19 [ib_uverbs]
     [<ffffffffa00b1728>] ib_uverbs_create_qp+0x1e5/0x684 [ib_uverbs]
     [<ffffffff81070fec>] ? lock_acquire+0xdb/0xfe
     [<ffffffff81070c09>] ? lock_release_non_nested+0x94/0x213
     [<ffffffff810d470f>] ? might_fault+0x40/0x90
     [<ffffffff810d470f>] ? might_fault+0x40/0x90
     [<ffffffffa00ae3dd>] ib_uverbs_write+0xb7/0xc2 [ib_uverbs]
     [<ffffffff810fe47f>] vfs_write+0xa7/0xee
     [<ffffffff810ff736>] ? fget_light+0x3b/0x99
     [<ffffffff810fe65f>] sys_write+0x45/0x69
     [<ffffffff8138cdf9>] system_call_fastpath+0x16/0x1b

Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
5909ce5
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
@shefty shefty RDMA/cma: Fix lockdep false positive recursive locking
The following lockdep problem was reported by Or Gerlitz <ogerlitz@mellanox.com>:

    [ INFO: possible recursive locking detected ]
    3.3.0-32035-g1b2649e-dirty #4 Not tainted
    ---------------------------------------------
    kworker/5:1/418 is trying to acquire lock:
     (&id_priv->handler_mutex){+.+.+.}, at: [<ffffffffa0138a41>] rdma_destroy_i    d+0x33/0x1f0 [rdma_cm]

    but task is already holding lock:
     (&id_priv->handler_mutex){+.+.+.}, at: [<ffffffffa0135130>] cma_disable_ca    llback+0x24/0x45 [rdma_cm]

    other info that might help us debug this:
     Possible unsafe locking scenario:

           CPU0
           ----
      lock(&id_priv->handler_mutex);
      lock(&id_priv->handler_mutex);

     *** DEADLOCK ***

     May be due to missing lock nesting notation

    3 locks held by kworker/5:1/418:
     #0:  (ib_cm){.+.+.+}, at: [<ffffffff81042ac1>] process_one_work+0x210/0x4a    6
     #1:  ((&(&work->work)->work)){+.+.+.}, at: [<ffffffff81042ac1>] process_on    e_work+0x210/0x4a6
     #2:  (&id_priv->handler_mutex){+.+.+.}, at: [<ffffffffa0135130>] cma_disab    le_callback+0x24/0x45 [rdma_cm]

    stack backtrace:
    Pid: 418, comm: kworker/5:1 Not tainted 3.3.0-32035-g1b2649e-dirty #4
    Call Trace:
     [<ffffffff8102b0fb>] ? console_unlock+0x1f4/0x204
     [<ffffffff81068771>] __lock_acquire+0x16b5/0x174e
     [<ffffffff8106461f>] ? save_trace+0x3f/0xb3
     [<ffffffff810688fa>] lock_acquire+0xf0/0x116
     [<ffffffffa0138a41>] ? rdma_destroy_id+0x33/0x1f0 [rdma_cm]
     [<ffffffff81364351>] mutex_lock_nested+0x64/0x2ce
     [<ffffffffa0138a41>] ? rdma_destroy_id+0x33/0x1f0 [rdma_cm]
     [<ffffffff81065a78>] ? trace_hardirqs_on_caller+0x11e/0x155
     [<ffffffff81065abc>] ? trace_hardirqs_on+0xd/0xf
     [<ffffffffa0138a41>] rdma_destroy_id+0x33/0x1f0 [rdma_cm]
     [<ffffffffa0139c02>] cma_req_handler+0x418/0x644 [rdma_cm]
     [<ffffffffa012ee88>] cm_process_work+0x32/0x119 [ib_cm]
     [<ffffffffa0130299>] cm_req_handler+0x928/0x982 [ib_cm]
     [<ffffffffa01302f3>] ? cm_req_handler+0x982/0x982 [ib_cm]
     [<ffffffffa0130326>] cm_work_handler+0x33/0xfe5 [ib_cm]
     [<ffffffff81065a78>] ? trace_hardirqs_on_caller+0x11e/0x155
     [<ffffffffa01302f3>] ? cm_req_handler+0x982/0x982 [ib_cm]
     [<ffffffff81042b6e>] process_one_work+0x2bd/0x4a6
     [<ffffffff81042ac1>] ? process_one_work+0x210/0x4a6
     [<ffffffff813669f3>] ? _raw_spin_unlock_irq+0x2b/0x40
     [<ffffffff8104316e>] worker_thread+0x1d6/0x350
     [<ffffffff81042f98>] ? rescuer_thread+0x241/0x241
     [<ffffffff81046a32>] kthread+0x84/0x8c
     [<ffffffff8136e854>] kernel_thread_helper+0x4/0x10
     [<ffffffff81366d59>] ? retint_restore_args+0xe/0xe
     [<ffffffff810469ae>] ? __init_kthread_worker+0x56/0x56
     [<ffffffff8136e850>] ? gs_change+0xb/0xb

The actual locking is fine, since we're dealing with different locks,
but from the same lock class.  cma_disable_callback() acquires the
listening id mutex, whereas rdma_destroy_id() acquires the mutex for
the new connection id.  To fix this, delay the call to
rdma_destroy_id() until we've released the listening id mutex.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
b6cec8a
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
Ben Myers xfs: protect xfs_sync_worker with s_umount semaphore
xfs_sync_worker checks the MS_ACTIVE flag in s_flags to avoid doing
work during mount and unmount.  This flag can be cleared by unmount
after the xfs_sync_worker checks it but before the work is completed.
The has caused crashes in the completion handler for the dummy
transaction commited by xfs_sync_worker:

PID: 27544  TASK: ffff88013544e040  CPU: 3   COMMAND: "kworker/3:0"
 #0 [ffff88016fdff930] machine_kexec at ffffffff810244e9
 #1 [ffff88016fdff9a0] crash_kexec at ffffffff8108d053
 #2 [ffff88016fdffa70] oops_end at ffffffff813ad1b8
 #3 [ffff88016fdffaa0] no_context at ffffffff8102bd48
 #4 [ffff88016fdffaf0] __bad_area_nosemaphore at ffffffff8102c04d
 #5 [ffff88016fdffb40] bad_area_nosemaphore at ffffffff8102c12e
 #6 [ffff88016fdffb50] do_page_fault at ffffffff813afaee
 #7 [ffff88016fdffc60] page_fault at ffffffff813ac635
    [exception RIP: xlog_get_lowest_lsn+0x30]
    RIP: ffffffffa04a9910  RSP: ffff88016fdffd10  RFLAGS: 00010246
    RAX: ffffc90014e48000  RBX: ffff88014d879980  RCX: ffff88014d879980
    RDX: ffff8802214ee4c0  RSI: 0000000000000000  RDI: 0000000000000000
    RBP: ffff88016fdffd10   R8: ffff88014d879a80   R9: 0000000000000000
    R10: 0000000000000001  R11: 0000000000000000  R12: ffff8802214ee400
    R13: ffff88014d879980  R14: 0000000000000000  R15: ffff88022fd96605
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #8 [ffff88016fdffd18] xlog_state_do_callback at ffffffffa04aa186 [xfs]
 #9 [ffff88016fdffd98] xlog_state_done_syncing at ffffffffa04aa568 [xfs]

Protect xfs_sync_worker by using the s_umount semaphore at the read
level to provide exclusion with unmount while work is progressing.

Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
1307bbd
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
@shirishpargaonkar shirishpargaonkar cifs: Include backup intent search flags during searches {try #2)
As observed and suggested by Tushar Gosavi...

---------
readdir calls these function to send TRANS2_FIND_FIRST and
TRANS2_FIND_NEXT command to the server. The current cifs module is
not specifying CIFS_SEARCH_BACKUP_SEARCH flag while sending these
command when backupuid/backupgid is specified. This can be resolved
by specifying CIFS_SEARCH_BACKUP_SEARCH flag.
---------

Cc: <stable@kernel.org>
Reported-and-Tested-by: Tushar Gosavi <tugosavi@in.ibm.com>
Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2608bee
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
Meenakshi Venkataraman iwlwifi: update BT traffic load states correctly
When BT traffic load changes from its
previous state, a new LQ command needs to be
sent down to the firmware. This needs to
be done only once per change. The state
variable that keeps track of this change is
last_bt_traffic_load. However, it was not
being updated when the change had been
handled. Not updating this variable was
causing a flood of advanced BT config
commands to be sent to the firmware. Fix
this.

Cc: stable@vger.kernel.org #2.6.38+
Signed-off-by: Meenakshi Venkataraman <meenakshi.venkataraman@intel.com>
Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
882dde8
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
Meenakshi Venkataraman iwlwifi: do not use shadow registers by default
Shadow registers in the device are meant to
allow the driver to update certain device
registers without needing to wake up all
components of the device. However, using
this feature in the device causes
communication between the driver and the
device to become unreliable, resulting in
host command timeouts.

Disable this feature by default till a fix is
available for the bug.

Cc: stable@vger.kernel.org #2.6.38+
Signed-off-by: Meenakshi Venkataraman <meenakshi.venkataraman@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
66a7707
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
@taoma-tm taoma-tm ext4: protect group inode free counting with group lock
Now when we set the group inode free count, we don't have a proper
group lock so that multiple threads may decrease the inode free
count at the same time. And e2fsck will complain something like:

Free inodes count wrong for group #1 (1, counted=0).
Fix? no

Free inodes count wrong for group #2 (3, counted=0).
Fix? no

Directories count wrong for group #2 (780, counted=779).
Fix? no

Free inodes count wrong for group #3 (2272, counted=2273).
Fix? no

So this patch try to protect it with the ext4_lock_group.

btw, it is found by xfstests test case 269 and the volume is
mkfsed with the parameter
"-O ^resize_inode,^uninit_bg,extent,meta_bg,flex_bg,ext_attr"
and I have run it 100 times and the error in e2fsck doesn't
show up again.

Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
6f2e9f0
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
@torvalds torvalds Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Pull CIFS updates from Steve French.

* 'for-next' of git://git.samba.org/sfrench/cifs-2.6: (29 commits)
  cifs: fix oops while traversing open file list (try #4)
  cifs: Fix comment as d_alloc_root() is replaced by d_make_root()
  CIFS: Introduce SMB2 mounts as vers=2.1
  CIFS: Introduce SMB2 Kconfig option
  CIFS: Move add/set_credits and get_credits_field to ops structure
  CIFS: Move protocol specific demultiplex thread calls to ops struct
  CIFS: Move protocol specific part from cifs_readv_receive to ops struct
  CIFS: Move header_size/max_header_size to ops structure
  CIFS: Move protocol specific part from SendReceive2 to ops struct
  cifs: Include backup intent search flags during searches {try #2)
  CIFS: Separate protocol specific part from setlk
  CIFS: Separate protocol specific part from getlk
  CIFS: Separate protocol specific lock type handling
  CIFS: Convert lock type to 32 bit variable
  CIFS: Move locks to cifsFileInfo structure
  cifs: convert send_nt_cancel into a version specific op
  cifs: add a smb_version_operations/values structures and a smb_version enum
  cifs: remove the vers= and version= synonyms for ver=
  cifs: add warning about change in default cache semantics in 3.7
  cifs: display cache= option in /proc/mounts
  ...
442a9ff
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
Andrea Arcangeli mm: pmd_read_atomic: fix 32bit PAE pmd walk vs pmd_populate SMP race …
…condition

When holding the mmap_sem for reading, pmd_offset_map_lock should only
run on a pmd_t that has been read atomically from the pmdp pointer,
otherwise we may read only half of it leading to this crash.

PID: 11679  TASK: f06e8000  CPU: 3   COMMAND: "do_race_2_panic"
 #0 [f06a9dd8] crash_kexec at c049b5ec
 #1 [f06a9e2c] oops_end at c083d1c2
 #2 [f06a9e40] no_context at c0433ded
 #3 [f06a9e64] bad_area_nosemaphore at c043401a
 #4 [f06a9e6c] __do_page_fault at c0434493
 #5 [f06a9eec] do_page_fault at c083eb45
 #6 [f06a9f04] error_code (via page_fault) at c083c5d5
    EAX: 01fb470c EBX: fff35000 ECX: 00000003 EDX: 00000100 EBP:
    00000000
    DS:  007b     ESI: 9e201000 ES:  007b     EDI: 01fb4700 GS:  00e0
    CS:  0060     EIP: c083bc14 ERR: ffffffff EFLAGS: 00010246
 #7 [f06a9f38] _spin_lock at c083bc14
 #8 [f06a9f44] sys_mincore at c0507b7d
 #9 [f06a9fb0] system_call at c083becd
                         start           len
    EAX: ffffffda  EBX: 9e200000  ECX: 00001000  EDX: 6228537f
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 003d0f00
    SS:  007b      ESP: 62285354  EBP: 62285388  GS:  0033
    CS:  0073      EIP: 00291416  ERR: 000000da  EFLAGS: 00000286

This should be a longstanding bug affecting x86 32bit PAE without THP.
Only archs with 64bit large pmd_t and 32bit unsigned long should be
affected.

With THP enabled the barrier() in pmd_none_or_trans_huge_or_clear_bad()
would partly hide the bug when the pmd transition from none to stable,
by forcing a re-read of the *pmd in pmd_offset_map_lock, but when THP is
enabled a new set of problem arises by the fact could then transition
freely in any of the none, pmd_trans_huge or pmd_trans_stable states.
So making the barrier in pmd_none_or_trans_huge_or_clear_bad()
unconditional isn't good idea and it would be a flakey solution.

This should be fully fixed by introducing a pmd_read_atomic that reads
the pmd in order with THP disabled, or by reading the pmd atomically
with cmpxchg8b with THP enabled.

Luckily this new race condition only triggers in the places that must
already be covered by pmd_none_or_trans_huge_or_clear_bad() so the fix
is localized there but this bug is not related to THP.

NOTE: this can trigger on x86 32bit systems with PAE enabled with more
than 4G of ram, otherwise the high part of the pmd will never risk to be
truncated because it would be zero at all times, in turn so hiding the
SMP race.

This bug was discovered and fully debugged by Ulrich, quote:

----
[..]
pmd_none_or_trans_huge_or_clear_bad() loads the content of edx and
eax.

    496 static inline int pmd_none_or_trans_huge_or_clear_bad(pmd_t
    *pmd)
    497 {
    498         /* depend on compiler for an atomic pmd read */
    499         pmd_t pmdval = *pmd;

                                // edi = pmd pointer
0xc0507a74 <sys_mincore+548>:   mov    0x8(%esp),%edi
...
                                // edx = PTE page table high address
0xc0507a84 <sys_mincore+564>:   mov    0x4(%edi),%edx
...
                                // eax = PTE page table low address
0xc0507a8e <sys_mincore+574>:   mov    (%edi),%eax

[..]

Please note that the PMD is not read atomically. These are two "mov"
instructions where the high order bits of the PMD entry are fetched
first. Hence, the above machine code is prone to the following race.

-  The PMD entry {high|low} is 0x0000000000000000.
   The "mov" at 0xc0507a84 loads 0x00000000 into edx.

-  A page fault (on another CPU) sneaks in between the two "mov"
   instructions and instantiates the PMD.

-  The PMD entry {high|low} is now 0x00000003fda38067.
   The "mov" at 0xc0507a8e loads 0xfda38067 into eax.
----

Reported-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
26c1917
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
@dhowells dhowells FRV: Shrink TIF_WORK_MASK [ver #2]
Shrink TIF_WORK_MASK so that it will fit in the 12-bit signed immediate
operand field of an ANDI instruction.

Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
1e5ef91
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
@dhowells dhowells FRV: Optimise the system call exit path in entry.S [ver #2]
Optimise the system call exit path in entry.S by packing some instructions.

Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
a2eddc7
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
@torvalds torvalds Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel…
…/git/viro/signal

Pull third pile of signal handling patches from Al Viro:
 "This time it's mostly helpers and conversions to them; there's a lot
  of stuff remaining in the tree, but that'll either go in -rc2
  (isolated bug fixes, ideally via arch maintainers' trees) or will sit
  there until the next cycle."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
  x86: get rid of calling do_notify_resume() when returning to kernel mode
  blackfin: check __get_user() return value
  whack-a-mole with TIF_FREEZE
  FRV: Optimise the system call exit path in entry.S [ver #2]
  FRV: Shrink TIF_WORK_MASK [ver #2]
  FRV: Prevent syscall exit tracing and notify_resume at end of kernel exceptions
  new helper: signal_delivered()
  powerpc: get rid of restore_sigmask()
  most of set_current_blocked() callers want SIGKILL/SIGSTOP removed from set
  set_restore_sigmask() is never called without SIGPENDING (and never should be)
  TIF_RESTORE_SIGMASK can be set only when TIF_SIGPENDING is set
  don't call try_to_freeze() from do_signal()
  pull clearing RESTORE_SIGMASK into block_sigmask()
  sh64: failure to build sigframe != signal without handler
  openrisc: tracehook_signal_handler() is supposed to be called on success
  new helper: sigmask_to_save()
  new helper: restore_saved_sigmask()
  new helpers: {clear,test,test_and_clear}_restore_sigmask()
  HAVE_RESTORE_SIGMASK is defined on all architectures now
86c47b7
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
@sgruszka sgruszka rt2x00: use atomic variable for seqno
Remove spinlock as atomic_t can be used instead. Note we use only 16
lower bits, upper bits are changed but we impilcilty cast to u16.

This fix possible deadlock on IBSS mode reproted by lockdep:

=================================
[ INFO: inconsistent lock state ]
3.4.0-wl+ #4 Not tainted
---------------------------------
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
kworker/u:2/30374 [HC0[0]:SC0[0]:HE1:SE1] takes:
 (&(&intf->seqlock)->rlock){+.?...}, at: [<f9979a20>] rt2x00queue_create_tx_descriptor+0x380/0x490 [rt2x00lib]
{IN-SOFTIRQ-W} state was registered at:
  [<c04978ab>] __lock_acquire+0x47b/0x1050
  [<c0498504>] lock_acquire+0x84/0xf0
  [<c0835733>] _raw_spin_lock+0x33/0x40
  [<f9979a20>] rt2x00queue_create_tx_descriptor+0x380/0x490 [rt2x00lib]
  [<f9979f2a>] rt2x00queue_write_tx_frame+0x1a/0x300 [rt2x00lib]
  [<f997834f>] rt2x00mac_tx+0x7f/0x380 [rt2x00lib]
  [<f98fe363>] __ieee80211_tx+0x1b3/0x300 [mac80211]
  [<f98ffdf5>] ieee80211_tx+0x105/0x130 [mac80211]
  [<f99000dd>] ieee80211_xmit+0xad/0x100 [mac80211]
  [<f9900519>] ieee80211_subif_start_xmit+0x2d9/0x930 [mac80211]
  [<c0782e87>] dev_hard_start_xmit+0x307/0x660
  [<c079bb71>] sch_direct_xmit+0xa1/0x1e0
  [<c0784bb3>] dev_queue_xmit+0x183/0x730
  [<c078c27a>] neigh_resolve_output+0xfa/0x1e0
  [<c07b436a>] ip_finish_output+0x24a/0x460
  [<c07b4897>] ip_output+0xb7/0x100
  [<c07b2d60>] ip_local_out+0x20/0x60
  [<c07e01ff>] igmpv3_sendpack+0x4f/0x60
  [<c07e108f>] igmp_ifc_timer_expire+0x29f/0x330
  [<c04520fc>] run_timer_softirq+0x15c/0x2f0
  [<c0449e3e>] __do_softirq+0xae/0x1e0
irq event stamp: 18380437
hardirqs last  enabled at (18380437): [<c0526027>] __slab_alloc.clone.3+0x67/0x5f0
hardirqs last disabled at (18380436): [<c0525ff3>] __slab_alloc.clone.3+0x33/0x5f0
softirqs last  enabled at (18377616): [<c0449eb3>] __do_softirq+0x123/0x1e0
softirqs last disabled at (18377611): [<c041278d>] do_softirq+0x9d/0xe0

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&(&intf->seqlock)->rlock);
  <Interrupt>
    lock(&(&intf->seqlock)->rlock);

 *** DEADLOCK ***

4 locks held by kworker/u:2/30374:
 #0:  (wiphy_name(local->hw.wiphy)){++++.+}, at: [<c045cf99>] process_one_work+0x109/0x3f0
 #1:  ((&sdata->work)){+.+.+.}, at: [<c045cf99>] process_one_work+0x109/0x3f0
 #2:  (&ifibss->mtx){+.+.+.}, at: [<f98f005b>] ieee80211_ibss_work+0x1b/0x470 [mac80211]
 #3:  (&intf->beacon_skb_mutex){+.+...}, at: [<f997a644>] rt2x00queue_update_beacon+0x24/0x50 [rt2x00lib]

stack backtrace:
Pid: 30374, comm: kworker/u:2 Not tainted 3.4.0-wl+ #4
Call Trace:
 [<c04962a6>] print_usage_bug+0x1f6/0x220
 [<c0496a12>] mark_lock+0x2c2/0x300
 [<c0495ff0>] ? check_usage_forwards+0xc0/0xc0
 [<c04978ec>] __lock_acquire+0x4bc/0x1050
 [<c0527890>] ? __kmalloc_track_caller+0x1c0/0x1d0
 [<c0777fb6>] ? copy_skb_header+0x26/0x90
 [<c0498504>] lock_acquire+0x84/0xf0
 [<f9979a20>] ? rt2x00queue_create_tx_descriptor+0x380/0x490 [rt2x00lib]
 [<c0835733>] _raw_spin_lock+0x33/0x40
 [<f9979a20>] ? rt2x00queue_create_tx_descriptor+0x380/0x490 [rt2x00lib]
 [<f9979a20>] rt2x00queue_create_tx_descriptor+0x380/0x490 [rt2x00lib]
 [<f997a5cf>] rt2x00queue_update_beacon_locked+0x5f/0xb0 [rt2x00lib]
 [<f997a64d>] rt2x00queue_update_beacon+0x2d/0x50 [rt2x00lib]
 [<f9977e3a>] rt2x00mac_bss_info_changed+0x1ca/0x200 [rt2x00lib]
 [<f9977c70>] ? rt2x00mac_remove_interface+0x70/0x70 [rt2x00lib]
 [<f98e4dd0>] ieee80211_bss_info_change_notify+0xe0/0x1d0 [mac80211]
 [<f98ef7b8>] __ieee80211_sta_join_ibss+0x3b8/0x610 [mac80211]
 [<c0496ab4>] ? mark_held_locks+0x64/0xc0
 [<c0440012>] ? virt_efi_query_capsule_caps+0x12/0x50
 [<f98efb09>] ieee80211_sta_join_ibss+0xf9/0x140 [mac80211]
 [<f98f0456>] ieee80211_ibss_work+0x416/0x470 [mac80211]
 [<c0496d8b>] ? trace_hardirqs_on+0xb/0x10
 [<c077683b>] ? skb_dequeue+0x4b/0x70
 [<f98f207f>] ieee80211_iface_work+0x13f/0x230 [mac80211]
 [<c045cf99>] ? process_one_work+0x109/0x3f0
 [<c045d015>] process_one_work+0x185/0x3f0
 [<c045cf99>] ? process_one_work+0x109/0x3f0
 [<f98f1f40>] ? ieee80211_teardown_sdata+0xa0/0xa0 [mac80211]
 [<c045ed86>] worker_thread+0x116/0x270
 [<c045ec70>] ? manage_workers+0x1e0/0x1e0
 [<c0462f64>] kthread+0x84/0x90
 [<c0462ee0>] ? __init_kthread_worker+0x60/0x60
 [<c083d382>] kernel_thread_helper+0x6/0x10

Cc: stable@vger.kernel.org
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Helmut Schaa <helmut.schaa@googlemail.com>
Acked-by: Gertjan van Wingerde <gwingerde@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
e5851da
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
@tiwai tiwai ALSA: usb-audio: Fix substream assignments
In 3.5 kernel, the endpoint is assigned dynamically for the
substreams, but the PCM assignment still checks the presence of the
endpoint pointer.  This ended up in duplicated PCM substream creations
at probing time, resulting in kernel warnings like:

WARNING: at fs/proc/generic.c:586 proc_register+0x169/0x1a6()
Pid: 1152, comm: modprobe Not tainted 3.5.0-rc1-00110-g71fae7e #2
Call Trace:
 [<ffffffff8102a400>] warn_slowpath_common+0x83/0x9c
 [<ffffffff8102a4bc>] warn_slowpath_fmt+0x46/0x48
 [<ffffffff813829ad>] ? add_preempt_count+0x39/0x3b
 [<ffffffff811292f0>] proc_register+0x169/0x1a6
 [<ffffffff8112962e>] create_proc_entry+0x74/0x8c
 [<ffffffffa018eb63>] snd_info_register+0x3e/0xc3 [snd]
 [<ffffffffa01fde2e>] snd_pcm_new_stream+0xb1/0x404 [snd_pcm]
 [<ffffffffa024861f>] snd_usb_add_audio_stream+0xd2/0x230 [snd_usb_audio]
 [<ffffffffa0241d33>] ? snd_usb_parse_audio_format+0x252/0x34f [snd_usb_audio]
 [<ffffffff810d6b17>] ? kmem_cache_alloc_trace+0xab/0xbb
 [<ffffffffa0248c29>] snd_usb_parse_audio_interface+0x4ac/0x567 [snd_usb_audio]
 [<ffffffffa023f0ff>] snd_usb_create_stream+0xe9/0x125 [snd_usb_audio]
 [<ffffffffa023f9b1>] usb_audio_probe+0x62a/0x72c [snd_usb_audio]
 .....

This patch fixes the regression by checking the fixed endpoint number
for each substream instead of the endpoint pointer.

Reported-and-tested-by: Jamie Heilman <jamie@audible.transient.net>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
8260ef0
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
Chen Gong edac: avoid mce decoding crash after edac driver unloaded
Some edac drivers register themselves as mce decoders via
notifier_chain. But in current notifier_chain implementation logic,
it doesn't accept same notifier registered twice. If so, it will be
wrong when adding/removing the element from the list. For example,
on one SandyBridge platform, remove module sb_edac and then trigger
one error, it will hit oops because it has no mce decoder registered
but related notifier_chain still points to an invalid callback
function. Here is an example:

Call Trace:
 [<ffffffff8150ef6a>] atomic_notifier_call_chain+0x1a/0x20
 [<ffffffff8102b936>] mce_log+0x46/0x180
 [<ffffffff8102eaea>] apei_mce_report_mem_error+0x4a/0x60
 [<ffffffff812e19d2>] ghes_do_proc+0x192/0x210
 [<ffffffff812e2066>] ghes_proc+0x46/0x70
 [<ffffffff812e20d8>] ghes_notify_sci+0x48/0x80
 [<ffffffff8150ef05>] notifier_call_chain+0x55/0x80
 [<ffffffff81076f1a>] __blocking_notifier_call_chain+0x5a/0x80
 [<ffffffff812aea11>] ? acpi_os_wait_events_complete+0x23/0x23
 [<ffffffff81076f56>] blocking_notifier_call_chain+0x16/0x20
 [<ffffffff812ddc4d>] acpi_hed_notify+0x19/0x1b
 [<ffffffff812b16bd>] acpi_device_notify+0x19/0x1b
 [<ffffffff812beb38>] acpi_ev_notify_dispatch+0x67/0x7f
 [<ffffffff812aea3a>] acpi_os_execute_deferred+0x29/0x36
 [<ffffffff81069dc2>] process_one_work+0x132/0x450
 [<ffffffff8106bbcb>] worker_thread+0x17b/0x3c0
 [<ffffffff8106ba50>] ? manage_workers+0x120/0x120
 [<ffffffff81070aee>] kthread+0x9e/0xb0
 [<ffffffff81514724>] kernel_thread_helper+0x4/0x10
 [<ffffffff81070a50>] ? kthread_freezable_should_stop+0x70/0x70
 [<ffffffff81514720>] ? gs_change+0x13/0x13
Code: f3 49 89 d4 45 85 ed 4d 89 c6 48 8b 0f 74 48 48 85 c9 75 17 eb 41
0f 1f 80 00 00 00 00 41 83 ed 01 4c 89 f9 74 22 4d 85 ff 74 1d <4c> 8b
79 08 4c 89 e2 48 89 de 48 89 cf ff 11 4d 85 f6 74 04 41
RIP  [<ffffffff8150eef6>] notifier_call_chain+0x46/0x80
 RSP <ffff88042868fb20>
CR2: ffffffffa01af838
---[ end trace 0100930068e73e6f ]---
BUG: unable to handle kernel paging request at fffffffffffffff8
IP: [<ffffffff810705b0>] kthread_data+0x10/0x20
PGD 1a0d067 PUD 1a0e067 PMD 0
Oops: 0000 [#2] SMP

Only i7core_edac and sb_edac have such issues because they have more
than one memory controller which means they have to register mce
decoder many times.

Cc: <stable@vger.kernel.org> # 3.2 and upper
Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
e35fca4
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
@elp elp cfg80211: fix potential deadlock in regulatory
reg_timeout_work() calls restore_regulatory_settings() which
takes cfg80211_mutex.

reg_set_request_processed() already holds cfg80211_mutex
before calling cancel_delayed_work_sync(reg_timeout),
so it might deadlock.

Call the async cancel_delayed_work instead, in order
to avoid the potential deadlock.

This is the relevant lockdep warning:

cfg80211: Calling CRDA for country: XX

======================================================
[ INFO: possible circular locking dependency detected ]
3.4.0-rc5-wl+ #26 Not tainted
-------------------------------------------------------
kworker/0:2/1391 is trying to acquire lock:
 (cfg80211_mutex){+.+.+.}, at: [<bf28ae00>] restore_regulatory_settings+0x34/0x418 [cfg80211]

but task is already holding lock:
 ((reg_timeout).work){+.+...}, at: [<c0059e94>] process_one_work+0x1f0/0x480

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 ((reg_timeout).work){+.+...}:
       [<c008fd44>] validate_chain+0xb94/0x10f0
       [<c0090b68>] __lock_acquire+0x8c8/0x9b0
       [<c0090d40>] lock_acquire+0xf0/0x114
       [<c005b600>] wait_on_work+0x4c/0x154
       [<c005c000>] __cancel_work_timer+0xd4/0x11c
       [<c005c064>] cancel_delayed_work_sync+0x1c/0x20
       [<bf28b274>] reg_set_request_processed+0x50/0x78 [cfg80211]
       [<bf28bd84>] set_regdom+0x550/0x600 [cfg80211]
       [<bf294cd8>] nl80211_set_reg+0x218/0x258 [cfg80211]
       [<c03c7738>] genl_rcv_msg+0x1a8/0x1e8
       [<c03c6a00>] netlink_rcv_skb+0x5c/0xc0
       [<c03c7584>] genl_rcv+0x28/0x34
       [<c03c6720>] netlink_unicast+0x15c/0x228
       [<c03c6c7c>] netlink_sendmsg+0x218/0x298
       [<c03933c8>] sock_sendmsg+0xa4/0xc0
       [<c039406c>] __sys_sendmsg+0x1e4/0x268
       [<c0394228>] sys_sendmsg+0x4c/0x70
       [<c0013840>] ret_fast_syscall+0x0/0x3c

-> #1 (reg_mutex){+.+.+.}:
       [<c008fd44>] validate_chain+0xb94/0x10f0
       [<c0090b68>] __lock_acquire+0x8c8/0x9b0
       [<c0090d40>] lock_acquire+0xf0/0x114
       [<c04734dc>] mutex_lock_nested+0x48/0x320
       [<bf28b2cc>] reg_todo+0x30/0x538 [cfg80211]
       [<c0059f44>] process_one_work+0x2a0/0x480
       [<c005a4b4>] worker_thread+0x1bc/0x2bc
       [<c0061148>] kthread+0x98/0xa4
       [<c0014af4>] kernel_thread_exit+0x0/0x8

-> #0 (cfg80211_mutex){+.+.+.}:
       [<c008ed58>] print_circular_bug+0x68/0x2cc
       [<c008fb28>] validate_chain+0x978/0x10f0
       [<c0090b68>] __lock_acquire+0x8c8/0x9b0
       [<c0090d40>] lock_acquire+0xf0/0x114
       [<c04734dc>] mutex_lock_nested+0x48/0x320
       [<bf28ae00>] restore_regulatory_settings+0x34/0x418 [cfg80211]
       [<bf28b200>] reg_timeout_work+0x1c/0x20 [cfg80211]
       [<c0059f44>] process_one_work+0x2a0/0x480
       [<c005a4b4>] worker_thread+0x1bc/0x2bc
       [<c0061148>] kthread+0x98/0xa4
       [<c0014af4>] kernel_thread_exit+0x0/0x8

other info that might help us debug this:

Chain exists of:
  cfg80211_mutex --> reg_mutex --> (reg_timeout).work

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock((reg_timeout).work);
                               lock(reg_mutex);
                               lock((reg_timeout).work);
  lock(cfg80211_mutex);

 *** DEADLOCK ***

2 locks held by kworker/0:2/1391:
 #0:  (events){.+.+.+}, at: [<c0059e94>] process_one_work+0x1f0/0x480
 #1:  ((reg_timeout).work){+.+...}, at: [<c0059e94>] process_one_work+0x1f0/0x480

stack backtrace:
[<c001b928>] (unwind_backtrace+0x0/0x12c) from [<c0471d3c>] (dump_stack+0x20/0x24)
[<c0471d3c>] (dump_stack+0x20/0x24) from [<c008ef70>] (print_circular_bug+0x280/0x2cc)
[<c008ef70>] (print_circular_bug+0x280/0x2cc) from [<c008fb28>] (validate_chain+0x978/0x10f0)
[<c008fb28>] (validate_chain+0x978/0x10f0) from [<c0090b68>] (__lock_acquire+0x8c8/0x9b0)
[<c0090b68>] (__lock_acquire+0x8c8/0x9b0) from [<c0090d40>] (lock_acquire+0xf0/0x114)
[<c0090d40>] (lock_acquire+0xf0/0x114) from [<c04734dc>] (mutex_lock_nested+0x48/0x320)
[<c04734dc>] (mutex_lock_nested+0x48/0x320) from [<bf28ae00>] (restore_regulatory_settings+0x34/0x418 [cfg80211])
[<bf28ae00>] (restore_regulatory_settings+0x34/0x418 [cfg80211]) from [<bf28b200>] (reg_timeout_work+0x1c/0x20 [cfg80211])
[<bf28b200>] (reg_timeout_work+0x1c/0x20 [cfg80211]) from [<c0059f44>] (process_one_work+0x2a0/0x480)
[<c0059f44>] (process_one_work+0x2a0/0x480) from [<c005a4b4>] (worker_thread+0x1bc/0x2bc)
[<c005a4b4>] (worker_thread+0x1bc/0x2bc) from [<c0061148>] (kthread+0x98/0xa4)
[<c0061148>] (kthread+0x98/0xa4) from [<c0014af4>] (kernel_thread_exit+0x0/0x8)
cfg80211: Calling CRDA to update world regulatory domain
cfg80211: World regulatory domain updated:
cfg80211:   (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
cfg80211:   (2402000 KHz - 2472000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
cfg80211:   (2457000 KHz - 2482000 KHz @ 20000 KHz), (300 mBi, 2000 mBm)
cfg80211:   (2474000 KHz - 2494000 KHz @ 20000 KHz), (300 mBi, 2000 mBm)
cfg80211:   (5170000 KHz - 5250000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
cfg80211:   (5735000 KHz - 5835000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)

Cc: stable@kernel.org
Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
fe20b39
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
Borislav Petkov x86/smp: Fix topology checks on AMD MCM CPUs
The warning below triggers on AMD MCM packages because physical package
IDs on the cores of a _physical_ socket are the same. I.e., this field
says which CPUs belong to the same physical package.

However, the same two CPUs belong to two different internal, i.e.
"logical" nodes in the same physical socket which is reflected in the
CPU-to-node map on x86 with NUMA.

Which makes this check wrong on the above topologies so circumvent it.

[    0.444413] Booting Node   0, Processors  #1 #2 #3 #4 #5 Ok.
[    0.461388] ------------[ cut here ]------------
[    0.465997] WARNING: at arch/x86/kernel/smpboot.c:310 topology_sane.clone.1+0x6e/0x81()
[    0.473960] Hardware name: Dinar
[    0.477170] sched: CPU #6's mc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency.
[    0.486860] Booting Node   1, Processors  #6
[    0.491104] Modules linked in:
[    0.494141] Pid: 0, comm: swapper/6 Not tainted 3.4.0+ #1
[    0.499510] Call Trace:
[    0.501946]  [<ffffffff8144bf92>] ? topology_sane.clone.1+0x6e/0x81
[    0.508185]  [<ffffffff8102f1fc>] warn_slowpath_common+0x85/0x9d
[    0.514163]  [<ffffffff8102f2b7>] warn_slowpath_fmt+0x46/0x48
[    0.519881]  [<ffffffff8144bf92>] topology_sane.clone.1+0x6e/0x81
[    0.525943]  [<ffffffff8144c234>] set_cpu_sibling_map+0x251/0x371
[    0.532004]  [<ffffffff8144c4ee>] start_secondary+0x19a/0x218
[    0.537729] ---[ end trace 4eaa2a86a8e2da22 ]---
[    0.628197]  #7 #8 #9 #10 #11 Ok.
[    0.807108] Booting Node   3, Processors  #12 #13 #14 #15 #16 #17 Ok.
[    0.897587] Booting Node   2, Processors  #18 #19 #20 #21 #22 #23 Ok.
[    0.917443] Brought up 24 CPUs

We ran a topology sanity check test we have here on it and
it all looks ok... hopefully :).

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120529135442.GE29157@aftab.osrc.amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
161270f
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
Don Zickus watchdog: Quiet down the boot messages
A bunch of bugzillas have complained how noisy the nmi_watchdog
is during boot-up especially with its expected failure cases
(like virt and bios resource contention).

This is my attempt to quiet them down and keep it less confusing
for the end user.  What I did is print the message for cpu0 and
save it for future comparisons.  If future cpus have an
identical message as cpu0, then don't print the redundant info.
However, if a future cpu has a different message, happily print
that loudly.

Before the change, you would see something like:

    ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
    CPU0: Intel(R) Core(TM)2 Quad CPU    Q9550  @ 2.83GHz stepping 0a
    Performance Events: PEBS fmt0+, Core2 events, Intel PMU driver.
    ... version:                2
    ... bit width:              40
    ... generic registers:      2
    ... value mask:             000000ffffffffff
    ... max period:             000000007fffffff
    ... fixed-purpose events:   3
    ... event mask:             0000000700000003
    NMI watchdog enabled, takes one hw-pmu counter.
    Booting Node   0, Processors  #1
    NMI watchdog enabled, takes one hw-pmu counter.
     #2
    NMI watchdog enabled, takes one hw-pmu counter.
     #3 Ok.
    NMI watchdog enabled, takes one hw-pmu counter.
    Brought up 4 CPUs
    Total of 4 processors activated (22607.24 BogoMIPS).

After the change, it is simplified to:

    ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
    CPU0: Intel(R) Core(TM)2 Quad CPU    Q9550  @ 2.83GHz stepping 0a
    Performance Events: PEBS fmt0+, Core2 events, Intel PMU driver.
    ... version:                2
    ... bit width:              40
    ... generic registers:      2
    ... value mask:             000000ffffffffff
    ... max period:             000000007fffffff
    ... fixed-purpose events:   3
    ... event mask:             0000000700000003
    NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
    Booting Node   0, Processors  #1 #2 #3 Ok.
    Brought up 4 CPUs

V2: little changes based on Joe Perches' feedback
V3: printk cleanup based on Ingo's feedback; checkpatch fix
V4: keep printk as one long line
V5: Ingo fix ups

Reported-and-tested-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: nzimmer@sgi.com
Cc: joe@perches.com
Link: http://lkml.kernel.org/r/1339594548-17227-1-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
a702704
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
@bmork bmork net: qmi_wwan: fix Gobi device probing
Ignoring interfaces with additional descriptors is not a reliable
method for locating the correct interface on Gobi devices.  There
is at least one device where this method fails:
https://bbs.archlinux.org/viewtopic.php?id=143506

The result is that the AT command port (interface #2) is hidden
from qcserial, preventing traditional serial modem usage:

[   15.562552] qmi_wwan 4-1.6:1.0: cdc-wdm0: USB WDM device
[   15.562691] qmi_wwan 4-1.6:1.0: wwan0: register 'qmi_wwan' at usb-0000:00:1d.0-1.6, Qualcomm Gobi wwan/QMI device, 1e:df:3c:3a:4e:3b
[   15.563383] qmi_wwan: probe of 4-1.6:1.1 failed with error -22
[   15.564189] qmi_wwan 4-1.6:1.2: cdc-wdm1: USB WDM device
[   15.564302] qmi_wwan 4-1.6:1.2: wwan1: register 'qmi_wwan' at usb-0000:00:1d.0-1.6, Qualcomm Gobi wwan/QMI device, 1e:df:3c:3a:4e:3b
[   15.564328] qmi_wwan: probe of 4-1.6:1.3 failed with error -22
[   15.569376] qcserial 4-1.6:1.1: Qualcomm USB modem converter detected
[   15.569440] usb 4-1.6: Qualcomm USB modem converter now attached to ttyUSB0
[   15.570372] qcserial 4-1.6:1.3: Qualcomm USB modem converter detected
[   15.570430] usb 4-1.6: Qualcomm USB modem converter now attached to ttyUSB1

Use static interface numbers taken from the interface map in
qcserial for all Gobi devices instead:

	Gobi 1K USB layout:
	0: serial port (doesn't respond)
	1: serial port (doesn't respond)
	2: AT-capable modem port
	3: QMI/net

	Gobi 2K+ USB layout:
	0: QMI/net
	1: DM/DIAG (use libqcdm from ModemManager for communication)
	2: AT-capable modem port
	3: NMEA

This should be more reliable over all, and will also prevent the
noisy "probe failed" messages.  The whitelisting logic is expected
to be replaced by direct interface number matching in 3.6.

Reported-by: Heinrich Siebmanns (Harvey) <H.Siebmanns@t-online.de>
Cc: <stable@vger.kernel.org> # v3.4: 0000188 USB: qmi_wwan: Make forced int 4 whitelist generic
Cc: <stable@vger.kernel.org> # v3.4: f7142e6 USB: qmi_wwan: Add ZTE (Vodafone) K3520-Z
Cc: <stable@vger.kernel.org> # v3.4
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
b9f90eb
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
Eric Dumazet net: l2tp_eth: use LLTX to avoid LOCKDEP splats
Denys Fedoryshchenko reported a LOCKDEP issue with l2tp code.

[ 8683.927442] ======================================================
[ 8683.927555] [ INFO: possible circular locking dependency detected ]
[ 8683.927672] 3.4.1-build-0061 #14 Not tainted
[ 8683.927782] -------------------------------------------------------
[ 8683.927895] swapper/0/0 is trying to acquire lock:
[ 8683.928007]  (slock-AF_INET){+.-...}, at: [<e0fc73ec>]
l2tp_xmit_skb+0x173/0x47e [l2tp_core]
[ 8683.928121]
[ 8683.928121] but task is already holding lock:
[ 8683.928121]  (_xmit_ETHER#2){+.-...}, at: [<c02f062d>]
sch_direct_xmit+0x36/0x119
[ 8683.928121]
[ 8683.928121] which lock already depends on the new lock.
[ 8683.928121]
[ 8683.928121]
[ 8683.928121] the existing dependency chain (in reverse order) is:
[ 8683.928121]
[ 8683.928121] -> #1 (_xmit_ETHER#2){+.-...}:
[ 8683.928121]        [<c015a561>] lock_acquire+0x71/0x85
[ 8683.928121]        [<c034da2d>] _raw_spin_lock+0x33/0x40
[ 8683.928121]        [<c0304e0c>] ip_send_reply+0xf2/0x1ce
[ 8683.928121]        [<c0317dbc>] tcp_v4_send_reset+0x153/0x16f
[ 8683.928121]        [<c0317f4a>] tcp_v4_do_rcv+0x172/0x194
[ 8683.928121]        [<c031929b>] tcp_v4_rcv+0x387/0x5a0
[ 8683.928121]        [<c03001d0>] ip_local_deliver_finish+0x13a/0x1e9
[ 8683.928121]        [<c0300645>] NF_HOOK.clone.11+0x46/0x4d
[ 8683.928121]        [<c030075b>] ip_local_deliver+0x41/0x45
[ 8683.928121]        [<c03005dd>] ip_rcv_finish+0x31a/0x33c
[ 8683.928121]        [<c0300645>] NF_HOOK.clone.11+0x46/0x4d
[ 8683.928121]        [<c0300960>] ip_rcv+0x201/0x23d
[ 8683.928121]        [<c02de91b>] __netif_receive_skb+0x329/0x378
[ 8683.928121]        [<c02deae8>] netif_receive_skb+0x4e/0x7d
[ 8683.928121]        [<e08d5ef3>] rtl8139_poll+0x243/0x33d [8139too]
[ 8683.928121]        [<c02df103>] net_rx_action+0x90/0x15d
[ 8683.928121]        [<c012b2b5>] __do_softirq+0x7b/0x118
[ 8683.928121]
[ 8683.928121] -> #0 (slock-AF_INET){+.-...}:
[ 8683.928121]        [<c0159f1b>] __lock_acquire+0x9a3/0xc27
[ 8683.928121]        [<c015a561>] lock_acquire+0x71/0x85
[ 8683.928121]        [<c034da2d>] _raw_spin_lock+0x33/0x40
[ 8683.928121]        [<e0fc73ec>] l2tp_xmit_skb+0x173/0x47e
[l2tp_core]
[ 8683.928121]        [<e0fe31fb>] l2tp_eth_dev_xmit+0x1a/0x2f
[l2tp_eth]
[ 8683.928121]        [<c02e01e7>] dev_hard_start_xmit+0x333/0x3f2
[ 8683.928121]        [<c02f064c>] sch_direct_xmit+0x55/0x119
[ 8683.928121]        [<c02e0528>] dev_queue_xmit+0x282/0x418
[ 8683.928121]        [<c031f4fb>] NF_HOOK.clone.19+0x45/0x4c
[ 8683.928121]        [<c031f524>] arp_xmit+0x22/0x24
[ 8683.928121]        [<c031f567>] arp_send+0x41/0x48
[ 8683.928121]        [<c031fa7d>] arp_process+0x289/0x491
[ 8683.928121]        [<c031f4fb>] NF_HOOK.clone.19+0x45/0x4c
[ 8683.928121]        [<c031f7a0>] arp_rcv+0xb1/0xc3
[ 8683.928121]        [<c02de91b>] __netif_receive_skb+0x329/0x378
[ 8683.928121]        [<c02de9d3>] process_backlog+0x69/0x130
[ 8683.928121]        [<c02df103>] net_rx_action+0x90/0x15d
[ 8683.928121]        [<c012b2b5>] __do_softirq+0x7b/0x118
[ 8683.928121]
[ 8683.928121] other info that might help us debug this:
[ 8683.928121]
[ 8683.928121]  Possible unsafe locking scenario:
[ 8683.928121]
[ 8683.928121]        CPU0                    CPU1
[ 8683.928121]        ----                    ----
[ 8683.928121]   lock(_xmit_ETHER#2);
[ 8683.928121]                                lock(slock-AF_INET);
[ 8683.928121]                                lock(_xmit_ETHER#2);
[ 8683.928121]   lock(slock-AF_INET);
[ 8683.928121]
[ 8683.928121]  *** DEADLOCK ***
[ 8683.928121]
[ 8683.928121] 3 locks held by swapper/0/0:
[ 8683.928121]  #0:  (rcu_read_lock){.+.+..}, at: [<c02dbc10>]
rcu_lock_acquire+0x0/0x30
[ 8683.928121]  #1:  (rcu_read_lock_bh){.+....}, at: [<c02dbc10>]
rcu_lock_acquire+0x0/0x30
[ 8683.928121]  #2:  (_xmit_ETHER#2){+.-...}, at: [<c02f062d>]
sch_direct_xmit+0x36/0x119
[ 8683.928121]
[ 8683.928121] stack backtrace:
[ 8683.928121] Pid: 0, comm: swapper/0 Not tainted 3.4.1-build-0061 #14
[ 8683.928121] Call Trace:
[ 8683.928121]  [<c034bdd2>] ? printk+0x18/0x1a
[ 8683.928121]  [<c0158904>] print_circular_bug+0x1ac/0x1b6
[ 8683.928121]  [<c0159f1b>] __lock_acquire+0x9a3/0xc27
[ 8683.928121]  [<c015a561>] lock_acquire+0x71/0x85
[ 8683.928121]  [<e0fc73ec>] ? l2tp_xmit_skb+0x173/0x47e [l2tp_core]
[ 8683.928121]  [<c034da2d>] _raw_spin_lock+0x33/0x40
[ 8683.928121]  [<e0fc73ec>] ? l2tp_xmit_skb+0x173/0x47e [l2tp_core]
[ 8683.928121]  [<e0fc73ec>] l2tp_xmit_skb+0x173/0x47e [l2tp_core]
[ 8683.928121]  [<e0fe31fb>] l2tp_eth_dev_xmit+0x1a/0x2f [l2tp_eth]
[ 8683.928121]  [<c02e01e7>] dev_hard_start_xmit+0x333/0x3f2
[ 8683.928121]  [<c02f064c>] sch_direct_xmit+0x55/0x119
[ 8683.928121]  [<c02e0528>] dev_queue_xmit+0x282/0x418
[ 8683.928121]  [<c02e02a6>] ? dev_hard_start_xmit+0x3f2/0x3f2
[ 8683.928121]  [<c031f4fb>] NF_HOOK.clone.19+0x45/0x4c
[ 8683.928121]  [<c031f524>] arp_xmit+0x22/0x24
[ 8683.928121]  [<c02e02a6>] ? dev_hard_start_xmit+0x3f2/0x3f2
[ 8683.928121]  [<c031f567>] arp_send+0x41/0x48
[ 8683.928121]  [<c031fa7d>] arp_process+0x289/0x491
[ 8683.928121]  [<c031f7f4>] ? __neigh_lookup.clone.20+0x42/0x42
[ 8683.928121]  [<c031f4fb>] NF_HOOK.clone.19+0x45/0x4c
[ 8683.928121]  [<c031f7a0>] arp_rcv+0xb1/0xc3
[ 8683.928121]  [<c031f7f4>] ? __neigh_lookup.clone.20+0x42/0x42
[ 8683.928121]  [<c02de91b>] __netif_receive_skb+0x329/0x378
[ 8683.928121]  [<c02de9d3>] process_backlog+0x69/0x130
[ 8683.928121]  [<c02df103>] net_rx_action+0x90/0x15d
[ 8683.928121]  [<c012b2b5>] __do_softirq+0x7b/0x118
[ 8683.928121]  [<c012b23a>] ? local_bh_enable+0xd/0xd
[ 8683.928121]  <IRQ>  [<c012b4d0>] ? irq_exit+0x41/0x91
[ 8683.928121]  [<c0103c6f>] ? do_IRQ+0x79/0x8d
[ 8683.928121]  [<c0157ea1>] ? trace_hardirqs_off_caller+0x2e/0x86
[ 8683.928121]  [<c034ef6e>] ? common_interrupt+0x2e/0x34
[ 8683.928121]  [<c0108a33>] ? default_idle+0x23/0x38
[ 8683.928121]  [<c01091a8>] ? cpu_idle+0x55/0x6f
[ 8683.928121]  [<c033df25>] ? rest_init+0xa1/0xa7
[ 8683.928121]  [<c033de84>] ? __read_lock_failed+0x14/0x14
[ 8683.928121]  [<c0498745>] ? start_kernel+0x303/0x30a
[ 8683.928121]  [<c0498209>] ? repair_env_string+0x51/0x51
[ 8683.928121]  [<c04980a8>] ? i386_start_kernel+0xa8/0xaf

It appears that like most virtual devices, l2tp should be converted to
LLTX mode.

This patch takes care of statistics using atomic_long in both RX and TX
paths, and fix a bug in l2tp_eth_dev_recv(), which was caching skb->data
before a pskb_may_pull() call.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Denys Fedoryshchenko <denys@visp.net.lb>
Cc: James Chapman <jchapman@katalix.com>
Cc: Hong zhi guo <honkiko@gmail.com>
Cc: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
a2842a1
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
@bebarino bebarino regulator: Fix recursive mutex lockdep warning
A recursive lockdep warning occurs if you call
regulator_set_optimum_mode() on a regulator with a supply because
there is no nesting annotation for the rdev->mutex. To avoid this
warning, get the supply's load before locking the regulator's
mutex to avoid grabbing the same class of lock twice.

=============================================
[ INFO: possible recursive locking detected ]
3.4.0 #3257 Tainted: G        W
---------------------------------------------
swapper/0/1 is trying to acquire lock:
 (&rdev->mutex){+.+.+.}, at: [<c036e9e0>] regulator_get_voltage+0x18/0x38

but task is already holding lock:
 (&rdev->mutex){+.+.+.}, at: [<c036ef38>] regulator_set_optimum_mode+0x24/0x224

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&rdev->mutex);
  lock(&rdev->mutex);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

3 locks held by swapper/0/1:
 #0:  (&__lockdep_no_validate__){......}, at: [<c03dbb48>] __driver_attach+0x40/0x8c
 #1:  (&__lockdep_no_validate__){......}, at: [<c03dbb58>] __driver_attach+0x50/0x8c
 #2:  (&rdev->mutex){+.+.+.}, at: [<c036ef38>] regulator_set_optimum_mode+0x24/0x224

stack backtrace:
[<c001521c>] (unwind_backtrace+0x0/0x12c) from [<c00cc4d4>] (validate_chain+0x760/0x1080)
[<c00cc4d4>] (validate_chain+0x760/0x1080) from [<c00cd744>] (__lock_acquire+0x950/0xa10)
[<c00cd744>] (__lock_acquire+0x950/0xa10) from [<c00cd990>] (lock_acquire+0x18c/0x1e8)
[<c00cd990>] (lock_acquire+0x18c/0x1e8) from [<c080c248>] (mutex_lock_nested+0x68/0x3c4)
[<c080c248>] (mutex_lock_nested+0x68/0x3c4) from [<c036e9e0>] (regulator_get_voltage+0x18/0x38)
[<c036e9e0>] (regulator_get_voltage+0x18/0x38) from [<c036efb8>] (regulator_set_optimum_mode+0xa4/0x224)
...

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
d92d95b
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
@aholler aholler leds: heartbeat: fix bug on panic
With commit 49dca5a I introduced
a bug (visible if CONFIG_PROVE_RCU is enabled) which occures when a panic
has happened:

[ 1526.520230] ===============================
[ 1526.520230] [ INFO: suspicious RCU usage. ]
[ 1526.520230] 3.5.0-rc1+ #12 Not tainted
[ 1526.520230] -------------------------------
[ 1526.520230] /c/kernel-tests/mm/include/linux/rcupdate.h:436 Illegal context switch in RCU read-side critical section!
[ 1526.520230]
[ 1526.520230] other info that might help us debug this:
[ 1526.520230]
[ 1526.520230]
[ 1526.520230] rcu_scheduler_active = 1, debug_locks = 0
[ 1526.520230] 3 locks held by net.agent/3279:
[ 1526.520230]  #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff82f85962>] do_page_fault+0x193/0x390
[ 1526.520230]  #1:  (panic_lock){+.+...}, at: [<ffffffff82ed2830>] panic+0x37/0x1d3
[ 1526.520230]  #2:  (rcu_read_lock){.+.+..}, at: [<ffffffff810b9b28>] rcu_lock_acquire+0x0/0x29
[ 1526.520230]
[ 1526.520230] stack backtrace:
[ 1526.520230] Pid: 3279, comm: net.agent Not tainted 3.5.0-rc1+ #12
[ 1526.520230] Call Trace:
[ 1526.520230]  [<ffffffff810e1570>] lockdep_rcu_suspicious+0x109/0x112
[ 1526.520230]  [<ffffffff810bfe3a>] rcu_preempt_sleep_check+0x45/0x47
[ 1526.520230]  [<ffffffff810bfe5a>] __might_sleep+0x1e/0x19a
[ 1526.520230]  [<ffffffff82f8010e>] down_write+0x26/0x81
[ 1526.520230]  [<ffffffff8276a966>] led_trigger_unregister+0x1f/0x9c
[ 1526.520230]  [<ffffffff8276def5>] heartbeat_reboot_notifier+0x15/0x19
[ 1526.520230]  [<ffffffff82f85bf5>] notifier_call_chain+0x96/0xcd
[ 1526.520230]  [<ffffffff82f85cba>] __atomic_notifier_call_chain+0x8e/0xff
[ 1526.520230]  [<ffffffff81094b7c>] ? kmsg_dump+0x37/0x1eb
[ 1526.520230]  [<ffffffff82f85d3f>] atomic_notifier_call_chain+0x14/0x16
[ 1526.520230]  [<ffffffff82ed28e1>] panic+0xe8/0x1d3
[ 1526.520230]  [<ffffffff811473e2>] out_of_memory+0x15d/0x1d3

So in case of a panic, now just turn of the LED. Other approaches like
scheduling a work to unregister the trigger aren't working because there
isn't much which still runs after a panic occured (except timers).

Signed-off-by: Alexander Holler <holler@ahsoftware.de>
Signed-off-by: Bryan Wu <bryan.wu@canonical.com>
fb31fbe
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
Dmitry Lifshitz [media] tvp5150: fix kernel crash if chip is unavailable
tvp5150 driver probe function doesn't check if the chip is present.
Thus the driver can be loaded without having a device.
This is dangerous and can cause kernel crash like this:

Kernel BUG at c03c0964 [verbose debug info unavailable]
Internal error: Oops - BUG: 0 [#1] PREEMPT ARM
Modules linked in:
CPU: 0    Tainted: G        W     (3.4.0-cm-t3730+ #2)
PC is at media_entity_create_link+0xe4/0xf4
LR is at isp_register_entities+0x228/0x2f4
pc : [<c03c0964>]    lr : [<c03f3b30>]    psr: 60000013
sp : cf02de50  ip : 00000000  fp : c079405c
r10: 00000000  r9 : 00000000  r8 : cf33c800
r7 : c0794834  r6 : 00000000  r5 : 00000000  r4 : cf365b48
r3 : 00000000  r2 : cf365b48  r1 : 00000000  r0 : cf33c800
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
Control: 10c5387d  Table: 80004019  DAC: 00000015
Process swapper (pid: 1, stack limit = 0xcf02c2f0)
Stack: (0xcf02de50 to 0xcf02e000)
de40:                                     cf360000 cf366f28 cf366218 c0794834
de60: cf365b48 00000000 cf33c800 c03f3b30 00000000 00000000 cf360890 cf3668a0
de80: 00000003 cf360000 c0785a58 00000000 cf360528 00000000 00000000 00003fff
dea0: cf360500 c03f4cdc c06cc1f4 cf360000 c0785a58 c0d27808 c07d55ec c0785a58
dec0: c031f0e0 c07d55ec c0776900 000000bb 00000000 c032040c c03203f4 c031ef0c
dee0: c0785a58 c07d55ec c0785a8c c031f0e0 c075e670 c031f0c8 cf02deb8 c0785a58
df00: c07d55ec c031f174 c07d55ec 00000000 cf02df18 c031d7a0 cf01d4a8 cf068b10
df20: 00000000 c07d55ec c07c74d0 cf34bcc0 00000000 c031ded8 c0672340 c054cd38
df40: 00000000 c07e68c0 c07d55ec 00000000 00000000 c075e670 c0776900 000000bb
df60: 00000000 c031f770 c07e68c0 00000007 c07e68c0 00000000 c075e670 c0008790
df80: 000000bb 00000006 00000006 c066e650 cf02dfa4 c07689b8 c07689b8 00000007
dfa0: c07e68c0 c073f2e8 c07689c0 000000bb 00000000 c073f2bc 00000006 00000006
dfc0: c073f2e8 00000000 c077649c c077649c c00150cc 00000013 00000000 00000000
dfe0: 00000000 c073f3cc cf02dfe8 00000000 c073f368 c00150cc 00000000 00000000
[<c03c0964>] (media_entity_create_link+0xe4/0xf4) from [<c03f3b30>] (isp_register_entities+0x228/0x2f4)
[<c03f3b30>] (isp_register_entities+0x228/0x2f4) from [<c03f4cdc>] (isp_probe+0x7ac/0x9b8)
[<c03f4cdc>] (isp_probe+0x7ac/0x9b8) from [<c032040c>] (platform_drv_probe+0x18/0x1c)
[<c032040c>] (platform_drv_probe+0x18/0x1c) from [<c031ef0c>] (really_probe+0x64/0x1d8)
[<c031ef0c>] (really_probe+0x64/0x1d8) from [<c031f0c8>] (driver_probe_device+0x48/0x60)
[<c031f0c8>] (driver_probe_device+0x48/0x60) from [<c031f174>] (__driver_attach+0x94/0x98)
[<c031f174>] (__driver_attach+0x94/0x98) from [<c031d7a0>] (bus_for_each_dev+0x54/0x80)
[<c031d7a0>] (bus_for_each_dev+0x54/0x80) from [<c031ded8>] (bus_add_driver+0xac/0x2a8)
[<c031ded8>] (bus_add_driver+0xac/0x2a8) from [<c031f770>] (driver_register+0x78/0x180)
[<c031f770>] (driver_register+0x78/0x180) from [<c0008790>] (do_one_initcall+0x34/0x184)
[<c0008790>] (do_one_initcall+0x34/0x184) from [<c073f2bc>] (do_basic_setup+0x9c/0xc8)
[<c073f2bc>] (do_basic_setup+0x9c/0xc8) from [<c073f3cc>] (kernel_init+0x64/0xec)
[<c073f3cc>] (kernel_init+0x64/0xec) from [<c00150cc>] (kernel_thread_exit+0x0/0x8)
Code: e1c812b6 e8bd87f0 e7f001f2 eafffffe (e7f001f2)

---[ end trace 3ed3c618b26ff3e8 ]---
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

This patch fixes the tvp5150_read() function to return an error in case
the I2C transaction fails.
tvp5150_probe() and other relevant driver callbacks changed to check the
status of the I2C read operations.
In case of a read error throw an error message with v4l2_err()
instead of v4l2_dbg().

[mchehab@redhat.com: Fix a small typo breaking compilation]
Signed-off-by: Dmitry Lifshitz <lifshitz@compulab.co.il>
Signed-off-by: Igor Grinberg <grinberg@compulab.co.il>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
8cd0d4c
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
@bebarino bebarino ARM: 7457/1: smp: Fix suspicious RCU originating from cpu_die()
While running hotplug tests I ran into this RCU splat

===============================
[ INFO: suspicious RCU usage. ]
3.4.0 #3275 Tainted: G        W
-------------------------------
include/linux/rcupdate.h:729 rcu_read_lock() used illegally while idle!

other info that might help us debug this:

RCU used illegally from idle CPU!
rcu_scheduler_active = 1, debug_locks = 0
RCU used illegally from extended quiescent state!
4 locks held by swapper/2/0:
 #0:  ((cpu_died).wait.lock){......}, at: [<c00ab128>] complete+0x1c/0x5c
 #1:  (&p->pi_lock){-.-.-.}, at: [<c00b275c>] try_to_wake_up+0x2c/0x388
 #2:  (&rq->lock){-.-.-.}, at: [<c00b2860>] try_to_wake_up+0x130/0x388
 #3:  (rcu_read_lock){.+.+..}, at: [<c00abe5c>] cpuacct_charge+0x28/0x1f4

stack backtrace:
[<c001521c>] (unwind_backtrace+0x0/0x12c) from [<c00abec8>] (cpuacct_charge+0x94/0x1f4)
[<c00abec8>] (cpuacct_charge+0x94/0x1f4) from [<c00b395c>] (update_curr+0x24c/0x2c8)
[<c00b395c>] (update_curr+0x24c/0x2c8) from [<c00b59c4>] (enqueue_task_fair+0x50/0x194)
[<c00b59c4>] (enqueue_task_fair+0x50/0x194) from [<c00afea4>] (enqueue_task+0x30/0x34)
[<c00afea4>] (enqueue_task+0x30/0x34) from [<c00b0908>] (ttwu_activate+0x14/0x38)
[<c00b0908>] (ttwu_activate+0x14/0x38) from [<c00b28a8>] (try_to_wake_up+0x178/0x388)
[<c00b28a8>] (try_to_wake_up+0x178/0x388) from [<c00a82a0>] (__wake_up_common+0x34/0x78)
[<c00a82a0>] (__wake_up_common+0x34/0x78) from [<c00ab154>] (complete+0x48/0x5c)
[<c00ab154>] (complete+0x48/0x5c) from [<c07db7cc>] (cpu_die+0x2c/0x58)
[<c07db7cc>] (cpu_die+0x2c/0x58) from [<c000f954>] (cpu_idle+0x64/0xfc)
[<c000f954>] (cpu_idle+0x64/0xfc) from [<80208160>] (0x80208160)

When a cpu is marked offline during its idle thread it calls
cpu_die() during an RCU idle period. cpu_die() calls complete()
to notify the killing process that the cpu has died. complete()
calls into the scheduler code and eventually grabs an RCU read
lock in cpuacct_charge().

Mark complete() as RCU_NONIDLE so that RCU pays attention to this
CPU for the duration of the complete() function even though it's
in idle.

Suggested-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
ff081e0
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
@jtlayton jtlayton cifs: when CONFIG_HIGHMEM is set, serialize the read/write kmaps
Jian found that when he ran fsx on a 32 bit arch with a large wsize the
process and one of the bdi writeback kthreads would sometimes deadlock
with a stack trace like this:

crash> bt
PID: 2789   TASK: f02edaa0  CPU: 3   COMMAND: "fsx"
 #0 [eed63cbc] schedule at c083c5b3
 #1 [eed63d80] kmap_high at c0500ec8
 #2 [eed63db0] cifs_async_writev at f7fabcd7 [cifs]
 #3 [eed63df0] cifs_writepages at f7fb7f5c [cifs]
 #4 [eed63e50] do_writepages at c04f3e32
 #5 [eed63e54] __filemap_fdatawrite_range at c04e152a
 #6 [eed63ea4] filemap_fdatawrite at c04e1b3e
 #7 [eed63eb4] cifs_file_aio_write at f7fa111a [cifs]
 #8 [eed63ecc] do_sync_write at c052d202
 #9 [eed63f74] vfs_write at c052d4ee
#10 [eed63f94] sys_write at c052df4c
#11 [eed63fb0] ia32_sysenter_target at c0409a98
    EAX: 00000004  EBX: 00000003  ECX: abd73b73  EDX: 012a65c6
    DS:  007b      ESI: 012a65c6  ES:  007b      EDI: 00000000
    SS:  007b      ESP: bf8db178  EBP: bf8db1f8  GS:  0033
    CS:  0073      EIP: 40000424  ERR: 00000004  EFLAGS: 00000246

Each task would kmap part of its address array before getting stuck, but
not enough to actually issue the write.

This patch fixes this by serializing the marshal_iov operations for
async reads and writes. The idea here is to ensure that cifs
aggressively tries to populate a request before attempting to fulfill
another one. As soon as all of the pages are kmapped for a request, then
we can unlock and allow another one to proceed.

There's no need to do this serialization on non-CONFIG_HIGHMEM arches
however, so optimize all of this out when CONFIG_HIGHMEM isn't set.

Cc: <stable@vger.kernel.org>
Reported-by: Jian Li <jiali@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
3cf003c
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
@jtlayton jtlayton nfs: skip commit in releasepage if we're freeing memory for fs-relate…
…d reasons

We've had some reports of a deadlock where rpciod ends up with a stack
trace like this:

    PID: 2507   TASK: ffff88103691ab40  CPU: 14  COMMAND: "rpciod/14"
     #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
     #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
     #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
     #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
     #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
     #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
     #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
     #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
     #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
     #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
    #10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
    #11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
    #12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
    #13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
    #14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
    #15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
    #16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
    #17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
    #18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
    #19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
    #20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
    #21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
    #22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
    #23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
    #24 [ffff8810343bfee8] kthread at ffffffff8108dd96
    #25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca

rpciod is trying to allocate memory for a new socket to talk to the
server. The VM ends up calling ->releasepage to get more memory, and it
tries to do a blocking commit. That commit can't succeed however without
a connected socket, so we deadlock.

Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
socket allocation, and having nfs_release_page check for that flag when
deciding whether to do a commit call. Also, set PF_FSTRANS
unconditionally in rpc_async_schedule since that function can also do
allocations sometimes.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
5cf02d0
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
@sgruszka sgruszka sched: fix divide by zero at {thread_group,task}_times
On architectures where cputime_t is 64 bit type, is possible to trigger
divide by zero on do_div(temp, (__force u32) total) line, if total is a
non zero number but has lower 32 bit's zeroed. Removing casting is not
a good solution since some do_div() implementations do cast to u32
internally.

This problem can be triggered in practice on very long lived processes:

  PID: 2331   TASK: ffff880472814b00  CPU: 2   COMMAND: "oraagent.bin"
   #0 [ffff880472a51b70] machine_kexec at ffffffff8103214b
   #1 [ffff880472a51bd0] crash_kexec at ffffffff810b91c2
   #2 [ffff880472a51ca0] oops_end at ffffffff814f0b00
   #3 [ffff880472a51cd0] die at ffffffff8100f26b
   #4 [ffff880472a51d00] do_trap at ffffffff814f03f4
   #5 [ffff880472a51d60] do_divide_error at ffffffff8100cfff
   #6 [ffff880472a51e00] divide_error at ffffffff8100be7b
      [exception RIP: thread_group_times+0x56]
      RIP: ffffffff81056a16  RSP: ffff880472a51eb8  RFLAGS: 00010046
      RAX: bc3572c9fe12d194  RBX: ffff880874150800  RCX: 0000000110266fad
      RDX: 0000000000000000  RSI: ffff880472a51eb8  RDI: 001038ae7d9633dc
      RBP: ffff880472a51ef8   R8: 00000000b10a3a64   R9: ffff880874150800
      R10: 00007fcba27ab680  R11: 0000000000000202  R12: ffff880472a51f08
      R13: ffff880472a51f10  R14: 0000000000000000  R15: 0000000000000007
      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
   #7 [ffff880472a51f00] do_sys_times at ffffffff8108845d
   #8 [ffff880472a51f40] sys_times at ffffffff81088524
   #9 [ffff880472a51f80] system_call_fastpath at ffffffff8100b0f2
      RIP: 0000003808caac3a  RSP: 00007fcba27ab6d8  RFLAGS: 00000202
      RAX: 0000000000000064  RBX: ffffffff8100b0f2  RCX: 0000000000000000
      RDX: 00007fcba27ab6e0  RSI: 000000000076d58e  RDI: 00007fcba27ab6e0
      RBP: 00007fcba27ab700   R8: 0000000000000020   R9: 000000000000091b
      R10: 00007fcba27ab680  R11: 0000000000000202  R12: 00007fff9ca41940
      R13: 0000000000000000  R14: 00007fcba27ac9c0  R15: 00007fff9ca41940
      ORIG_RAX: 0000000000000064  CS: 0033  SS: 002b

Cc: stable@vger.kernel.org
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120808092714.GA3580@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
bea6832
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
Eric Dumazet tcp: fix possible socket refcount problem
Commit 6f458df (tcp: improve latencies of timer triggered events)
added bug leading to following trace :

[ 2866.131281] IPv4: Attempt to release TCP socket in state 1 ffff880019ec0000
[ 2866.131726]
[ 2866.132188] =========================
[ 2866.132281] [ BUG: held lock freed! ]
[ 2866.132281] 3.6.0-rc1+ #622 Not tainted
[ 2866.132281] -------------------------
[ 2866.132281] kworker/0:1/652 is freeing memory ffff880019ec0000-ffff880019ec0a1f, with a lock still held there!
[ 2866.132281]  (sk_lock-AF_INET-RPC){+.+...}, at: [<ffffffff81903619>] tcp_sendmsg+0x29/0xcc6
[ 2866.132281] 4 locks held by kworker/0:1/652:
[ 2866.132281]  #0:  (rpciod){.+.+.+}, at: [<ffffffff81083567>] process_one_work+0x1de/0x47f
[ 2866.132281]  #1:  ((&task->u.tk_work)){+.+.+.}, at: [<ffffffff81083567>] process_one_work+0x1de/0x47f
[ 2866.132281]  #2:  (sk_lock-AF_INET-RPC){+.+...}, at: [<ffffffff81903619>] tcp_sendmsg+0x29/0xcc6
[ 2866.132281]  #3:  (&icsk->icsk_retransmit_timer){+.-...}, at: [<ffffffff81078017>] run_timer_softirq+0x1ad/0x35f
[ 2866.132281]
[ 2866.132281] stack backtrace:
[ 2866.132281] Pid: 652, comm: kworker/0:1 Not tainted 3.6.0-rc1+ #622
[ 2866.132281] Call Trace:
[ 2866.132281]  <IRQ>  [<ffffffff810bc527>] debug_check_no_locks_freed+0x112/0x159
[ 2866.132281]  [<ffffffff818a0839>] ? __sk_free+0xfd/0x114
[ 2866.132281]  [<ffffffff811549fa>] kmem_cache_free+0x6b/0x13a
[ 2866.132281]  [<ffffffff818a0839>] __sk_free+0xfd/0x114
[ 2866.132281]  [<ffffffff818a08c0>] sk_free+0x1c/0x1e
[ 2866.132281]  [<ffffffff81911e1c>] tcp_write_timer+0x51/0x56
[ 2866.132281]  [<ffffffff81078082>] run_timer_softirq+0x218/0x35f
[ 2866.132281]  [<ffffffff81078017>] ? run_timer_softirq+0x1ad/0x35f
[ 2866.132281]  [<ffffffff810f5831>] ? rb_commit+0x58/0x85
[ 2866.132281]  [<ffffffff81911dcb>] ? tcp_write_timer_handler+0x148/0x148
[ 2866.132281]  [<ffffffff81070bd6>] __do_softirq+0xcb/0x1f9
[ 2866.132281]  [<ffffffff81a0a00c>] ? _raw_spin_unlock+0x29/0x2e
[ 2866.132281]  [<ffffffff81a1227c>] call_softirq+0x1c/0x30
[ 2866.132281]  [<ffffffff81039f38>] do_softirq+0x4a/0xa6
[ 2866.132281]  [<ffffffff81070f2b>] irq_exit+0x51/0xad
[ 2866.132281]  [<ffffffff81a129cd>] do_IRQ+0x9d/0xb4
[ 2866.132281]  [<ffffffff81a0a3ef>] common_interrupt+0x6f/0x6f
[ 2866.132281]  <EOI>  [<ffffffff8109d006>] ? sched_clock_cpu+0x58/0xd1
[ 2866.132281]  [<ffffffff81a0a172>] ? _raw_spin_unlock_irqrestore+0x4c/0x56
[ 2866.132281]  [<ffffffff81078692>] mod_timer+0x178/0x1a9
[ 2866.132281]  [<ffffffff818a00aa>] sk_reset_timer+0x19/0x26
[ 2866.132281]  [<ffffffff8190b2cc>] tcp_rearm_rto+0x99/0xa4
[ 2866.132281]  [<ffffffff8190dfba>] tcp_event_new_data_sent+0x6e/0x70
[ 2866.132281]  [<ffffffff8190f7ea>] tcp_write_xmit+0x7de/0x8e4
[ 2866.132281]  [<ffffffff818a565d>] ? __alloc_skb+0xa0/0x1a1
[ 2866.132281]  [<ffffffff8190f952>] __tcp_push_pending_frames+0x2e/0x8a
[ 2866.132281]  [<ffffffff81904122>] tcp_sendmsg+0xb32/0xcc6
[ 2866.132281]  [<ffffffff819229c2>] inet_sendmsg+0xaa/0xd5
[ 2866.132281]  [<ffffffff81922918>] ? inet_autobind+0x5f/0x5f
[ 2866.132281]  [<ffffffff810ee7f1>] ? trace_clock_local+0x9/0xb
[ 2866.132281]  [<ffffffff8189adab>] sock_sendmsg+0xa3/0xc4
[ 2866.132281]  [<ffffffff810f5de6>] ? rb_reserve_next_event+0x26f/0x2d5
[ 2866.132281]  [<ffffffff8103e6a9>] ? native_sched_clock+0x29/0x6f
[ 2866.132281]  [<ffffffff8103e6f8>] ? sched_clock+0x9/0xd
[ 2866.132281]  [<ffffffff810ee7f1>] ? trace_clock_local+0x9/0xb
[ 2866.132281]  [<ffffffff8189ae03>] kernel_sendmsg+0x37/0x43
[ 2866.132281]  [<ffffffff8199ce49>] xs_send_kvec+0x77/0x80
[ 2866.132281]  [<ffffffff8199cec1>] xs_sendpages+0x6f/0x1a0
[ 2866.132281]  [<ffffffff8107826d>] ? try_to_del_timer_sync+0x55/0x61
[ 2866.132281]  [<ffffffff8199d0d2>] xs_tcp_send_request+0x55/0xf1
[ 2866.132281]  [<ffffffff8199bb90>] xprt_transmit+0x89/0x1db
[ 2866.132281]  [<ffffffff81999bcd>] ? call_connect+0x3c/0x3c
[ 2866.132281]  [<ffffffff81999d92>] call_transmit+0x1c5/0x20e
[ 2866.132281]  [<ffffffff819a0d55>] __rpc_execute+0x6f/0x225
[ 2866.132281]  [<ffffffff81999bcd>] ? call_connect+0x3c/0x3c
[ 2866.132281]  [<ffffffff819a0f33>] rpc_async_schedule+0x28/0x34
[ 2866.132281]  [<ffffffff810835d6>] process_one_work+0x24d/0x47f
[ 2866.132281]  [<ffffffff81083567>] ? process_one_work+0x1de/0x47f
[ 2866.132281]  [<ffffffff819a0f0b>] ? __rpc_execute+0x225/0x225
[ 2866.132281]  [<ffffffff81083a6d>] worker_thread+0x236/0x317
[ 2866.132281]  [<ffffffff81083837>] ? process_scheduled_works+0x2f/0x2f
[ 2866.132281]  [<ffffffff8108b7b8>] kthread+0x9a/0xa2
[ 2866.132281]  [<ffffffff81a12184>] kernel_thread_helper+0x4/0x10
[ 2866.132281]  [<ffffffff81a0a4b0>] ? retint_restore_args+0x13/0x13
[ 2866.132281]  [<ffffffff8108b71e>] ? __init_kthread_worker+0x5a/0x5a
[ 2866.132281]  [<ffffffff81a12180>] ? gs_change+0x13/0x13
[ 2866.308506] IPv4: Attempt to release TCP socket in state 1 ffff880019ec0000
[ 2866.309689] =============================================================================
[ 2866.310254] BUG TCP (Not tainted): Object already free
[ 2866.310254] -----------------------------------------------------------------------------
[ 2866.310254]

The bug comes from the fact that timer set in sk_reset_timer() can run
before we actually do the sock_hold(). socket refcount reaches zero and
we free the socket too soon.

timer handler is not allowed to reduce socket refcnt if socket is owned
by the user, or we need to change sk_reset_timer() implementation.

We should take a reference on the socket in case TCP_DELACK_TIMER_DEFERRED
or TCP_DELACK_TIMER_DEFERRED bit are set in tsq_flags

Also fix a typo in tcp_delack_timer(), where TCP_WRITE_TIMER_DEFERRED
was used instead of TCP_DELACK_TIMER_DEFERRED.

For consistency, use same socket refcount change for TCP_MTU_REDUCED_DEFERRED,
even if not fired from a timer.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Tested-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
144d56e
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
Bruno Prémont fbcon: Fix bit_putcs() call to kmalloc(s, GFP_KERNEL)
Switch to kmalloc(,GFP_ATOMIC) in bit_putcs to fix below trace:

[    9.771812] BUG: sleeping function called from invalid context at /usr/src/linux-git/mm/slub.c:943
[    9.771814] in_atomic(): 1, irqs_disabled(): 1, pid: 1063, name: mount
[    9.771818] Pid: 1063, comm: mount Not tainted 3.5.0-jupiter-00003-g8d858b1-dirty #2
[    9.771819] Call Trace:
[    9.771838]  [<c104f79b>] __might_sleep+0xcb/0xe0
[    9.771844]  [<c10c00d4>] __kmalloc+0xb4/0x1c0
[    9.771851]  [<c1041d4a>] ? queue_work+0x1a/0x30
[    9.771854]  [<c1041dcf>] ? queue_delayed_work+0xf/0x30
[    9.771862]  [<c1205832>] ? bit_putcs+0xf2/0x3e0
[    9.771865]  [<c1041e01>] ? schedule_delayed_work+0x11/0x20
[    9.771868]  [<c1205832>] bit_putcs+0xf2/0x3e0
[    9.771875]  [<c12002b8>] ? get_color.clone.14+0x28/0x100
[    9.771878]  [<c1200d2f>] fbcon_putcs+0x11f/0x130
[    9.771882]  [<c1205740>] ? bit_clear+0xe0/0xe0
[    9.771885]  [<c1200f6d>] fbcon_redraw.clone.21+0x11d/0x160
[    9.771889]  [<c120383d>] fbcon_scroll+0x79d/0xe10
[    9.771892]  [<c12002b8>] ? get_color.clone.14+0x28/0x100
[    9.771897]  [<c124c0b4>] scrup+0x64/0xd0
[    9.771900]  [<c124c22b>] lf+0x2b/0x60
[    9.771903]  [<c124cc95>] vt_console_print+0x1d5/0x2f0
[    9.771907]  [<c124cac0>] ? register_vt_notifier+0x20/0x20
[    9.771913]  [<c102b335>] call_console_drivers.clone.5+0xa5/0xc0
[    9.771916]  [<c102c58e>] console_unlock+0x2fe/0x3c0
[    9.771920]  [<c102ca16>] vprintk_emit+0x2e6/0x300
[    9.771924]  [<c13f01ae>] printk+0x38/0x3a
[    9.771931]  [<c112e8fe>] reiserfs_remount+0x2ae/0x3e0
[    9.771934]  [<c112e650>] ? reiserfs_fill_super+0xb00/0xb00
[    9.771939]  [<c10ca0ab>] do_remount_sb+0xab/0x150
[    9.771943]  [<c1034476>] ? ns_capable+0x46/0x70
[    9.771948]  [<c10e059c>] do_mount+0x20c/0x6b0
[    9.771955]  [<c10a7044>] ? strndup_user+0x34/0x50
[    9.771958]  [<c10e0acc>] sys_mount+0x6c/0xa0
[    9.771964]  [<c13f2557>] sysenter_do_call+0x12/0x26

According to comment in bit_putcs() that kammloc() call only happens
when fbcon is drawing to a monochrome framebuffer (which is my case with
hid-picolcd).

Signed-off-by: Bruno Prémont <bonbons@linux-vserver.org>
Signed-off-by: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
72caa5f
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
Lauri Hintsala mmc: mxs-mmc: fix deadlock caused by recursion loop
Release the lock before mmc_signal_sdio_irq is called by
mxs_mmc_enable_sdio_irq.

Backtrace:
[   65.470000] =============================================
[   65.470000] [ INFO: possible recursive locking detected ]
[   65.470000] 3.5.0-rc5 #2 Not tainted
[   65.470000] ---------------------------------------------
[   65.470000] ksdioirqd/mmc0/73 is trying to acquire lock:
[   65.470000]  (&(&host->lock)->rlock#2){-.-...}, at: [<bf054120>] mxs_mmc_enable_sdio_irq+0x18/0xdc [mxs_mmc]
[   65.470000]
[   65.470000] but task is already holding lock:
[   65.470000]  (&(&host->lock)->rlock#2){-.-...}, at: [<bf054120>] mxs_mmc_enable_sdio_irq+0x18/0xdc [mxs_mmc]
[   65.470000]
[   65.470000] other info that might help us debug this:
[   65.470000]  Possible unsafe locking scenario:
[   65.470000]
[   65.470000]        CPU0
[   65.470000]        ----
[   65.470000]   lock(&(&host->lock)->rlock#2);
[   65.470000]   lock(&(&host->lock)->rlock#2);
[   65.470000]
[   65.470000]  *** DEADLOCK ***
[   65.470000]
[   65.470000]  May be due to missing lock nesting notation
[   65.470000]
[   65.470000] 1 lock held by ksdioirqd/mmc0/73:
[   65.470000]  #0:  (&(&host->lock)->rlock#2){-.-...}, at: [<bf054120>] mxs_mmc_enable_sdio_irq+0x18/0xdc [mxs_mmc]
[   65.470000]
[   65.470000] stack backtrace:
[   65.470000] [<c0014990>] (unwind_backtrace+0x0/0xf4) from [<c005ccb8>] (__lock_acquire+0x14f8/0x1b98)
[   65.470000] [<c005ccb8>] (__lock_acquire+0x14f8/0x1b98) from [<c005d3f8>] (lock_acquire+0xa0/0x108)
[   65.470000] [<c005d3f8>] (lock_acquire+0xa0/0x108) from [<c02f671c>] (_raw_spin_lock_irqsave+0x48/0x5c)
[   65.470000] [<c02f671c>] (_raw_spin_lock_irqsave+0x48/0x5c) from [<bf054120>] (mxs_mmc_enable_sdio_irq+0x18/0xdc [mxs_mmc])
[   65.470000] [<bf054120>] (mxs_mmc_enable_sdio_irq+0x18/0xdc [mxs_mmc]) from [<bf0541d0>] (mxs_mmc_enable_sdio_irq+0xc8/0xdc [mxs_mmc])
[   65.470000] [<bf0541d0>] (mxs_mmc_enable_sdio_irq+0xc8/0xdc [mxs_mmc]) from [<c0219b38>] (sdio_irq_thread+0x1bc/0x274)
[   65.470000] [<c0219b38>] (sdio_irq_thread+0x1bc/0x274) from [<c003c324>] (kthread+0x8c/0x98)
[   65.470000] [<c003c324>] (kthread+0x8c/0x98) from [<c00101ac>] (kernel_thread_exit+0x0/0x8)
[   65.470000] BUG: spinlock lockup suspected on CPU#0, ksdioirqd/mmc0/73
[   65.470000]  lock: 0xc3358724, .magic: dead4ead, .owner: ksdioirqd/mmc0/73, .owner_cpu: 0
[   65.470000] [<c0014990>] (unwind_backtrace+0x0/0xf4) from [<c01b46b0>] (do_raw_spin_lock+0x100/0x144)
[   65.470000] [<c01b46b0>] (do_raw_spin_lock+0x100/0x144) from [<c02f6724>] (_raw_spin_lock_irqsave+0x50/0x5c)
[   65.470000] [<c02f6724>] (_raw_spin_lock_irqsave+0x50/0x5c) from [<bf054120>] (mxs_mmc_enable_sdio_irq+0x18/0xdc [mxs_mmc])
[   65.470000] [<bf054120>] (mxs_mmc_enable_sdio_irq+0x18/0xdc [mxs_mmc]) from [<bf0541d0>] (mxs_mmc_enable_sdio_irq+0xc8/0xdc [mxs_mmc])
[   65.470000] [<bf0541d0>] (mxs_mmc_enable_sdio_irq+0xc8/0xdc [mxs_mmc]) from [<c0219b38>] (sdio_irq_thread+0x1bc/0x274)
[   65.470000] [<c0219b38>] (sdio_irq_thread+0x1bc/0x274) from [<c003c324>] (kthread+0x8c/0x98)
[   65.470000] [<c003c324>] (kthread+0x8c/0x98) from [<c00101ac>] (kernel_thread_exit+0x0/0x8)

Reported-by: Attila Kinali <attila@kinali.ch>
Signed-off-by: Lauri Hintsala <lauri.hintsala@bluegiga.com>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
fc108d2
@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012
Eric Dumazet l2tp: fix a lockdep splat
Fixes following lockdep splat :

[ 1614.734896] =============================================
[ 1614.734898] [ INFO: possible recursive locking detected ]
[ 1614.734901] 3.6.0-rc3+ #782 Not tainted
[ 1614.734903] ---------------------------------------------
[ 1614.734905] swapper/11/0 is trying to acquire lock:
[ 1614.734907]  (slock-AF_INET){+.-...}, at: [<ffffffffa0209d72>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
[ 1614.734920]
[ 1614.734920] but task is already holding lock:
[ 1614.734922]  (slock-AF_INET){+.-...}, at: [<ffffffff815fce23>] tcp_v4_err+0x163/0x6b0
[ 1614.734932]
[ 1614.734932] other info that might help us debug this:
[ 1614.734935]  Possible unsafe locking scenario:
[ 1614.734935]
[ 1614.734937]        CPU0
[ 1614.734938]        ----
[ 1614.734940]   lock(slock-AF_INET);
[ 1614.734943]   lock(slock-AF_INET);
[ 1614.734946]
[ 1614.734946]  *** DEADLOCK ***
[ 1614.734946]
[ 1614.734949]  May be due to missing lock nesting notation
[ 1614.734949]
[ 1614.734952] 7 locks held by swapper/11/0:
[ 1614.734954]  #0:  (rcu_read_lock){.+.+..}, at: [<ffffffff81592801>] __netif_receive_skb+0x251/0xd00
[ 1614.734964]  #1:  (rcu_read_lock){.+.+..}, at: [<ffffffff815d319c>] ip_local_deliver_finish+0x4c/0x4e0
[ 1614.734972]  #2:  (rcu_read_lock){.+.+..}, at: [<ffffffff8160d116>] icmp_socket_deliver+0x46/0x230
[ 1614.734982]  #3:  (slock-AF_INET){+.-...}, at: [<ffffffff815fce23>] tcp_v4_err+0x163/0x6b0
[ 1614.734989]  #4:  (rcu_read_lock){.+.+..}, at: [<ffffffff815da240>] ip_queue_xmit+0x0/0x680
[ 1614.734997]  #5:  (rcu_read_lock_bh){.+....}, at: [<ffffffff815d9925>] ip_finish_output+0x135/0x890
[ 1614.735004]  #6:  (rcu_read_lock_bh){.+....}, at: [<ffffffff81595680>] dev_queue_xmit+0x0/0xe00
[ 1614.735012]
[ 1614.735012] stack backtrace:
[ 1614.735016] Pid: 0, comm: swapper/11 Not tainted 3.6.0-rc3+ #782
[ 1614.735018] Call Trace:
[ 1614.735020]  <IRQ>  [<ffffffff810a50ac>] __lock_acquire+0x144c/0x1b10
[ 1614.735033]  [<ffffffff810a334b>] ? check_usage+0x9b/0x4d0
[ 1614.735037]  [<ffffffff810a6762>] ? mark_held_locks+0x82/0x130
[ 1614.735042]  [<ffffffff810a5df0>] lock_acquire+0x90/0x200
[ 1614.735047]  [<ffffffffa0209d72>] ? l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
[ 1614.735051]  [<ffffffff810a69ad>] ? trace_hardirqs_on+0xd/0x10
[ 1614.735060]  [<ffffffff81749b31>] _raw_spin_lock+0x41/0x50
[ 1614.735065]  [<ffffffffa0209d72>] ? l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
[ 1614.735069]  [<ffffffffa0209d72>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
[ 1614.735075]  [<ffffffffa014f7f2>] l2tp_eth_dev_xmit+0x32/0x60 [l2tp_eth]
[ 1614.735079]  [<ffffffff81595112>] dev_hard_start_xmit+0x502/0xa70
[ 1614.735083]  [<ffffffff81594c6e>] ? dev_hard_start_xmit+0x5e/0xa70
[ 1614.735087]  [<ffffffff815957c1>] ? dev_queue_xmit+0x141/0xe00
[ 1614.735093]  [<ffffffff815b622e>] sch_direct_xmit+0xfe/0x290
[ 1614.735098]  [<ffffffff81595865>] dev_queue_xmit+0x1e5/0xe00
[ 1614.735102]  [<ffffffff81595680>] ? dev_hard_start_xmit+0xa70/0xa70
[ 1614.735106]  [<ffffffff815b4daa>] ? eth_header+0x3a/0xf0
[ 1614.735111]  [<ffffffff8161d33e>] ? fib_get_table+0x2e/0x280
[ 1614.735117]  [<ffffffff8160a7e2>] arp_xmit+0x22/0x60
[ 1614.735121]  [<ffffffff8160a863>] arp_send+0x43/0x50
[ 1614.735125]  [<ffffffff8160b82f>] arp_solicit+0x18f/0x450
[ 1614.735132]  [<ffffffff8159d9da>] neigh_probe+0x4a/0x70
[ 1614.735137]  [<ffffffff815a191a>] __neigh_event_send+0xea/0x300
[ 1614.735141]  [<ffffffff815a1c93>] neigh_resolve_output+0x163/0x260
[ 1614.735146]  [<ffffffff815d9cf5>] ip_finish_output+0x505/0x890
[ 1614.735150]  [<ffffffff815d9925>] ? ip_finish_output+0x135/0x890
[ 1614.735154]  [<ffffffff815dae79>] ip_output+0x59/0xf0
[ 1614.735158]  [<ffffffff815da1cd>] ip_local_out+0x2d/0xa0
[ 1614.735162]  [<ffffffff815da403>] ip_queue_xmit+0x1c3/0x680
[ 1614.735165]  [<ffffffff815da240>] ? ip_local_out+0xa0/0xa0
[ 1614.735172]  [<ffffffff815f4402>] tcp_transmit_skb+0x402/0xa60
[ 1614.735177]  [<ffffffff815f5a11>] tcp_retransmit_skb+0x1a1/0x620
[ 1614.735181]  [<ffffffff815f7e93>] tcp_retransmit_timer+0x393/0x960
[ 1614.735185]  [<ffffffff815fce23>] ? tcp_v4_err+0x163/0x6b0
[ 1614.735189]  [<ffffffff815fd317>] tcp_v4_err+0x657/0x6b0
[ 1614.735194]  [<ffffffff8160d116>] ? icmp_socket_deliver+0x46/0x230
[ 1614.735199]  [<ffffffff8160d19e>] icmp_socket_deliver+0xce/0x230
[ 1614.735203]  [<ffffffff8160d116>] ? icmp_socket_deliver+0x46/0x230
[ 1614.735208]  [<ffffffff8160d464>] icmp_unreach+0xe4/0x2c0
[ 1614.735213]  [<ffffffff8160e520>] icmp_rcv+0x350/0x4a0
[ 1614.735217]  [<ffffffff815d3285>] ip_local_deliver_finish+0x135/0x4e0
[ 1614.735221]  [<ffffffff815d319c>] ? ip_local_deliver_finish+0x4c/0x4e0
[ 1614.735225]  [<ffffffff815d3ffa>] ip_local_deliver+0x4a/0x90
[ 1614.735229]  [<ffffffff815d37b7>] ip_rcv_finish+0x187/0x730
[ 1614.735233]  [<ffffffff815d425d>] ip_rcv+0x21d/0x300
[ 1614.735237]  [<ffffffff81592a1b>] __netif_receive_skb+0x46b/0xd00
[ 1614.735241]  [<ffffffff81592801>] ? __netif_receive_skb+0x251/0xd00
[ 1614.735245]  [<ffffffff81593368>] process_backlog+0xb8/0x180
[ 1614.735249]  [<ffffffff81593cf9>] net_rx_action+0x159/0x330
[ 1614.735257]  [<ffffffff810491f0>] __do_softirq+0xd0/0x3e0
[ 1614.735264]  [<ffffffff8109ed24>] ? tick_program_event+0x24/0x30
[ 1614.735270]  [<ffffffff8175419c>] call_softirq+0x1c/0x30
[ 1614.735278]  [<ffffffff8100425d>] do_softirq+0x8d/0xc0
[ 1614.735282]  [<ffffffff8104983e>] irq_exit+0xae/0xe0
[ 1614.735287]  [<ffffffff8175494e>] smp_apic_timer_interrupt+0x6e/0x99
[ 1614.735291]  [<ffffffff81753a1c>] apic_timer_interrupt+0x6c/0x80
[ 1614.735293]  <EOI>  [<ffffffff810a14ad>] ? trace_hardirqs_off+0xd/0x10
[ 1614.735306]  [<ffffffff81336f85>] ? intel_idle+0xf5/0x150
[ 1614.735310]  [<ffffffff81336f7e>] ? intel_idle+0xee/0x150
[ 1614.735317]  [<ffffffff814e6ea9>] cpuidle_enter+0x19/0x20
[ 1614.735321]  [<ffffffff814e7538>] cpuidle_idle_call+0xa8/0x630
[ 1614.735327]  [<ffffffff8100c1ba>] cpu_idle+0x8a/0xe0
[ 1614.735333]  [<ffffffff8173762e>] start_secondary+0x220/0x222

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
37159ef
@dudducat dudducat referenced this issue Oct 20, 2012
Closed

usb issue #136

@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Oct 26, 2012
Eric Dumazet bonding: set qdisc_tx_busylock to avoid LOCKDEP splat
If a qdisc is installed on a bonding device, its possible to get
following lockdep splat under stress :

 =============================================
 [ INFO: possible recursive locking detected ]
 3.6.0+ #211 Not tainted
 ---------------------------------------------
 ping/4876 is trying to acquire lock:
  (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+.-...}, at: [<ffffffff8157a191>] dev_queue_xmit+0xe1/0x830

 but task is already holding lock:
  (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+.-...}, at: [<ffffffff8157a191>] dev_queue_xmit+0xe1/0x830

 other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(dev->qdisc_tx_busylock ?: &qdisc_tx_busylock);
   lock(dev->qdisc_tx_busylock ?: &qdisc_tx_busylock);

  *** DEADLOCK ***

  May be due to missing lock nesting notation

 6 locks held by ping/4876:
  #0:  (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff815e5030>] raw_sendmsg+0x600/0xc30
  #1:  (rcu_read_lock_bh){.+....}, at: [<ffffffff815ba4bd>] ip_finish_output+0x12d/0x870
  #2:  (rcu_read_lock_bh){.+....}, at: [<ffffffff8157a0b0>] dev_queue_xmit+0x0/0x830
  #3:  (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+.-...}, at: [<ffffffff8157a191>] dev_queue_xmit+0xe1/0x830
  #4:  (&bond->lock){++.?..}, at: [<ffffffffa02128c1>] bond_start_xmit+0x31/0x4b0 [bonding]
  #5:  (rcu_read_lock_bh){.+....}, at: [<ffffffff8157a0b0>] dev_queue_xmit+0x0/0x830

 stack backtrace:
 Pid: 4876, comm: ping Not tainted 3.6.0+ #211
 Call Trace:
  [<ffffffff810a0145>] __lock_acquire+0x715/0x1b80
  [<ffffffff810a256b>] ? mark_held_locks+0x9b/0x100
  [<ffffffff810a1bf2>] lock_acquire+0x92/0x1d0
  [<ffffffff8157a191>] ? dev_queue_xmit+0xe1/0x830
  [<ffffffff81726b7c>] _raw_spin_lock+0x3c/0x50
  [<ffffffff8157a191>] ? dev_queue_xmit+0xe1/0x830
  [<ffffffff8106264d>] ? rcu_read_lock_bh_held+0x5d/0x90
  [<ffffffff8157a191>] dev_queue_xmit+0xe1/0x830
  [<ffffffff8157a0b0>] ? netdev_pick_tx+0x570/0x570
  [<ffffffffa0212a6a>] bond_start_xmit+0x1da/0x4b0 [bonding]
  [<ffffffff815796d0>] dev_hard_start_xmit+0x240/0x6b0
  [<ffffffff81597c6e>] sch_direct_xmit+0xfe/0x2a0
  [<ffffffff8157a249>] dev_queue_xmit+0x199/0x830
  [<ffffffff8157a0b0>] ? netdev_pick_tx+0x570/0x570
  [<ffffffff815ba96f>] ip_finish_output+0x5df/0x870
  [<ffffffff815ba4bd>] ? ip_finish_output+0x12d/0x870
  [<ffffffff815bb964>] ip_output+0x54/0xf0
  [<ffffffff815bad48>] ip_local_out+0x28/0x90
  [<ffffffff815bc444>] ip_send_skb+0x14/0x50
  [<ffffffff815bc4b2>] ip_push_pending_frames+0x32/0x40
  [<ffffffff815e536a>] raw_sendmsg+0x93a/0xc30
  [<ffffffff8128d570>] ? selinux_file_send_sigiotask+0x1f0/0x1f0
  [<ffffffff8109ddb4>] ? __lock_is_held+0x54/0x80
  [<ffffffff815f6730>] ? inet_recvmsg+0x220/0x220
  [<ffffffff8109ddb4>] ? __lock_is_held+0x54/0x80
  [<ffffffff815f6855>] inet_sendmsg+0x125/0x240
  [<ffffffff815f6730>] ? inet_recvmsg+0x220/0x220
  [<ffffffff8155cddb>] sock_sendmsg+0xab/0xe0
  [<ffffffff810a1650>] ? lock_release_non_nested+0xa0/0x2e0
  [<ffffffff810a1650>] ? lock_release_non_nested+0xa0/0x2e0
  [<ffffffff8155d18c>] __sys_sendmsg+0x37c/0x390
  [<ffffffff81195b2a>] ? fsnotify+0x2ca/0x7e0
  [<ffffffff811958e8>] ? fsnotify+0x88/0x7e0
  [<ffffffff81361f36>] ? put_ldisc+0x56/0xd0
  [<ffffffff8116f98a>] ? fget_light+0x3da/0x510
  [<ffffffff8155f6c4>] sys_sendmsg+0x44/0x80
  [<ffffffff8172fc22>] system_call_fastpath+0x16/0x1b

Avoid this problem using a distinct lock_class_key for bonding
devices.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
49ee492
@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Oct 26, 2012
@dmonakhov dmonakhov ext4: fix ext4_flush_completed_IO wait semantics
BUG #1) All places where we call ext4_flush_completed_IO are broken
    because buffered io and DIO/AIO goes through three stages
    1) submitted io,
    2) completed io (in i_completed_io_list) conversion pended
    3) finished  io (conversion done)
    And by calling ext4_flush_completed_IO we will flush only
    requests which were in (2) stage, which is wrong because:
     1) punch_hole and truncate _must_ wait for all outstanding unwritten io
      regardless to it's state.
     2) fsync and nolock_dio_read should also wait because there is
        a time window between end_page_writeback() and ext4_add_complete_io()
        As result integrity fsync is broken in case of buffered write
        to fallocated region:
        fsync                                      blkdev_completion
	 ->filemap_write_and_wait_range
                                                   ->ext4_end_bio
                                                     ->end_page_writeback
          <-- filemap_write_and_wait_range return
	 ->ext4_flush_completed_IO
   	 sees empty i_completed_io_list but pended
   	 conversion still exist
                                                     ->ext4_add_complete_io

BUG #2) Race window becomes wider due to the 'ext4: completed_io
locking cleanup V4' patch series

This patch make following changes:
1) ext4_flush_completed_io() now first try to flush completed io and when
   wait for any outstanding unwritten io via ext4_unwritten_wait()
2) Rename function to more appropriate name.
3) Assert that all callers of ext4_flush_unwritten_io should hold i_mutex to
   prevent endless wait

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
c278531
@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Oct 26, 2012
Jiri Kosina mm, slab: release slab_mutex earlier in kmem_cache_destroy()
Commit 1331e7a ("rcu: Remove _rcu_barrier() dependency on
__stop_machine()") introduced slab_mutex -> cpu_hotplug.lock dependency
through kmem_cache_destroy() -> rcu_barrier() -> _rcu_barrier() ->
get_online_cpus().

Lockdep thinks that this might actually result in ABBA deadlock,
and reports it as below:

=== [ cut here ] ===
 ======================================================
 [ INFO: possible circular locking dependency detected ]
 3.6.0-rc5-00004-g0d8ee37 #143 Not tainted
 -------------------------------------------------------
 kworker/u:2/40 is trying to acquire lock:
  (rcu_sched_state.barrier_mutex){+.+...}, at: [<ffffffff810f2126>] _rcu_barrier+0x26/0x1e0

 but task is already holding lock:
  (slab_mutex){+.+.+.}, at: [<ffffffff81176e15>] kmem_cache_destroy+0x45/0xe0

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #2 (slab_mutex){+.+.+.}:
        [<ffffffff810ae1e2>] validate_chain+0x632/0x720
        [<ffffffff810ae5d9>] __lock_acquire+0x309/0x530
        [<ffffffff810ae921>] lock_acquire+0x121/0x190
        [<ffffffff8155d4cc>] __mutex_lock_common+0x5c/0x450
        [<ffffffff8155d9ee>] mutex_lock_nested+0x3e/0x50
        [<ffffffff81558cb5>] cpuup_callback+0x2f/0xbe
        [<ffffffff81564b83>] notifier_call_chain+0x93/0x140
        [<ffffffff81076f89>] __raw_notifier_call_chain+0x9/0x10
        [<ffffffff8155719d>] _cpu_up+0xba/0x14e
        [<ffffffff815572ed>] cpu_up+0xbc/0x117
        [<ffffffff81ae05e3>] smp_init+0x6b/0x9f
        [<ffffffff81ac47d6>] kernel_init+0x147/0x1dc
        [<ffffffff8156ab44>] kernel_thread_helper+0x4/0x10

 -> #1 (cpu_hotplug.lock){+.+.+.}:
        [<ffffffff810ae1e2>] validate_chain+0x632/0x720
        [<ffffffff810ae5d9>] __lock_acquire+0x309/0x530
        [<ffffffff810ae921>] lock_acquire+0x121/0x190
        [<ffffffff8155d4cc>] __mutex_lock_common+0x5c/0x450
        [<ffffffff8155d9ee>] mutex_lock_nested+0x3e/0x50
        [<ffffffff81049197>] get_online_cpus+0x37/0x50
        [<ffffffff810f21bb>] _rcu_barrier+0xbb/0x1e0
        [<ffffffff810f22f0>] rcu_barrier_sched+0x10/0x20
        [<ffffffff810f2309>] rcu_barrier+0x9/0x10
        [<ffffffff8118c129>] deactivate_locked_super+0x49/0x90
        [<ffffffff8118cc01>] deactivate_super+0x61/0x70
        [<ffffffff811aaaa7>] mntput_no_expire+0x127/0x180
        [<ffffffff811ab49e>] sys_umount+0x6e/0xd0
        [<ffffffff81569979>] system_call_fastpath+0x16/0x1b

 -> #0 (rcu_sched_state.barrier_mutex){+.+...}:
        [<ffffffff810adb4e>] check_prev_add+0x3de/0x440
        [<ffffffff810ae1e2>] validate_chain+0x632/0x720
        [<ffffffff810ae5d9>] __lock_acquire+0x309/0x530
        [<ffffffff810ae921>] lock_acquire+0x121/0x190
        [<ffffffff8155d4cc>] __mutex_lock_common+0x5c/0x450
        [<ffffffff8155d9ee>] mutex_lock_nested+0x3e/0x50
        [<ffffffff810f2126>] _rcu_barrier+0x26/0x1e0
        [<ffffffff810f22f0>] rcu_barrier_sched+0x10/0x20
        [<ffffffff810f2309>] rcu_barrier+0x9/0x10
        [<ffffffff81176ea1>] kmem_cache_destroy+0xd1/0xe0
        [<ffffffffa04c3154>] nf_conntrack_cleanup_net+0xe4/0x110 [nf_conntrack]
        [<ffffffffa04c31aa>] nf_conntrack_cleanup+0x2a/0x70 [nf_conntrack]
        [<ffffffffa04c42ce>] nf_conntrack_net_exit+0x5e/0x80 [nf_conntrack]
        [<ffffffff81454b79>] ops_exit_list+0x39/0x60
        [<ffffffff814551ab>] cleanup_net+0xfb/0x1b0
        [<ffffffff8106917b>] process_one_work+0x26b/0x4c0
        [<ffffffff81069f3e>] worker_thread+0x12e/0x320
        [<ffffffff8106f73e>] kthread+0x9e/0xb0
        [<ffffffff8156ab44>] kernel_thread_helper+0x4/0x10

 other info that might help us debug this:

 Chain exists of:
   rcu_sched_state.barrier_mutex --> cpu_hotplug.lock --> slab_mutex

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(slab_mutex);
                                lock(cpu_hotplug.lock);
                                lock(slab_mutex);
   lock(rcu_sched_state.barrier_mutex);

  *** DEADLOCK ***
=== [ cut here ] ===

This is actually a false positive. Lockdep has no way of knowing the fact
that the ABBA can actually never happen, because of special semantics of
cpu_hotplug.refcount and its handling in cpu_hotplug_begin(); the mutual
exclusion there is not achieved through mutex, but through
cpu_hotplug.refcount.

The "neither cpu_up() nor cpu_down() will proceed past cpu_hotplug_begin()
until everyone who called get_online_cpus() will call put_online_cpus()"
semantics is totally invisible to lockdep.

This patch therefore moves the unlock of slab_mutex so that rcu_barrier()
is being called with it unlocked. It has two advantages:

- it slightly reduces hold time of slab_mutex; as it's used to protect
  the cachep list, it's not necessary to hold it over kmem_cache_free()
  call any more
- it silences the lockdep false positive warning, as it avoids lockdep ever
  learning about slab_mutex -> cpu_hotplug.lock dependency

Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
210ed9d
@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Oct 26, 2012
@torvalds torvalds Merge tag 'sound-3.7' of git://git.kernel.org/pub/scm/linux/kernel/gi…
…t/tiwai/sound

Pull sound updates #2 from Takashi Iwai:
 "This update contains a few cleanup works, regression/stable fixes
  gathered since the last pull request.

   - Clean up with generic hd-audio jack handling code by David
     Henningsson
   - A few regression fixes for standardized HD-audio auto-parser
   - Misc clean-up and small fixes"

* tag 'sound-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda - do not detect jack on internal speakers for Realtek
  ALSA: hda - Fix missing beep on ASUS X43U notebook
  ALSA: hda - Remove AZX_DCAPS_POSFIX_COMBO
  ALSA: hda - Warn an allocation for an uninitialized array
  ALSA: hda/cirrus - Add missing init/free of hda_gen_spec
  ALSA: hda - Fix memory leaks at error path in patch_cirrus.c
  ALSA: hda - Add missing hda_gen_spec to struct via_spec
  ALSA: hda - remove "Mic Jack Mode" for headset jacks (Latitude Exx30)
  ALSA: hda - make Cirrus codec use generic unsol event handler
  ALSA: hda - make VIA codec use generic unsol event handler
  ALSA: hda - Remove dead GPIO code for VIA codec
  ALSA: usb-audio: Add TASCAM US122 MKII playback
2fc07ef
@popcornmix popcornmix pushed a commit that referenced this issue Dec 5, 2012
Saurabh Mohan ipv4/ip_vti.c: VTI fix post-decryption forwarding
[ Upstream commit b294200 ]

With the latest kernel there are two things that must be done post decryption
 so that the packet are forwarded.
 1. Remove the mark from the packet. This will cause the packet to not match
 the ipsec-policy again. However doing this causes the post-decryption check to
 fail also and the packet will get dropped. (cat /proc/net/xfrm_stat).
 2. Remove the sp association in the skbuff so that no policy check is done on
 the packet for VTI tunnels.

Due to #2 above we must now do a security-policy check in the vti rcv path
prior to resetting the mark in the skbuff.

Signed-off-by: Saurabh Mohan <saurabh.mohan@vyatta.com>
Reported-by: Ruben Herold <ruben@puettmann.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
b31143d
@popcornmix popcornmix pushed a commit that referenced this issue Dec 5, 2012
Szymon Janc NFC: Use dynamic initialization for rwlocks
commit fe235b5 upstream.

If rwlock is dynamically allocated but statically initialized it is
missing proper lockdep annotation.

INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
Pid: 3352, comm: neard Not tainted 3.5.0-999-nfc+ #2
Call Trace:
[<ffffffff810c8526>] __lock_acquire+0x8f6/0x1bf0
[<ffffffff81739045>] ? printk+0x4d/0x4f
[<ffffffff810c9eed>] lock_acquire+0x9d/0x220
[<ffffffff81702bfe>] ? nfc_llcp_sock_from_sn+0x4e/0x160
[<ffffffff81746724>] _raw_read_lock+0x44/0x60
[<ffffffff81702bfe>] ? nfc_llcp_sock_from_sn+0x4e/0x160
[<ffffffff81702bfe>] nfc_llcp_sock_from_sn+0x4e/0x160
[<ffffffff817034a7>] nfc_llcp_get_sdp_ssap+0xa7/0x1b0
[<ffffffff81706353>] llcp_sock_bind+0x173/0x210
[<ffffffff815d9c94>] sys_bind+0xe4/0x100
[<ffffffff8139209e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[<ffffffff8174ea69>] system_call_fastpath+0x16/0x1b

Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
cc0e742
@popcornmix popcornmix pushed a commit that referenced this issue Dec 5, 2012
Petr Matousek KVM: x86: invalid opcode oops on SET_SREGS with OSXSAVE bit set (CVE-…
…2012-4461)

commit 6d1068b upstream.

On hosts without the XSAVE support unprivileged local user can trigger
oops similar to the one below by setting X86_CR4_OSXSAVE bit in guest
cr4 register using KVM_SET_SREGS ioctl and later issuing KVM_RUN
ioctl.

invalid opcode: 0000 [#2] SMP
Modules linked in: tun ip6table_filter ip6_tables ebtable_nat ebtables
...
Pid: 24935, comm: zoog_kvm_monito Tainted: G      D      3.2.0-3-686-pae
EIP: 0060:[<f8b9550c>] EFLAGS: 00210246 CPU: 0
EIP is at kvm_arch_vcpu_ioctl_run+0x92a/0xd13 [kvm]
EAX: 00000001 EBX: 000f387e ECX: 00000000 EDX: 00000000
ESI: 00000000 EDI: 00000000 EBP: ef5a0060 ESP: d7c63e70
 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process zoog_kvm_monito (pid: 24935, ti=d7c62000 task=ed84a0c0
task.ti=d7c62000)
Stack:
 00000001 f70a1200 f8b940a9 ef5a0060 00000000 00200202 f8769009 00000000
 ef5a0060 000f387e eda5c020 8722f9c8 00015bae 00000000 ed84a0c0 ed84a0c0
 c12bf02d 0000ae80 ef7f8740 fffffffb f359b740 ef5a0060 f8b85dc1 0000ae80
Call Trace:
 [<f8b940a9>] ? kvm_arch_vcpu_ioctl_set_sregs+0x2fe/0x308 [kvm]
...
 [<c12bfb44>] ? syscall_call+0x7/0xb
Code: 89 e8 e8 14 ee ff ff ba 00 00 04 00 89 e8 e8 98 48 ff ff 85 c0 74
1e 83 7d 48 00 75 18 8b 85 08 07 00 00 31 c9 8b 95 0c 07 00 00 <0f> 01
d1 c7 45 48 01 00 00 00 c7 45 1c 01 00 00 00 0f ae f0 89
EIP: [<f8b9550c>] kvm_arch_vcpu_ioctl_run+0x92a/0xd13 [kvm] SS:ESP
0068:d7c63e70

QEMU first retrieves the supported features via KVM_GET_SUPPORTED_CPUID
and then sets them later. So guest's X86_FEATURE_XSAVE should be masked
out on hosts without X86_FEATURE_XSAVE, making kvm_set_cr4 with
X86_CR4_OSXSAVE fail. Userspaces that allow specifying guest cpuid with
X86_FEATURE_XSAVE even on hosts that do not support it, might be
susceptible to this attack from inside the guest as well.

Allow setting X86_CR4_OSXSAVE bit only if host has XSAVE support.

Signed-off-by: Petr Matousek <pmatouse@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
e467125
@popcornmix popcornmix added a commit that referenced this issue Mar 31, 2016
@T-X T-X net: fix bridge multicast packet checksum validation
We need to update the skb->csum after pulling the skb, otherwise
an unnecessary checksum (re)computation can ocure for IGMP/MLD packets
in the bridge code. Additionally this fixes the following splats for
network devices / bridge ports with support for and enabled RX checksum
offloading:

[...]
[   43.986968] eth0: hw csum failure
[   43.990344] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.4.0 #2
[   43.996193] Hardware name: BCM2709
[   43.999647] [<800204e0>] (unwind_backtrace) from [<8001cf14>] (show_stack+0x10/0x14)
[   44.007432] [<8001cf14>] (show_stack) from [<801ab614>] (dump_stack+0x80/0x90)
[   44.014695] [<801ab614>] (dump_stack) from [<802e4548>] (__skb_checksum_complete+0x6c/0xac)
[   44.023090] [<802e4548>] (__skb_checksum_complete) from [<803a055c>] (ipv6_mc_validate_checksum+0x104/0x178)
[   44.032959] [<803a055c>] (ipv6_mc_validate_checksum) from [<802e111c>] (skb_checksum_trimmed+0x130/0x188)
[   44.042565] [<802e111c>] (skb_checksum_trimmed) from [<803a06e8>] (ipv6_mc_check_mld+0x118/0x338)
[   44.051501] [<803a06e8>] (ipv6_mc_check_mld) from [<803b2c98>] (br_multicast_rcv+0x5dc/0xd00)
[   44.060077] [<803b2c98>] (br_multicast_rcv) from [<803aa510>] (br_handle_frame_finish+0xac/0x51c)
[...]

Fixes: 9afd85c ("net: Export IGMP/MLD message validation code")
Reported-by: Álvaro Fernández Rojas <noltari@gmail.com>
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: David S. Miller <davem@davemloft.net>
966e306
@popcornmix popcornmix pushed a commit that referenced this issue Apr 6, 2016
Yang Shi arm64: replace read_lock to rcu lock in call_step_hook
[ Upstream commit cf0a254 ]

BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:917
in_atomic(): 1, irqs_disabled(): 128, pid: 383, name: sh
Preemption disabled at:[<ffff800000124c18>] kgdb_cpu_enter+0x158/0x6b8

CPU: 3 PID: 383 Comm: sh Tainted: G        W       4.1.13-rt13 #2
Hardware name: Freescale Layerscape 2085a RDB Board (DT)
Call trace:
[<ffff8000000885e8>] dump_backtrace+0x0/0x128
[<ffff800000088734>] show_stack+0x24/0x30
[<ffff80000079a7c4>] dump_stack+0x80/0xa0
[<ffff8000000bd324>] ___might_sleep+0x18c/0x1a0
[<ffff8000007a20ac>] __rt_spin_lock+0x2c/0x40
[<ffff8000007a2268>] rt_read_lock+0x40/0x58
[<ffff800000085328>] single_step_handler+0x38/0xd8
[<ffff800000082368>] do_debug_exception+0x58/0xb8
Exception stack(0xffff80834a1e7c80 to 0xffff80834a1e7da0)
7c80: ffffff9c ffffffff 92c23ba0 0000ffff 4a1e7e40 ffff8083 001bfcc4 ffff8000
7ca0: f2000400 00000000 00000000 00000000 4a1e7d80 ffff8083 0049501c ffff8000
7cc0: 00005402 00000000 00aaa210 ffff8000 4a1e7ea0 ffff8083 000833f4 ffff8000
7ce0: ffffff9c ffffffff 92c23ba0 0000ffff 4a1e7ea0 ffff8083 001bfcc0 ffff8000
7d00: 4a0fc400 ffff8083 00005402 00000000 4a1e7d40 ffff8083 00490324 ffff8000
7d20: ffffff9c 00000000 92c23ba0 0000ffff 000a0000 00000000 00000000 00000000
7d40: 00000008 00000000 00080000 00000000 92c23b8b 0000ffff 92c23b8e 0000ffff
7d60: 00000038 00000000 00001cb2 00000000 00000005 00000000 92d7b498 0000ffff
7d80: 01010101 01010101 92be9000 0000ffff 00000000 00000000 00000030 00000000
[<ffff8000000833f4>] el1_dbg+0x18/0x6c

This issue is similar with 62c6c61("arm64: replace read_lock to rcu lock in
call_break_hook"), but comes to single_step_handler.

This also solves kgdbts boot test silent hang issue on 4.4 -rt kernel.

Signed-off-by: Yang Shi <yang.shi@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
143cf26
@popcornmix popcornmix pushed a commit that referenced this issue Apr 11, 2016
@bvanassche bvanassche scsi_dh_alua: Fix a recently introduced deadlock
While retesting the SRP initiator I ran the command "rmmod mlx4_ib"
while I/O was in progress. That command triggers SCSI device removal
indirectly. Avoid that this action triggers the following deadlock:

=================================
[ INFO: inconsistent lock state ]
4.6.0-rc0-dbg+ #2 Tainted: G           O
---------------------------------
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
multipathd/484 [HC0[0]:SC0[0]:HE1:SE1] takes:
 (&(&pg->lock)->rlock){+.?...}, at: [<ffffffffa04f50a2>] alua_bus_detach+0x52/0xa0 [scsi_dh_alua]
{IN-SOFTIRQ-W} state was registered at:
  [<ffffffff810a64a9>] __lock_acquire+0x7e9/0x1ad0
  [<ffffffff810a7fd0>] lock_acquire+0x60/0x80
  [<ffffffff8159910e>] _raw_spin_lock_irqsave+0x3e/0x60
  [<ffffffffa04f5131>] alua_rtpg_queue+0x41/0x1d0 [scsi_dh_alua]
  [<ffffffffa04f5531>] alua_check+0xe1/0x220 [scsi_dh_alua]
  [<ffffffffa04f5709>] alua_check_sense+0x99/0xb0 [scsi_dh_alua]
  [<ffffffff813f0d01>] scsi_check_sense+0x71/0x3f0
  [<ffffffff813f2f8b>] scsi_decide_disposition+0x18b/0x1d0
  [<ffffffff813f6e52>] scsi_softirq_done+0x52/0x140
  [<ffffffff812a26f2>] blk_done_softirq+0x52/0x90
  [<ffffffff8105bc1f>] __do_softirq+0x10f/0x230
  [<ffffffff8105bec8>] irq_exit+0xa8/0xb0
  [<ffffffff8101a675>] do_IRQ+0x65/0x110
  [<ffffffff8159a2c9>] ret_from_intr+0x0/0x19
  [<ffffffff811732f1>] kmem_cache_alloc+0x151/0x190
  [<ffffffff8118e534>] create_object+0x34/0x2d0
  [<ffffffff8158eaa6>] kmemleak_alloc_percpu+0x56/0xd0
  [<ffffffff8113ab0d>] pcpu_alloc+0x38d/0x660
  [<ffffffff8113aded>] __alloc_percpu_gfp+0xd/0x10
  [<ffffffff812e56a5>] __percpu_counter_init+0x55/0xb0
  [<ffffffff812b4989>] blkg_alloc+0x79/0x230
  [<ffffffff812b6756>] blkcg_init_queue+0x26/0x1d0
  [<ffffffff81297eed>] blk_alloc_queue_node+0x27d/0x2e0
  [<ffffffffa017766c>] dm_create+0x20c/0x570 [dm_mod]
  [<ffffffffa017e356>] dev_create+0x56/0x2c0 [dm_mod]
  [<ffffffffa017dcae>] ctl_ioctl+0x26e/0x520 [dm_mod]
  [<ffffffffa017df6e>] dm_ctl_ioctl+0xe/0x20 [dm_mod]
  [<ffffffff811aa8ee>] do_vfs_ioctl+0x8e/0x660
  [<ffffffff811aaefc>] SyS_ioctl+0x3c/0x70
  [<ffffffff81599929>] entry_SYSCALL_64_fastpath+0x1c/0xac
irq event stamp: 4290931
hardirqs last  enabled at (4290931): [ 1662.892772]
[<ffffffff81599341>] _raw_spin_unlock_irqrestore+0x31/0x50
hardirqs last disabled at (4290930): [<ffffffff815990e7>] _raw_spin_lock_irqsave+0x17/0x60
softirqs last  enabled at (4290774): [<ffffffff8105bcdb>] __do_softirq+0x1cb/0x230
softirqs last disabled at (4289831): [<ffffffff8105bec8>] irq_exit+0xa8/0xb0

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&(&pg->lock)->rlock);
  <Interrupt>
    lock(&(&pg->lock)->rlock);

 *** DEADLOCK ***

2 locks held by multipathd/484:
 #0:  (&bdev->bd_mutex){+.+.+.}, at: [<ffffffff811d1cc3>] __blkdev_put+0x33/0x360
 #1:  (sd_ref_mutex){+.+...}, at: [<ffffffff81400afc>] scsi_disk_put+0x1c/0x40

stack backtrace:
CPU: 6 PID: 484 Comm: multipathd Tainted: G           O    4.6.0-rc0-dbg+ #2
Call Trace:
 [<ffffffff812bd115>] dump_stack+0x67/0x92
 [<ffffffff810a5175>] print_usage_bug+0x215/0x240
 [<ffffffff810a56ea>] mark_lock+0x54a/0x610
 [<ffffffff810a6505>] __lock_acquire+0x845/0x1ad0
 [<ffffffff810a7fd0>] lock_acquire+0x60/0x80
 [<ffffffff81598f23>] _raw_spin_lock+0x33/0x50
 [<ffffffffa04f50a2>] alua_bus_detach+0x52/0xa0 [scsi_dh_alua]
 [<ffffffff813ff6f7>] scsi_dh_release_device+0x17/0x50
 [<ffffffff813fb8da>] scsi_device_dev_release_usercontext+0x2a/0x120
 [<ffffffff810701f0>] execute_in_process_context+0x80/0x90
 [<ffffffff813fb8a7>] scsi_device_dev_release+0x17/0x20
 [<ffffffff813c8cfd>] device_release+0x2d/0x90
 [<ffffffff812bfa8a>] kobject_release+0x7a/0x190
 [<ffffffff812bf946>] kobject_put+0x26/0x50
 [<ffffffff813c8ee2>] put_device+0x12/0x20
 [<ffffffff813edc86>] scsi_device_put+0x26/0x30
 [<ffffffff81400b0d>] scsi_disk_put+0x2d/0x40
 [<ffffffff81400b68>] sd_release+0x48/0xb0
 [<ffffffff811d1f2e>] __blkdev_put+0x29e/0x360
 [<ffffffff811d24b9>] blkdev_put+0x49/0x170
 [<ffffffff811d2600>] blkdev_close+0x20/0x30
 [<ffffffff81198f48>] __fput+0xe8/0x1f0
 [<ffffffff81199089>] ____fput+0x9/0x10
 [<ffffffff81075d9e>] task_work_run+0x6e/0xa0
 [<ffffffff81001119>] exit_to_usermode_loop+0xa9/0xb0
 [<ffffffff81001590>] syscall_return_slowpath+0xb0/0xc0
 [<ffffffff815999b7>] entry_SYSCALL_64_fastpath+0xaa/0xac

Fixes: cb0a168 (scsi_dh_alua: update 'access_state' field)
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Reviewed-by: Hannes Reinicke <hare@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ewan Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
38c3159
@giraldeau giraldeau pushed a commit to giraldeau/linux that referenced this issue Apr 12, 2016
Yang Shi arm64: replace read_lock to rcu lock in call_step_hook
BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:917
in_atomic(): 1, irqs_disabled(): 128, pid: 383, name: sh
Preemption disabled at:[<ffff800000124c18>] kgdb_cpu_enter+0x158/0x6b8

CPU: 3 PID: 383 Comm: sh Tainted: G        W       4.1.13-rt13 #2
Hardware name: Freescale Layerscape 2085a RDB Board (DT)
Call trace:
[<ffff8000000885e8>] dump_backtrace+0x0/0x128
[<ffff800000088734>] show_stack+0x24/0x30
[<ffff80000079a7c4>] dump_stack+0x80/0xa0
[<ffff8000000bd324>] ___might_sleep+0x18c/0x1a0
[<ffff8000007a20ac>] __rt_spin_lock+0x2c/0x40
[<ffff8000007a2268>] rt_read_lock+0x40/0x58
[<ffff800000085328>] single_step_handler+0x38/0xd8
[<ffff800000082368>] do_debug_exception+0x58/0xb8
Exception stack(0xffff80834a1e7c80 to 0xffff80834a1e7da0)
7c80: ffffff9c ffffffff 92c23ba0 0000ffff 4a1e7e40 ffff8083 001bfcc4 ffff8000
7ca0: f2000400 00000000 00000000 00000000 4a1e7d80 ffff8083 0049501c ffff8000
7cc0: 00005402 00000000 00aaa210 ffff8000 4a1e7ea0 ffff8083 000833f4 ffff8000
7ce0: ffffff9c ffffffff 92c23ba0 0000ffff 4a1e7ea0 ffff8083 001bfcc0 ffff8000
7d00: 4a0fc400 ffff8083 00005402 00000000 4a1e7d40 ffff8083 00490324 ffff8000
7d20: ffffff9c 00000000 92c23ba0 0000ffff 000a0000 00000000 00000000 00000000
7d40: 00000008 00000000 00080000 00000000 92c23b8b 0000ffff 92c23b8e 0000ffff
7d60: 00000038 00000000 00001cb2 00000000 00000005 00000000 92d7b498 0000ffff
7d80: 01010101 01010101 92be9000 0000ffff 00000000 00000000 00000030 00000000
[<ffff8000000833f4>] el1_dbg+0x18/0x6c

This issue is similar with 62c6c61("arm64: replace read_lock to rcu lock in
call_break_hook"), but comes to single_step_handler.

This also solves kgdbts boot test silent hang issue on 4.4 -rt kernel.

Cc: stable-rt@vger.kernel.org
Signed-off-by: Yang Shi <yang.shi@linaro.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
2fcb3f1
@giraldeau giraldeau pushed a commit to giraldeau/linux that referenced this issue Apr 12, 2016
@paulgortmaker paulgortmaker list_bl: Make list head locking RT safe
As per changes in include/linux/jbd_common.h for avoiding the
bit_spin_locks on RT ("fs: jbd/jbd2: Make state lock and journal
head lock rt safe") we do the same thing here.

We use the non atomic __set_bit and __clear_bit inside the scope of
the lock to preserve the ability of the existing LIST_DEBUG code to
use the zero'th bit in the sanity checks.

As a bit spinlock, we had no lockdep visibility into the usage
of the list head locking.  Now, if we were to implement it as a
standard non-raw spinlock, we would see:

BUG: sleeping function called from invalid context at kernel/rtmutex.c:658
in_atomic(): 1, irqs_disabled(): 0, pid: 122, name: udevd
5 locks held by udevd/122:
 #0:  (&sb->s_type->i_mutex_key#7/1){+.+.+.}, at: [<ffffffff811967e8>] lock_rename+0xe8/0xf0
 #1:  (rename_lock){+.+...}, at: [<ffffffff811a277c>] d_move+0x2c/0x60
 #2:  (&dentry->d_lock){+.+...}, at: [<ffffffff811a0763>] dentry_lock_for_move+0xf3/0x130
 #3:  (&dentry->d_lock/2){+.+...}, at: [<ffffffff811a0734>] dentry_lock_for_move+0xc4/0x130
 #4:  (&dentry->d_lock/3){+.+...}, at: [<ffffffff811a0747>] dentry_lock_for_move+0xd7/0x130
Pid: 122, comm: udevd Not tainted 3.4.47-rt62 #7
Call Trace:
 [<ffffffff810b9624>] __might_sleep+0x134/0x1f0
 [<ffffffff817a24d4>] rt_spin_lock+0x24/0x60
 [<ffffffff811a0c4c>] __d_shrink+0x5c/0xa0
 [<ffffffff811a1b2d>] __d_drop+0x1d/0x40
 [<ffffffff811a24be>] __d_move+0x8e/0x320
 [<ffffffff811a278e>] d_move+0x3e/0x60
 [<ffffffff81199598>] vfs_rename+0x198/0x4c0
 [<ffffffff8119b093>] sys_renameat+0x213/0x240
 [<ffffffff817a2de5>] ? _raw_spin_unlock+0x35/0x60
 [<ffffffff8107781c>] ? do_page_fault+0x1ec/0x4b0
 [<ffffffff817a32ca>] ? retint_swapgs+0xe/0x13
 [<ffffffff813eb0e6>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff8119b0db>] sys_rename+0x1b/0x20
 [<ffffffff817a3b96>] system_call_fastpath+0x1a/0x1f

Since we are only taking the lock during short lived list operations,
lets assume for now that it being raw won't be a significant latency
concern.


Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
aa8f4fc
@giraldeau giraldeau pushed a commit to giraldeau/linux that referenced this issue Apr 12, 2016
Yang Shi mm/memcontrol: Don't call schedule_work_on in preemption disabled con…
…text

The following trace is triggered when running ltp oom test cases:

BUG: sleeping function called from invalid context at kernel/rtmutex.c:659
in_atomic(): 1, irqs_disabled(): 0, pid: 17188, name: oom03
Preemption disabled at:[<ffffffff8112ba70>] mem_cgroup_reclaim+0x90/0xe0

CPU: 2 PID: 17188 Comm: oom03 Not tainted 3.10.10-rt3 #2
Hardware name: Intel Corporation Calpella platform/MATXM-CORE-411-B, BIOS 4.6.3 08/18/2010
ffff88007684d730 ffff880070df9b58 ffffffff8169918d ffff880070df9b70
ffffffff8106db31 ffff88007688b4a0 ffff880070df9b88 ffffffff8169d9c0
ffff88007688b4a0 ffff880070df9bc8 ffffffff81059da1 0000000170df9bb0
Call Trace:
[<ffffffff8169918d>] dump_stack+0x19/0x1b
[<ffffffff8106db31>] __might_sleep+0xf1/0x170
[<ffffffff8169d9c0>] rt_spin_lock+0x20/0x50
[<ffffffff81059da1>] queue_work_on+0x61/0x100
[<ffffffff8112b361>] drain_all_stock+0xe1/0x1c0
[<ffffffff8112ba70>] mem_cgroup_reclaim+0x90/0xe0
[<ffffffff8112beda>] __mem_cgroup_try_charge+0x41a/0xc40
[<ffffffff810f1c91>] ? release_pages+0x1b1/0x1f0
[<ffffffff8106f200>] ? sched_exec+0x40/0xb0
[<ffffffff8112cc87>] mem_cgroup_charge_common+0x37/0x70
[<ffffffff8112e2c6>] mem_cgroup_newpage_charge+0x26/0x30
[<ffffffff8110af68>] handle_pte_fault+0x618/0x840
[<ffffffff8103ecf6>] ? unpin_current_cpu+0x16/0x70
[<ffffffff81070f94>] ? migrate_enable+0xd4/0x200
[<ffffffff8110cde5>] handle_mm_fault+0x145/0x1e0
[<ffffffff810301e1>] __do_page_fault+0x1a1/0x4c0
[<ffffffff8169c9eb>] ? preempt_schedule_irq+0x4b/0x70
[<ffffffff8169e3b7>] ? retint_kernel+0x37/0x40
[<ffffffff8103053e>] do_page_fault+0xe/0x10
[<ffffffff8169e4c2>] page_fault+0x22/0x30

So, to prevent schedule_work_on from being called in preempt disabled context,
replace the pair of get/put_cpu() to get/put_cpu_light().


Signed-off-by: Yang Shi <yang.shi@windriver.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
49f2b68
@giraldeau giraldeau pushed a commit to giraldeau/linux that referenced this issue Apr 12, 2016
Yang Shi hrtimer: Move schedule_work call to helper thread
When run ltp leapsec_timer test, the following call trace is caught:

BUG: sleeping function called from invalid context at kernel/rtmutex.c:659
in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/1
Preemption disabled at:[<ffffffff810857f3>] cpu_startup_entry+0x133/0x310

CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.10.10-rt3 #2
Hardware name: Intel Corporation Calpella platform/MATXM-CORE-411-B, BIOS 4.6.3 08/18/2010
ffffffff81c2f800 ffff880076843e40 ffffffff8169918d ffff880076843e58
ffffffff8106db31 ffff88007684b4a0 ffff880076843e70 ffffffff8169d9c0
ffff88007684b4a0 ffff880076843eb0 ffffffff81059da1 0000001876851200
Call Trace:
<IRQ>  [<ffffffff8169918d>] dump_stack+0x19/0x1b
[<ffffffff8106db31>] __might_sleep+0xf1/0x170
[<ffffffff8169d9c0>] rt_spin_lock+0x20/0x50
[<ffffffff81059da1>] queue_work_on+0x61/0x100
[<ffffffff81065aa1>] clock_was_set_delayed+0x21/0x30
[<ffffffff810883be>] do_timer+0x40e/0x660
[<ffffffff8108f487>] tick_do_update_jiffies64+0xf7/0x140
[<ffffffff8108fe42>] tick_check_idle+0x92/0xc0
[<ffffffff81044327>] irq_enter+0x57/0x70
[<ffffffff816a040e>] smp_apic_timer_interrupt+0x3e/0x9b
[<ffffffff8169f80a>] apic_timer_interrupt+0x6a/0x70
<EOI>  [<ffffffff8155ea1c>] ? cpuidle_enter_state+0x4c/0xc0
[<ffffffff8155eb68>] cpuidle_idle_call+0xd8/0x2d0
[<ffffffff8100b59e>] arch_cpu_idle+0xe/0x30
[<ffffffff8108585e>] cpu_startup_entry+0x19e/0x310
[<ffffffff8168efa2>] start_secondary+0x1ad/0x1b0

The clock_was_set_delayed is called in hard IRQ handler (timer interrupt), which
calls schedule_work.

Under PREEMPT_RT_FULL, schedule_work calls spinlocks which could sleep, so it's
not safe to call schedule_work in interrupt context.

Reference upstream commit b68d61c705ef02384c0538b8d9374545097899ca
(rt,ntp: Move call to schedule_delayed_work() to helper thread)
from git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git, which
makes a similar change.

add a helper thread which does the call to schedule_work and wake up that
thread instead of calling schedule_work directly.


Signed-off-by: Yang Shi <yang.shi@windriver.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
4503283
@giraldeau giraldeau pushed a commit to giraldeau/linux that referenced this issue Apr 12, 2016
Sebastian Andrzej Siewior block: blk-mq: Use swait
| BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:914
| in_atomic(): 1, irqs_disabled(): 0, pid: 255, name: kworker/u257:6
| 5 locks held by kworker/u257:6/255:
|  #0:  ("events_unbound"){.+.+.+}, at: [<ffffffff8108edf1>] process_one_work+0x171/0x5e0
|  #1:  ((&entry->work)){+.+.+.}, at: [<ffffffff8108edf1>] process_one_work+0x171/0x5e0
|  #2:  (&shost->scan_mutex){+.+.+.}, at: [<ffffffffa000faa3>] __scsi_add_device+0xa3/0x130 [scsi_mod]
|  #3:  (&set->tag_list_lock){+.+...}, at: [<ffffffff812f09fa>] blk_mq_init_queue+0x96a/0xa50
|  #4:  (rcu_read_lock_sched){......}, at: [<ffffffff8132887d>] percpu_ref_kill_and_confirm+0x1d/0x120
| Preemption disabled at:[<ffffffff812eff76>] blk_mq_freeze_queue_start+0x56/0x70
|
| CPU: 2 PID: 255 Comm: kworker/u257:6 Not tainted 3.18.7-rt0+ #1
| Workqueue: events_unbound async_run_entry_fn
|  0000000000000003 ffff8800bc29f998 ffffffff815b3a12 0000000000000000
|  0000000000000000 ffff8800bc29f9b8 ffffffff8109aa16 ffff8800bc29fa28
|  ffff8800bc5d1bc8 ffff8800bc29f9e8 ffffffff815b8dd4 ffff880000000000
| Call Trace:
|  [<ffffffff815b3a12>] dump_stack+0x4f/0x7c
|  [<ffffffff8109aa16>] __might_sleep+0x116/0x190
|  [<ffffffff815b8dd4>] rt_spin_lock+0x24/0x60
|  [<ffffffff810b6089>] __wake_up+0x29/0x60
|  [<ffffffff812ee06e>] blk_mq_usage_counter_release+0x1e/0x20
|  [<ffffffff81328966>] percpu_ref_kill_and_confirm+0x106/0x120
|  [<ffffffff812eff76>] blk_mq_freeze_queue_start+0x56/0x70
|  [<ffffffff812f0000>] blk_mq_update_tag_set_depth+0x40/0xd0
|  [<ffffffff812f0a1c>] blk_mq_init_queue+0x98c/0xa50
|  [<ffffffffa000dcf0>] scsi_mq_alloc_queue+0x20/0x60 [scsi_mod]
|  [<ffffffffa000ea35>] scsi_alloc_sdev+0x2f5/0x370 [scsi_mod]
|  [<ffffffffa000f494>] scsi_probe_and_add_lun+0x9e4/0xdd0 [scsi_mod]
|  [<ffffffffa000fb26>] __scsi_add_device+0x126/0x130 [scsi_mod]
|  [<ffffffffa013033f>] ata_scsi_scan_host+0xaf/0x200 [libata]
|  [<ffffffffa012b5b6>] async_port_probe+0x46/0x60 [libata]
|  [<ffffffff810978fb>] async_run_entry_fn+0x3b/0xf0
|  [<ffffffff8108ee81>] process_one_work+0x201/0x5e0

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
8e4fe8d
@giraldeau giraldeau pushed a commit to giraldeau/linux that referenced this issue Apr 12, 2016
@paulgortmaker paulgortmaker sas-ata/isci: dont't disable interrupts in qc_issue handler
On 3.14-rt we see the following trace on Canoe Pass for
SCSI_ISCI "Intel(R) C600 Series Chipset SAS Controller"
when the sas qc_issue handler is run:

 BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:905
 in_atomic(): 0, irqs_disabled(): 1, pid: 432, name: udevd
 CPU: 11 PID: 432 Comm: udevd Not tainted 3.14.28-rt22 #2
 Hardware name: Intel Corporation S2600CP/S2600CP, BIOS SE5C600.86B.02.01.0002.082220131453 08/22/2013
 ffff880fab500000 ffff880fa9f239c0 ffffffff81a2d273 0000000000000000
 ffff880fa9f239d8 ffffffff8107f023 ffff880faac23dc0 ffff880fa9f239f0
 ffffffff81a33cc0 ffff880faaeb1400 ffff880fa9f23a40 ffffffff815de891
 Call Trace:
 [<ffffffff81a2d273>] dump_stack+0x4e/0x7a
 [<ffffffff8107f023>] __might_sleep+0xe3/0x160
 [<ffffffff81a33cc0>] rt_spin_lock+0x20/0x50
 [<ffffffff815de891>] isci_task_execute_task+0x171/0x2f0  <-----
 [<ffffffff815cfecb>] sas_ata_qc_issue+0x25b/0x2a0
 [<ffffffff81606363>] ata_qc_issue+0x1f3/0x370
 [<ffffffff8160c600>] ? ata_scsi_invalid_field+0x40/0x40
 [<ffffffff8160c8f5>] ata_scsi_translate+0xa5/0x1b0
 [<ffffffff8160efc6>] ata_sas_queuecmd+0x86/0x280
 [<ffffffff815ce446>] sas_queuecommand+0x196/0x230
 [<ffffffff81081fad>] ? get_parent_ip+0xd/0x50
 [<ffffffff815b05a4>] scsi_dispatch_cmd+0xb4/0x210
 [<ffffffff815b7744>] scsi_request_fn+0x314/0x530

and gdb shows:

(gdb) list * isci_task_execute_task+0x171
0xffffffff815ddfb1 is in isci_task_execute_task (drivers/scsi/isci/task.c:138).
133             dev_dbg(&ihost->pdev->dev, "%s: num=%d\n", __func__, num);
134
135             for_each_sas_task(num, task) {
136                     enum sci_status status = SCI_FAILURE;
137
138                     spin_lock_irqsave(&ihost->scic_lock, flags);    <-----
139                     idev = isci_lookup_device(task->dev);
140                     io_ready = isci_device_io_ready(idev, task);
141                     tag = isci_alloc_tag(ihost);
142                     spin_unlock_irqrestore(&ihost->scic_lock, flags);
(gdb)

In addition to the scic_lock, the function also contains locking of
the task_state_lock -- which is clearly not a candidate for raw lock
conversion.  As can be seen by the comment nearby, we really should
be running the qc_issue code with interrupts enabled anyway.


Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
bb1d5ef
@giraldeau giraldeau pushed a commit to giraldeau/linux that referenced this issue Apr 12, 2016
@umgwanakikbuti umgwanakikbuti hotplug: Use set_cpus_allowed_ptr() in sync_unplug_thread()
do_set_cpus_allowed() is not safe vs ->sched_class change.

crash> bt
PID: 11676  TASK: ffff88026f979da0  CPU: 22  COMMAND: "sync_unplug/22"
 #0 [ffff880274d25bc8] machine_kexec at ffffffff8103b41c
 #1 [ffff880274d25c18] crash_kexec at ffffffff810d881a
 #2 [ffff880274d25cd8] oops_end at ffffffff81525818
 #3 [ffff880274d25cf8] do_invalid_op at ffffffff81003096
 #4 [ffff880274d25d90] invalid_op at ffffffff8152d3de
    [exception RIP: set_cpus_allowed_rt+18]
    RIP: ffffffff8109e012  RSP: ffff880274d25e48  RFLAGS: 00010202
    RAX: ffffffff8109e000  RBX: ffff88026f979da0  RCX: ffff8802770cb6e8
    RDX: 0000000000000000  RSI: ffffffff81add700  RDI: ffff88026f979da0
    RBP: ffff880274d25e78   R8: ffffffff816112e0   R9: 0000000000000001
    R10: 0000000000000001  R11: 0000000000011940  R12: ffff88026f979da0
    R13: ffff8802770cb6d0  R14: ffff880274d25fd8  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #5 [ffff880274d25e60] do_set_cpus_allowed at ffffffff8108e65f
 #6 [ffff880274d25e80] sync_unplug_thread at ffffffff81058c08
 #7 [ffff880274d25ed8] kthread at ffffffff8107cad6
 #8 [ffff880274d25f50] ret_from_fork at ffffffff8152bbbc
crash> task_struct ffff88026f979da0 | grep class
  sched_class = 0xffffffff816111e0 <fair_sched_class+64>,

Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
1bf0638
@giraldeau giraldeau added a commit to giraldeau/linux that referenced this issue Apr 12, 2016
@T-X T-X net: fix bridge multicast packet checksum validation
We need to update the skb->csum after pulling the skb, otherwise
an unnecessary checksum (re)computation can ocure for IGMP/MLD packets
in the bridge code. Additionally this fixes the following splats for
network devices / bridge ports with support for and enabled RX checksum
offloading:

[...]
[   43.986968] eth0: hw csum failure
[   43.990344] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.4.0 #2
[   43.996193] Hardware name: BCM2709
[   43.999647] [<800204e0>] (unwind_backtrace) from [<8001cf14>] (show_stack+0x10/0x14)
[   44.007432] [<8001cf14>] (show_stack) from [<801ab614>] (dump_stack+0x80/0x90)
[   44.014695] [<801ab614>] (dump_stack) from [<802e4548>] (__skb_checksum_complete+0x6c/0xac)
[   44.023090] [<802e4548>] (__skb_checksum_complete) from [<803a055c>] (ipv6_mc_validate_checksum+0x104/0x178)
[   44.032959] [<803a055c>] (ipv6_mc_validate_checksum) from [<802e111c>] (skb_checksum_trimmed+0x130/0x188)
[   44.042565] [<802e111c>] (skb_checksum_trimmed) from [<803a06e8>] (ipv6_mc_check_mld+0x118/0x338)
[   44.051501] [<803a06e8>] (ipv6_mc_check_mld) from [<803b2c98>] (br_multicast_rcv+0x5dc/0xd00)
[   44.060077] [<803b2c98>] (br_multicast_rcv) from [<803aa510>] (br_handle_frame_finish+0xac/0x51c)
[...]

Fixes: 9afd85c ("net: Export IGMP/MLD message validation code")
Reported-by: Álvaro Fernández Rojas <noltari@gmail.com>
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: David S. Miller <davem@davemloft.net>
24a02d2
@popcornmix popcornmix pushed a commit that referenced this issue Apr 12, 2016
@snitm snitm dm: fix excessive dm-mq context switching
commit 6acfe68 upstream.

Request-based DM's blk-mq support (dm-mq) was reported to be 50% slower
than if an underlying null_blk device were used directly.  One of the
reasons for this drop in performance is that blk_insert_clone_request()
was calling blk_mq_insert_request() with @async=true.  This forced the
use of kblockd_schedule_delayed_work_on() to run the blk-mq hw queues
which ushered in ping-ponging between process context (fio in this case)
and kblockd's kworker to submit the cloned request.  The ftrace
function_graph tracer showed:

  kworker-2013  =>   fio-12190
  fio-12190    =>  kworker-2013
  ...
  kworker-2013  =>   fio-12190
  fio-12190    =>  kworker-2013
  ...

Fixing blk_insert_clone_request()'s blk_mq_insert_request() call to
_not_ use kblockd to submit the cloned requests isn't enough to
eliminate the observed context switches.

In addition to this dm-mq specific blk-core fix, there are 2 DM core
fixes to dm-mq that (when paired with the blk-core fix) completely
eliminate the observed context switching:

1)  don't blk_mq_run_hw_queues in blk-mq request completion

    Motivated by desire to reduce overhead of dm-mq, punting to kblockd
    just increases context switches.

    In my testing against a really fast null_blk device there was no benefit
    to running blk_mq_run_hw_queues() on completion (and no other blk-mq
    driver does this).  So hopefully this change doesn't induce the need for
    yet another revert like commit 621739b !

2)  use blk_mq_complete_request() in dm_complete_request()

    blk_complete_request() doesn't offer the traditional q->mq_ops vs
    .request_fn branching pattern that other historic block interfaces
    do (e.g. blk_get_request).  Using blk_mq_complete_request() for
    blk-mq requests is important for performance.  It should be noted
    that, like blk_complete_request(), blk_mq_complete_request() doesn't
    natively handle partial completions -- but the request-based
    DM-multipath target does provide the required partial completion
    support by dm.c:end_clone_bio() triggering requeueing of the request
    via dm-mpath.c:multipath_end_io()'s return of DM_ENDIO_REQUEUE.

dm-mq fix #2 is _much_ more important than #1 for eliminating the
context switches.
Before: cpu          : usr=15.10%, sys=59.39%, ctx=7905181, majf=0, minf=475
After:  cpu          : usr=20.60%, sys=79.35%, ctx=2008, majf=0, minf=472

With these changes multithreaded async read IOPs improved from ~950K
to ~1350K for this dm-mq stacked on null_blk test-case.  The raw read
IOPs of the underlying null_blk device for the same workload is ~1950K.

Fixes: 7fb4898 ("block: add blk-mq support to blk_insert_cloned_request()")
Fixes: bfebd1c ("dm: add full blk-mq support to request-based DM")
Reported-by: Sagi Grimberg <sagig@dev.mellanox.co.il>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7f4c67d
@popcornmix popcornmix pushed a commit that referenced this issue Apr 12, 2016
@snitm snitm dm: fix excessive dm-mq context switching
commit 6acfe68 upstream.

Request-based DM's blk-mq support (dm-mq) was reported to be 50% slower
than if an underlying null_blk device were used directly.  One of the
reasons for this drop in performance is that blk_insert_clone_request()
was calling blk_mq_insert_request() with @async=true.  This forced the
use of kblockd_schedule_delayed_work_on() to run the blk-mq hw queues
which ushered in ping-ponging between process context (fio in this case)
and kblockd's kworker to submit the cloned request.  The ftrace
function_graph tracer showed:

  kworker-2013  =>   fio-12190
  fio-12190    =>  kworker-2013
  ...
  kworker-2013  =>   fio-12190
  fio-12190    =>  kworker-2013
  ...

Fixing blk_insert_clone_request()'s blk_mq_insert_request() call to
_not_ use kblockd to submit the cloned requests isn't enough to
eliminate the observed context switches.

In addition to this dm-mq specific blk-core fix, there are 2 DM core
fixes to dm-mq that (when paired with the blk-core fix) completely
eliminate the observed context switching:

1)  don't blk_mq_run_hw_queues in blk-mq request completion

    Motivated by desire to reduce overhead of dm-mq, punting to kblockd
    just increases context switches.

    In my testing against a really fast null_blk device there was no benefit
    to running blk_mq_run_hw_queues() on completion (and no other blk-mq
    driver does this).  So hopefully this change doesn't induce the need for
    yet another revert like commit 621739b !

2)  use blk_mq_complete_request() in dm_complete_request()

    blk_complete_request() doesn't offer the traditional q->mq_ops vs
    .request_fn branching pattern that other historic block interfaces
    do (e.g. blk_get_request).  Using blk_mq_complete_request() for
    blk-mq requests is important for performance.  It should be noted
    that, like blk_complete_request(), blk_mq_complete_request() doesn't
    natively handle partial completions -- but the request-based
    DM-multipath target does provide the required partial completion
    support by dm.c:end_clone_bio() triggering requeueing of the request
    via dm-mpath.c:multipath_end_io()'s return of DM_ENDIO_REQUEUE.

dm-mq fix #2 is _much_ more important than #1 for eliminating the
context switches.
Before: cpu          : usr=15.10%, sys=59.39%, ctx=7905181, majf=0, minf=475
After:  cpu          : usr=20.60%, sys=79.35%, ctx=2008, majf=0, minf=472

With these changes multithreaded async read IOPs improved from ~950K
to ~1350K for this dm-mq stacked on null_blk test-case.  The raw read
IOPs of the underlying null_blk device for the same workload is ~1950K.

Fixes: 7fb4898 ("block: add blk-mq support to blk_insert_cloned_request()")
Fixes: bfebd1c ("dm: add full blk-mq support to request-based DM")
Reported-by: Sagi Grimberg <sagig@dev.mellanox.co.il>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5504a47
@popcornmix popcornmix added a commit that referenced this issue Apr 13, 2016
@T-X T-X net: fix bridge multicast packet checksum validation
We need to update the skb->csum after pulling the skb, otherwise
an unnecessary checksum (re)computation can ocure for IGMP/MLD packets
in the bridge code. Additionally this fixes the following splats for
network devices / bridge ports with support for and enabled RX checksum
offloading:

[...]
[   43.986968] eth0: hw csum failure
[   43.990344] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.4.0 #2
[   43.996193] Hardware name: BCM2709
[   43.999647] [<800204e0>] (unwind_backtrace) from [<8001cf14>] (show_stack+0x10/0x14)
[   44.007432] [<8001cf14>] (show_stack) from [<801ab614>] (dump_stack+0x80/0x90)
[   44.014695] [<801ab614>] (dump_stack) from [<802e4548>] (__skb_checksum_complete+0x6c/0xac)
[   44.023090] [<802e4548>] (__skb_checksum_complete) from [<803a055c>] (ipv6_mc_validate_checksum+0x104/0x178)
[   44.032959] [<803a055c>] (ipv6_mc_validate_checksum) from [<802e111c>] (skb_checksum_trimmed+0x130/0x188)
[   44.042565] [<802e111c>] (skb_checksum_trimmed) from [<803a06e8>] (ipv6_mc_check_mld+0x118/0x338)
[   44.051501] [<803a06e8>] (ipv6_mc_check_mld) from [<803b2c98>] (br_multicast_rcv+0x5dc/0xd00)
[   44.060077] [<803b2c98>] (br_multicast_rcv) from [<803aa510>] (br_handle_frame_finish+0xac/0x51c)
[...]

Fixes: 9afd85c ("net: Export IGMP/MLD message validation code")
Reported-by: Álvaro Fernández Rojas <noltari@gmail.com>
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: David S. Miller <davem@davemloft.net>
8855928
@popcornmix popcornmix pushed a commit that referenced this issue Apr 18, 2016
@bmork bmork drm/i915: fix deadlock on lid open
commit e2c8b87 moved modeset locking inside resume/suspend
functions, but missed a code path only executed on lid close/open
on older hardware. The result was a deadlock when closing and
opening the lid without suspending on such hardware:

 =============================================
 [ INFO: possible recursive locking detected ]
 4.6.0-rc1 #385 Not tainted
 ---------------------------------------------
 kworker/0:3/88 is trying to acquire lock:
  (&dev->mode_config.mutex){+.+.+.}, at: [<ffffffffa063e6a4>] intel_display_resume+0x4a/0x12f [i915]

 but task is already holding lock:
  (&dev->mode_config.mutex){+.+.+.}, at: [<ffffffffa02d0d4f>] drm_modeset_lock_all+0x3e/0xa6 [drm]

 other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(&dev->mode_config.mutex);
   lock(&dev->mode_config.mutex);

  *** DEADLOCK ***

  May be due to missing lock nesting notation

 7 locks held by kworker/0:3/88:
  #0:  ("kacpi_notify"){++++.+}, at: [<ffffffff81068dfc>] process_one_work+0x14a/0x50b
  #1:  ((&dpc->work)#2){+.+.+.}, at: [<ffffffff81068dfc>] process_one_work+0x14a/0x50b
  #2:  ((acpi_lid_notifier).rwsem){++++.+}, at: [<ffffffff8106f874>] __blocking_notifier_call_chain+0x34/0x65
  #3:  (&dev_priv->modeset_restore_lock){+.+.+.}, at: [<ffffffffa0664cf6>] intel_lid_notify+0x3c/0xd9 [i915]
  #4:  (&dev->mode_config.mutex){+.+.+.}, at: [<ffffffffa02d0d4f>] drm_modeset_lock_all+0x3e/0xa6 [drm]
  #5:  (crtc_ww_class_acquire){+.+.+.}, at: [<ffffffffa02d0d59>] drm_modeset_lock_all+0x48/0xa6 [drm]
  #6:  (crtc_ww_class_mutex){+.+.+.}, at: [<ffffffffa02d0b2a>] modeset_lock+0x13c/0x1cd [drm]

 stack backtrace:
 CPU: 0 PID: 88 Comm: kworker/0:3 Not tainted 4.6.0-rc1 #385
 Hardware name: LENOVO 2776LEG/2776LEG, BIOS 6EET55WW (3.15 ) 12/19/2011
 Workqueue: kacpi_notify acpi_os_execute_deferred
  0000000000000000 ffff88022fd5f990 ffffffff8124af06 ffffffff825b39c0
  ffffffff825b39c0 ffff88022fd5fa60 ffffffff8108f547 ffff88022fd5fa70
  000000008108e817 ffff880230236cc0 0000000000000000 ffffffff825b39c0
 Call Trace:
  [<ffffffff8124af06>] dump_stack+0x67/0x90
  [<ffffffff8108f547>] __lock_acquire+0xdb5/0xf71
  [<ffffffff8108bd2c>] ? look_up_lock_class+0xbe/0x10a
  [<ffffffff8108fae2>] lock_acquire+0x137/0x1cb
  [<ffffffff8108fae2>] ? lock_acquire+0x137/0x1cb
  [<ffffffffa063e6a4>] ? intel_display_resume+0x4a/0x12f [i915]
  [<ffffffff8148202f>] mutex_lock_nested+0x7e/0x3a4
  [<ffffffffa063e6a4>] ? intel_display_resume+0x4a/0x12f [i915]
  [<ffffffffa063e6a4>] ? intel_display_resume+0x4a/0x12f [i915]
  [<ffffffffa02d0b2a>] ? modeset_lock+0x13c/0x1cd [drm]
  [<ffffffffa063e6a4>] intel_display_resume+0x4a/0x12f [i915]
  [<ffffffffa063e6a4>] ? intel_display_resume+0x4a/0x12f [i915]
  [<ffffffffa02d0b2a>] ? modeset_lock+0x13c/0x1cd [drm]
  [<ffffffffa02d0b2a>] ? modeset_lock+0x13c/0x1cd [drm]
  [<ffffffffa02d0bf7>] ? drm_modeset_lock+0x17/0x24 [drm]
  [<ffffffffa02d0c8b>] ? drm_modeset_lock_all_ctx+0x87/0xa1 [drm]
  [<ffffffffa0664d6a>] intel_lid_notify+0xb0/0xd9 [i915]
  [<ffffffff8106f4c6>] notifier_call_chain+0x4a/0x6c
  [<ffffffff8106f88d>] __blocking_notifier_call_chain+0x4d/0x65
  [<ffffffff8106f8b9>] blocking_notifier_call_chain+0x14/0x16
  [<ffffffffa0011215>] acpi_lid_send_state+0x83/0xad [button]
  [<ffffffffa00112a6>] acpi_button_notify+0x41/0x132 [button]
  [<ffffffff812b07df>] acpi_device_notify+0x19/0x1b
  [<ffffffff812c8570>] acpi_ev_notify_dispatch+0x49/0x64
  [<ffffffff812ab9fb>] acpi_os_execute_deferred+0x14/0x20
  [<ffffffff81068f17>] process_one_work+0x265/0x50b
  [<ffffffff810696f5>] worker_thread+0x1fc/0x2dd
  [<ffffffff810694f9>] ? rescuer_thread+0x309/0x309
  [<ffffffff810694f9>] ? rescuer_thread+0x309/0x309
  [<ffffffff8106e2d6>] kthread+0xe0/0xe8
  [<ffffffff8107bc47>] ? local_clock+0x19/0x22
  [<ffffffff81484f42>] ret_from_fork+0x22/0x40
  [<ffffffff8106e1f6>] ? kthread_create_on_node+0x1b5/0x1b5

Fixes: e2c8b87 ("drm/i915: Use atomic helpers for suspend, v2.")
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459328913-13719-1-git-send-email-bjorn@mork.no
(cherry picked from commit 9f54d4b)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
42bf7b4
@davet321 davet321 pushed a commit to davet321/rpi-linux that referenced this issue Apr 21, 2016
Yang Shi arm64: replace read_lock to rcu lock in call_step_hook
commit cf0a254 upstream.

BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:917
in_atomic(): 1, irqs_disabled(): 128, pid: 383, name: sh
Preemption disabled at:[<ffff800000124c18>] kgdb_cpu_enter+0x158/0x6b8

CPU: 3 PID: 383 Comm: sh Tainted: G        W       4.1.13-rt13 #2
Hardware name: Freescale Layerscape 2085a RDB Board (DT)
Call trace:
[<ffff8000000885e8>] dump_backtrace+0x0/0x128
[<ffff800000088734>] show_stack+0x24/0x30
[<ffff80000079a7c4>] dump_stack+0x80/0xa0
[<ffff8000000bd324>] ___might_sleep+0x18c/0x1a0
[<ffff8000007a20ac>] __rt_spin_lock+0x2c/0x40
[<ffff8000007a2268>] rt_read_lock+0x40/0x58
[<ffff800000085328>] single_step_handler+0x38/0xd8
[<ffff800000082368>] do_debug_exception+0x58/0xb8
Exception stack(0xffff80834a1e7c80 to 0xffff80834a1e7da0)
7c80: ffffff9c ffffffff 92c23ba0 0000ffff 4a1e7e40 ffff8083 001bfcc4 ffff8000
7ca0: f2000400 00000000 00000000 00000000 4a1e7d80 ffff8083 0049501c ffff8000
7cc0: 00005402 00000000 00aaa210 ffff8000 4a1e7ea0 ffff8083 000833f4 ffff8000
7ce0: ffffff9c ffffffff 92c23ba0 0000ffff 4a1e7ea0 ffff8083 001bfcc0 ffff8000
7d00: 4a0fc400 ffff8083 00005402 00000000 4a1e7d40 ffff8083 00490324 ffff8000
7d20: ffffff9c 00000000 92c23ba0 0000ffff 000a0000 00000000 00000000 00000000
7d40: 00000008 00000000 00080000 00000000 92c23b8b 0000ffff 92c23b8e 0000ffff
7d60: 00000038 00000000 00001cb2 00000000 00000005 00000000 92d7b498 0000ffff
7d80: 01010101 01010101 92be9000 0000ffff 00000000 00000000 00000030 00000000
[<ffff8000000833f4>] el1_dbg+0x18/0x6c

This issue is similar with 62c6c61("arm64: replace read_lock to rcu lock in
call_break_hook"), but comes to single_step_handler.

This also solves kgdbts boot test silent hang issue on 4.4 -rt kernel.

Signed-off-by: Yang Shi <yang.shi@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
c8ed225
@popcornmix popcornmix pushed a commit that referenced this issue Apr 22, 2016
@T-X T-X net: fix bridge multicast packet checksum validation
[ Upstream commit 9b36881 ]

We need to update the skb->csum after pulling the skb, otherwise
an unnecessary checksum (re)computation can ocure for IGMP/MLD packets
in the bridge code. Additionally this fixes the following splats for
network devices / bridge ports with support for and enabled RX checksum
offloading:

[...]
[   43.986968] eth0: hw csum failure
[   43.990344] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.4.0 #2
[   43.996193] Hardware name: BCM2709
[   43.999647] [<800204e0>] (unwind_backtrace) from [<8001cf14>] (show_stack+0x10/0x14)
[   44.007432] [<8001cf14>] (show_stack) from [<801ab614>] (dump_stack+0x80/0x90)
[   44.014695] [<801ab614>] (dump_stack) from [<802e4548>] (__skb_checksum_complete+0x6c/0xac)
[   44.023090] [<802e4548>] (__skb_checksum_complete) from [<803a055c>] (ipv6_mc_validate_checksum+0x104/0x178)
[   44.032959] [<803a055c>] (ipv6_mc_validate_checksum) from [<802e111c>] (skb_checksum_trimmed+0x130/0x188)
[   44.042565] [<802e111c>] (skb_checksum_trimmed) from [<803a06e8>] (ipv6_mc_check_mld+0x118/0x338)
[   44.051501] [<803a06e8>] (ipv6_mc_check_mld) from [<803b2c98>] (br_multicast_rcv+0x5dc/0xd00)
[   44.060077] [<803b2c98>] (br_multicast_rcv) from [<803aa510>] (br_handle_frame_finish+0xac/0x51c)
[...]

Fixes: 9afd85c ("net: Export IGMP/MLD message validation code")
Reported-by: Álvaro Fernández Rojas <noltari@gmail.com>
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
44bc7d1
@popcornmix popcornmix pushed a commit that referenced this issue Apr 22, 2016
Yang Shi arm64: replace read_lock to rcu lock in call_step_hook
commit cf0a254 upstream.

BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:917
in_atomic(): 1, irqs_disabled(): 128, pid: 383, name: sh
Preemption disabled at:[<ffff800000124c18>] kgdb_cpu_enter+0x158/0x6b8

CPU: 3 PID: 383 Comm: sh Tainted: G        W       4.1.13-rt13 #2
Hardware name: Freescale Layerscape 2085a RDB Board (DT)
Call trace:
[<ffff8000000885e8>] dump_backtrace+0x0/0x128
[<ffff800000088734>] show_stack+0x24/0x30
[<ffff80000079a7c4>] dump_stack+0x80/0xa0
[<ffff8000000bd324>] ___might_sleep+0x18c/0x1a0
[<ffff8000007a20ac>] __rt_spin_lock+0x2c/0x40
[<ffff8000007a2268>] rt_read_lock+0x40/0x58
[<ffff800000085328>] single_step_handler+0x38/0xd8
[<ffff800000082368>] do_debug_exception+0x58/0xb8
Exception stack(0xffff80834a1e7c80 to 0xffff80834a1e7da0)
7c80: ffffff9c ffffffff 92c23ba0 0000ffff 4a1e7e40 ffff8083 001bfcc4 ffff8000
7ca0: f2000400 00000000 00000000 00000000 4a1e7d80 ffff8083 0049501c ffff8000
7cc0: 00005402 00000000 00aaa210 ffff8000 4a1e7ea0 ffff8083 000833f4 ffff8000
7ce0: ffffff9c ffffffff 92c23ba0 0000ffff 4a1e7ea0 ffff8083 001bfcc0 ffff8000
7d00: 4a0fc400 ffff8083 00005402 00000000 4a1e7d40 ffff8083 00490324 ffff8000
7d20: ffffff9c 00000000 92c23ba0 0000ffff 000a0000 00000000 00000000 00000000
7d40: 00000008 00000000 00080000 00000000 92c23b8b 0000ffff 92c23b8e 0000ffff
7d60: 00000038 00000000 00001cb2 00000000 00000005 00000000 92d7b498 0000ffff
7d80: 01010101 01010101 92be9000 0000ffff 00000000 00000000 00000030 00000000
[<ffff8000000833f4>] el1_dbg+0x18/0x6c

This issue is similar with 62c6c61("arm64: replace read_lock to rcu lock in
call_break_hook"), but comes to single_step_handler.

This also solves kgdbts boot test silent hang issue on 4.4 -rt kernel.

Signed-off-by: Yang Shi <yang.shi@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
f6dffe7
@popcornmix popcornmix pushed a commit that referenced this issue Apr 25, 2016
@davem330 davem330 Merge branch 'tipc-fixes'
Jon Maloy says:

====================
tipc: name distributor pernet queue

Commit #1 fixes a potential issue with deferred binding table
updates being pushed to the wrong namespace.

Commit #2 solves a problem with deferred binding table updates
remaining in the the defer queue after the issuing node has gone
down.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10c3c02
@popcornmix popcornmix pushed a commit that referenced this issue Apr 25, 2016
Adrian Hunter perf intel-pt: Fix segfault tracing transactions
Tracing a workload that uses transactions gave a seg fault as follows:

  perf record -e intel_pt// workload
  perf report
  Program received signal SIGSEGV, Segmentation fault.
  0x000000000054b58c in intel_pt_reset_last_branch_rb (ptq=0x1a36110)
  	at util/intel-pt.c:929
  929 ptq->last_branch_rb->nr = 0;
  (gdb) p ptq->last_branch_rb
  $1 = (struct branch_stack *) 0x0
  (gdb) up
  1148 intel_pt_reset_last_branch_rb(ptq);
  (gdb) l
  1143 if (ret)
  1144 pr_err("Intel Processor Trace: failed to deliver transaction event
  1145 ret);
  1146
  1147 if (pt->synth_opts.callchain)
  1148 intel_pt_reset_last_branch_rb(ptq);
  1149
  1150 return ret;
  1151 }
  1152
  (gdb) p pt->synth_opts.callchain
  $2 = true
  (gdb)
  (gdb) bt
   #0 0x000000000054b58c in intel_pt_reset_last_branch_rb (ptq=0x1a36110)
   #1 0x000000000054c1e0 in intel_pt_synth_transaction_sample (ptq=0x1a36110)
   #2 0x000000000054c5b2 in intel_pt_sample (ptq=0x1a36110)

Caused by checking the 'callchain' flag when it should have been the
'last_branch' flag.  Fix that.

Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v4.4+
Fixes: f14445e ("perf intel-pt: Support generating branch stack")
Link: http://lkml.kernel.org/r/1460977068-11566-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
1342e0b