Skip to content
This repository

su without password via /proc/pid/mem write #2

Closed
andrew-aladev opened this Issue January 23, 2012 · 1 comment

2 participants

Andrew Aladjev popcornmix
Andrew Aladjev

http://blog.zx2c4.com/749
your fs/proc/base.c is vulnerable too

Richo Healey richo referenced this issue from a commit in richo/linux November 23, 2011
usb/isp1760: Simpler queue head list code.
Small code refactoring to ease the real fix in patch #2.

Signed-off-by: Arvid Brodin <arvid.brodin@enea.com>
Tested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
e08f6a2
Richo Healey richo referenced this issue from a commit in richo/linux December 06, 2011
[SCSI] qla4xxx: fix call trace on rmmod with ql4xdontresethba=1
abort all active commands from eh_host_reset in-case
of ql4xdontresethba=1

Fix following call trace:-
Nov 21 14:50:47 172.17.140.111 qla4xxx 0000:13:00.4: qla4_8xxx_disable_msix: qla4xxx (rsp_q)
Nov 21 14:50:47 172.17.140.111 qla4xxx 0000:13:00.4: PCI INT A disabled
Nov 21 14:50:47 172.17.140.111 slab error in kmem_cache_destroy(): cache `qla4xxx_srbs': Can't free all objects
Nov 21 14:50:47 172.17.140.111 Pid: 9154, comm: rmmod Tainted: G           O 3.2.0-rc2+ #2
Nov 21 14:50:47 172.17.140.111 Call Trace:
Nov 21 14:50:47 172.17.140.111  [<c051231a>] ? kmem_cache_destroy+0x9a/0xb0
Nov 21 14:50:47 172.17.140.111  [<c0489c4a>] ? sys_delete_module+0x14a/0x210
Nov 21 14:50:47 172.17.140.111  [<c04fd552>] ? do_munmap+0x202/0x280
Nov 21 14:50:47 172.17.140.111  [<c04a6d4e>] ? audit_syscall_entry+0x1ae/0x1d0
Nov 21 14:50:47 172.17.140.111  [<c083019f>] ? sysenter_do_call+0x12/0x28
Nov 21 14:51:50 172.17.140.111 SLAB: cache with size 64 has lost its name
Nov 21 14:51:50 172.17.140.111 iscsi: registered transport (qla4xxx)
Nov 21 14:51:50 172.17.140.111 qla4xxx 0000:13:00.4: PCI INT A -> GSI 28 (level, low) -> IRQ 28

Signed-off-by: Sarang Radke <sarang.radke@qlogic.com>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
8a28896
Richo Healey richo referenced this issue from a commit in richo/linux December 27, 2011
Christian Borntraeger [S390] kvm: fix sleeping function ... at mm/page_alloc.c:2260
commit cc77245
    [S390] fix list corruption in gmap reverse mapping

added a potential dead lock:

BUG: sleeping function called from invalid context at mm/page_alloc.c:2260
in_atomic(): 1, irqs_disabled(): 0, pid: 1108, name: qemu-system-s39
3 locks held by qemu-system-s39/1108:
 #0:  (&kvm->slots_lock){+.+.+.}, at: [<000003e004866542>] kvm_set_memory_region+0x3a/0x6c [kvm]
 #1:  (&mm->mmap_sem){++++++}, at: [<0000000000123790>] gmap_map_segment+0x9c/0x298
 #2:  (&(&mm->page_table_lock)->rlock){+.+.+.}, at: [<00000000001237a8>] gmap_map_segment+0xb4/0x298
CPU: 0 Not tainted 3.1.3 #45
Process qemu-system-s39 (pid: 1108, task: 00000004f8b3cb30, ksp: 00000004fd5978d0)
00000004fd5979a0 00000004fd597920 0000000000000002 0000000000000000
       00000004fd5979c0 00000004fd597938 00000004fd597938 0000000000617e96
       0000000000000000 00000004f8b3cf58 0000000000000000 0000000000000000
       000000000000000d 000000000000000c 00000004fd597988 0000000000000000
       0000000000000000 0000000000100a18 00000004fd597920 00000004fd597960
Call Trace:
([<0000000000100926>] show_trace+0xee/0x144)
 [<0000000000131f3a>] __might_sleep+0x12a/0x158
 [<0000000000217fb4>] __alloc_pages_nodemask+0x224/0xadc
 [<0000000000123086>] gmap_alloc_table+0x46/0x114
 [<000000000012395c>] gmap_map_segment+0x268/0x298
 [<000003e00486b014>] kvm_arch_commit_memory_region+0x44/0x6c [kvm]
 [<000003e004866414>] __kvm_set_memory_region+0x3b0/0x4a4 [kvm]
 [<000003e004866554>] kvm_set_memory_region+0x4c/0x6c [kvm]
 [<000003e004867c7a>] kvm_vm_ioctl+0x14a/0x314 [kvm]
 [<0000000000292100>] do_vfs_ioctl+0x94/0x588
 [<0000000000292688>] SyS_ioctl+0x94/0xac
 [<000000000061e124>] sysc_noemu+0x22/0x28
 [<000003fffcd5e7ca>] 0x3fffcd5e7ca
3 locks held by qemu-system-s39/1108:
 #0:  (&kvm->slots_lock){+.+.+.}, at: [<000003e004866542>] kvm_set_memory_region+0x3a/0x6c [kvm]
 #1:  (&mm->mmap_sem){++++++}, at: [<0000000000123790>] gmap_map_segment+0x9c/0x298
 #2:  (&(&mm->page_table_lock)->rlock){+.+.+.}, at: [<00000000001237a8>] gmap_map_segment+0xb4/0x298

Fix this by freeing the lock on the alloc path. This is ok, since the
gmap table is never freed until we call gmap_free, so the table we are
walking cannot go.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
c86cce2
Richo Healey richo referenced this issue from a commit in richo/linux January 10, 2012
mempool: fix and document synchronization and memory barrier usage
mempool_alloc/free() use undocumented smp_mb()'s.  The code is slightly
broken and misleading.

The lockless part is in mempool_free().  It wants to determine whether the
item being freed needs to be returned to the pool or backing allocator
without grabbing pool->lock.  Two things need to be guaranteed for correct
operation.

1. pool->curr_nr + #allocated should never dip below pool->min_nr.
2. Waiters shouldn't be left dangling.

For #1, The only necessary condition is that curr_nr visible at free is
from after the allocation of the element being freed (details in the
comment).  For most cases, this is true without any barrier but there can
be fringe cases where the allocated pointer is passed to the freeing task
without going through memory barriers.  To cover this case, wmb is
necessary before returning from allocation and rmb is necessary before
reading curr_nr.  IOW,

	ALLOCATING TASK			FREEING TASK

	update pool state after alloc;
	wmb();
	pass pointer to freeing task;
					read pointer;
					rmb();
					read pool state to free;

The current code doesn't have wmb after pool update during allocation and
may theoretically, on machines where unlock doesn't behave as full wmb,
lead to pool depletion and deadlock.  smp_wmb() needs to be added after
successful allocation from reserved elements and smp_mb() in
mempool_free() can be replaced with smp_rmb().

For #2, the waiter needs to add itself to waitqueue and then check the
wait condition and the waker needs to update the wait condition and then
wake up.  Because waitqueue operations always go through full spinlock
synchronization, there is no need for extra memory barriers.

Furthermore, mempool_alloc() is already holding pool->lock when it decides
that it needs to wait.  There is no reason to do unlock - add waitqueue -
test condition again.  It can simply add itself to waitqueue while holding
pool->lock and then unlock and sleep.

This patch adds smp_wmb() after successful allocation from reserved pool,
replaces smp_mb() in mempool_free() with smp_rmb() and extend pool->lock
over waitqueue addition.  More importantly, it explains what memory
barriers do and how the lockless testing is correct.

-v2: Oleg pointed out that unlock doesn't imply wmb.  Added explicit
     smp_wmb() after successful allocation from reserved pool and
     updated comments accordingly.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5b99054
Richo Healey richo referenced this issue from a commit in richo/linux January 20, 2012
net: introduce res_counter_charge_nofail() for socket allocations
There is a case in __sk_mem_schedule(), where an allocation
is beyond the maximum, but yet we are allowed to proceed.
It happens under the following condition:

	sk->sk_wmem_queued + size >= sk->sk_sndbuf

The network code won't revert the allocation in this case,
meaning that at some point later it'll try to do it. Since
this is never communicated to the underlying res_counter
code, there is an inbalance in res_counter uncharge operation.

I see two ways of fixing this:

1) storing the information about those allocations somewhere
   in memcg, and then deducting from that first, before
   we start draining the res_counter,
2) providing a slightly different allocation function for
   the res_counter, that matches the original behavior of
   the network code more closely.

I decided to go for #2 here, believing it to be more elegant,
since #1 would require us to do basically that, but in a more
obscure way.

Signed-off-by: Glauber Costa <glommer@parallels.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
CC: Tejun Heo <tj@kernel.org>
CC: Li Zefan <lizf@cn.fujitsu.com>
CC: Laurent Chavey <chavey@google.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
0e90b31
Richo Healey richo referenced this issue from a commit in richo/linux January 12, 2012
ming1 HID: usbhid: fix dead lock between open and disconect
There is no reason to hold hiddev->existancelock before
calling usb_deregister_dev, so move it out of the lock.

The patch fixes the lockdep warning below.

[ 5733.386271] ======================================================
[ 5733.386274] [ INFO: possible circular locking dependency detected ]
[ 5733.386278] 3.2.0-custom-next-20120111+ #1 Not tainted
[ 5733.386281] -------------------------------------------------------
[ 5733.386284] khubd/186 is trying to acquire lock:
[ 5733.386288]  (minor_rwsem){++++.+}, at: [<ffffffffa0011a04>] usb_deregister_dev+0x37/0x9e [usbcore]
[ 5733.386311]
[ 5733.386312] but task is already holding lock:
[ 5733.386315]  (&hiddev->existancelock){+.+...}, at: [<ffffffffa0094d17>] hiddev_disconnect+0x26/0x87 [usbhid]
[ 5733.386328]
[ 5733.386329] which lock already depends on the new lock.
[ 5733.386330]
[ 5733.386333]
[ 5733.386334] the existing dependency chain (in reverse order) is:
[ 5733.386336]
[ 5733.386337] -> #1 (&hiddev->existancelock){+.+...}:
[ 5733.386346]        [<ffffffff81082d26>] lock_acquire+0xcb/0x10e
[ 5733.386357]        [<ffffffff813df961>] __mutex_lock_common+0x60/0x465
[ 5733.386366]        [<ffffffff813dfe4d>] mutex_lock_nested+0x36/0x3b
[ 5733.386371]        [<ffffffffa0094ad6>] hiddev_open+0x113/0x193 [usbhid]
[ 5733.386378]        [<ffffffffa0011971>] usb_open+0x66/0xc2 [usbcore]
[ 5733.386390]        [<ffffffff8111a8b5>] chrdev_open+0x12b/0x154
[ 5733.386402]        [<ffffffff811159a8>] __dentry_open.isra.16+0x20b/0x355
[ 5733.386408]        [<ffffffff811165dc>] nameidata_to_filp+0x43/0x4a
[ 5733.386413]        [<ffffffff81122ed5>] do_last+0x536/0x570
[ 5733.386419]        [<ffffffff8112300b>] path_openat+0xce/0x301
[ 5733.386423]        [<ffffffff81123327>] do_filp_open+0x33/0x81
[ 5733.386427]        [<ffffffff8111664d>] do_sys_open+0x6a/0xfc
[ 5733.386431]        [<ffffffff811166fb>] sys_open+0x1c/0x1e
[ 5733.386434]        [<ffffffff813e7c79>] system_call_fastpath+0x16/0x1b
[ 5733.386441]
[ 5733.386441] -> #0 (minor_rwsem){++++.+}:
[ 5733.386448]        [<ffffffff8108255d>] __lock_acquire+0xa80/0xd74
[ 5733.386454]        [<ffffffff81082d26>] lock_acquire+0xcb/0x10e
[ 5733.386458]        [<ffffffff813e01f5>] down_write+0x44/0x77
[ 5733.386464]        [<ffffffffa0011a04>] usb_deregister_dev+0x37/0x9e [usbcore]
[ 5733.386475]        [<ffffffffa0094d2d>] hiddev_disconnect+0x3c/0x87 [usbhid]
[ 5733.386483]        [<ffffffff8132df51>] hid_disconnect+0x3f/0x54
[ 5733.386491]        [<ffffffff8132dfb4>] hid_device_remove+0x4e/0x7a
[ 5733.386496]        [<ffffffff812c0957>] __device_release_driver+0x81/0xcd
[ 5733.386502]        [<ffffffff812c09c3>] device_release_driver+0x20/0x2d
[ 5733.386507]        [<ffffffff812c0564>] bus_remove_device+0x114/0x128
[ 5733.386512]        [<ffffffff812bdd6f>] device_del+0x131/0x183
[ 5733.386519]        [<ffffffff8132def3>] hid_destroy_device+0x1e/0x3d
[ 5733.386525]        [<ffffffffa00916b0>] usbhid_disconnect+0x36/0x42 [usbhid]
[ 5733.386530]        [<ffffffffa000fb60>] usb_unbind_interface+0x57/0x11f [usbcore]
[ 5733.386542]        [<ffffffff812c0957>] __device_release_driver+0x81/0xcd
[ 5733.386547]        [<ffffffff812c09c3>] device_release_driver+0x20/0x2d
[ 5733.386552]        [<ffffffff812c0564>] bus_remove_device+0x114/0x128
[ 5733.386557]        [<ffffffff812bdd6f>] device_del+0x131/0x183
[ 5733.386562]        [<ffffffffa000de61>] usb_disable_device+0xa8/0x1d8 [usbcore]
[ 5733.386573]        [<ffffffffa0006bd2>] usb_disconnect+0xab/0x11f [usbcore]
[ 5733.386583]        [<ffffffffa0008aa0>] hub_thread+0x73b/0x1157 [usbcore]
[ 5733.386593]        [<ffffffff8105dc0f>] kthread+0x95/0x9d
[ 5733.386601]        [<ffffffff813e90b4>] kernel_thread_helper+0x4/0x10
[ 5733.386607]
[ 5733.386608] other info that might help us debug this:
[ 5733.386609]
[ 5733.386612]  Possible unsafe locking scenario:
[ 5733.386613]
[ 5733.386615]        CPU0                    CPU1
[ 5733.386618]        ----                    ----
[ 5733.386620]   lock(&hiddev->existancelock);
[ 5733.386625]                                lock(minor_rwsem);
[ 5733.386630]                                lock(&hiddev->existancelock);
[ 5733.386635]   lock(minor_rwsem);
[ 5733.386639]
[ 5733.386640]  *** DEADLOCK ***
[ 5733.386641]
[ 5733.386644] 6 locks held by khubd/186:
[ 5733.386646]  #0:  (&__lockdep_no_validate__){......}, at: [<ffffffffa00084af>] hub_thread+0x14a/0x1157 [usbcore]
[ 5733.386661]  #1:  (&__lockdep_no_validate__){......}, at: [<ffffffffa0006b77>] usb_disconnect+0x50/0x11f [usbcore]
[ 5733.386677]  #2:  (hcd->bandwidth_mutex){+.+.+.}, at: [<ffffffffa0006bc8>] usb_disconnect+0xa1/0x11f [usbcore]
[ 5733.386693]  #3:  (&__lockdep_no_validate__){......}, at: [<ffffffff812c09bb>] device_release_driver+0x18/0x2d
[ 5733.386704]  #4:  (&__lockdep_no_validate__){......}, at: [<ffffffff812c09bb>] device_release_driver+0x18/0x2d
[ 5733.386714]  #5:  (&hiddev->existancelock){+.+...}, at: [<ffffffffa0094d17>] hiddev_disconnect+0x26/0x87 [usbhid]
[ 5733.386727]
[ 5733.386727] stack backtrace:
[ 5733.386731] Pid: 186, comm: khubd Not tainted 3.2.0-custom-next-20120111+ #1
[ 5733.386734] Call Trace:
[ 5733.386741]  [<ffffffff81062881>] ? up+0x34/0x3b
[ 5733.386747]  [<ffffffff813d9ef3>] print_circular_bug+0x1f8/0x209
[ 5733.386752]  [<ffffffff8108255d>] __lock_acquire+0xa80/0xd74
[ 5733.386756]  [<ffffffff810808b4>] ? trace_hardirqs_on_caller+0x15d/0x1a3
[ 5733.386763]  [<ffffffff81043a3f>] ? vprintk+0x3f4/0x419
[ 5733.386774]  [<ffffffffa0011a04>] ? usb_deregister_dev+0x37/0x9e [usbcore]
[ 5733.386779]  [<ffffffff81082d26>] lock_acquire+0xcb/0x10e
[ 5733.386789]  [<ffffffffa0011a04>] ? usb_deregister_dev+0x37/0x9e [usbcore]
[ 5733.386797]  [<ffffffff813e01f5>] down_write+0x44/0x77
[ 5733.386807]  [<ffffffffa0011a04>] ? usb_deregister_dev+0x37/0x9e [usbcore]
[ 5733.386818]  [<ffffffffa0011a04>] usb_deregister_dev+0x37/0x9e [usbcore]
[ 5733.386825]  [<ffffffffa0094d2d>] hiddev_disconnect+0x3c/0x87 [usbhid]
[ 5733.386830]  [<ffffffff8132df51>] hid_disconnect+0x3f/0x54
[ 5733.386834]  [<ffffffff8132dfb4>] hid_device_remove+0x4e/0x7a
[ 5733.386839]  [<ffffffff812c0957>] __device_release_driver+0x81/0xcd
[ 5733.386844]  [<ffffffff812c09c3>] device_release_driver+0x20/0x2d
[ 5733.386848]  [<ffffffff812c0564>] bus_remove_device+0x114/0x128
[ 5733.386854]  [<ffffffff812bdd6f>] device_del+0x131/0x183
[ 5733.386859]  [<ffffffff8132def3>] hid_destroy_device+0x1e/0x3d
[ 5733.386865]  [<ffffffffa00916b0>] usbhid_disconnect+0x36/0x42 [usbhid]
[ 5733.386876]  [<ffffffffa000fb60>] usb_unbind_interface+0x57/0x11f [usbcore]
[ 5733.386882]  [<ffffffff812c0957>] __device_release_driver+0x81/0xcd
[ 5733.386886]  [<ffffffff812c09c3>] device_release_driver+0x20/0x2d
[ 5733.386890]  [<ffffffff812c0564>] bus_remove_device+0x114/0x128
[ 5733.386895]  [<ffffffff812bdd6f>] device_del+0x131/0x183
[ 5733.386905]  [<ffffffffa000de61>] usb_disable_device+0xa8/0x1d8 [usbcore]
[ 5733.386916]  [<ffffffffa0006bd2>] usb_disconnect+0xab/0x11f [usbcore]
[ 5733.386921]  [<ffffffff813dff82>] ? __mutex_unlock_slowpath+0x130/0x141
[ 5733.386929]  [<ffffffffa0008aa0>] hub_thread+0x73b/0x1157 [usbcore]
[ 5733.386935]  [<ffffffff8106a51d>] ? finish_task_switch+0x78/0x150
[ 5733.386941]  [<ffffffff8105e396>] ? __init_waitqueue_head+0x4c/0x4c
[ 5733.386950]  [<ffffffffa0008365>] ? usb_remote_wakeup+0x56/0x56 [usbcore]
[ 5733.386955]  [<ffffffff8105dc0f>] kthread+0x95/0x9d
[ 5733.386961]  [<ffffffff813e90b4>] kernel_thread_helper+0x4/0x10
[ 5733.386966]  [<ffffffff813e24b8>] ? retint_restore_args+0x13/0x13
[ 5733.386970]  [<ffffffff8105db7a>] ? __init_kthread_worker+0x55/0x55
[ 5733.386974]  [<ffffffff813e90b0>] ? gs_change+0x13/0x13

Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
ba18311
Richo Healey richo referenced this issue from a commit in richo/linux January 30, 2012
sstabellini xen pvhvm: do not remap pirqs onto evtchns if !xen_have_vector_callback
CC: stable@kernel.org #2.6.37 and onwards
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
207d543
Richo Healey richo referenced this issue from a commit in richo/linux February 03, 2012
mm: compaction: check pfn_valid when entering a new MAX_ORDER_NR_PAGE…
…S block during isolation for migration

When isolating for migration, migration starts at the start of a zone
which is not necessarily pageblock aligned.  Further, it stops isolating
when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally
not aligned.  This allows isolate_migratepages() to call pfn_to_page() on
an invalid PFN which can result in a crash.  This was originally reported
against a 3.0-based kernel with the following trace in a crash dump.

PID: 9902   TASK: d47aecd0  CPU: 0   COMMAND: "memcg_process_s"
 #0 [d72d3ad0] crash_kexec at c028cfdb
 #1 [d72d3b24] oops_end at c05c5322
 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60
 #3 [d72d3bec] bad_area at c0227fb6
 #4 [d72d3c00] do_page_fault at c05c72ec
 #5 [d72d3c80] error_code (via page_fault) at c05c47a4
    EAX: 00000000  EBX: 000c0000  ECX: 00000001  EDX: 00000807  EBP: 000c0000
    DS:  007b      ESI: 00000001  ES:  007b      EDI: f3000a80  GS:  6f50
    CS:  0060      EIP: c030b15a  ERR: ffffffff  EFLAGS: 00010002
 #6 [d72d3cb4] isolate_migratepages at c030b15a
 #7 [d72d3d14] zone_watermark_ok at c02d26cb
 #8 [d72d3d2c] compact_zone at c030b8de
 #9 [d72d3d68] compact_zone_order at c030bba1
#10 [d72d3db4] try_to_compact_pages at c030bc84
#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7
#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7
#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97
#14 [d72d3eb8] alloc_pages_vma at c030a845
#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb
#16 [d72d3f00] handle_mm_fault at c02f36c6
#17 [d72d3f30] do_page_fault at c05c70ed
#18 [d72d3fb0] error_code (via page_fault) at c05c47a4
    EAX: b71ff000  EBX: 00000001  ECX: 00001600  EDX: 00000431
    DS:  007b      ESI: 08048950  ES:  007b      EDI: bfaa3788
    SS:  007b      ESP: bfaa36e0  EBP: bfaa3828  GS:  6f50
    CS:  0073      EIP: 080487c8  ERR: ffffffff  EFLAGS: 00010202

It was also reported by Herbert van den Bergh against 3.1-based kernel
with the following snippet from the console log.

BUG: unable to handle kernel paging request at 01c00008
IP: [<c0522399>] isolate_migratepages+0x119/0x390
*pdpt = 000000002f7ce001 *pde = 0000000000000000

It is expected that it also affects 3.2.x and current mainline.

The problem is that pfn_valid is only called on the first PFN being
checked and that PFN is not necessarily aligned.  Lets say we have a case
like this

H = MAX_ORDER_NR_PAGES boundary
| = pageblock boundary
m = cc->migrate_pfn
f = cc->free_pfn
o = memory hole

H------|------H------|----m-Hoooooo|ooooooH-f----|------H

The migrate_pfn is just below a memory hole and the free scanner is beyond
the hole.  When isolate_migratepages started, it scans from migrate_pfn to
migrate_pfn+pageblock_nr_pages which is now in a memory hole.  It checks
pfn_valid() on the first PFN but then scans into the hole where there are
not necessarily valid struct pages.

This patch ensures that isolate_migratepages calls pfn_valid when
necessary.

Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
0bf380b
Richo Healey richo referenced this issue from a commit in richo/linux February 08, 2012
Linus Torvalds Merge tag 'sound-fixes' of git://git.kernel.org/pub/scm/linux/kernel/…
…git/tiwai/sound

sound fixes #2 for 3.3-rc3

A collection of small fixes, mostly for regressions.
In addition, a few ASoC wm8994 updates are included, too.

* tag 'sound-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ASoC: wm8994: Disable line output discharge prior to ramping VMID
  ASoC: wm8994: Fix typo in VMID ramp setting
  ALSA: oxygen, virtuoso: fix exchanged L/R volumes of aux and CD inputs
  ALSA: usb-audio: add Edirol UM-3G support
  ALSA: hda - add support for Uniwill ECS M31EI notebook
  ALSA: hda - Fix error handling in patch_ca0132.c
  ASoC: wm8994: Enabling VMID should take a runtime PM reference
  ALSA: hda/realtek - Fix a wrong condition
  ALSA: emu8000: Remove duplicate linux/moduleparam.h include from emu8000_patch.c
  ALSA: hda/realtek - Add missing Bass and CLFE as vmaster slaves
  ASoC: wm_hubs: Correct line input to line output 2 paths
  ASoC: cs42l73: Fix Output [X|A|V]SP_SCLK Sourcing Mode setting for master mode
  ASoC: wm8962: Fix word length configuration
  ASoC: core: Better support for idle_bias_off suspend ignores
  ASoC: wm8994: Remove ASoC level register cache sync
  ASoC: wm_hubs: Fix routing of input PGAs to line output mixer
e862f2e
Richo Healey richo referenced this issue from a commit in richo/linux January 07, 2012
yizou ixgbe: do not update real num queues when netdev is going away
If the netdev is already in NETREG_UNREGISTERING/_UNREGISTERED state, do not
update the real num tx queues. netdev_queue_update_kobjects() is already
called via remove_queue_kobjects() at NETREG_UNREGISTERING time. So, when
upper layer driver, e.g., FCoE protocol stack is monitoring the netdev
event of NETDEV_UNREGISTER and calls back to LLD ndo_fcoe_disable() to remove
extra queues allocated for FCoE, the associated txq sysfs kobjects are already
removed, and trying to update the real num queues would cause something like
below:

...
PID: 25138  TASK: ffff88021e64c440  CPU: 3   COMMAND: "kworker/3:3"
 #0 [ffff88021f007760] machine_kexec at ffffffff810226d9
 #1 [ffff88021f0077d0] crash_kexec at ffffffff81089d2d
 #2 [ffff88021f0078a0] oops_end at ffffffff813bca78
 #3 [ffff88021f0078d0] no_context at ffffffff81029e72
 #4 [ffff88021f007920] __bad_area_nosemaphore at ffffffff8102a155
 #5 [ffff88021f0079f0] bad_area_nosemaphore at ffffffff8102a23e
 #6 [ffff88021f007a00] do_page_fault at ffffffff813bf32e
 #7 [ffff88021f007b10] page_fault at ffffffff813bc045
    [exception RIP: sysfs_find_dirent+17]
    RIP: ffffffff81178611  RSP: ffff88021f007bc0  RFLAGS: 00010246
    RAX: ffff88021e64c440  RBX: ffffffff8156cc63  RCX: 0000000000000004
    RDX: ffffffff8156cc63  RSI: 0000000000000000  RDI: 0000000000000000
    RBP: ffff88021f007be0   R8: 0000000000000004   R9: 0000000000000008
    R10: ffffffff816fed00  R11: 0000000000000004  R12: 0000000000000000
    R13: ffffffff8156cc63  R14: 0000000000000000  R15: ffff8802222a0000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #8 [ffff88021f007be8] sysfs_get_dirent at ffffffff81178c07
 #9 [ffff88021f007c18] sysfs_remove_group at ffffffff8117ac27
#10 [ffff88021f007c48] netdev_queue_update_kobjects at ffffffff813178f9
#11 [ffff88021f007c88] netif_set_real_num_tx_queues at ffffffff81303e38
#12 [ffff88021f007cc8] ixgbe_set_num_queues at ffffffffa0249763 [ixgbe]
#13 [ffff88021f007cf8] ixgbe_init_interrupt_scheme at ffffffffa024ea89 [ixgbe]
#14 [ffff88021f007d48] ixgbe_fcoe_disable at ffffffffa0267113 [ixgbe]
#15 [ffff88021f007d68] vlan_dev_fcoe_disable at ffffffffa014fef5 [8021q]
#16 [ffff88021f007d78] fcoe_interface_cleanup at ffffffffa02b7dfd [fcoe]
#17 [ffff88021f007df8] fcoe_destroy_work at ffffffffa02b7f08 [fcoe]
#18 [ffff88021f007e18] process_one_work at ffffffff8105d7ca
#19 [ffff88021f007e68] worker_thread at ffffffff81060513
#20 [ffff88021f007ee8] kthread at ffffffff810648b6
#21 [ffff88021f007f48] kernel_thread_helper at ffffffff813c40f4

Signed-off-by: Yi Zou <yi.zou@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9d837ea
Richo Healey richo referenced this issue from a commit in richo/linux January 27, 2012
Andre Guedes Bluetooth: Fix potential deadlock
We don't need to use the _sync variant in hci_conn_hold and
hci_conn_put to cancel conn->disc_work delayed work. This way
we avoid potential deadlocks like this one reported by lockdep.

======================================================
[ INFO: possible circular locking dependency detected ]
3.2.0+ #1 Not tainted
-------------------------------------------------------
kworker/u:1/17 is trying to acquire lock:
 (&hdev->lock){+.+.+.}, at: [<ffffffffa0041155>] hci_conn_timeout+0x62/0x158 [bluetooth]

but task is already holding lock:
 ((&(&conn->disc_work)->work)){+.+...}, at: [<ffffffff81035751>] process_one_work+0x11a/0x2bf

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 ((&(&conn->disc_work)->work)){+.+...}:
       [<ffffffff81057444>] lock_acquire+0x8a/0xa7
       [<ffffffff81034ed1>] wait_on_work+0x3d/0xaa
       [<ffffffff81035b54>] __cancel_work_timer+0xac/0xef
       [<ffffffff81035ba4>] cancel_delayed_work_sync+0xd/0xf
       [<ffffffffa00554b0>] smp_chan_create+0xde/0xe6 [bluetooth]
       [<ffffffffa0056160>] smp_conn_security+0xa3/0x12d [bluetooth]
       [<ffffffffa0053640>] l2cap_connect_cfm+0x237/0x2e8 [bluetooth]
       [<ffffffffa004239c>] hci_proto_connect_cfm+0x2d/0x6f [bluetooth]
       [<ffffffffa0046ea5>] hci_event_packet+0x29d1/0x2d60 [bluetooth]
       [<ffffffffa003dde3>] hci_rx_work+0xd0/0x2e1 [bluetooth]
       [<ffffffff810357af>] process_one_work+0x178/0x2bf
       [<ffffffff81036178>] worker_thread+0xce/0x152
       [<ffffffff81039a03>] kthread+0x95/0x9d
       [<ffffffff812e7754>] kernel_thread_helper+0x4/0x10

-> #1 (slock-AF_BLUETOOTH-BTPROTO_L2CAP){+.+...}:
       [<ffffffff81057444>] lock_acquire+0x8a/0xa7
       [<ffffffff812e553a>] _raw_spin_lock_bh+0x36/0x6a
       [<ffffffff81244d56>] lock_sock_nested+0x24/0x7f
       [<ffffffffa004d96f>] lock_sock+0xb/0xd [bluetooth]
       [<ffffffffa0052906>] l2cap_chan_connect+0xa9/0x26f [bluetooth]
       [<ffffffffa00545f8>] l2cap_sock_connect+0xb3/0xff [bluetooth]
       [<ffffffff81243b48>] sys_connect+0x69/0x8a
       [<ffffffff812e6579>] system_call_fastpath+0x16/0x1b

-> #0 (&hdev->lock){+.+.+.}:
       [<ffffffff81056d06>] __lock_acquire+0xa80/0xd74
       [<ffffffff81057444>] lock_acquire+0x8a/0xa7
       [<ffffffff812e3870>] __mutex_lock_common+0x48/0x38e
       [<ffffffff812e3c75>] mutex_lock_nested+0x2a/0x31
       [<ffffffffa0041155>] hci_conn_timeout+0x62/0x158 [bluetooth]
       [<ffffffff810357af>] process_one_work+0x178/0x2bf
       [<ffffffff81036178>] worker_thread+0xce/0x152
       [<ffffffff81039a03>] kthread+0x95/0x9d
       [<ffffffff812e7754>] kernel_thread_helper+0x4/0x10

other info that might help us debug this:

Chain exists of:
  &hdev->lock --> slock-AF_BLUETOOTH-BTPROTO_L2CAP --> (&(&conn->disc_work)->work)

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock((&(&conn->disc_work)->work));
                               lock(slock-AF_BLUETOOTH-BTPROTO_L2CAP);
                               lock((&(&conn->disc_work)->work));
  lock(&hdev->lock);

 *** DEADLOCK ***

2 locks held by kworker/u:1/17:
 #0:  (hdev->name){.+.+.+}, at: [<ffffffff81035751>] process_one_work+0x11a/0x2bf
 #1:  ((&(&conn->disc_work)->work)){+.+...}, at: [<ffffffff81035751>] process_one_work+0x11a/0x2bf

stack backtrace:
Pid: 17, comm: kworker/u:1 Not tainted 3.2.0+ #1
Call Trace:
 [<ffffffff812e06c6>] print_circular_bug+0x1f8/0x209
 [<ffffffff81056d06>] __lock_acquire+0xa80/0xd74
 [<ffffffff81021ef2>] ? arch_local_irq_restore+0x6/0xd
 [<ffffffff81022bc7>] ? vprintk+0x3f9/0x41e
 [<ffffffff81057444>] lock_acquire+0x8a/0xa7
 [<ffffffffa0041155>] ? hci_conn_timeout+0x62/0x158 [bluetooth]
 [<ffffffff812e3870>] __mutex_lock_common+0x48/0x38e
 [<ffffffffa0041155>] ? hci_conn_timeout+0x62/0x158 [bluetooth]
 [<ffffffff81190fd6>] ? __dynamic_pr_debug+0x6d/0x6f
 [<ffffffffa0041155>] ? hci_conn_timeout+0x62/0x158 [bluetooth]
 [<ffffffff8105320f>] ? trace_hardirqs_off+0xd/0xf
 [<ffffffff812e3c75>] mutex_lock_nested+0x2a/0x31
 [<ffffffffa0041155>] hci_conn_timeout+0x62/0x158 [bluetooth]
 [<ffffffff810357af>] process_one_work+0x178/0x2bf
 [<ffffffff81035751>] ? process_one_work+0x11a/0x2bf
 [<ffffffff81055af3>] ? lock_acquired+0x1d0/0x1df
 [<ffffffffa00410f3>] ? hci_acl_disconn+0x65/0x65 [bluetooth]
 [<ffffffff81036178>] worker_thread+0xce/0x152
 [<ffffffff810407ed>] ? finish_task_switch+0x45/0xc5
 [<ffffffff810360aa>] ? manage_workers.isra.25+0x16a/0x16a
 [<ffffffff81039a03>] kthread+0x95/0x9d
 [<ffffffff812e7754>] kernel_thread_helper+0x4/0x10
 [<ffffffff812e5db4>] ? retint_restore_args+0x13/0x13
 [<ffffffff8103996e>] ? __init_kthread_worker+0x55/0x55
 [<ffffffff812e7750>] ? gs_change+0x13/0x13

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Reviewed-by: Ulisses Furquim <ulisses@profusion.mobi>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
a51cd2b
Richo Healey richo referenced this issue from a commit in richo/linux February 21, 2012
mlx4: Replacing pool_lock with mutex
Under the spinlock we call request_irq(), which allocates memory with GFP_KERNEL,
This causes the following trace when DEBUG_SPINLOCK is enabled, it can cause
the following trace:

 BUG: spinlock wrong CPU on CPU#2, ethtool/2595
 lock: ffff8801f9cbc2b0, .magic: dead4ead, .owner: ethtool/2595, .owner_cpu: 0
 Pid: 2595, comm: ethtool Not tainted 3.0.18 #2
 Call Trace:
 spin_bug+0xa2/0xf0
 do_raw_spin_unlock+0x71/0xa0
 _raw_spin_unlock+0xe/0x10
 mlx4_assign_eq+0x12b/0x190 [mlx4_core]
 mlx4_en_activate_cq+0x252/0x2d0 [mlx4_en]
 ? mlx4_en_activate_rx_rings+0x227/0x370 [mlx4_en]
 mlx4_en_start_port+0x189/0xb90 [mlx4_en]
 mlx4_en_set_ringparam+0x29a/0x340 [mlx4_en]
 dev_ethtool+0x816/0xb10
 ? dev_get_by_name_rcu+0xa4/0xe0
 dev_ioctl+0x2b5/0x470
 handle_mm_fault+0x1cd/0x2d0
 sock_do_ioctl+0x5d/0x70
 sock_ioctl+0x79/0x2f0
 do_vfs_ioctl+0x8c/0x340
 sys_ioctl+0xa1/0xb0
 system_call_fastpath+0x16/0x1b

Replacing with mutex, which is enough in this case.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
730c41d
Chris Boot bootc referenced this issue from a commit in bootc/linux-rpi-orig February 03, 2012
mm: compaction: check pfn_valid when entering a new MAX_ORDER_NR_PAGE…
…S block during isolation for migration

commit 0bf380b upstream.

When isolating for migration, migration starts at the start of a zone
which is not necessarily pageblock aligned.  Further, it stops isolating
when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally
not aligned.  This allows isolate_migratepages() to call pfn_to_page() on
an invalid PFN which can result in a crash.  This was originally reported
against a 3.0-based kernel with the following trace in a crash dump.

PID: 9902   TASK: d47aecd0  CPU: 0   COMMAND: "memcg_process_s"
 #0 [d72d3ad0] crash_kexec at c028cfdb
 #1 [d72d3b24] oops_end at c05c5322
 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60
 #3 [d72d3bec] bad_area at c0227fb6
 #4 [d72d3c00] do_page_fault at c05c72ec
 #5 [d72d3c80] error_code (via page_fault) at c05c47a4
    EAX: 00000000  EBX: 000c0000  ECX: 00000001  EDX: 00000807  EBP: 000c0000
    DS:  007b      ESI: 00000001  ES:  007b      EDI: f3000a80  GS:  6f50
    CS:  0060      EIP: c030b15a  ERR: ffffffff  EFLAGS: 00010002
 #6 [d72d3cb4] isolate_migratepages at c030b15a
 #7 [d72d3d14] zone_watermark_ok at c02d26cb
 #8 [d72d3d2c] compact_zone at c030b8de
 #9 [d72d3d68] compact_zone_order at c030bba1
#10 [d72d3db4] try_to_compact_pages at c030bc84
#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7
#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7
#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97
#14 [d72d3eb8] alloc_pages_vma at c030a845
#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb
#16 [d72d3f00] handle_mm_fault at c02f36c6
#17 [d72d3f30] do_page_fault at c05c70ed
#18 [d72d3fb0] error_code (via page_fault) at c05c47a4
    EAX: b71ff000  EBX: 00000001  ECX: 00001600  EDX: 00000431
    DS:  007b      ESI: 08048950  ES:  007b      EDI: bfaa3788
    SS:  007b      ESP: bfaa36e0  EBP: bfaa3828  GS:  6f50
    CS:  0073      EIP: 080487c8  ERR: ffffffff  EFLAGS: 00010202

It was also reported by Herbert van den Bergh against 3.1-based kernel
with the following snippet from the console log.

BUG: unable to handle kernel paging request at 01c00008
IP: [<c0522399>] isolate_migratepages+0x119/0x390
*pdpt = 000000002f7ce001 *pde = 0000000000000000

It is expected that it also affects 3.2.x and current mainline.

The problem is that pfn_valid is only called on the first PFN being
checked and that PFN is not necessarily aligned.  Lets say we have a case
like this

H = MAX_ORDER_NR_PAGES boundary
| = pageblock boundary
m = cc->migrate_pfn
f = cc->free_pfn
o = memory hole

H------|------H------|----m-Hoooooo|ooooooH-f----|------H

The migrate_pfn is just below a memory hole and the free scanner is beyond
the hole.  When isolate_migratepages started, it scans from migrate_pfn to
migrate_pfn+pageblock_nr_pages which is now in a memory hole.  It checks
pfn_valid() on the first PFN but then scans into the hole where there are
not necessarily valid struct pages.

This patch ensures that isolate_migratepages calls pfn_valid when
necessary.

Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9da11af
Raspberry Pi raspberrypi referenced this issue May 12, 2012
Closed

JIRA SW-5991 #20

Peter Huewe PeterHuewe referenced this issue from a commit in PeterHuewe/linux April 30, 2012
hwmon: (coretemp) fix oops on cpu unplug
commit b704871 upstream.

coretemp tries to access core_data array beyond bounds on cpu unplug if
core id of the cpu if more than NUM_REAL_CORES-1.

BUG: unable to handle kernel NULL pointer dereference at 000000000000013c
IP: [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
PGD 673e5a067 PUD 66e9b3067 PMD 0
Oops: 0000 [#1] SMP
CPU 79
Modules linked in: sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf bnep bluetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter nf_conntrack_ipv4 nf_defrag_ipv4 ip6_tables xt_state nf_conntrack coretemp crc32c_intel asix tpm_tis pcspkr usbnet iTCO_wdt i2c_i801 microcode mii joydev tpm i2c_core iTCO_vendor_support tpm_bios i7core_edac igb ioatdma edac_core dca megaraid_sas [last unloaded: oprofile]

Pid: 3315, comm: set-cpus Tainted: G        W    3.4.0-rc5+ #2 QCI QSSC-S4R/QSSC-S4R
RIP: 0010:[<ffffffffa00159af>]  [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
RSP: 0018:ffff880472fb3d48  EFLAGS: 00010246
RAX: 0000000000000124 RBX: 0000000000000034 RCX: 00000000ffffffff
RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246
RBP: ffff880472fb3d88 R08: ffff88077fcd36c0 R09: 0000000000000001
R10: ffffffff8184bc48 R11: 0000000000000000 R12: ffff880273095800
R13: 0000000000000013 R14: ffff8802730a1810 R15: 0000000000000000
FS:  00007f694a20f720(0000) GS:ffff88077fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000000000013c CR3: 000000067209b000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process set-cpus (pid: 3315, threadinfo ffff880472fb2000, task ffff880471fa0000)
Stack:
 ffff880277b4c308 0000000000000003 ffff880472fb3d88 0000000000000005
 0000000000000034 00000000ffffffd1 ffffffff81cadc70 ffff880472fb3e14
 ffff880472fb3dc8 ffffffff8161f48d ffff880471fa0000 0000000000000034
Call Trace:
 [<ffffffff8161f48d>] notifier_call_chain+0x4d/0x70
 [<ffffffff8107f1be>] __raw_notifier_call_chain+0xe/0x10
 [<ffffffff81059d30>] __cpu_notify+0x20/0x40
 [<ffffffff815fa251>] _cpu_down+0x81/0x270
 [<ffffffff815fa477>] cpu_down+0x37/0x50
 [<ffffffff815fd6a3>] store_online+0x63/0xc0
 [<ffffffff813c7078>] dev_attr_store+0x18/0x30
 [<ffffffff811f02cf>] sysfs_write_file+0xef/0x170
 [<ffffffff81180443>] vfs_write+0xb3/0x180
 [<ffffffff8118076a>] sys_write+0x4a/0x90
 [<ffffffff816236a9>] system_call_fastpath+0x16/0x1b
Code: 48 c7 c7 94 60 01 a0 44 0f b7 ac 10 ac 00 00 00 31 c0 e8 41 b7 5f e1 41 83 c5 02 49 63 c5 49 8b 44 c4 10 48 85 c0 74 56 45 31 ff <39> 58 18 75 4e eb 1f 49 63 d7 4c 89 f7 48 89 45 c8 48 6b d2 28
RIP  [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
 RSP <ffff880472fb3d48>
CR2: 000000000000013c

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5f991cd
Ben Arblaster andatche referenced this issue from a commit in andatche/raspi-linux April 30, 2012
hwmon: (coretemp) fix oops on cpu unplug
commit b704871 upstream.

coretemp tries to access core_data array beyond bounds on cpu unplug if
core id of the cpu if more than NUM_REAL_CORES-1.

BUG: unable to handle kernel NULL pointer dereference at 000000000000013c
IP: [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
PGD 673e5a067 PUD 66e9b3067 PMD 0
Oops: 0000 [#1] SMP
CPU 79
Modules linked in: sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf bnep bluetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter nf_conntrack_ipv4 nf_defrag_ipv4 ip6_tables xt_state nf_conntrack coretemp crc32c_intel asix tpm_tis pcspkr usbnet iTCO_wdt i2c_i801 microcode mii joydev tpm i2c_core iTCO_vendor_support tpm_bios i7core_edac igb ioatdma edac_core dca megaraid_sas [last unloaded: oprofile]

Pid: 3315, comm: set-cpus Tainted: G        W    3.4.0-rc5+ #2 QCI QSSC-S4R/QSSC-S4R
RIP: 0010:[<ffffffffa00159af>]  [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
RSP: 0018:ffff880472fb3d48  EFLAGS: 00010246
RAX: 0000000000000124 RBX: 0000000000000034 RCX: 00000000ffffffff
RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246
RBP: ffff880472fb3d88 R08: ffff88077fcd36c0 R09: 0000000000000001
R10: ffffffff8184bc48 R11: 0000000000000000 R12: ffff880273095800
R13: 0000000000000013 R14: ffff8802730a1810 R15: 0000000000000000
FS:  00007f694a20f720(0000) GS:ffff88077fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000000000013c CR3: 000000067209b000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process set-cpus (pid: 3315, threadinfo ffff880472fb2000, task ffff880471fa0000)
Stack:
 ffff880277b4c308 0000000000000003 ffff880472fb3d88 0000000000000005
 0000000000000034 00000000ffffffd1 ffffffff81cadc70 ffff880472fb3e14
 ffff880472fb3dc8 ffffffff8161f48d ffff880471fa0000 0000000000000034
Call Trace:
 [<ffffffff8161f48d>] notifier_call_chain+0x4d/0x70
 [<ffffffff8107f1be>] __raw_notifier_call_chain+0xe/0x10
 [<ffffffff81059d30>] __cpu_notify+0x20/0x40
 [<ffffffff815fa251>] _cpu_down+0x81/0x270
 [<ffffffff815fa477>] cpu_down+0x37/0x50
 [<ffffffff815fd6a3>] store_online+0x63/0xc0
 [<ffffffff813c7078>] dev_attr_store+0x18/0x30
 [<ffffffff811f02cf>] sysfs_write_file+0xef/0x170
 [<ffffffff81180443>] vfs_write+0xb3/0x180
 [<ffffffff8118076a>] sys_write+0x4a/0x90
 [<ffffffff816236a9>] system_call_fastpath+0x16/0x1b
Code: 48 c7 c7 94 60 01 a0 44 0f b7 ac 10 ac 00 00 00 31 c0 e8 41 b7 5f e1 41 83 c5 02 49 63 c5 49 8b 44 c4 10 48 85 c0 74 56 45 31 ff <39> 58 18 75 4e eb 1f 49 63 d7 4c 89 f7 48 89 45 c8 48 6b d2 28
RIP  [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
 RSP <ffff880472fb3d48>
CR2: 000000000000013c

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
f1f1ed7
Erik Hemming erique referenced this issue from a commit in erique/rpi_linux April 18, 2012
Stephen Boyd ks8851: Fix mutex deadlock in ks8851_net_stop()
There is a potential deadlock scenario when the ks8851 driver
is removed. The interrupt handler schedules a workqueue which
acquires a mutex that ks8851_net_stop() also acquires before
flushing the workqueue. Previously lockdep wouldn't be able
to find this problem but now that it has the support we can
trigger this lockdep warning by rmmoding the driver after
an ifconfig up.

Fix the possible deadlock by disabling the interrupts in
the chip and then release the lock across the workqueue
flushing. The mutex is only there to proect the registers
anyway so this should be ok.

=======================================================
[ INFO: possible circular locking dependency detected ]
3.0.21-00021-g8b33780-dirty #2911
-------------------------------------------------------
rmmod/125 is trying to acquire lock:
 ((&ks->irq_work)){+.+...}, at: [<c019e0b8>] flush_work+0x0/0xac

but task is already holding lock:
 (&ks->lock){+.+...}, at: [<bf00b850>] ks8851_net_stop+0x64/0x138 [ks8851]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&ks->lock){+.+...}:
       [<c01b89c8>] __lock_acquire+0x940/0x9f8
       [<c01b9058>] lock_acquire+0x10c/0x130
       [<c083dbec>] mutex_lock_nested+0x68/0x3dc
       [<bf00bd48>] ks8851_irq_work+0x24/0x46c [ks8851]
       [<c019c580>] process_one_work+0x2d8/0x518
       [<c019cb98>] worker_thread+0x220/0x3a0
       [<c01a2ad4>] kthread+0x88/0x94
       [<c0107008>] kernel_thread_exit+0x0/0x8

-> #0 ((&ks->irq_work)){+.+...}:
       [<c01b7984>] validate_chain+0x914/0x1018
       [<c01b89c8>] __lock_acquire+0x940/0x9f8
       [<c01b9058>] lock_acquire+0x10c/0x130
       [<c019e104>] flush_work+0x4c/0xac
       [<bf00b858>] ks8851_net_stop+0x6c/0x138 [ks8851]
       [<c06b209c>] __dev_close_many+0x98/0xcc
       [<c06b2174>] dev_close_many+0x68/0xd0
       [<c06b22ec>] rollback_registered_many+0xcc/0x2b8
       [<c06b2554>] rollback_registered+0x28/0x34
       [<c06b25b8>] unregister_netdevice_queue+0x58/0x7c
       [<c06b25f4>] unregister_netdev+0x18/0x20
       [<bf00c1f4>] ks8851_remove+0x64/0xb4 [ks8851]
       [<c049ddf0>] spi_drv_remove+0x18/0x1c
       [<c0468e98>] __device_release_driver+0x7c/0xbc
       [<c0468f64>] driver_detach+0x8c/0xb4
       [<c0467f00>] bus_remove_driver+0xb8/0xe8
       [<c01c1d20>] sys_delete_module+0x1e8/0x27c
       [<c0105ec0>] ret_fast_syscall+0x0/0x3c

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&ks->lock);
                               lock((&ks->irq_work));
                               lock(&ks->lock);
  lock((&ks->irq_work));

 *** DEADLOCK ***

4 locks held by rmmod/125:
 #0:  (&__lockdep_no_validate__){+.+.+.}, at: [<c0468f44>] driver_detach+0x6c/0xb4
 #1:  (&__lockdep_no_validate__){+.+.+.}, at: [<c0468f50>] driver_detach+0x78/0xb4
 #2:  (rtnl_mutex){+.+.+.}, at: [<c06b25e8>] unregister_netdev+0xc/0x20
 #3:  (&ks->lock){+.+...}, at: [<bf00b850>] ks8851_net_stop+0x64/0x138 [ks8851]

Cc: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
c5a9993
Erik Hemming erique referenced this issue from a commit in erique/rpi_linux April 26, 2012
xen/enlighten: Disable MWAIT_LEAF so that acpi-pad won't be loaded.
There are exactly four users of __monitor and __mwait:

 - cstate.c (which allows acpi_processor_ffh_cstate_enter to be called
   when the cpuidle API drivers are used. However patch
   "cpuidle: replace xen access to x86 pm_idle and default_idle"
   provides a mechanism to disable the cpuidle and use safe_halt.
 - smpboot (which allows mwait_play_dead to be called). However
   safe_halt is always used so we skip that.
 - intel_idle (same deal as above).
 - acpi_pad.c. This the one that we do not want to run as we
   will hit the below crash.

Why do we want to expose MWAIT_LEAF in the first place?
We want it for the xen-acpi-processor driver - which uploads
C-states to the hypervisor. If MWAIT_LEAF is set, the cstate.c
sets the proper address in the C-states so that the hypervisor
can benefit from using the MWAIT functionality. And that is
the sole reason for using it.

Without this patch, if a module performs mwait or monitor we
get this:

invalid opcode: 0000 [#1] SMP
CPU 2
.. snip..
Pid: 5036, comm: insmod Tainted: G           O 3.4.0-rc2upstream-dirty #2 Intel Corporation S2600CP/S2600CP
RIP: e030:[<ffffffffa000a017>]  [<ffffffffa000a017>] mwait_check_init+0x17/0x1000 [mwait_check]
RSP: e02b:ffff8801c298bf18  EFLAGS: 00010282
RAX: ffff8801c298a010 RBX: ffffffffa03b2000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff8801c29800d8 RDI: ffff8801ff097200
RBP: ffff8801c298bf18 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
R13: ffffffffa000a000 R14: 0000005148db7294 R15: 0000000000000003
FS:  00007fbb364f2700(0000) GS:ffff8801ff08c000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000000179f038 CR3: 00000001c9469000 CR4: 0000000000002660
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process insmod (pid: 5036, threadinfo ffff8801c298a000, task ffff8801c29cd7e0)
Stack:
 ffff8801c298bf48 ffffffff81002124 ffffffffa03b2000 00000000000081fd
 000000000178f010 000000000178f030 ffff8801c298bf78 ffffffff810c41e6
 00007fff3fb30db9 00007fff3fb30db9 00000000000081fd 0000000000010000
Call Trace:
 [<ffffffff81002124>] do_one_initcall+0x124/0x170
 [<ffffffff810c41e6>] sys_init_module+0xc6/0x220
 [<ffffffff815b15b9>] system_call_fastpath+0x16/0x1b
Code: <0f> 01 c8 31 c0 0f 01 c9 c9 c3 00 00 00 00 00 00 00 00 00 00 00 00
RIP  [<ffffffffa000a017>] mwait_check_init+0x17/0x1000 [mwait_check]
 RSP <ffff8801c298bf18>
---[ end trace 16582fc8a3d1e29a ]---
Kernel panic - not syncing: Fatal exception

With this module (which is what acpi_pad.c would hit):

MODULE_AUTHOR("Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>");
MODULE_DESCRIPTION("mwait_check_and_back");
MODULE_LICENSE("GPL");
MODULE_VERSION();

static int __init mwait_check_init(void)
{
	__monitor((void *)&current_thread_info()->flags, 0, 0);
	__mwait(0, 0);
	return 0;
}
static void __exit mwait_check_exit(void)
{
}
module_init(mwait_check_init);
module_exit(mwait_check_exit);

Reported-by: Liu, Jinsong <jinsong.liu@intel.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
df88b2d
Erik Hemming erique referenced this issue from a commit in erique/rpi_linux April 30, 2012
hwmon: (coretemp) fix oops on cpu unplug
coretemp tries to access core_data array beyond bounds on cpu unplug if
core id of the cpu if more than NUM_REAL_CORES-1.

BUG: unable to handle kernel NULL pointer dereference at 000000000000013c
IP: [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
PGD 673e5a067 PUD 66e9b3067 PMD 0
Oops: 0000 [#1] SMP
CPU 79
Modules linked in: sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf bnep bluetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter nf_conntrack_ipv4 nf_defrag_ipv4 ip6_tables xt_state nf_conntrack coretemp crc32c_intel asix tpm_tis pcspkr usbnet iTCO_wdt i2c_i801 microcode mii joydev tpm i2c_core iTCO_vendor_support tpm_bios i7core_edac igb ioatdma edac_core dca megaraid_sas [last unloaded: oprofile]

Pid: 3315, comm: set-cpus Tainted: G        W    3.4.0-rc5+ #2 QCI QSSC-S4R/QSSC-S4R
RIP: 0010:[<ffffffffa00159af>]  [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
RSP: 0018:ffff880472fb3d48  EFLAGS: 00010246
RAX: 0000000000000124 RBX: 0000000000000034 RCX: 00000000ffffffff
RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246
RBP: ffff880472fb3d88 R08: ffff88077fcd36c0 R09: 0000000000000001
R10: ffffffff8184bc48 R11: 0000000000000000 R12: ffff880273095800
R13: 0000000000000013 R14: ffff8802730a1810 R15: 0000000000000000
FS:  00007f694a20f720(0000) GS:ffff88077fcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000000000013c CR3: 000000067209b000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process set-cpus (pid: 3315, threadinfo ffff880472fb2000, task ffff880471fa0000)
Stack:
 ffff880277b4c308 0000000000000003 ffff880472fb3d88 0000000000000005
 0000000000000034 00000000ffffffd1 ffffffff81cadc70 ffff880472fb3e14
 ffff880472fb3dc8 ffffffff8161f48d ffff880471fa0000 0000000000000034
Call Trace:
 [<ffffffff8161f48d>] notifier_call_chain+0x4d/0x70
 [<ffffffff8107f1be>] __raw_notifier_call_chain+0xe/0x10
 [<ffffffff81059d30>] __cpu_notify+0x20/0x40
 [<ffffffff815fa251>] _cpu_down+0x81/0x270
 [<ffffffff815fa477>] cpu_down+0x37/0x50
 [<ffffffff815fd6a3>] store_online+0x63/0xc0
 [<ffffffff813c7078>] dev_attr_store+0x18/0x30
 [<ffffffff811f02cf>] sysfs_write_file+0xef/0x170
 [<ffffffff81180443>] vfs_write+0xb3/0x180
 [<ffffffff8118076a>] sys_write+0x4a/0x90
 [<ffffffff816236a9>] system_call_fastpath+0x16/0x1b
Code: 48 c7 c7 94 60 01 a0 44 0f b7 ac 10 ac 00 00 00 31 c0 e8 41 b7 5f e1 41 83 c5 02 49 63 c5 49 8b 44 c4 10 48 85 c0 74 56 45 31 ff <39> 58 18 75 4e eb 1f 49 63 d7 4c 89 f7 48 89 45 c8 48 6b d2 28
RIP  [<ffffffffa00159af>] coretemp_cpu_callback+0x93/0x1ba [coretemp]
 RSP <ffff880472fb3d48>
CR2: 000000000000013c

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: stable@vger.kernel.org # 3.0+
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
b704871
Erik Hemming erique referenced this issue from a commit in erique/rpi_linux May 15, 2012
James Bottomley [PARISC] fix PA1.1 oops on boot
All PA1.1 systems have been oopsing on boot since

commit f311847
Author: James Bottomley <James.Bottomley@HansenPartnership.com>
Date:   Wed Dec 22 10:22:11 2010 -0600

    parisc: flush pages through tmpalias space

because a PA2.0 instruction was accidentally introduced into the PA1.1 TLB
insertion interruption path when it was consolidated with the do_alias macro.
Fix the do_alias macro only to use PA2.0 instructions if compiled for 64 bit.
Cc: stable@vger.kernel.org  #2.6.39+
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
5e18558
Erik Hemming erique referenced this issue from a commit in erique/rpi_linux May 16, 2012
[PARISC] fix crash in flush_icache_page_asm on PA1.1
As pointed out by serveral people, PA1.1 only has a type 26 instruction
meaning that the space register must be explicitly encoded.  Not giving an
explicit space means that the compiler uses the type 24 version which is PA2.0
only resulting in an illegal instruction crash.

This regression was caused by

    commit f311847
    Author: James Bottomley <James.Bottomley@HansenPartnership.com>
    Date:   Wed Dec 22 10:22:11 2010 -0600

        parisc: flush pages through tmpalias space

Reported-by: Helge Deller <deller@gmx.de>
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: stable@vger.kernel.org	#2.6.39+
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
207f583
Erik Hemming erique referenced this issue from a commit in erique/rpi_linux May 29, 2012
mm: pmd_read_atomic: fix 32bit PAE pmd walk vs pmd_populate SMP race …
…condition

commit 26c1917 upstream.

When holding the mmap_sem for reading, pmd_offset_map_lock should only
run on a pmd_t that has been read atomically from the pmdp pointer,
otherwise we may read only half of it leading to this crash.

PID: 11679  TASK: f06e8000  CPU: 3   COMMAND: "do_race_2_panic"
 #0 [f06a9dd8] crash_kexec at c049b5ec
 #1 [f06a9e2c] oops_end at c083d1c2
 #2 [f06a9e40] no_context at c0433ded
 #3 [f06a9e64] bad_area_nosemaphore at c043401a
 #4 [f06a9e6c] __do_page_fault at c0434493
 #5 [f06a9eec] do_page_fault at c083eb45
 #6 [f06a9f04] error_code (via page_fault) at c083c5d5
    EAX: 01fb470c EBX: fff35000 ECX: 00000003 EDX: 00000100 EBP:
    00000000
    DS:  007b     ESI: 9e201000 ES:  007b     EDI: 01fb4700 GS:  00e0
    CS:  0060     EIP: c083bc14 ERR: ffffffff EFLAGS: 00010246
 #7 [f06a9f38] _spin_lock at c083bc14
 #8 [f06a9f44] sys_mincore at c0507b7d
 #9 [f06a9fb0] system_call at c083becd
                         start           len
    EAX: ffffffda  EBX: 9e200000  ECX: 00001000  EDX: 6228537f
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 003d0f00
    SS:  007b      ESP: 62285354  EBP: 62285388  GS:  0033
    CS:  0073      EIP: 00291416  ERR: 000000da  EFLAGS: 00000286

This should be a longstanding bug affecting x86 32bit PAE without THP.
Only archs with 64bit large pmd_t and 32bit unsigned long should be
affected.

With THP enabled the barrier() in pmd_none_or_trans_huge_or_clear_bad()
would partly hide the bug when the pmd transition from none to stable,
by forcing a re-read of the *pmd in pmd_offset_map_lock, but when THP is
enabled a new set of problem arises by the fact could then transition
freely in any of the none, pmd_trans_huge or pmd_trans_stable states.
So making the barrier in pmd_none_or_trans_huge_or_clear_bad()
unconditional isn't good idea and it would be a flakey solution.

This should be fully fixed by introducing a pmd_read_atomic that reads
the pmd in order with THP disabled, or by reading the pmd atomically
with cmpxchg8b with THP enabled.

Luckily this new race condition only triggers in the places that must
already be covered by pmd_none_or_trans_huge_or_clear_bad() so the fix
is localized there but this bug is not related to THP.

NOTE: this can trigger on x86 32bit systems with PAE enabled with more
than 4G of ram, otherwise the high part of the pmd will never risk to be
truncated because it would be zero at all times, in turn so hiding the
SMP race.

This bug was discovered and fully debugged by Ulrich, quote:

----
[..]
pmd_none_or_trans_huge_or_clear_bad() loads the content of edx and
eax.

    496 static inline int pmd_none_or_trans_huge_or_clear_bad(pmd_t
    *pmd)
    497 {
    498         /* depend on compiler for an atomic pmd read */
    499         pmd_t pmdval = *pmd;

                                // edi = pmd pointer
0xc0507a74 <sys_mincore+548>:   mov    0x8(%esp),%edi
...
                                // edx = PTE page table high address
0xc0507a84 <sys_mincore+564>:   mov    0x4(%edi),%edx
...
                                // eax = PTE page table low address
0xc0507a8e <sys_mincore+574>:   mov    (%edi),%eax

[..]

Please note that the PMD is not read atomically. These are two "mov"
instructions where the high order bits of the PMD entry are fetched
first. Hence, the above machine code is prone to the following race.

-  The PMD entry {high|low} is 0x0000000000000000.
   The "mov" at 0xc0507a84 loads 0x00000000 into edx.

-  A page fault (on another CPU) sneaks in between the two "mov"
   instructions and instantiates the PMD.

-  The PMD entry {high|low} is now 0x00000003fda38067.
   The "mov" at 0xc0507a8e loads 0xfda38067 into eax.
----

Reported-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2d363e9
popcornmix popcornmix referenced this issue from a commit May 15, 2012
shirishpargaonkar cifs: Include backup intent search flags during searches {try #2)
commit 2608bee upstream.

As observed and suggested by Tushar Gosavi...

---------
readdir calls these function to send TRANS2_FIND_FIRST and
TRANS2_FIND_NEXT command to the server. The current cifs module is
not specifying CIFS_SEARCH_BACKUP_SEARCH flag while sending these
command when backupuid/backupgid is specified. This can be resolved
by specifying CIFS_SEARCH_BACKUP_SEARCH flag.
---------

Reported-and-Tested-by: Tushar Gosavi <tugosavi@in.ibm.com>
Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
90f560e
popcornmix popcornmix referenced this issue from a commit June 01, 2012
Stanislaw Gruszka rt2x00: use atomic variable for seqno
commit e5851da upstream.

Remove spinlock as atomic_t can be used instead. Note we use only 16
lower bits, upper bits are changed but we impilcilty cast to u16.

This fix possible deadlock on IBSS mode reproted by lockdep:

=================================
[ INFO: inconsistent lock state ]
3.4.0-wl+ #4 Not tainted
---------------------------------
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
kworker/u:2/30374 [HC0[0]:SC0[0]:HE1:SE1] takes:
 (&(&intf->seqlock)->rlock){+.?...}, at: [<f9979a20>] rt2x00queue_create_tx_descriptor+0x380/0x490 [rt2x00lib]
{IN-SOFTIRQ-W} state was registered at:
  [<c04978ab>] __lock_acquire+0x47b/0x1050
  [<c0498504>] lock_acquire+0x84/0xf0
  [<c0835733>] _raw_spin_lock+0x33/0x40
  [<f9979a20>] rt2x00queue_create_tx_descriptor+0x380/0x490 [rt2x00lib]
  [<f9979f2a>] rt2x00queue_write_tx_frame+0x1a/0x300 [rt2x00lib]
  [<f997834f>] rt2x00mac_tx+0x7f/0x380 [rt2x00lib]
  [<f98fe363>] __ieee80211_tx+0x1b3/0x300 [mac80211]
  [<f98ffdf5>] ieee80211_tx+0x105/0x130 [mac80211]
  [<f99000dd>] ieee80211_xmit+0xad/0x100 [mac80211]
  [<f9900519>] ieee80211_subif_start_xmit+0x2d9/0x930 [mac80211]
  [<c0782e87>] dev_hard_start_xmit+0x307/0x660
  [<c079bb71>] sch_direct_xmit+0xa1/0x1e0
  [<c0784bb3>] dev_queue_xmit+0x183/0x730
  [<c078c27a>] neigh_resolve_output+0xfa/0x1e0
  [<c07b436a>] ip_finish_output+0x24a/0x460
  [<c07b4897>] ip_output+0xb7/0x100
  [<c07b2d60>] ip_local_out+0x20/0x60
  [<c07e01ff>] igmpv3_sendpack+0x4f/0x60
  [<c07e108f>] igmp_ifc_timer_expire+0x29f/0x330
  [<c04520fc>] run_timer_softirq+0x15c/0x2f0
  [<c0449e3e>] __do_softirq+0xae/0x1e0
irq event stamp: 18380437
hardirqs last  enabled at (18380437): [<c0526027>] __slab_alloc.clone.3+0x67/0x5f0
hardirqs last disabled at (18380436): [<c0525ff3>] __slab_alloc.clone.3+0x33/0x5f0
softirqs last  enabled at (18377616): [<c0449eb3>] __do_softirq+0x123/0x1e0
softirqs last disabled at (18377611): [<c041278d>] do_softirq+0x9d/0xe0

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&(&intf->seqlock)->rlock);
  <Interrupt>
    lock(&(&intf->seqlock)->rlock);

 *** DEADLOCK ***

4 locks held by kworker/u:2/30374:
 #0:  (wiphy_name(local->hw.wiphy)){++++.+}, at: [<c045cf99>] process_one_work+0x109/0x3f0
 #1:  ((&sdata->work)){+.+.+.}, at: [<c045cf99>] process_one_work+0x109/0x3f0
 #2:  (&ifibss->mtx){+.+.+.}, at: [<f98f005b>] ieee80211_ibss_work+0x1b/0x470 [mac80211]
 #3:  (&intf->beacon_skb_mutex){+.+...}, at: [<f997a644>] rt2x00queue_update_beacon+0x24/0x50 [rt2x00lib]

stack backtrace:
Pid: 30374, comm: kworker/u:2 Not tainted 3.4.0-wl+ #4
Call Trace:
 [<c04962a6>] print_usage_bug+0x1f6/0x220
 [<c0496a12>] mark_lock+0x2c2/0x300
 [<c0495ff0>] ? check_usage_forwards+0xc0/0xc0
 [<c04978ec>] __lock_acquire+0x4bc/0x1050
 [<c0527890>] ? __kmalloc_track_caller+0x1c0/0x1d0
 [<c0777fb6>] ? copy_skb_header+0x26/0x90
 [<c0498504>] lock_acquire+0x84/0xf0
 [<f9979a20>] ? rt2x00queue_create_tx_descriptor+0x380/0x490 [rt2x00lib]
 [<c0835733>] _raw_spin_lock+0x33/0x40
 [<f9979a20>] ? rt2x00queue_create_tx_descriptor+0x380/0x490 [rt2x00lib]
 [<f9979a20>] rt2x00queue_create_tx_descriptor+0x380/0x490 [rt2x00lib]
 [<f997a5cf>] rt2x00queue_update_beacon_locked+0x5f/0xb0 [rt2x00lib]
 [<f997a64d>] rt2x00queue_update_beacon+0x2d/0x50 [rt2x00lib]
 [<f9977e3a>] rt2x00mac_bss_info_changed+0x1ca/0x200 [rt2x00lib]
 [<f9977c70>] ? rt2x00mac_remove_interface+0x70/0x70 [rt2x00lib]
 [<f98e4dd0>] ieee80211_bss_info_change_notify+0xe0/0x1d0 [mac80211]
 [<f98ef7b8>] __ieee80211_sta_join_ibss+0x3b8/0x610 [mac80211]
 [<c0496ab4>] ? mark_held_locks+0x64/0xc0
 [<c0440012>] ? virt_efi_query_capsule_caps+0x12/0x50
 [<f98efb09>] ieee80211_sta_join_ibss+0xf9/0x140 [mac80211]
 [<f98f0456>] ieee80211_ibss_work+0x416/0x470 [mac80211]
 [<c0496d8b>] ? trace_hardirqs_on+0xb/0x10
 [<c077683b>] ? skb_dequeue+0x4b/0x70
 [<f98f207f>] ieee80211_iface_work+0x13f/0x230 [mac80211]
 [<c045cf99>] ? process_one_work+0x109/0x3f0
 [<c045d015>] process_one_work+0x185/0x3f0
 [<c045cf99>] ? process_one_work+0x109/0x3f0
 [<f98f1f40>] ? ieee80211_teardown_sdata+0xa0/0xa0 [mac80211]
 [<c045ed86>] worker_thread+0x116/0x270
 [<c045ec70>] ? manage_workers+0x1e0/0x1e0
 [<c0462f64>] kthread+0x84/0x90
 [<c0462ee0>] ? __init_kthread_worker+0x60/0x60
 [<c083d382>] kernel_thread_helper+0x6/0x10

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Helmut Schaa <helmut.schaa@googlemail.com>
Acked-by: Gertjan van Wingerde <gwingerde@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
6c69679
popcornmix popcornmix referenced this issue from a commit May 08, 2012
edac: avoid mce decoding crash after edac driver unloaded
commit e35fca4 upstream.

Some edac drivers register themselves as mce decoders via
notifier_chain. But in current notifier_chain implementation logic,
it doesn't accept same notifier registered twice. If so, it will be
wrong when adding/removing the element from the list. For example,
on one SandyBridge platform, remove module sb_edac and then trigger
one error, it will hit oops because it has no mce decoder registered
but related notifier_chain still points to an invalid callback
function. Here is an example:

Call Trace:
 [<ffffffff8150ef6a>] atomic_notifier_call_chain+0x1a/0x20
 [<ffffffff8102b936>] mce_log+0x46/0x180
 [<ffffffff8102eaea>] apei_mce_report_mem_error+0x4a/0x60
 [<ffffffff812e19d2>] ghes_do_proc+0x192/0x210
 [<ffffffff812e2066>] ghes_proc+0x46/0x70
 [<ffffffff812e20d8>] ghes_notify_sci+0x48/0x80
 [<ffffffff8150ef05>] notifier_call_chain+0x55/0x80
 [<ffffffff81076f1a>] __blocking_notifier_call_chain+0x5a/0x80
 [<ffffffff812aea11>] ? acpi_os_wait_events_complete+0x23/0x23
 [<ffffffff81076f56>] blocking_notifier_call_chain+0x16/0x20
 [<ffffffff812ddc4d>] acpi_hed_notify+0x19/0x1b
 [<ffffffff812b16bd>] acpi_device_notify+0x19/0x1b
 [<ffffffff812beb38>] acpi_ev_notify_dispatch+0x67/0x7f
 [<ffffffff812aea3a>] acpi_os_execute_deferred+0x29/0x36
 [<ffffffff81069dc2>] process_one_work+0x132/0x450
 [<ffffffff8106bbcb>] worker_thread+0x17b/0x3c0
 [<ffffffff8106ba50>] ? manage_workers+0x120/0x120
 [<ffffffff81070aee>] kthread+0x9e/0xb0
 [<ffffffff81514724>] kernel_thread_helper+0x4/0x10
 [<ffffffff81070a50>] ? kthread_freezable_should_stop+0x70/0x70
 [<ffffffff81514720>] ? gs_change+0x13/0x13
Code: f3 49 89 d4 45 85 ed 4d 89 c6 48 8b 0f 74 48 48 85 c9 75 17 eb 41
0f 1f 80 00 00 00 00 41 83 ed 01 4c 89 f9 74 22 4d 85 ff 74 1d <4c> 8b
79 08 4c 89 e2 48 89 de 48 89 cf ff 11 4d 85 f6 74 04 41
RIP  [<ffffffff8150eef6>] notifier_call_chain+0x46/0x80
 RSP <ffff88042868fb20>
CR2: ffffffffa01af838
---[ end trace 0100930068e73e6f ]---
BUG: unable to handle kernel paging request at fffffffffffffff8
IP: [<ffffffff810705b0>] kthread_data+0x10/0x20
PGD 1a0d067 PUD 1a0e067 PMD 0
Oops: 0000 [#2] SMP

Only i7core_edac and sb_edac have such issues because they have more
than one memory controller which means they have to register mce
decoder many times.

Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
[bwh: Backported to 3.2: drivers call atomic_notifier_chain_{,un}register()
 directly]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
4ce3278
popcornmix popcornmix referenced this issue from a commit May 29, 2012
mm: pmd_read_atomic: fix 32bit PAE pmd walk vs pmd_populate SMP race …
…condition

commit 26c1917 upstream.

When holding the mmap_sem for reading, pmd_offset_map_lock should only
run on a pmd_t that has been read atomically from the pmdp pointer,
otherwise we may read only half of it leading to this crash.

PID: 11679  TASK: f06e8000  CPU: 3   COMMAND: "do_race_2_panic"
 #0 [f06a9dd8] crash_kexec at c049b5ec
 #1 [f06a9e2c] oops_end at c083d1c2
 #2 [f06a9e40] no_context at c0433ded
 #3 [f06a9e64] bad_area_nosemaphore at c043401a
 #4 [f06a9e6c] __do_page_fault at c0434493
 #5 [f06a9eec] do_page_fault at c083eb45
 #6 [f06a9f04] error_code (via page_fault) at c083c5d5
    EAX: 01fb470c EBX: fff35000 ECX: 00000003 EDX: 00000100 EBP:
    00000000
    DS:  007b     ESI: 9e201000 ES:  007b     EDI: 01fb4700 GS:  00e0
    CS:  0060     EIP: c083bc14 ERR: ffffffff EFLAGS: 00010246
 #7 [f06a9f38] _spin_lock at c083bc14
 #8 [f06a9f44] sys_mincore at c0507b7d
 #9 [f06a9fb0] system_call at c083becd
                         start           len
    EAX: ffffffda  EBX: 9e200000  ECX: 00001000  EDX: 6228537f
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 003d0f00
    SS:  007b      ESP: 62285354  EBP: 62285388  GS:  0033
    CS:  0073      EIP: 00291416  ERR: 000000da  EFLAGS: 00000286

This should be a longstanding bug affecting x86 32bit PAE without THP.
Only archs with 64bit large pmd_t and 32bit unsigned long should be
affected.

With THP enabled the barrier() in pmd_none_or_trans_huge_or_clear_bad()
would partly hide the bug when the pmd transition from none to stable,
by forcing a re-read of the *pmd in pmd_offset_map_lock, but when THP is
enabled a new set of problem arises by the fact could then transition
freely in any of the none, pmd_trans_huge or pmd_trans_stable states.
So making the barrier in pmd_none_or_trans_huge_or_clear_bad()
unconditional isn't good idea and it would be a flakey solution.

This should be fully fixed by introducing a pmd_read_atomic that reads
the pmd in order with THP disabled, or by reading the pmd atomically
with cmpxchg8b with THP enabled.

Luckily this new race condition only triggers in the places that must
already be covered by pmd_none_or_trans_huge_or_clear_bad() so the fix
is localized there but this bug is not related to THP.

NOTE: this can trigger on x86 32bit systems with PAE enabled with more
than 4G of ram, otherwise the high part of the pmd will never risk to be
truncated because it would be zero at all times, in turn so hiding the
SMP race.

This bug was discovered and fully debugged by Ulrich, quote:

----
[..]
pmd_none_or_trans_huge_or_clear_bad() loads the content of edx and
eax.

    496 static inline int pmd_none_or_trans_huge_or_clear_bad(pmd_t
    *pmd)
    497 {
    498         /* depend on compiler for an atomic pmd read */
    499         pmd_t pmdval = *pmd;

                                // edi = pmd pointer
0xc0507a74 <sys_mincore+548>:   mov    0x8(%esp),%edi
...
                                // edx = PTE page table high address
0xc0507a84 <sys_mincore+564>:   mov    0x4(%edi),%edx
...
                                // eax = PTE page table low address
0xc0507a8e <sys_mincore+574>:   mov    (%edi),%eax

[..]

Please note that the PMD is not read atomically. These are two "mov"
instructions where the high order bits of the PMD entry are fetched
first. Hence, the above machine code is prone to the following race.

-  The PMD entry {high|low} is 0x0000000000000000.
   The "mov" at 0xc0507a84 loads 0x00000000 into edx.

-  A page fault (on another CPU) sneaks in between the two "mov"
   instructions and instantiates the PMD.

-  The PMD entry {high|low} is now 0x00000003fda38067.
   The "mov" at 0xc0507a8e loads 0xfda38067 into eax.
----

Reported-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
02d1854
popcornmix popcornmix referenced this issue from a commit June 12, 2012
elp cfg80211: fix potential deadlock in regulatory
commit fe20b39 upstream.

reg_timeout_work() calls restore_regulatory_settings() which
takes cfg80211_mutex.

reg_set_request_processed() already holds cfg80211_mutex
before calling cancel_delayed_work_sync(reg_timeout),
so it might deadlock.

Call the async cancel_delayed_work instead, in order
to avoid the potential deadlock.

This is the relevant lockdep warning:

cfg80211: Calling CRDA for country: XX

======================================================
[ INFO: possible circular locking dependency detected ]
3.4.0-rc5-wl+ #26 Not tainted
-------------------------------------------------------
kworker/0:2/1391 is trying to acquire lock:
 (cfg80211_mutex){+.+.+.}, at: [<bf28ae00>] restore_regulatory_settings+0x34/0x418 [cfg80211]

but task is already holding lock:
 ((reg_timeout).work){+.+...}, at: [<c0059e94>] process_one_work+0x1f0/0x480

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 ((reg_timeout).work){+.+...}:
       [<c008fd44>] validate_chain+0xb94/0x10f0
       [<c0090b68>] __lock_acquire+0x8c8/0x9b0
       [<c0090d40>] lock_acquire+0xf0/0x114
       [<c005b600>] wait_on_work+0x4c/0x154
       [<c005c000>] __cancel_work_timer+0xd4/0x11c
       [<c005c064>] cancel_delayed_work_sync+0x1c/0x20
       [<bf28b274>] reg_set_request_processed+0x50/0x78 [cfg80211]
       [<bf28bd84>] set_regdom+0x550/0x600 [cfg80211]
       [<bf294cd8>] nl80211_set_reg+0x218/0x258 [cfg80211]
       [<c03c7738>] genl_rcv_msg+0x1a8/0x1e8
       [<c03c6a00>] netlink_rcv_skb+0x5c/0xc0
       [<c03c7584>] genl_rcv+0x28/0x34
       [<c03c6720>] netlink_unicast+0x15c/0x228
       [<c03c6c7c>] netlink_sendmsg+0x218/0x298
       [<c03933c8>] sock_sendmsg+0xa4/0xc0
       [<c039406c>] __sys_sendmsg+0x1e4/0x268
       [<c0394228>] sys_sendmsg+0x4c/0x70
       [<c0013840>] ret_fast_syscall+0x0/0x3c

-> #1 (reg_mutex){+.+.+.}:
       [<c008fd44>] validate_chain+0xb94/0x10f0
       [<c0090b68>] __lock_acquire+0x8c8/0x9b0
       [<c0090d40>] lock_acquire+0xf0/0x114
       [<c04734dc>] mutex_lock_nested+0x48/0x320
       [<bf28b2cc>] reg_todo+0x30/0x538 [cfg80211]
       [<c0059f44>] process_one_work+0x2a0/0x480
       [<c005a4b4>] worker_thread+0x1bc/0x2bc
       [<c0061148>] kthread+0x98/0xa4
       [<c0014af4>] kernel_thread_exit+0x0/0x8

-> #0 (cfg80211_mutex){+.+.+.}:
       [<c008ed58>] print_circular_bug+0x68/0x2cc
       [<c008fb28>] validate_chain+0x978/0x10f0
       [<c0090b68>] __lock_acquire+0x8c8/0x9b0
       [<c0090d40>] lock_acquire+0xf0/0x114
       [<c04734dc>] mutex_lock_nested+0x48/0x320
       [<bf28ae00>] restore_regulatory_settings+0x34/0x418 [cfg80211]
       [<bf28b200>] reg_timeout_work+0x1c/0x20 [cfg80211]
       [<c0059f44>] process_one_work+0x2a0/0x480
       [<c005a4b4>] worker_thread+0x1bc/0x2bc
       [<c0061148>] kthread+0x98/0xa4
       [<c0014af4>] kernel_thread_exit+0x0/0x8

other info that might help us debug this:

Chain exists of:
  cfg80211_mutex --> reg_mutex --> (reg_timeout).work

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock((reg_timeout).work);
                               lock(reg_mutex);
                               lock((reg_timeout).work);
  lock(cfg80211_mutex);

 *** DEADLOCK ***

2 locks held by kworker/0:2/1391:
 #0:  (events){.+.+.+}, at: [<c0059e94>] process_one_work+0x1f0/0x480
 #1:  ((reg_timeout).work){+.+...}, at: [<c0059e94>] process_one_work+0x1f0/0x480

stack backtrace:
[<c001b928>] (unwind_backtrace+0x0/0x12c) from [<c0471d3c>] (dump_stack+0x20/0x24)
[<c0471d3c>] (dump_stack+0x20/0x24) from [<c008ef70>] (print_circular_bug+0x280/0x2cc)
[<c008ef70>] (print_circular_bug+0x280/0x2cc) from [<c008fb28>] (validate_chain+0x978/0x10f0)
[<c008fb28>] (validate_chain+0x978/0x10f0) from [<c0090b68>] (__lock_acquire+0x8c8/0x9b0)
[<c0090b68>] (__lock_acquire+0x8c8/0x9b0) from [<c0090d40>] (lock_acquire+0xf0/0x114)
[<c0090d40>] (lock_acquire+0xf0/0x114) from [<c04734dc>] (mutex_lock_nested+0x48/0x320)
[<c04734dc>] (mutex_lock_nested+0x48/0x320) from [<bf28ae00>] (restore_regulatory_settings+0x34/0x418 [cfg80211])
[<bf28ae00>] (restore_regulatory_settings+0x34/0x418 [cfg80211]) from [<bf28b200>] (reg_timeout_work+0x1c/0x20 [cfg80211])
[<bf28b200>] (reg_timeout_work+0x1c/0x20 [cfg80211]) from [<c0059f44>] (process_one_work+0x2a0/0x480)
[<c0059f44>] (process_one_work+0x2a0/0x480) from [<c005a4b4>] (worker_thread+0x1bc/0x2bc)
[<c005a4b4>] (worker_thread+0x1bc/0x2bc) from [<c0061148>] (kthread+0x98/0xa4)
[<c0061148>] (kthread+0x98/0xa4) from [<c0014af4>] (kernel_thread_exit+0x0/0x8)
cfg80211: Calling CRDA to update world regulatory domain
cfg80211: World regulatory domain updated:
cfg80211:   (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
cfg80211:   (2402000 KHz - 2472000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
cfg80211:   (2457000 KHz - 2482000 KHz @ 20000 KHz), (300 mBi, 2000 mBm)
cfg80211:   (2474000 KHz - 2494000 KHz @ 20000 KHz), (300 mBi, 2000 mBm)
cfg80211:   (5170000 KHz - 5250000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
cfg80211:   (5735000 KHz - 5835000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)

Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
d120768
popcornmix popcornmix referenced this issue from a commit in popcornmix/linux July 11, 2012
cifs: when CONFIG_HIGHMEM is set, serialize the read/write kmaps
commit 3cf003c upstream.

Jian found that when he ran fsx on a 32 bit arch with a large wsize the
process and one of the bdi writeback kthreads would sometimes deadlock
with a stack trace like this:

crash> bt
PID: 2789   TASK: f02edaa0  CPU: 3   COMMAND: "fsx"
 #0 [eed63cbc] schedule at c083c5b3
 #1 [eed63d80] kmap_high at c0500ec8
 #2 [eed63db0] cifs_async_writev at f7fabcd7 [cifs]
 #3 [eed63df0] cifs_writepages at f7fb7f5c [cifs]
 #4 [eed63e50] do_writepages at c04f3e32
 #5 [eed63e54] __filemap_fdatawrite_range at c04e152a
 #6 [eed63ea4] filemap_fdatawrite at c04e1b3e
 #7 [eed63eb4] cifs_file_aio_write at f7fa111a [cifs]
 #8 [eed63ecc] do_sync_write at c052d202
 #9 [eed63f74] vfs_write at c052d4ee
#10 [eed63f94] sys_write at c052df4c
#11 [eed63fb0] ia32_sysenter_target at c0409a98
    EAX: 00000004  EBX: 00000003  ECX: abd73b73  EDX: 012a65c6
    DS:  007b      ESI: 012a65c6  ES:  007b      EDI: 00000000
    SS:  007b      ESP: bf8db178  EBP: bf8db1f8  GS:  0033
    CS:  0073      EIP: 40000424  ERR: 00000004  EFLAGS: 00000246

Each task would kmap part of its address array before getting stuck, but
not enough to actually issue the write.

This patch fixes this by serializing the marshal_iov operations for
async reads and writes. The idea here is to ensure that cifs
aggressively tries to populate a request before attempting to fulfill
another one. As soon as all of the pages are kmapped for a request, then
we can unlock and allow another one to proceed.

There's no need to do this serialization on non-CONFIG_HIGHMEM arches
however, so optimize all of this out when CONFIG_HIGHMEM isn't set.

Reported-by: Jian Li <jiali@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
bb761c9
popcornmix popcornmix referenced this issue from a commit in popcornmix/linux July 23, 2012
nfs: skip commit in releasepage if we're freeing memory for fs-relate…
…d reasons

commit 5cf02d0 upstream.

We've had some reports of a deadlock where rpciod ends up with a stack
trace like this:

    PID: 2507   TASK: ffff88103691ab40  CPU: 14  COMMAND: "rpciod/14"
     #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
     #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
     #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
     #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
     #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
     #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
     #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
     #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
     #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
     #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
    #10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
    #11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
    #12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
    #13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
    #14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
    #15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
    #16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
    #17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
    #18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
    #19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
    #20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
    #21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
    #22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
    #23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
    #24 [ffff8810343bfee8] kthread at ffffffff8108dd96
    #25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca

rpciod is trying to allocate memory for a new socket to talk to the
server. The VM ends up calling ->releasepage to get more memory, and it
tries to do a blocking commit. That commit can't succeed however without
a connected socket, so we deadlock.

Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
socket allocation, and having nfs_release_page check for that flag when
deciding whether to do a commit call. Also, set PF_FSTRANS
unconditionally in rpc_async_schedule since that function can also do
allocations sometimes.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
1c88c58
popcornmix
Owner

The kernel update to 3.2.27 should have solved this. Please reopen if you don't think this is fixed.

popcornmix popcornmix closed this August 20, 2012
chrisw2 chrisw2 referenced this issue from a commit December 13, 2011
lockdep/waitqueues: Add better annotation
 -> #2 (&tty->write_wait){-.-...}:

is a lot more informative than:

 -> #2 (key#19){-.....}:

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-8zpopbny51023rdb0qq67eye@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
f07fdec
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi September 12, 2012
usb: chipidea: udc: don't stall endpoint if request list is empty in …
…isr_tr_complete_low

When attaching an imx28 or imx53 in USB gadget mode to a Windows host and
starting a rndis connection we see this message every 4-10 seconds:

    g_ether gadget: high speed config #2: RNDIS

Analysis shows that each time this message is printed, the rndis connection is
re-establish due to a reset because of a stalled endpoint (ep 0, dir 1). The
endpoint is stalled because the reqeust complete bit on that endpoint is set,
but in isr_tr_complete_low() the endpoint request list (mEp->qh.queue) is
empty.

This patch removed this check, because the code doesn't take the following
situation into account:

The loop over all endpoints in isr_tr_complete_handler() will call ep_nuke() on
both ep0/dir0 and ep/dir1 in the first loop. Pending reqeusts will be flushed
and completed here. There seems to be a race condition, the request is nuked,
but the request complete bit will be set, too. The subsequent check (in
ep0/dir1's loop cycle) for endpoint request list (mEp->qh.queue) empty will
fail.

Both other mainline chipidea drivers (mv_udc_core.c and fsl_udc_core.c) don't
have this check.

Cc: stable <stable@vger.kernel.org>
Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
db89960
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi August 21, 2012
[SCSI] bnx2i: Fixed NULL ptr deference for 1G bnx2 Linux iSCSI offload
This patch fixes the following kernel panic invoked by uninitialized fields
in the chip initialization for the 1G bnx2 iSCSI offload.

One of the bits in the chip initialization is being used by the latest
firmware to control overflow packets.  When this control bit gets enabled
erroneously, it would ultimately result in a bad packet placement which would
cause the bnx2 driver to dereference a NULL ptr in the placement handler.

This can happen under certain stress I/O environment under the Linux
iSCSI offload operation.

This change only affects Broadcom's 5709 chipset.

Unable to handle kernel NULL pointer dereference at 0000000000000008 RIP:
 [<ffffffff881f0e7d>] :bnx2:bnx2_poll_work+0xd0d/0x13c5
Pid: 0, comm: swapper Tainted: G     ---- 2.6.18-333.el5debug #2
RIP: 0010:[<ffffffff881f0e7d>]  [<ffffffff881f0e7d>] :bnx2:bnx2_poll_work+0xd0d/0x13c5
RSP: 0018:ffff8101b575bd50  EFLAGS: 00010216
RAX: 0000000000000005 RBX: ffff81007c5fb180 RCX: 0000000000000000
RDX: 0000000000000ffc RSI: 00000000817e8000 RDI: 0000000000000220
RBP: ffff81015bbd7ec0 R08: ffff8100817e9000 R09: 0000000000000000
R10: ffff81007c5fb180 R11: 00000000000000c8 R12: 000000007a25a010
R13: 0000000000000000 R14: 0000000000000005 R15: ffff810159f80558
FS:  0000000000000000(0000) GS:ffff8101afebc240(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000008 CR3: 0000000000201000 CR4: 00000000000006a0
Process swapper (pid: 0, threadinfo ffff8101b5754000, task ffff8101afebd820)
Stack:  000000000000000b ffff810159f80000 0000000000000040 ffff810159f80520
 ffff810159f80500 00cf00cf8008e84b ffffc200100939e0 ffff810009035b20
 0000502900000000 000000be00000001 ffff8100817e7810 00d08101b575bea8
Call Trace:
 <IRQ>  [<ffffffff8008e0d0>] show_schedstat+0x1c2/0x25b
 [<ffffffff881f1886>] :bnx2:bnx2_poll+0xf6/0x231
 [<ffffffff8000c9b9>] net_rx_action+0xac/0x1b1
 [<ffffffff800125a0>] __do_softirq+0x89/0x133
 [<ffffffff8005e30c>] call_softirq+0x1c/0x28
 [<ffffffff8006d5de>] do_softirq+0x2c/0x7d
 [<ffffffff8006d46e>] do_IRQ+0xee/0xf7
 [<ffffffff8005d625>] ret_from_intr+0x0/0xa
 <EOI>  [<ffffffff801a5780>] acpi_processor_idle_simple+0x1c5/0x341
 [<ffffffff801a573d>] acpi_processor_idle_simple+0x182/0x341
 [<ffffffff801a55bb>] acpi_processor_idle_simple+0x0/0x341
 [<ffffffff80049560>] cpu_idle+0x95/0xb8
 [<ffffffff80078b1c>] start_secondary+0x479/0x488

Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Cc: stable@vger.kernel.org
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
d653220
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi September 13, 2012
xfs: stop the sync worker before xfs_unmountfs
Cancel work of the xfs_sync_worker before teardown of the log in
xfs_unmountfs.  This prevents occasional crashes on unmount like so:

PID: 21602  TASK: ee9df060  CPU: 0   COMMAND: "kworker/0:3"
 #0 [c5377d28] crash_kexec at c0292c94
 #1 [c5377d80] oops_end at c07090c2
 #2 [c5377d98] no_context at c06f614e
 #3 [c5377dbc] __bad_area_nosemaphore at c06f6281
 #4 [c5377df4] bad_area_nosemaphore at c06f629b
 #5 [c5377e00] do_page_fault at c070b0cb
 #6 [c5377e7c] error_code (via page_fault) at c070892c
    EAX: f300c6a8  EBX: f300c6a8  ECX: 000000c0  EDX: 000000c0  EBP: c5377ed0
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 00000001  GS:  ffffad20
    CS:  0060      EIP: c0481ad0  ERR: ffffffff  EFLAGS: 00010246
 #7 [c5377eb0] atomic64_read_cx8 at c0481ad0
 #8 [c5377ebc] xlog_assign_tail_lsn_locked at f7cc7c6e [xfs]
 #9 [c5377ed4] xfs_trans_ail_delete_bulk at f7ccd520 [xfs]
#10 [c5377f0c] xfs_buf_iodone at f7ccb602 [xfs]
#11 [c5377f24] xfs_buf_do_callbacks at f7cca524 [xfs]
#12 [c5377f30] xfs_buf_iodone_callbacks at f7cca5da [xfs]
#13 [c5377f4c] xfs_buf_iodone_work at f7c718d0 [xfs]
#14 [c5377f58] process_one_work at c024ee4c
#15 [c5377f98] worker_thread at c024f43d
#16 [c5377fbc] kthread at c025326b
#17 [c5377fe8] kernel_thread_helper at c070e834

PID: 26653  TASK: e79143b0  CPU: 3   COMMAND: "umount"
 #0 [cde0fda0] __schedule at c0706595
 #1 [cde0fe28] schedule at c0706b89
 #2 [cde0fe30] schedule_timeout at c0705600
 #3 [cde0fe94] __down_common at c0706098
 #4 [cde0fec8] __down at c0706122
 #5 [cde0fed0] down at c025936f
 #6 [cde0fee0] xfs_buf_lock at f7c7131d [xfs]
 #7 [cde0ff00] xfs_freesb at f7cc2236 [xfs]
 #8 [cde0ff10] xfs_fs_put_super at f7c80f21 [xfs]
 #9 [cde0ff1c] generic_shutdown_super at c0333d7a
#10 [cde0ff38] kill_block_super at c0333e0f
#11 [cde0ff48] deactivate_locked_super at c0334218
#12 [cde0ff58] deactivate_super at c033495d
#13 [cde0ff68] mntput_no_expire at c034bc13
#14 [cde0ff7c] sys_umount at c034cc69
#15 [cde0ffa0] sys_oldumount at c034ccd4
#16 [cde0ffb0] system_call at c0707e66

commit 11159a05 added this to xfs_log_unmount and needs to be cleaned up
at a later date.

Signed-off-by: Ben Myers <bpm@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
4026c9f
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi September 18, 2012
ARM: SAMSUNG: Use spin_lock_{irqsave,irqrestore} in clk_set_rate
The spinlock clocks_lock can be held during ISR, hence it is not safe to
hold that lock with disabling interrupts.

It fixes following potential deadlock.

=========================================================
[ INFO: possible irq lock inversion dependency detected ]
3.6.0-rc4+ #2 Not tainted
---------------------------------------------------------
swapper/0/1 just changed the state of lock:
 (&(&host->lock)->rlock){-.....}, at: [<c027fb0d>] sdhci_irq+0x15/0x564
but this lock took another, HARDIRQ-unsafe lock in the past:
 (clocks_lock){+.+...}

and interrupts could create inverse lock ordering between them.

other info that might help us debug this:
 Possible interrupt unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(clocks_lock);
                               local_irq_disable();
                               lock(&(&host->lock)->rlock);
                               lock(clocks_lock);
  <Interrupt>
    lock(&(&host->lock)->rlock);

 *** DEADLOCK ***

Signed-off-by: Tushar Behera <tushar.behera@linaro.org>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
d6838a6
Stephen Warren swarren referenced this issue from a commit September 19, 2012
Commit has since been removed from the repository and is no longer available.
Stephen Warren swarren referenced this issue from a commit September 19, 2012
Commit has since been removed from the repository and is no longer available.
Stephen Warren swarren referenced this issue from a commit September 19, 2012
Commit has since been removed from the repository and is no longer available.
Stephen Warren swarren referenced this issue from a commit September 19, 2012
Commit has since been removed from the repository and is no longer available.
Stephen Warren swarren referenced this issue from a commit September 19, 2012
Commit has since been removed from the repository and is no longer available.
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi September 14, 2012
minipli xfrm_user: return error pointer instead of NULL #2
When dump_one_policy() returns an error, e.g. because of a too small
buffer to dump the whole xfrm policy, xfrm_policy_netlink() returns
NULL instead of an error pointer. But its caller expects an error
pointer and therefore continues to operate on a NULL skbuff.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
c254637
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi September 13, 2012
xfs: stop the sync worker before xfs_unmountfs
Cancel work of the xfs_sync_worker before teardown of the log in
xfs_unmountfs.  This prevents occasional crashes on unmount like so:

PID: 21602  TASK: ee9df060  CPU: 0   COMMAND: "kworker/0:3"
 #0 [c5377d28] crash_kexec at c0292c94
 #1 [c5377d80] oops_end at c07090c2
 #2 [c5377d98] no_context at c06f614e
 #3 [c5377dbc] __bad_area_nosemaphore at c06f6281
 #4 [c5377df4] bad_area_nosemaphore at c06f629b
 #5 [c5377e00] do_page_fault at c070b0cb
 #6 [c5377e7c] error_code (via page_fault) at c070892c
    EAX: f300c6a8  EBX: f300c6a8  ECX: 000000c0  EDX: 000000c0  EBP: c5377ed0
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 00000001  GS:  ffffad20
    CS:  0060      EIP: c0481ad0  ERR: ffffffff  EFLAGS: 00010246
 #7 [c5377eb0] atomic64_read_cx8 at c0481ad0
 #8 [c5377ebc] xlog_assign_tail_lsn_locked at f7cc7c6e [xfs]
 #9 [c5377ed4] xfs_trans_ail_delete_bulk at f7ccd520 [xfs]
#10 [c5377f0c] xfs_buf_iodone at f7ccb602 [xfs]
#11 [c5377f24] xfs_buf_do_callbacks at f7cca524 [xfs]
#12 [c5377f30] xfs_buf_iodone_callbacks at f7cca5da [xfs]
#13 [c5377f4c] xfs_buf_iodone_work at f7c718d0 [xfs]
#14 [c5377f58] process_one_work at c024ee4c
#15 [c5377f98] worker_thread at c024f43d
#16 [c5377fbc] kthread at c025326b
#17 [c5377fe8] kernel_thread_helper at c070e834

PID: 26653  TASK: e79143b0  CPU: 3   COMMAND: "umount"
 #0 [cde0fda0] __schedule at c0706595
 #1 [cde0fe28] schedule at c0706b89
 #2 [cde0fe30] schedule_timeout at c0705600
 #3 [cde0fe94] __down_common at c0706098
 #4 [cde0fec8] __down at c0706122
 #5 [cde0fed0] down at c025936f
 #6 [cde0fee0] xfs_buf_lock at f7c7131d [xfs]
 #7 [cde0ff00] xfs_freesb at f7cc2236 [xfs]
 #8 [cde0ff10] xfs_fs_put_super at f7c80f21 [xfs]
 #9 [cde0ff1c] generic_shutdown_super at c0333d7a
#10 [cde0ff38] kill_block_super at c0333e0f
#11 [cde0ff48] deactivate_locked_super at c0334218
#12 [cde0ff58] deactivate_super at c033495d
#13 [cde0ff68] mntput_no_expire at c034bc13
#14 [cde0ff7c] sys_umount at c034cc69
#15 [cde0ffa0] sys_oldumount at c034ccd4
#16 [cde0ffb0] system_call at c0707e66

commit 11159a05 added this to xfs_log_unmount and needs to be cleaned up
at a later date.

Signed-off-by: Ben Myers <bpm@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
0ba6e53
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi September 14, 2012
Bluetooth: Fix not removing power_off delayed work
For example, when a usb reset is received (I could reproduce it
running something very similar to this[1] in a loop) it could be
that the device is unregistered while the power_off delayed work
is still scheduled to run.

Backtrace:

WARNING: at lib/debugobjects.c:261 debug_print_object+0x7c/0x8d()
Hardware name: To Be Filled By O.E.M.
ODEBUG: free active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x26
Modules linked in: nouveau mxm_wmi btusb wmi bluetooth ttm coretemp drm_kms_helper
Pid: 2114, comm: usb-reset Not tainted 3.5.0bt-next #2
Call Trace:
 [<ffffffff8124cc00>] ? free_obj_work+0x57/0x91
 [<ffffffff81058f88>] warn_slowpath_common+0x7e/0x97
 [<ffffffff81059035>] warn_slowpath_fmt+0x41/0x43
 [<ffffffff8124ccb6>] debug_print_object+0x7c/0x8d
 [<ffffffff8106e3ec>] ? __queue_work+0x259/0x259
 [<ffffffff8124d63e>] ? debug_check_no_obj_freed+0x6f/0x1b5
 [<ffffffff8124d667>] debug_check_no_obj_freed+0x98/0x1b5
 [<ffffffffa00aa031>] ? bt_host_release+0x10/0x1e [bluetooth]
 [<ffffffff810fc035>] kfree+0x90/0xe6
 [<ffffffffa00aa031>] bt_host_release+0x10/0x1e [bluetooth]
 [<ffffffff812ec2f9>] device_release+0x4a/0x7e
 [<ffffffff8123ef57>] kobject_release+0x11d/0x154
 [<ffffffff8123ed98>] kobject_put+0x4a/0x4f
 [<ffffffff812ec0d9>] put_device+0x12/0x14
 [<ffffffffa009472b>] hci_free_dev+0x22/0x26 [bluetooth]
 [<ffffffffa0280dd0>] btusb_disconnect+0x96/0x9f [btusb]
 [<ffffffff813581b4>] usb_unbind_interface+0x57/0x106
 [<ffffffff812ef988>] __device_release_driver+0x83/0xd6
 [<ffffffff812ef9fb>] device_release_driver+0x20/0x2d
 [<ffffffff813582a7>] usb_driver_release_interface+0x44/0x7b
 [<ffffffff81358795>] usb_forced_unbind_intf+0x45/0x4e
 [<ffffffff8134f959>] usb_reset_device+0xa6/0x12e
 [<ffffffff8135df86>] usbdev_do_ioctl+0x319/0xe20
 [<ffffffff81203244>] ? avc_has_perm_flags+0xc9/0x12e
 [<ffffffff812031a0>] ? avc_has_perm_flags+0x25/0x12e
 [<ffffffff81050101>] ? do_page_fault+0x31e/0x3a1
 [<ffffffff8135eaa6>] usbdev_ioctl+0x9/0xd
 [<ffffffff811126b1>] vfs_ioctl+0x21/0x34
 [<ffffffff81112f7b>] do_vfs_ioctl+0x408/0x44b
 [<ffffffff81208d45>] ? file_has_perm+0x76/0x81
 [<ffffffff8111300f>] sys_ioctl+0x51/0x76
 [<ffffffff8158db22>] system_call_fastpath+0x16/0x1b

[1] http://cpansearch.perl.org/src/DPAVLIN/Biblio-RFID-0.03/examples/usbreset.c

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Cc: stable@vger.kernel.org
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
78c04c0
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi September 14, 2012
Luis R. Rodriguez cfg80211: fix possible circular lock on reg_regdb_search()
When call_crda() is called we kick off a witch hunt search
for the same regulatory domain on our internal regulatory
database and that work gets kicked off on a workqueue, this
is done while the cfg80211_mutex is held. If that workqueue
kicks off it will first lock reg_regdb_search_mutex and
later cfg80211_mutex but to ensure two CPUs will not contend
against cfg80211_mutex the right thing to do is to have the
reg_regdb_search() wait until the cfg80211_mutex is let go.

The lockdep report is pasted below.

cfg80211: Calling CRDA to update world regulatory domain

======================================================
[ INFO: possible circular locking dependency detected ]
3.3.8 #3 Tainted: G           O
-------------------------------------------------------
kworker/0:1/235 is trying to acquire lock:
 (cfg80211_mutex){+.+...}, at: [<816468a4>] set_regdom+0x78c/0x808 [cfg80211]

but task is already holding lock:
 (reg_regdb_search_mutex){+.+...}, at: [<81646828>] set_regdom+0x710/0x808 [cfg80211]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (reg_regdb_search_mutex){+.+...}:
       [<800a8384>] lock_acquire+0x60/0x88
       [<802950a8>] mutex_lock_nested+0x54/0x31c
       [<81645778>] is_world_regdom+0x9f8/0xc74 [cfg80211]

-> #1 (reg_mutex#2){+.+...}:
       [<800a8384>] lock_acquire+0x60/0x88
       [<802950a8>] mutex_lock_nested+0x54/0x31c
       [<8164539c>] is_world_regdom+0x61c/0xc74 [cfg80211]

-> #0 (cfg80211_mutex){+.+...}:
       [<800a77b8>] __lock_acquire+0x10d4/0x17bc
       [<800a8384>] lock_acquire+0x60/0x88
       [<802950a8>] mutex_lock_nested+0x54/0x31c
       [<816468a4>] set_regdom+0x78c/0x808 [cfg80211]

other info that might help us debug this:

Chain exists of:
  cfg80211_mutex --> reg_mutex#2 --> reg_regdb_search_mutex

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(reg_regdb_search_mutex);
                               lock(reg_mutex#2);
                               lock(reg_regdb_search_mutex);
  lock(cfg80211_mutex);

 *** DEADLOCK ***

3 locks held by kworker/0:1/235:
 #0:  (events){.+.+..}, at: [<80089a00>] process_one_work+0x230/0x460
 #1:  (reg_regdb_work){+.+...}, at: [<80089a00>] process_one_work+0x230/0x460
 #2:  (reg_regdb_search_mutex){+.+...}, at: [<81646828>] set_regdom+0x710/0x808 [cfg80211]

stack backtrace:
Call Trace:
[<80290fd4>] dump_stack+0x8/0x34
[<80291bc4>] print_circular_bug+0x2ac/0x2d8
[<800a77b8>] __lock_acquire+0x10d4/0x17bc
[<800a8384>] lock_acquire+0x60/0x88
[<802950a8>] mutex_lock_nested+0x54/0x31c
[<816468a4>] set_regdom+0x78c/0x808 [cfg80211]

Reported-by: Felix Fietkau <nbd@openwrt.org>
Tested-by: Felix Fietkau <nbd@openwrt.org>
Cc: stable@vger.kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@do-not-panic.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
a85d0d7
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi September 19, 2012
xen/boot: Disable BIOS SMP MP table search.
As the initial domain we are able to search/map certain regions
of memory to harvest configuration data. For all low-level we
use ACPI tables - for interrupts we use exclusively ACPI _PRT
(so DSDT) and MADT for INT_SRC_OVR.

The SMP MP table is not used at all. As a matter of fact we do
not even support machines that only have SMP MP but no ACPI tables.

Lets follow how Moorestown does it and just disable searching
for BIOS SMP tables.

This also fixes an issue on HP Proliant BL680c G5 and DL380 G6:

9f->100 for 1:1 PTE
Freeing 9f-100 pfn range: 97 pages freed
1-1 mapping on 9f->100
.. snip..
e820: BIOS-provided physical RAM map:
Xen: [mem 0x0000000000000000-0x000000000009efff] usable
Xen: [mem 0x000000000009f400-0x00000000000fffff] reserved
Xen: [mem 0x0000000000100000-0x00000000cfd1dfff] usable
.. snip..
Scan for SMP in [mem 0x00000000-0x000003ff]
Scan for SMP in [mem 0x0009fc00-0x0009ffff]
Scan for SMP in [mem 0x000f0000-0x000fffff]
found SMP MP-table at [mem 0x000f4fa0-0x000f4faf] mapped at [ffff8800000f4fa0]
(XEN) mm.c:908:d0 Error getting mfn 100 (pfn 5555555555555555) from L1 entry 0000000000100461 for l1e_owner=0, pg_owner=0
(XEN) mm.c:4995:d0 ptwr_emulate: could not get_page_from_l1e()
BUG: unable to handle kernel NULL pointer dereference at           (null)
IP: [<ffffffff81ac07e2>] xen_set_pte_init+0x66/0x71
. snip..
Pid: 0, comm: swapper Not tainted 3.6.0-rc6upstream-00188-gb6fb969-dirty #2 HP ProLiant BL680c G5
.. snip..
Call Trace:
 [<ffffffff81ad31c6>] __early_ioremap+0x18a/0x248
 [<ffffffff81624731>] ? printk+0x48/0x4a
 [<ffffffff81ad32ac>] early_ioremap+0x13/0x15
 [<ffffffff81acc140>] get_mpc_size+0x2f/0x67
 [<ffffffff81acc284>] smp_scan_config+0x10c/0x136
 [<ffffffff81acc2e4>] default_find_smp_config+0x36/0x5a
 [<ffffffff81ac3085>] setup_arch+0x5b3/0xb5b
 [<ffffffff81624731>] ? printk+0x48/0x4a
 [<ffffffff81abca7f>] start_kernel+0x90/0x390
 [<ffffffff81abc356>] x86_64_start_reservations+0x131/0x136
 [<ffffffff81abfa83>] xen_start_kernel+0x65f/0x661
(XEN) Domain 0 crashed: 'noreboot' set - not rebooting.

which is that ioremap would end up mapping 0xff using _PAGE_IOMAP
(which is what early_ioremap sticks as a flag) - which meant
we would get MFN 0xFF (pte ff461, which is OK), and then it would
also map 0x100 (b/c ioremap tries to get page aligned request, and
it was trying to map 0xf4fa0 + PAGE_SIZE - so it mapped the next page)
as _PAGE_IOMAP. Since 0x100 is actually a RAM page, and the _PAGE_IOMAP
bypasses the P2M lookup we would happily set the PTE to 1000461.
Xen would deny the request since we do not have access to the
Machine Frame Number (MFN) of 0x100. The P2M[0x100] is for example
0x80140.

CC: stable@vger.kernel.org
Fixes-Oracle-Bugzilla: https://bugzilla.oracle.com/bugzilla/show_bug.cgi?id=13665
Acked-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
bd49940
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi September 07, 2012
wfarnsworth ARM: 7524/1: support syscall tracing
As specified by ftrace-design.txt, TIF_SYSCALL_TRACEPOINT was
added, as well as NR_syscalls in asm/unistd.h.  Additionally,
__sys_trace was modified to call trace_sys_enter and
trace_sys_exit when appropriate.

Tests #2 - #4 of "perf test" now complete successfully.

Signed-off-by: Steven Walter <stevenrwalter@gmail.com>
Signed-off-by: Wade Farnsworth <wade_farnsworth@mentor.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
1f66e06
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi September 19, 2012
staging: usbip: stub_dev: Fixed oops during removal of usbip_host
stub_shutdown_connection() should set kernel thread pointers to NULL after killing them.
so that at the time of usbip_host removal stub_shutdown_connection() doesn't try to kill kernel threads
which are already killed.

[ 1504.312158] BUG: unable to handle kernel NULL pointer dereference at           (null)
[ 1504.315833] IP: [<ffffffff8108186f>] exit_creds+0x1f/0x70
[ 1504.317688] PGD 41f1c067 PUD 41d0f067 PMD 0
[ 1504.319611] Oops: 0000 [#2] SMP
[ 1504.321480] Modules linked in: vhci_hcd(O) usbip_host(O-) usbip_core(O) uas usb_storage joydev parport_pc bnep rfcomm ppdev binfmt_misc
snd_hda_codec_hdmi snd_hda_codec_conexant snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi arc4 snd_rawmidi snd_seq_midi_event
snd_seq ath9k mac80211 ath9k_common ath9k_hw snd_timer snd_seq_device snd ath i915 drm_kms_helper drm psmouse cfg80211 coretemp soundcore
lpc_ich i2c_algo_bit snd_page_alloc video intel_ips wmi btusb mac_hid bluetooth ideapad_laptop lp sparse_keymap serio_raw mei microcode parport r8169
[ 1504.329666] CPU 1
[ 1504.329687] Pid: 2434, comm: usbip_eh Tainted: G      D    O 3.6.0-rc31+ #2 LENOVO 20042                           /Base Board Product Name
[ 1504.333502] RIP: 0010:[<ffffffff8108186f>]  [<ffffffff8108186f>] exit_creds+0x1f/0x70
[ 1504.335371] RSP: 0018:ffff880041c7fdd0  EFLAGS: 00010282
[ 1504.337226] RAX: 0000000000000000 RBX: ffff880041c2db40 RCX: ffff880041e4ae50
[ 1504.339101] RDX: 0000000000000044 RSI: 0000000000000286 RDI: 0000000000000000
[ 1504.341027] RBP: ffff880041c7fde0 R08: ffff880041c7e000 R09: 0000000000000000
[ 1504.342934] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
[ 1504.344840] R13: 0000000000000000 R14: ffff88004d448000 R15: ffff880041c7fea0
[ 1504.346743] FS:  0000000000000000(0000) GS:ffff880077440000(0000) knlGS:0000000000000000
[ 1504.348671] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1504.350612] CR2: 0000000000000000 CR3: 0000000041d0d000 CR4: 00000000000007e0
[ 1504.352723] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1504.354734] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1504.356737] Process usbip_eh (pid: 2434, threadinfo ffff880041c7e000, task ffff88004d448000)
[ 1504.358711] Stack:
[ 1504.360635]  ffff880041c7fde0 ffff880041c2db40 ffff880041c7fe00 ffffffff81052aaa
[ 1504.362589]  ffff880041c2db40 0000000000000000 ffff880041c7fe30 ffffffff8107a148
[ 1504.364539]  ffff880041e4ae10 ffff880041e4ae00 ffff88004d448000 ffff88004d448000
[ 1504.366470] Call Trace:
[ 1504.368368]  [<ffffffff81052aaa>] __put_task_struct+0x4a/0x140
[ 1504.370307]  [<ffffffff8107a148>] kthread_stop+0x108/0x110
[ 1504.370312]  [<ffffffffa040da4e>] stub_shutdown_connection+0x3e/0x1b0 [usbip_host]
[ 1504.370315]  [<ffffffffa03fd920>] event_handler_loop+0x70/0x140 [usbip_core]
[ 1504.370318]  [<ffffffff8107ad30>] ? add_wait_queue+0x60/0x60
[ 1504.370320]  [<ffffffffa03fd8b0>] ? usbip_stop_eh+0x30/0x30 [usbip_core]
[ 1504.370322]  [<ffffffff8107a453>] kthread+0x93/0xa0
[ 1504.370327]  [<ffffffff81670e84>] kernel_thread_helper+0x4/0x10
[ 1504.370330]  [<ffffffff8107a3c0>] ? flush_kthread_worker+0xb0/0xb0
[ 1504.370332]  [<ffffffff81670e80>] ? gs_change+0x13/0x13
[ 1504.370351] Code: 0b 0f 0b 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 53 48 83 ec 08 66 66 66 66 90 48 8b 87 58 04 00 00 48 89 fb 48 8b bf 50 04 00 00
<8b> 00 48 c7 83 50 04 00 00 00 00 00 00 f0 ff 0f 0f 94 c0 84 c0
[ 1504.370353] RIP  [<ffffffff8108186f>] exit_creds+0x1f/0x70
[ 1504.370353]  RSP <ffff880041c7fdd0>
[ 1504.370354] CR2: 0000000000000000
[ 1504.401376] ---[ end trace 1971ce612a16727a ]---

Signed-off-by: navin patidar <navinp@cdac.in>
Acked-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
b7caecb
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi September 19, 2012
staging: usbip: vhci_hcd: Fixed oops during removal of vhci_hcd
In response to "usbip detach -p [port_number]" user command,vhci_shutdown_connection() gets
executed which kills tcp_tx,tcp_rx kernel threads but doesn't set thread pointers to NULL.
so, at the time of vhci_hcd removal vhci_shutdown_connection() again tries to kill kernel
threads which are already killed.

[  312.220259] BUG: unable to handle kernel NULL pointer dereference at           (null)
[  312.220313] IP: [<ffffffff8108186f>] exit_creds+0x1f/0x70
[  312.220349] PGD 5b7be067 PUD 5b7dc067 PMD 0
[  312.220388] Oops: 0000 [#1] SMP
[  312.220415] Modules linked in: vhci_hcd(O-) usbip_host(O) usbip_core(O) uas usb_storage joydev parport_pc bnep rfcomm ppdev binfmt_misc
snd_hda_codec_hdmi snd_hda_codec_conexant snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi arc4 snd_rawmidi snd_seq_midi_event snd_seq
ath9k mac80211 ath9k_common ath9k_hw snd_timer snd_seq_device snd ath i915 drm_kms_helper drm psmouse cfg80211 coretemp soundcore lpc_ich i2c_algo_bit
snd_page_alloc video intel_ips wmi btusb mac_hid bluetooth ideapad_laptop lp sparse_keymap serio_raw mei microcode parport r8169
[  312.220862] CPU 0
[  312.220882] Pid: 2095, comm: usbip_eh Tainted: G           O 3.6.0-rc3+ #2 LENOVO 20042                           /Base Board Product Name
[  312.220938] RIP: 0010:[<ffffffff8108186f>]  [<ffffffff8108186f>] exit_creds+0x1f/0x70
[  312.220979] RSP: 0018:ffff88004d6ebdb0  EFLAGS: 00010282
[  312.221008] RAX: 0000000000000000 RBX: ffff880041c4db40 RCX: ffff88005ad52230
[  312.221041] RDX: 0000000000000034 RSI: 0000000000000286 RDI: 0000000000000000
[  312.221074] RBP: ffff88004d6ebdc0 R08: ffff88004d6ea000 R09: 0000000000000000
[  312.221107] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
[  312.221144] R13: 0000000000000000 R14: ffff88005ad521d8 R15: ffff88004d6ebea0
[  312.221177] FS:  0000000000000000(0000) GS:ffff880077400000(0000) knlGS:0000000000000000
[  312.221214] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  312.221241] CR2: 0000000000000000 CR3: 000000005b7b9000 CR4: 00000000000007f0
[  312.221277] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  312.221309] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  312.221342] Process usbip_eh (pid: 2095, threadinfo ffff88004d6ea000, task ffff88004d44db40)
[  312.221384] Stack:
[  312.221396]  ffff88004d6ebdc0 ffff880041c4db40 ffff88004d6ebde0 ffffffff81052aaa
[  312.221442]  ffff880041c4db40 0000000000000000 ffff88004d6ebe10 ffffffff8107a148
[  312.221487]  ffff88005ad521f0 ffff88005ad52228 ffff88004d44db40 ffff88005ad521d8
[  312.221536] Call Trace:
[  312.221556]  [<ffffffff81052aaa>] __put_task_struct+0x4a/0x140
[  312.221587]  [<ffffffff8107a148>] kthread_stop+0x108/0x110
[  312.221616]  [<ffffffffa0412569>] vhci_shutdown_connection+0x49/0x2b4 [vhci_hcd]
[  312.221657]  [<ffffffffa0139920>] event_handler_loop+0x70/0x140 [usbip_core]
[  312.221692]  [<ffffffff8107ad30>] ? add_wait_queue+0x60/0x60
[  312.221721]  [<ffffffffa01398b0>] ? usbip_stop_eh+0x30/0x30 [usbip_core]
[  312.221754]  [<ffffffff8107a453>] kthread+0x93/0xa0
[  312.221784]  [<ffffffff81670e84>] kernel_thread_helper+0x4/0x10
[  312.221814]  [<ffffffff8107a3c0>] ? flush_kthread_worker+0xb0/0xb0
[  312.223589]  [<ffffffff81670e80>] ? gs_change+0x13/0x13
[  312.225360] Code: 0b 0f 0b 66 0f 1f 84 00 00 00 00 00 55 48 89 e5 53 48 83 ec 08 66 66 66 66 90 48 8b 87 58 04 00 00 48 89 fb 48 8b bf 50 04 00 00
<8b> 00 48 c7 83 50 04 00 00 00 00 00 00 f0 ff 0f 0f 94 c0 84 c0
[  312.229663] RIP  [<ffffffff8108186f>] exit_creds+0x1f/0x70
[  312.231696]  RSP <ffff88004d6ebdb0>
[  312.233648] CR2: 0000000000000000
[  312.256464] ---[ end trace 1971ce612a167279 ]---

Signed-off-by: navin patidar <navinp@cdac.in>
Acked-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
cbf7d12
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi September 04, 2012
[SCSI] zfcp: Make trace record tags unique
Duplicate fssrh_2 from a54ca0f
"[SCSI] zfcp: Redesign of the debug tracing for HBA records."
complicates distinction of generic status read response from
local link up.
Duplicate fsscth1 from 2c55b75
"[SCSI] zfcp: Redesign of the debug tracing for SAN records."
complicates distinction of good common transport response from
invalid port handle.

Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Reviewed-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> #2.6.38+
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
0100998
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi September 04, 2012
[SCSI] zfcp: Do not wakeup while suspended
If the mapping of FCP device bus ID and corresponding subchannel
is modified while the Linux image is suspended, the resume of FCP
devices can fail. During resume, zfcp gets callbacks from cio regarding
the modified subchannels but they can be arbitrarily mixed with the
restore/resume callback. Since the cio callbacks would trigger
adapter recovery, zfcp could wakeup before the resume callback.
Therefore, ignore the cio callbacks regarding subchannels while
being suspended. We can safely do so, since zfcp does not deal itself
with subchannels. For problem determination purposes, we still trace the
ignored callback events.

The following kernel messages could be seen on resume:

kernel: <WWPN>: parent <FCP device bus ID> should not be sleeping

As part of adapter reopen recovery, zfcp performs auto port scanning
which can erroneously try to register new remote ports with
scsi_transport_fc and the device core code complains about the parent
(adapter) still sleeping.

kernel: zfcp.3dff9c: <FCP device bus ID>:\
 Setting up the QDIO connection to the FCP adapter failed
<last kernel message repeated 3 more times>
kernel: zfcp.574d43: <FCP device bus ID>:\
 ERP cannot recover an error on the FCP device

In such cases, the adapter gave up recovery and remained blocked along
with its child objects: remote ports and LUNs/scsi devices. Even the
adapter shutdown as part of giving up recovery failed because the ccw
device state remained disconnected. Later, the corresponding remote
ports ran into dev_loss_tmo. As a result, the LUNs were erroneously
not available again after resume.

Even a manually triggered adapter recovery (e.g. sysfs attribute
failed, or device offline/online via sysfs) could not recover the
adapter due to the remaining disconnected state of the corresponding
ccw device.

Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> #2.6.32+
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
cb45214
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi September 04, 2012
JuliaLawall [SCSI] zfcp: remove invalid reference to list iterator variable
If list_for_each_entry, etc complete a traversal of the list, the iterator
variable ends up pointing to an address at an offset from the list head,
and not a meaningful structure.  Thus this value should not be used after
the end of the iterator.  Replace port->adapter->scsi_host by
adapter->scsi_host.

This problem was found using Coccinelle (http://coccinelle.lip6.fr/).

Oversight in upsteam commit of v2.6.37
a1ca483
"[SCSI] zfcp: Move ACL/CFDC code to zfcp_cfdc.c"
which merged the content of zfcp_erp_port_access_changed().

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Reviewed-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> #2.6.37+
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
ca579c9
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi September 04, 2012
[SCSI] zfcp: restore refcount check on port_remove
Upstream commit f3450c7
"[SCSI] zfcp: Replace local reference counting with common kref"
accidentally dropped a reference count check before tearing down
zfcp_ports that are potentially in use by zfcp_units.
Even remote ports in use can be removed causing
unreachable garbage objects zfcp_ports with zfcp_units.
Thus units won't come back even after a manual port_rescan.
The kref of zfcp_port->dev.kobj is already used by the driver core.
We cannot re-use it to track the number of zfcp_units.
Re-introduce our own counter for units per port
and check on port_remove.

Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: <stable@vger.kernel.org> #2.6.33+
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
d99b601
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi September 04, 2012
[SCSI] zfcp: only access zfcp_scsi_dev for valid scsi_device
__scsi_remove_device (e.g. due to dev_loss_tmo) calls
zfcp_scsi_slave_destroy which in turn sends a close LUN FSF request to
the adapter. After 30 seconds without response,
zfcp_erp_timeout_handler kicks the ERP thread failing the close LUN
ERP action. zfcp_erp_wait in zfcp_erp_lun_shutdown_wait and thus
zfcp_scsi_slave_destroy returns and then scsi_device is no longer
valid. Sometime later the response to the close LUN FSF request may
finally come in. However, commit
b62a8d9
"[SCSI] zfcp: Use SCSI device data zfcp_scsi_dev instead of zfcp_unit"
introduced a number of attempts to unconditionally access struct
zfcp_scsi_dev through struct scsi_device causing a use-after-free.
This leads to an Oops due to kernel page fault in one of:
zfcp_fsf_abort_fcp_command_handler, zfcp_fsf_open_lun_handler,
zfcp_fsf_close_lun_handler, zfcp_fsf_req_trace,
zfcp_fsf_fcp_handler_common.
Move dereferencing of zfcp private data zfcp_scsi_dev allocated in
scsi_device via scsi_transport_reserve_device after the check for
potentially aborted FSF request and thus no longer valid scsi_device.
Only then assign sdev_to_zfcp(sdev) to the local auto variable struct
zfcp_scsi_dev *zfcp_sdev.

Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> #2.6.37+
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
d436de8
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi September 25, 2012
NFC: Use dynamic initialization for rwlocks
If rwlock is dynamically allocated but statically initialized it is
missing proper lockdep annotation.

INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
Pid: 3352, comm: neard Not tainted 3.5.0-999-nfc+ #2
Call Trace:
[<ffffffff810c8526>] __lock_acquire+0x8f6/0x1bf0
[<ffffffff81739045>] ? printk+0x4d/0x4f
[<ffffffff810c9eed>] lock_acquire+0x9d/0x220
[<ffffffff81702bfe>] ? nfc_llcp_sock_from_sn+0x4e/0x160
[<ffffffff81746724>] _raw_read_lock+0x44/0x60
[<ffffffff81702bfe>] ? nfc_llcp_sock_from_sn+0x4e/0x160
[<ffffffff81702bfe>] nfc_llcp_sock_from_sn+0x4e/0x160
[<ffffffff817034a7>] nfc_llcp_get_sdp_ssap+0xa7/0x1b0
[<ffffffff81706353>] llcp_sock_bind+0x173/0x210
[<ffffffff815d9c94>] sys_bind+0xe4/0x100
[<ffffffff8139209e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[<ffffffff8174ea69>] system_call_fastpath+0x16/0x1b

Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
fe235b5
popcornmix popcornmix referenced this issue from a commit January 27, 2012
Andre Guedes Bluetooth: Fix potential deadlock
We don't need to use the _sync variant in hci_conn_hold and
hci_conn_put to cancel conn->disc_work delayed work. This way
we avoid potential deadlocks like this one reported by lockdep.

======================================================
[ INFO: possible circular locking dependency detected ]
3.2.0+ #1 Not tainted
-------------------------------------------------------
kworker/u:1/17 is trying to acquire lock:
 (&hdev->lock){+.+.+.}, at: [<ffffffffa0041155>] hci_conn_timeout+0x62/0x158 [bluetooth]

but task is already holding lock:
 ((&(&conn->disc_work)->work)){+.+...}, at: [<ffffffff81035751>] process_one_work+0x11a/0x2bf

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 ((&(&conn->disc_work)->work)){+.+...}:
       [<ffffffff81057444>] lock_acquire+0x8a/0xa7
       [<ffffffff81034ed1>] wait_on_work+0x3d/0xaa
       [<ffffffff81035b54>] __cancel_work_timer+0xac/0xef
       [<ffffffff81035ba4>] cancel_delayed_work_sync+0xd/0xf
       [<ffffffffa00554b0>] smp_chan_create+0xde/0xe6 [bluetooth]
       [<ffffffffa0056160>] smp_conn_security+0xa3/0x12d [bluetooth]
       [<ffffffffa0053640>] l2cap_connect_cfm+0x237/0x2e8 [bluetooth]
       [<ffffffffa004239c>] hci_proto_connect_cfm+0x2d/0x6f [bluetooth]
       [<ffffffffa0046ea5>] hci_event_packet+0x29d1/0x2d60 [bluetooth]
       [<ffffffffa003dde3>] hci_rx_work+0xd0/0x2e1 [bluetooth]
       [<ffffffff810357af>] process_one_work+0x178/0x2bf
       [<ffffffff81036178>] worker_thread+0xce/0x152
       [<ffffffff81039a03>] kthread+0x95/0x9d
       [<ffffffff812e7754>] kernel_thread_helper+0x4/0x10

-> #1 (slock-AF_BLUETOOTH-BTPROTO_L2CAP){+.+...}:
       [<ffffffff81057444>] lock_acquire+0x8a/0xa7
       [<ffffffff812e553a>] _raw_spin_lock_bh+0x36/0x6a
       [<ffffffff81244d56>] lock_sock_nested+0x24/0x7f
       [<ffffffffa004d96f>] lock_sock+0xb/0xd [bluetooth]
       [<ffffffffa0052906>] l2cap_chan_connect+0xa9/0x26f [bluetooth]
       [<ffffffffa00545f8>] l2cap_sock_connect+0xb3/0xff [bluetooth]
       [<ffffffff81243b48>] sys_connect+0x69/0x8a
       [<ffffffff812e6579>] system_call_fastpath+0x16/0x1b

-> #0 (&hdev->lock){+.+.+.}:
       [<ffffffff81056d06>] __lock_acquire+0xa80/0xd74
       [<ffffffff81057444>] lock_acquire+0x8a/0xa7
       [<ffffffff812e3870>] __mutex_lock_common+0x48/0x38e
       [<ffffffff812e3c75>] mutex_lock_nested+0x2a/0x31
       [<ffffffffa0041155>] hci_conn_timeout+0x62/0x158 [bluetooth]
       [<ffffffff810357af>] process_one_work+0x178/0x2bf
       [<ffffffff81036178>] worker_thread+0xce/0x152
       [<ffffffff81039a03>] kthread+0x95/0x9d
       [<ffffffff812e7754>] kernel_thread_helper+0x4/0x10

other info that might help us debug this:

Chain exists of:
  &hdev->lock --> slock-AF_BLUETOOTH-BTPROTO_L2CAP --> (&(&conn->disc_work)->work)

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock((&(&conn->disc_work)->work));
                               lock(slock-AF_BLUETOOTH-BTPROTO_L2CAP);
                               lock((&(&conn->disc_work)->work));
  lock(&hdev->lock);

 *** DEADLOCK ***

2 locks held by kworker/u:1/17:
 #0:  (hdev->name){.+.+.+}, at: [<ffffffff81035751>] process_one_work+0x11a/0x2bf
 #1:  ((&(&conn->disc_work)->work)){+.+...}, at: [<ffffffff81035751>] process_one_work+0x11a/0x2bf

stack backtrace:
Pid: 17, comm: kworker/u:1 Not tainted 3.2.0+ #1
Call Trace:
 [<ffffffff812e06c6>] print_circular_bug+0x1f8/0x209
 [<ffffffff81056d06>] __lock_acquire+0xa80/0xd74
 [<ffffffff81021ef2>] ? arch_local_irq_restore+0x6/0xd
 [<ffffffff81022bc7>] ? vprintk+0x3f9/0x41e
 [<ffffffff81057444>] lock_acquire+0x8a/0xa7
 [<ffffffffa0041155>] ? hci_conn_timeout+0x62/0x158 [bluetooth]
 [<ffffffff812e3870>] __mutex_lock_common+0x48/0x38e
 [<ffffffffa0041155>] ? hci_conn_timeout+0x62/0x158 [bluetooth]
 [<ffffffff81190fd6>] ? __dynamic_pr_debug+0x6d/0x6f
 [<ffffffffa0041155>] ? hci_conn_timeout+0x62/0x158 [bluetooth]
 [<ffffffff8105320f>] ? trace_hardirqs_off+0xd/0xf
 [<ffffffff812e3c75>] mutex_lock_nested+0x2a/0x31
 [<ffffffffa0041155>] hci_conn_timeout+0x62/0x158 [bluetooth]
 [<ffffffff810357af>] process_one_work+0x178/0x2bf
 [<ffffffff81035751>] ? process_one_work+0x11a/0x2bf
 [<ffffffff81055af3>] ? lock_acquired+0x1d0/0x1df
 [<ffffffffa00410f3>] ? hci_acl_disconn+0x65/0x65 [bluetooth]
 [<ffffffff81036178>] worker_thread+0xce/0x152
 [<ffffffff810407ed>] ? finish_task_switch+0x45/0xc5
 [<ffffffff810360aa>] ? manage_workers.isra.25+0x16a/0x16a
 [<ffffffff81039a03>] kthread+0x95/0x9d
 [<ffffffff812e7754>] kernel_thread_helper+0x4/0x10
 [<ffffffff812e5db4>] ? retint_restore_args+0x13/0x13
 [<ffffffff8103996e>] ? __init_kthread_worker+0x55/0x55
 [<ffffffff812e7750>] ? gs_change+0x13/0x13

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@openbossa.org>
Reviewed-by: Ulisses Furquim <ulisses@profusion.mobi>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2f304d1
popcornmix popcornmix referenced this issue from a commit February 22, 2012
x86: Remove some noise from boot log when starting cpus
Printing the "start_ip" for every secondary cpu is very noisy on a large
system - and doesn't add any value. Drop this message.

Console log before:
Booting Node   0, Processors  #1
smpboot cpu 1: start_ip = 96000
 #2
smpboot cpu 2: start_ip = 96000
 #3
smpboot cpu 3: start_ip = 96000
 #4
smpboot cpu 4: start_ip = 96000
       ...
 #31
smpboot cpu 31: start_ip = 96000
Brought up 32 CPUs

Console log after:
Booting Node   0, Processors  #1 #2 #3 #4 #5 #6 #7 Ok.
Booting Node   1, Processors  #8 #9 #10 #11 #12 #13 #14 #15 Ok.
Booting Node   0, Processors  #16 #17 #18 #19 #20 #21 #22 #23 Ok.
Booting Node   1, Processors  #24 #25 #26 #27 #28 #29 #30 #31
Brought up 32 CPUs

Acked-by: Borislav Petkov <bp@amd64.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/4f452eb42507460426@agluck-desktop.sc.intel.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
140f190
popcornmix popcornmix referenced this issue from a commit January 20, 2012
djbw [SCSI] libsas: fix lifetime of SAS_HA_FROZEN
Until all sas_tasks are known to no longer be in-flight this flag gates late
completions from colliding with error handling.  However, it must be cleared
prior to the submission of scsi_send_eh_cmnd() requests, otherwise those
commands will never be completed correctly.

This was spotted by slub debug:
 =============================================================================
 BUG sas_task: Objects remaining on kmem_cache_close()
 -----------------------------------------------------------------------------

 INFO: Slab 0xffffea001f0eba00 objects=34 used=1 fp=0xffff8807c3aecb00 flags=0x8000000000004080
 Pid: 22919, comm: modprobe Not tainted 3.2.0-isci+ #2
 Call Trace:
  [<ffffffff810fcdcd>] slab_err+0xb0/0xd2
  [<ffffffff810e1c50>] ? free_percpu+0x31/0x117
  [<ffffffff81100122>] ? kzalloc+0x14/0x16
  [<ffffffff81100122>] ? kzalloc+0x14/0x16
  [<ffffffff81100486>] kmem_cache_destroy+0x11d/0x270
  [<ffffffffa0112bdc>] sas_class_exit+0x10/0x12 [libsas]
  [<ffffffff81078fba>] sys_delete_module+0x1c4/0x23c
  [<ffffffff814797ba>] ? sysret_check+0x2e/0x69
  [<ffffffff8126479e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
  [<ffffffff81479782>] system_call_fastpath+0x16/0x1b
 INFO: Object 0xffff8807c3aed280 @offset=21120
 INFO: Allocated in sas_alloc_task+0x22/0x90 [libsas] age=4615311 cpu=2 pid=12966
  __slab_alloc.clone.3+0x1d1/0x234
  kmem_cache_alloc+0x52/0x10d
  sas_alloc_task+0x22/0x90 [libsas]
  sas_queuecommand+0x20e/0x230 [libsas]
  scsi_send_eh_cmnd+0xd1/0x30c
  scsi_eh_try_stu+0x4f/0x6b
  scsi_eh_ready_devs+0xba/0x6ef
  sas_scsi_recover_host+0xa35/0xab1 [libsas]
  scsi_error_handler+0x14b/0x5fa
  kthread+0x9d/0xa5
  kernel_thread_helper+0x4/0x10

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
8402347
popcornmix popcornmix referenced this issue from a commit March 07, 2012
Srivatsa S. Bhat x86, mce: Fix rcu splat in drain_mce_log_buffer()
While booting, the following message is seen:

[   21.665087] ===============================
[   21.669439] [ INFO: suspicious RCU usage. ]
[   21.673798] 3.2.0-0.0.0.28.36b5ec9-default #2 Not tainted
[   21.681353] -------------------------------
[   21.685864] arch/x86/kernel/cpu/mcheck/mce.c:194 suspicious rcu_dereference_index_check() usage!
[   21.695013]
[   21.695014] other info that might help us debug this:
[   21.695016]
[   21.703488]
[   21.703489] rcu_scheduler_active = 1, debug_locks = 1
[   21.710426] 3 locks held by modprobe/2139:
[   21.714754]  #0:  (&__lockdep_no_validate__){......}, at: [<ffffffff8133afd3>] __driver_attach+0x53/0xa0
[   21.725020]  #1:
[   21.725323] ioatdma: Intel(R) QuickData Technology Driver 4.00
[   21.733206]  (&__lockdep_no_validate__){......}, at: [<ffffffff8133afe1>] __driver_attach+0x61/0xa0
[   21.743015]  #2:  (i7core_edac_lock){+.+.+.}, at: [<ffffffffa01cfa5f>] i7core_probe+0x1f/0x5c0 [i7core_edac]
[   21.753708]
[   21.753709] stack backtrace:
[   21.758429] Pid: 2139, comm: modprobe Not tainted 3.2.0-0.0.0.28.36b5ec9-default #2
[   21.768253] Call Trace:
[   21.770838]  [<ffffffff810977cd>] lockdep_rcu_suspicious+0xcd/0x100
[   21.777366]  [<ffffffff8101aa41>] drain_mcelog_buffer+0x191/0x1b0
[   21.783715]  [<ffffffff8101aa78>] mce_register_decode_chain+0x18/0x20
[   21.790430]  [<ffffffffa01cf8db>] i7core_register_mci+0x2fb/0x3e4 [i7core_edac]
[   21.798003]  [<ffffffffa01cfb14>] i7core_probe+0xd4/0x5c0 [i7core_edac]
[   21.804809]  [<ffffffff8129566b>] local_pci_probe+0x5b/0xe0
[   21.810631]  [<ffffffff812957c9>] __pci_device_probe+0xd9/0xe0
[   21.816650]  [<ffffffff813362e4>] ? get_device+0x14/0x20
[   21.822178]  [<ffffffff81296916>] pci_device_probe+0x36/0x60
[   21.828061]  [<ffffffff8133ac8a>] really_probe+0x7a/0x2b0
[   21.833676]  [<ffffffff8133af23>] driver_probe_device+0x63/0xc0
[   21.839868]  [<ffffffff8133b01b>] __driver_attach+0x9b/0xa0
[   21.845718]  [<ffffffff8133af80>] ? driver_probe_device+0xc0/0xc0
[   21.852027]  [<ffffffff81339168>] bus_for_each_dev+0x68/0x90
[   21.857876]  [<ffffffff8133aa3c>] driver_attach+0x1c/0x20
[   21.863462]  [<ffffffff8133a64d>] bus_add_driver+0x16d/0x2b0
[   21.869377]  [<ffffffff8133b6dc>] driver_register+0x7c/0x160
[   21.875220]  [<ffffffff81296bda>] __pci_register_driver+0x6a/0xf0
[   21.881494]  [<ffffffffa01fe000>] ? 0xffffffffa01fdfff
[   21.886846]  [<ffffffffa01fe047>] i7core_init+0x47/0x1000 [i7core_edac]
[   21.893737]  [<ffffffff810001ce>] do_one_initcall+0x3e/0x180
[   21.899670]  [<ffffffff810a9b95>] sys_init_module+0xc5/0x220
[   21.905542]  [<ffffffff8149bc39>] system_call_fastpath+0x16/0x1b

Fix this by using ACCESS_ONCE() instead of rcu_dereference_check_mce()
over mcelog.next. Since the access to each entry is controlled by the
->finished field, ACCESS_ONCE() should work just fine. An rcu_dereference
is unnecessary here.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Suggested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
b11e3d7
popcornmix popcornmix referenced this issue from a commit January 30, 2012
device.h: audit and cleanup users in main include dir
The <linux/device.h> header includes a lot of stuff, and
it in turn gets a lot of use just for the basic "struct device"
which appears so often.

Clean up the users as follows:

1) For those headers only needing "struct device" as a pointer
in fcn args, replace the include with exactly that.

2) For headers not really using anything from device.h, simply
delete the include altogether.

3) For headers relying on getting device.h implicitly before
being included themselves, now explicitly include device.h

4) For files in which doing #1 or #2 uncovers an implicit
dependency on some other header, fix by explicitly adding
the required header(s).

Any C files that were implicitly relying on device.h to be
present have already been dealt with in advance.

Total removals from #1 and #2: 51.  Total additions coming
from #3: 9.  Total other implicit dependencies from #4: 7.

As of 3.3-rc1, there were 110, so a net removal of 42 gives
about a 38% reduction in device.h presence in include/*

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
313162d
popcornmix popcornmix referenced this issue from a commit March 12, 2012
Disintegrate asm/system.h for Blackfin [ver #2]
Disintegrate asm/system.h for Blackfin.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: uclinux-dist-devel@blackfin.uclinux.org
Signed-off-by: Bob Liu <lliubbo@gmail.com>
3bed8d6
popcornmix popcornmix referenced this issue from a commit March 22, 2012
Linus Torvalds Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel…
…/git/lliubbo/blackfin

Pull blackfin updates from Bob Liu

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lliubbo/blackfin: (24 commits)
  blackfin: clean up string bfin_dma_5xx after rename.
  blackfin:dma: rename bfin_dma_5xx.c to bfin_dma.c
  bf548: ssm2602: Add ssm2602 platform data into bf548 ezkit board file.
  Blackfin: s/#if CONFIG/#ifdef CONFIG/
  Blackfin: pnav: delete duplicate linux/export.h include
  bf561: add ppi DLEN macro for 10bits to 16bits
  arch: blackfin: udpate defconfig
  Disintegrate asm/system.h for Blackfin [ver #2]
  arch/blackfin: don't generate random mac in bfin_get_ether_addr()
  Blackfin: wire up new process_vm syscalls
  blackfin: cleanup anomaly workarounds
  blackfin: update default defconfig
  blackfin: thread_info: add suspend flag
  bfin: add bfin_ad73311_machine platform device
  blackfin: bf537: stamp: update board file for 193x
  blackfin: kgdb: skip hardware watchpoint test
  bf548: add ppi interrupt mask and blanking clocks
  blackfin: bf561: forgot CSYNC in get_core_lock_noflush
  spi/bfin_spi: drop bits_per_word from client data
  blackfin: cplb-mpu: fix page mask table overflow
  ...
4f5b1af
popcornmix popcornmix referenced this issue from a commit March 23, 2012
Linus Torvalds Merge branch 'amba' of git://git.linaro.org/people/rmk/linux-arm
Pull #2 ARM updates from Russell King:
 "Further ARM AMBA primecell updates which aren't included directly in
  the previous commit.  I wanted to keep these separate as they're
  touching stuff outside arch/arm/."

* 'amba' of git://git.linaro.org/people/rmk/linux-arm:
  ARM: 7362/1: AMBA: Add module_amba_driver() helper macro for amba_driver
  ARM: 7335/1: mach-u300: do away with MMC config files
  ARM: 7280/1: mmc: mmci: Cache MMCICLOCK and MMCIPOWER register
  ARM: 7309/1: realview: fix unconnected interrupts on EB11MP
  ARM: 7230/1: mmc: mmci: Fix PIO read for small SDIO packets
  ARM: 7227/1: mmc: mmci: Prepare for SDIO before setting up DMA job
  ARM: 7223/1: mmc: mmci: Fixup use of runtime PM and use autosuspend
  ARM: 7221/1: mmc: mmci: Change from using legacy suspend
  ARM: 7219/1: mmc: mmci: Change vdd_handler to a generic ios_handler
  ARM: 7218/1: mmc: mmci: Provide option to configure bus signal direction
  ARM: 7217/1: mmc: mmci: Put power register deviations in variant data
  ARM: 7216/1: mmc: mmci: Do not release spinlock in request_end
  ARM: 7215/1: mmc: mmci: Increase max_segs from 16 to 128
0d19eac
popcornmix popcornmix referenced this issue from a commit April 01, 2012
blackfin: fix cmpxchg build fails from system.h fallout
Commit 3bed8d6 ("Disintegrate asm/system.h for Blackfin [ver #2]")
introduced arch/blackfin/include/asm/cmpxchg.h but has it also including
the asm-generic one which causes this:

  CC      arch/blackfin/kernel/asm-offsets.s
  In file included from arch/blackfin/include/asm/cmpxchg.h:125:0,
                 from arch/blackfin/include/asm/atomic.h:10,
                 from include/linux/atomic.h:4,
                 from include/linux/spinlock.h:384,
                 from include/linux/seqlock.h:29,
                 from include/linux/time.h:8,
                 from include/linux/timex.h:56,
                 from include/linux/sched.h:57,
                 from arch/blackfin/kernel/asm-offsets.c:10:
  include/asm-generic/cmpxchg.h:24:15: error: redefinition of '__xchg'
  arch/blackfin/include/asm/cmpxchg.h:82:29: note: previous definition of '__xchg' was here
  make[2]: *** [arch/blackfin/kernel/asm-offsets.s] Error 1

It really only needs two simple defines from asm-generic, so just use
those instead.

Cc: Bob Liu <lliubbo@gmail.com>
Cc: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1512cdc
popcornmix popcornmix referenced this issue from a commit April 06, 2012
djbw kobject: provide more diagnostic info for kobject_add_internal() fail…
…ures

1/ convert open-coded KERN_ERR+dump_stack() to WARN(), so that automated
   tools pick up this warning.

2/ include the 'child' and 'parent' kobject names.  This information was
   useful for tracking down the case where scsi invoked device_del() on a
   parent object and subsequently invoked device_add() on a child.  Now the
   warning looks like:

     kobject_add_internal failed for target8:0:16 (error: -2 parent: end_device-8:0:24)
     Pid: 2942, comm: scsi_scan_8 Not tainted 3.3.0-rc7-isci+ #2
     Call Trace:
      [<ffffffff8125e551>] kobject_add_internal+0x1c1/0x1f3
      [<ffffffff81075149>] ? trace_hardirqs_on+0xd/0xf
      [<ffffffff8125e659>] kobject_add_varg+0x41/0x50
      [<ffffffff8125e723>] kobject_add+0x64/0x66
      [<ffffffff8131124b>] device_add+0x12d/0x63a
      [<ffffffff8125e0ef>] ? kobject_put+0x4c/0x50
      [<ffffffff8132f370>] scsi_sysfs_add_sdev+0x4e/0x28a
      [<ffffffff8132dce3>] do_scan_async+0x9c/0x145

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: James Bottomley <JBottomley@parallels.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
282029c
popcornmix popcornmix referenced this issue from a commit April 30, 2012
Roland Dreier IB/uverbs: Lock SRQ / CQ / PD objects in a consistent order
Since XRC support was added, the uverbs code has locked SRQ, CQ and PD
objects needed during QP and SRQ creation in different orders
depending on the the code path.  This leads to the (at least
theoretical) possibility of deadlock, and triggers the lockdep splat
below.

Fix this by making sure we always lock the SRQ first, then CQs and
finally the PD.

    ======================================================
    [ INFO: possible circular locking dependency detected ]
    3.4.0-rc5+ #34 Not tainted
    -------------------------------------------------------
    ibv_srq_pingpon/2484 is trying to acquire lock:
     (SRQ-uobj){+++++.}, at: [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs]

    but task is already holding lock:
     (CQ-uobj){+++++.}, at: [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs]

    which lock already depends on the new lock.

    the existing dependency chain (in reverse order) is:

    -> #2 (CQ-uobj){+++++.}:
           [<ffffffff81070fd0>] lock_acquire+0xbf/0xfe
           [<ffffffff81384f28>] down_read+0x34/0x43
           [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs]
           [<ffffffffa00af542>] idr_read_obj+0x9/0x19 [ib_uverbs]
           [<ffffffffa00b16c3>] ib_uverbs_create_qp+0x180/0x684 [ib_uverbs]
           [<ffffffffa00ae3dd>] ib_uverbs_write+0xb7/0xc2 [ib_uverbs]
           [<ffffffff810fe47f>] vfs_write+0xa7/0xee
           [<ffffffff810fe65f>] sys_write+0x45/0x69
           [<ffffffff8138cdf9>] system_call_fastpath+0x16/0x1b

    -> #1 (PD-uobj){++++++}:
           [<ffffffff81070fd0>] lock_acquire+0xbf/0xfe
           [<ffffffff81384f28>] down_read+0x34/0x43
           [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs]
           [<ffffffffa00af542>] idr_read_obj+0x9/0x19 [ib_uverbs]
           [<ffffffffa00af8ad>] __uverbs_create_xsrq+0x96/0x386 [ib_uverbs]
           [<ffffffffa00b31b9>] ib_uverbs_detach_mcast+0x1cd/0x1e6 [ib_uverbs]
           [<ffffffffa00ae3dd>] ib_uverbs_write+0xb7/0xc2 [ib_uverbs]
           [<ffffffff810fe47f>] vfs_write+0xa7/0xee
           [<ffffffff810fe65f>] sys_write+0x45/0x69
           [<ffffffff8138cdf9>] system_call_fastpath+0x16/0x1b

    -> #0 (SRQ-uobj){+++++.}:
           [<ffffffff81070898>] __lock_acquire+0xa29/0xd06
           [<ffffffff81070fd0>] lock_acquire+0xbf/0xfe
           [<ffffffff81384f28>] down_read+0x34/0x43
           [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs]
           [<ffffffffa00af542>] idr_read_obj+0x9/0x19 [ib_uverbs]
           [<ffffffffa00b1728>] ib_uverbs_create_qp+0x1e5/0x684 [ib_uverbs]
           [<ffffffffa00ae3dd>] ib_uverbs_write+0xb7/0xc2 [ib_uverbs]
           [<ffffffff810fe47f>] vfs_write+0xa7/0xee
           [<ffffffff810fe65f>] sys_write+0x45/0x69
           [<ffffffff8138cdf9>] system_call_fastpath+0x16/0x1b

    other info that might help us debug this:

    Chain exists of:
      SRQ-uobj --> PD-uobj --> CQ-uobj

     Possible unsafe locking scenario:

           CPU0                    CPU1
           ----                    ----
      lock(CQ-uobj);
                                   lock(PD-uobj);
                                   lock(CQ-uobj);
      lock(SRQ-uobj);

     *** DEADLOCK ***

    3 locks held by ibv_srq_pingpon/2484:
     #0:  (QP-uobj){+.+...}, at: [<ffffffffa00b162c>] ib_uverbs_create_qp+0xe9/0x684 [ib_uverbs]
     #1:  (PD-uobj){++++++}, at: [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs]
     #2:  (CQ-uobj){+++++.}, at: [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs]

    stack backtrace:
    Pid: 2484, comm: ibv_srq_pingpon Not tainted 3.4.0-rc5+ #34
    Call Trace:
     [<ffffffff8137eff0>] print_circular_bug+0x1f8/0x209
     [<ffffffff81070898>] __lock_acquire+0xa29/0xd06
     [<ffffffffa00af37c>] ? __idr_get_uobj+0x20/0x5e [ib_uverbs]
     [<ffffffffa00af51b>] ? idr_read_uobj+0x2f/0x4d [ib_uverbs]
     [<ffffffff81070fd0>] lock_acquire+0xbf/0xfe
     [<ffffffffa00af51b>] ? idr_read_uobj+0x2f/0x4d [ib_uverbs]
     [<ffffffff81070eee>] ? lock_release+0x166/0x189
     [<ffffffff81384f28>] down_read+0x34/0x43
     [<ffffffffa00af51b>] ? idr_read_uobj+0x2f/0x4d [ib_uverbs]
     [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs]
     [<ffffffffa00af542>] idr_read_obj+0x9/0x19 [ib_uverbs]
     [<ffffffffa00b1728>] ib_uverbs_create_qp+0x1e5/0x684 [ib_uverbs]
     [<ffffffff81070fec>] ? lock_acquire+0xdb/0xfe
     [<ffffffff81070c09>] ? lock_release_non_nested+0x94/0x213
     [<ffffffff810d470f>] ? might_fault+0x40/0x90
     [<ffffffff810d470f>] ? might_fault+0x40/0x90
     [<ffffffffa00ae3dd>] ib_uverbs_write+0xb7/0xc2 [ib_uverbs]
     [<ffffffff810fe47f>] vfs_write+0xa7/0xee
     [<ffffffff810ff736>] ? fget_light+0x3b/0x99
     [<ffffffff810fe65f>] sys_write+0x45/0x69
     [<ffffffff8138cdf9>] system_call_fastpath+0x16/0x1b

Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
5909ce5
popcornmix popcornmix referenced this issue from a commit April 25, 2012
Sean Hefty RDMA/cma: Fix lockdep false positive recursive locking
The following lockdep problem was reported by Or Gerlitz <ogerlitz@mellanox.com>:

    [ INFO: possible recursive locking detected ]
    3.3.0-32035-g1b2649e-dirty #4 Not tainted
    ---------------------------------------------
    kworker/5:1/418 is trying to acquire lock:
     (&id_priv->handler_mutex){+.+.+.}, at: [<ffffffffa0138a41>] rdma_destroy_i    d+0x33/0x1f0 [rdma_cm]

    but task is already holding lock:
     (&id_priv->handler_mutex){+.+.+.}, at: [<ffffffffa0135130>] cma_disable_ca    llback+0x24/0x45 [rdma_cm]

    other info that might help us debug this:
     Possible unsafe locking scenario:

           CPU0
           ----
      lock(&id_priv->handler_mutex);
      lock(&id_priv->handler_mutex);

     *** DEADLOCK ***

     May be due to missing lock nesting notation

    3 locks held by kworker/5:1/418:
     #0:  (ib_cm){.+.+.+}, at: [<ffffffff81042ac1>] process_one_work+0x210/0x4a    6
     #1:  ((&(&work->work)->work)){+.+.+.}, at: [<ffffffff81042ac1>] process_on    e_work+0x210/0x4a6
     #2:  (&id_priv->handler_mutex){+.+.+.}, at: [<ffffffffa0135130>] cma_disab    le_callback+0x24/0x45 [rdma_cm]

    stack backtrace:
    Pid: 418, comm: kworker/5:1 Not tainted 3.3.0-32035-g1b2649e-dirty #4
    Call Trace:
     [<ffffffff8102b0fb>] ? console_unlock+0x1f4/0x204
     [<ffffffff81068771>] __lock_acquire+0x16b5/0x174e
     [<ffffffff8106461f>] ? save_trace+0x3f/0xb3
     [<ffffffff810688fa>] lock_acquire+0xf0/0x116
     [<ffffffffa0138a41>] ? rdma_destroy_id+0x33/0x1f0 [rdma_cm]
     [<ffffffff81364351>] mutex_lock_nested+0x64/0x2ce
     [<ffffffffa0138a41>] ? rdma_destroy_id+0x33/0x1f0 [rdma_cm]
     [<ffffffff81065a78>] ? trace_hardirqs_on_caller+0x11e/0x155
     [<ffffffff81065abc>] ? trace_hardirqs_on+0xd/0xf
     [<ffffffffa0138a41>] rdma_destroy_id+0x33/0x1f0 [rdma_cm]
     [<ffffffffa0139c02>] cma_req_handler+0x418/0x644 [rdma_cm]
     [<ffffffffa012ee88>] cm_process_work+0x32/0x119 [ib_cm]
     [<ffffffffa0130299>] cm_req_handler+0x928/0x982 [ib_cm]
     [<ffffffffa01302f3>] ? cm_req_handler+0x982/0x982 [ib_cm]
     [<ffffffffa0130326>] cm_work_handler+0x33/0xfe5 [ib_cm]
     [<ffffffff81065a78>] ? trace_hardirqs_on_caller+0x11e/0x155
     [<ffffffffa01302f3>] ? cm_req_handler+0x982/0x982 [ib_cm]
     [<ffffffff81042b6e>] process_one_work+0x2bd/0x4a6
     [<ffffffff81042ac1>] ? process_one_work+0x210/0x4a6
     [<ffffffff813669f3>] ? _raw_spin_unlock_irq+0x2b/0x40
     [<ffffffff8104316e>] worker_thread+0x1d6/0x350
     [<ffffffff81042f98>] ? rescuer_thread+0x241/0x241
     [<ffffffff81046a32>] kthread+0x84/0x8c
     [<ffffffff8136e854>] kernel_thread_helper+0x4/0x10
     [<ffffffff81366d59>] ? retint_restore_args+0xe/0xe
     [<ffffffff810469ae>] ? __init_kthread_worker+0x56/0x56
     [<ffffffff8136e850>] ? gs_change+0xb/0xb

The actual locking is fine, since we're dealing with different locks,
but from the same lock class.  cma_disable_callback() acquires the
listening id mutex, whereas rdma_destroy_id() acquires the mutex for
the new connection id.  To fix this, delay the call to
rdma_destroy_id() until we've released the listening id mutex.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
b6cec8a
popcornmix popcornmix referenced this issue from a commit May 15, 2012
xfs: protect xfs_sync_worker with s_umount semaphore
xfs_sync_worker checks the MS_ACTIVE flag in s_flags to avoid doing
work during mount and unmount.  This flag can be cleared by unmount
after the xfs_sync_worker checks it but before the work is completed.
The has caused crashes in the completion handler for the dummy
transaction commited by xfs_sync_worker:

PID: 27544  TASK: ffff88013544e040  CPU: 3   COMMAND: "kworker/3:0"
 #0 [ffff88016fdff930] machine_kexec at ffffffff810244e9
 #1 [ffff88016fdff9a0] crash_kexec at ffffffff8108d053
 #2 [ffff88016fdffa70] oops_end at ffffffff813ad1b8
 #3 [ffff88016fdffaa0] no_context at ffffffff8102bd48
 #4 [ffff88016fdffaf0] __bad_area_nosemaphore at ffffffff8102c04d
 #5 [ffff88016fdffb40] bad_area_nosemaphore at ffffffff8102c12e
 #6 [ffff88016fdffb50] do_page_fault at ffffffff813afaee
 #7 [ffff88016fdffc60] page_fault at ffffffff813ac635
    [exception RIP: xlog_get_lowest_lsn+0x30]
    RIP: ffffffffa04a9910  RSP: ffff88016fdffd10  RFLAGS: 00010246
    RAX: ffffc90014e48000  RBX: ffff88014d879980  RCX: ffff88014d879980
    RDX: ffff8802214ee4c0  RSI: 0000000000000000  RDI: 0000000000000000
    RBP: ffff88016fdffd10   R8: ffff88014d879a80   R9: 0000000000000000
    R10: 0000000000000001  R11: 0000000000000000  R12: ffff8802214ee400
    R13: ffff88014d879980  R14: 0000000000000000  R15: ffff88022fd96605
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #8 [ffff88016fdffd18] xlog_state_do_callback at ffffffffa04aa186 [xfs]
 #9 [ffff88016fdffd98] xlog_state_done_syncing at ffffffffa04aa568 [xfs]

Protect xfs_sync_worker by using the s_umount semaphore at the read
level to provide exclusion with unmount while work is progressing.

Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
1307bbd
popcornmix popcornmix referenced this issue from a commit May 15, 2012
shirishpargaonkar cifs: Include backup intent search flags during searches {try #2)
As observed and suggested by Tushar Gosavi...

---------
readdir calls these function to send TRANS2_FIND_FIRST and
TRANS2_FIND_NEXT command to the server. The current cifs module is
not specifying CIFS_SEARCH_BACKUP_SEARCH flag while sending these
command when backupuid/backupgid is specified. This can be resolved
by specifying CIFS_SEARCH_BACKUP_SEARCH flag.
---------

Cc: <stable@kernel.org>
Reported-and-Tested-by: Tushar Gosavi <tugosavi@in.ibm.com>
Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2608bee
popcornmix popcornmix referenced this issue from a commit May 16, 2012
iwlwifi: update BT traffic load states correctly
When BT traffic load changes from its
previous state, a new LQ command needs to be
sent down to the firmware. This needs to
be done only once per change. The state
variable that keeps track of this change is
last_bt_traffic_load. However, it was not
being updated when the change had been
handled. Not updating this variable was
causing a flood of advanced BT config
commands to be sent to the firmware. Fix
this.

Cc: stable@vger.kernel.org #2.6.38+
Signed-off-by: Meenakshi Venkataraman <meenakshi.venkataraman@intel.com>
Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
882dde8
popcornmix popcornmix referenced this issue from a commit May 16, 2012
iwlwifi: do not use shadow registers by default
Shadow registers in the device are meant to
allow the driver to update certain device
registers without needing to wake up all
components of the device. However, using
this feature in the device causes
communication between the driver and the
device to become unreliable, resulting in
host command timeouts.

Disable this feature by default till a fix is
available for the bug.

Cc: stable@vger.kernel.org #2.6.38+
Signed-off-by: Meenakshi Venkataraman <meenakshi.venkataraman@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
66a7707
popcornmix popcornmix referenced this issue from a commit May 28, 2012
taoma-tm ext4: protect group inode free counting with group lock
Now when we set the group inode free count, we don't have a proper
group lock so that multiple threads may decrease the inode free
count at the same time. And e2fsck will complain something like:

Free inodes count wrong for group #1 (1, counted=0).
Fix? no

Free inodes count wrong for group #2 (3, counted=0).
Fix? no

Directories count wrong for group #2 (780, counted=779).
Fix? no

Free inodes count wrong for group #3 (2272, counted=2273).
Fix? no

So this patch try to protect it with the ext4_lock_group.

btw, it is found by xfstests test case 269 and the volume is
mkfsed with the parameter
"-O ^resize_inode,^uninit_bg,extent,meta_bg,flex_bg,ext_attr"
and I have run it 100 times and the error in e2fsck doesn't
show up again.

Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
6f2e9f0
popcornmix popcornmix referenced this issue from a commit May 29, 2012
Linus Torvalds Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Pull CIFS updates from Steve French.

* 'for-next' of git://git.samba.org/sfrench/cifs-2.6: (29 commits)
  cifs: fix oops while traversing open file list (try #4)
  cifs: Fix comment as d_alloc_root() is replaced by d_make_root()
  CIFS: Introduce SMB2 mounts as vers=2.1
  CIFS: Introduce SMB2 Kconfig option
  CIFS: Move add/set_credits and get_credits_field to ops structure
  CIFS: Move protocol specific demultiplex thread calls to ops struct
  CIFS: Move protocol specific part from cifs_readv_receive to ops struct
  CIFS: Move header_size/max_header_size to ops structure
  CIFS: Move protocol specific part from SendReceive2 to ops struct
  cifs: Include backup intent search flags during searches {try #2)
  CIFS: Separate protocol specific part from setlk
  CIFS: Separate protocol specific part from getlk
  CIFS: Separate protocol specific lock type handling
  CIFS: Convert lock type to 32 bit variable
  CIFS: Move locks to cifsFileInfo structure
  cifs: convert send_nt_cancel into a version specific op
  cifs: add a smb_version_operations/values structures and a smb_version enum
  cifs: remove the vers= and version= synonyms for ver=
  cifs: add warning about change in default cache semantics in 3.7
  cifs: display cache= option in /proc/mounts
  ...
442a9ff
popcornmix popcornmix referenced this issue from a commit May 29, 2012
mm: pmd_read_atomic: fix 32bit PAE pmd walk vs pmd_populate SMP race …
…condition

When holding the mmap_sem for reading, pmd_offset_map_lock should only
run on a pmd_t that has been read atomically from the pmdp pointer,
otherwise we may read only half of it leading to this crash.

PID: 11679  TASK: f06e8000  CPU: 3   COMMAND: "do_race_2_panic"
 #0 [f06a9dd8] crash_kexec at c049b5ec
 #1 [f06a9e2c] oops_end at c083d1c2
 #2 [f06a9e40] no_context at c0433ded
 #3 [f06a9e64] bad_area_nosemaphore at c043401a
 #4 [f06a9e6c] __do_page_fault at c0434493
 #5 [f06a9eec] do_page_fault at c083eb45
 #6 [f06a9f04] error_code (via page_fault) at c083c5d5
    EAX: 01fb470c EBX: fff35000 ECX: 00000003 EDX: 00000100 EBP:
    00000000
    DS:  007b     ESI: 9e201000 ES:  007b     EDI: 01fb4700 GS:  00e0
    CS:  0060     EIP: c083bc14 ERR: ffffffff EFLAGS: 00010246
 #7 [f06a9f38] _spin_lock at c083bc14
 #8 [f06a9f44] sys_mincore at c0507b7d
 #9 [f06a9fb0] system_call at c083becd
                         start           len
    EAX: ffffffda  EBX: 9e200000  ECX: 00001000  EDX: 6228537f
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 003d0f00
    SS:  007b      ESP: 62285354  EBP: 62285388  GS:  0033
    CS:  0073      EIP: 00291416  ERR: 000000da  EFLAGS: 00000286

This should be a longstanding bug affecting x86 32bit PAE without THP.
Only archs with 64bit large pmd_t and 32bit unsigned long should be
affected.

With THP enabled the barrier() in pmd_none_or_trans_huge_or_clear_bad()
would partly hide the bug when the pmd transition from none to stable,
by forcing a re-read of the *pmd in pmd_offset_map_lock, but when THP is
enabled a new set of problem arises by the fact could then transition
freely in any of the none, pmd_trans_huge or pmd_trans_stable states.
So making the barrier in pmd_none_or_trans_huge_or_clear_bad()
unconditional isn't good idea and it would be a flakey solution.

This should be fully fixed by introducing a pmd_read_atomic that reads
the pmd in order with THP disabled, or by reading the pmd atomically
with cmpxchg8b with THP enabled.

Luckily this new race condition only triggers in the places that must
already be covered by pmd_none_or_trans_huge_or_clear_bad() so the fix
is localized there but this bug is not related to THP.

NOTE: this can trigger on x86 32bit systems with PAE enabled with more
than 4G of ram, otherwise the high part of the pmd will never risk to be
truncated because it would be zero at all times, in turn so hiding the
SMP race.

This bug was discovered and fully debugged by Ulrich, quote:

----
[..]
pmd_none_or_trans_huge_or_clear_bad() loads the content of edx and
eax.

    496 static inline int pmd_none_or_trans_huge_or_clear_bad(pmd_t
    *pmd)
    497 {
    498         /* depend on compiler for an atomic pmd read */
    499         pmd_t pmdval = *pmd;

                                // edi = pmd pointer
0xc0507a74 <sys_mincore+548>:   mov    0x8(%esp),%edi
...
                                // edx = PTE page table high address
0xc0507a84 <sys_mincore+564>:   mov    0x4(%edi),%edx
...
                                // eax = PTE page table low address
0xc0507a8e <sys_mincore+574>:   mov    (%edi),%eax

[..]

Please note that the PMD is not read atomically. These are two "mov"
instructions where the high order bits of the PMD entry are fetched
first. Hence, the above machine code is prone to the following race.

-  The PMD entry {high|low} is 0x0000000000000000.
   The "mov" at 0xc0507a84 loads 0x00000000 into edx.

-  A page fault (on another CPU) sneaks in between the two "mov"
   instructions and instantiates the PMD.

-  The PMD entry {high|low} is now 0x00000003fda38067.
   The "mov" at 0xc0507a8e loads 0xfda38067 into eax.
----

Reported-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
26c1917
popcornmix popcornmix referenced this issue from a commit May 01, 2012
FRV: Shrink TIF_WORK_MASK [ver #2]
Shrink TIF_WORK_MASK so that it will fit in the 12-bit signed immediate
operand field of an ANDI instruction.

Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
1e5ef91
popcornmix popcornmix referenced this issue from a commit May 01, 2012
FRV: Optimise the system call exit path in entry.S [ver #2]
Optimise the system call exit path in entry.S by packing some instructions.

Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
a2eddc7
popcornmix popcornmix referenced this issue from a commit June 01, 2012
Linus Torvalds Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel…
…/git/viro/signal

Pull third pile of signal handling patches from Al Viro:
 "This time it's mostly helpers and conversions to them; there's a lot
  of stuff remaining in the tree, but that'll either go in -rc2
  (isolated bug fixes, ideally via arch maintainers' trees) or will sit
  there until the next cycle."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
  x86: get rid of calling do_notify_resume() when returning to kernel mode
  blackfin: check __get_user() return value
  whack-a-mole with TIF_FREEZE
  FRV: Optimise the system call exit path in entry.S [ver #2]
  FRV: Shrink TIF_WORK_MASK [ver #2]
  FRV: Prevent syscall exit tracing and notify_resume at end of kernel exceptions
  new helper: signal_delivered()
  powerpc: get rid of restore_sigmask()
  most of set_current_blocked() callers want SIGKILL/SIGSTOP removed from set
  set_restore_sigmask() is never called without SIGPENDING (and never should be)
  TIF_RESTORE_SIGMASK can be set only when TIF_SIGPENDING is set
  don't call try_to_freeze() from do_signal()
  pull clearing RESTORE_SIGMASK into block_sigmask()
  sh64: failure to build sigframe != signal without handler
  openrisc: tracehook_signal_handler() is supposed to be called on success
  new helper: sigmask_to_save()
  new helper: restore_saved_sigmask()
  new helpers: {clear,test,test_and_clear}_restore_sigmask()
  HAVE_RESTORE_SIGMASK is defined on all architectures now
86c47b7
popcornmix popcornmix referenced this issue from a commit June 01, 2012
Stanislaw Gruszka rt2x00: use atomic variable for seqno
Remove spinlock as atomic_t can be used instead. Note we use only 16
lower bits, upper bits are changed but we impilcilty cast to u16.

This fix possible deadlock on IBSS mode reproted by lockdep:

=================================
[ INFO: inconsistent lock state ]
3.4.0-wl+ #4 Not tainted
---------------------------------
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
kworker/u:2/30374 [HC0[0]:SC0[0]:HE1:SE1] takes:
 (&(&intf->seqlock)->rlock){+.?...}, at: [<f9979a20>] rt2x00queue_create_tx_descriptor+0x380/0x490 [rt2x00lib]
{IN-SOFTIRQ-W} state was registered at:
  [<c04978ab>] __lock_acquire+0x47b/0x1050
  [<c0498504>] lock_acquire+0x84/0xf0
  [<c0835733>] _raw_spin_lock+0x33/0x40
  [<f9979a20>] rt2x00queue_create_tx_descriptor+0x380/0x490 [rt2x00lib]
  [<f9979f2a>] rt2x00queue_write_tx_frame+0x1a/0x300 [rt2x00lib]
  [<f997834f>] rt2x00mac_tx+0x7f/0x380 [rt2x00lib]
  [<f98fe363>] __ieee80211_tx+0x1b3/0x300 [mac80211]
  [<f98ffdf5>] ieee80211_tx+0x105/0x130 [mac80211]
  [<f99000dd>] ieee80211_xmit+0xad/0x100 [mac80211]
  [<f9900519>] ieee80211_subif_start_xmit+0x2d9/0x930 [mac80211]
  [<c0782e87>] dev_hard_start_xmit+0x307/0x660
  [<c079bb71>] sch_direct_xmit+0xa1/0x1e0
  [<c0784bb3>] dev_queue_xmit+0x183/0x730
  [<c078c27a>] neigh_resolve_output+0xfa/0x1e0
  [<c07b436a>] ip_finish_output+0x24a/0x460
  [<c07b4897>] ip_output+0xb7/0x100
  [<c07b2d60>] ip_local_out+0x20/0x60
  [<c07e01ff>] igmpv3_sendpack+0x4f/0x60
  [<c07e108f>] igmp_ifc_timer_expire+0x29f/0x330
  [<c04520fc>] run_timer_softirq+0x15c/0x2f0
  [<c0449e3e>] __do_softirq+0xae/0x1e0
irq event stamp: 18380437
hardirqs last  enabled at (18380437): [<c0526027>] __slab_alloc.clone.3+0x67/0x5f0
hardirqs last disabled at (18380436): [<c0525ff3>] __slab_alloc.clone.3+0x33/0x5f0
softirqs last  enabled at (18377616): [<c0449eb3>] __do_softirq+0x123/0x1e0
softirqs last disabled at (18377611): [<c041278d>] do_softirq+0x9d/0xe0

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&(&intf->seqlock)->rlock);
  <Interrupt>
    lock(&(&intf->seqlock)->rlock);

 *** DEADLOCK ***

4 locks held by kworker/u:2/30374:
 #0:  (wiphy_name(local->hw.wiphy)){++++.+}, at: [<c045cf99>] process_one_work+0x109/0x3f0
 #1:  ((&sdata->work)){+.+.+.}, at: [<c045cf99>] process_one_work+0x109/0x3f0
 #2:  (&ifibss->mtx){+.+.+.}, at: [<f98f005b>] ieee80211_ibss_work+0x1b/0x470 [mac80211]
 #3:  (&intf->beacon_skb_mutex){+.+...}, at: [<f997a644>] rt2x00queue_update_beacon+0x24/0x50 [rt2x00lib]

stack backtrace:
Pid: 30374, comm: kworker/u:2 Not tainted 3.4.0-wl+ #4
Call Trace:
 [<c04962a6>] print_usage_bug+0x1f6/0x220
 [<c0496a12>] mark_lock+0x2c2/0x300
 [<c0495ff0>] ? check_usage_forwards+0xc0/0xc0
 [<c04978ec>] __lock_acquire+0x4bc/0x1050
 [<c0527890>] ? __kmalloc_track_caller+0x1c0/0x1d0
 [<c0777fb6>] ? copy_skb_header+0x26/0x90
 [<c0498504>] lock_acquire+0x84/0xf0
 [<f9979a20>] ? rt2x00queue_create_tx_descriptor+0x380/0x490 [rt2x00lib]
 [<c0835733>] _raw_spin_lock+0x33/0x40
 [<f9979a20>] ? rt2x00queue_create_tx_descriptor+0x380/0x490 [rt2x00lib]
 [<f9979a20>] rt2x00queue_create_tx_descriptor+0x380/0x490 [rt2x00lib]
 [<f997a5cf>] rt2x00queue_update_beacon_locked+0x5f/0xb0 [rt2x00lib]
 [<f997a64d>] rt2x00queue_update_beacon+0x2d/0x50 [rt2x00lib]
 [<f9977e3a>] rt2x00mac_bss_info_changed+0x1ca/0x200 [rt2x00lib]
 [<f9977c70>] ? rt2x00mac_remove_interface+0x70/0x70 [rt2x00lib]
 [<f98e4dd0>] ieee80211_bss_info_change_notify+0xe0/0x1d0 [mac80211]
 [<f98ef7b8>] __ieee80211_sta_join_ibss+0x3b8/0x610 [mac80211]
 [<c0496ab4>] ? mark_held_locks+0x64/0xc0
 [<c0440012>] ? virt_efi_query_capsule_caps+0x12/0x50
 [<f98efb09>] ieee80211_sta_join_ibss+0xf9/0x140 [mac80211]
 [<f98f0456>] ieee80211_ibss_work+0x416/0x470 [mac80211]
 [<c0496d8b>] ? trace_hardirqs_on+0xb/0x10
 [<c077683b>] ? skb_dequeue+0x4b/0x70
 [<f98f207f>] ieee80211_iface_work+0x13f/0x230 [mac80211]
 [<c045cf99>] ? process_one_work+0x109/0x3f0
 [<c045d015>] process_one_work+0x185/0x3f0
 [<c045cf99>] ? process_one_work+0x109/0x3f0
 [<f98f1f40>] ? ieee80211_teardown_sdata+0xa0/0xa0 [mac80211]
 [<c045ed86>] worker_thread+0x116/0x270
 [<c045ec70>] ? manage_workers+0x1e0/0x1e0
 [<c0462f64>] kthread+0x84/0x90
 [<c0462ee0>] ? __init_kthread_worker+0x60/0x60
 [<c083d382>] kernel_thread_helper+0x6/0x10

Cc: stable@vger.kernel.org
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Helmut Schaa <helmut.schaa@googlemail.com>
Acked-by: Gertjan van Wingerde <gwingerde@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
e5851da
popcornmix popcornmix referenced this issue from a commit June 08, 2012
Takashi Iwai ALSA: usb-audio: Fix substream assignments
In 3.5 kernel, the endpoint is assigned dynamically for the
substreams, but the PCM assignment still checks the presence of the
endpoint pointer.  This ended up in duplicated PCM substream creations
at probing time, resulting in kernel warnings like:

WARNING: at fs/proc/generic.c:586 proc_register+0x169/0x1a6()
Pid: 1152, comm: modprobe Not tainted 3.5.0-rc1-00110-g71fae7e #2
Call Trace:
 [<ffffffff8102a400>] warn_slowpath_common+0x83/0x9c
 [<ffffffff8102a4bc>] warn_slowpath_fmt+0x46/0x48
 [<ffffffff813829ad>] ? add_preempt_count+0x39/0x3b
 [<ffffffff811292f0>] proc_register+0x169/0x1a6
 [<ffffffff8112962e>] create_proc_entry+0x74/0x8c
 [<ffffffffa018eb63>] snd_info_register+0x3e/0xc3 [snd]
 [<ffffffffa01fde2e>] snd_pcm_new_stream+0xb1/0x404 [snd_pcm]
 [<ffffffffa024861f>] snd_usb_add_audio_stream+0xd2/0x230 [snd_usb_audio]
 [<ffffffffa0241d33>] ? snd_usb_parse_audio_format+0x252/0x34f [snd_usb_audio]
 [<ffffffff810d6b17>] ? kmem_cache_alloc_trace+0xab/0xbb
 [<ffffffffa0248c29>] snd_usb_parse_audio_interface+0x4ac/0x567 [snd_usb_audio]
 [<ffffffffa023f0ff>] snd_usb_create_stream+0xe9/0x125 [snd_usb_audio]
 [<ffffffffa023f9b1>] usb_audio_probe+0x62a/0x72c [snd_usb_audio]
 .....

This patch fixes the regression by checking the fixed endpoint number
for each substream instead of the endpoint pointer.

Reported-and-tested-by: Jamie Heilman <jamie@audible.transient.net>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
8260ef0
popcornmix popcornmix referenced this issue from a commit May 08, 2012
edac: avoid mce decoding crash after edac driver unloaded
Some edac drivers register themselves as mce decoders via
notifier_chain. But in current notifier_chain implementation logic,
it doesn't accept same notifier registered twice. If so, it will be
wrong when adding/removing the element from the list. For example,
on one SandyBridge platform, remove module sb_edac and then trigger
one error, it will hit oops because it has no mce decoder registered
but related notifier_chain still points to an invalid callback
function. Here is an example:

Call Trace:
 [<ffffffff8150ef6a>] atomic_notifier_call_chain+0x1a/0x20
 [<ffffffff8102b936>] mce_log+0x46/0x180
 [<ffffffff8102eaea>] apei_mce_report_mem_error+0x4a/0x60
 [<ffffffff812e19d2>] ghes_do_proc+0x192/0x210
 [<ffffffff812e2066>] ghes_proc+0x46/0x70
 [<ffffffff812e20d8>] ghes_notify_sci+0x48/0x80
 [<ffffffff8150ef05>] notifier_call_chain+0x55/0x80
 [<ffffffff81076f1a>] __blocking_notifier_call_chain+0x5a/0x80
 [<ffffffff812aea11>] ? acpi_os_wait_events_complete+0x23/0x23
 [<ffffffff81076f56>] blocking_notifier_call_chain+0x16/0x20
 [<ffffffff812ddc4d>] acpi_hed_notify+0x19/0x1b
 [<ffffffff812b16bd>] acpi_device_notify+0x19/0x1b
 [<ffffffff812beb38>] acpi_ev_notify_dispatch+0x67/0x7f
 [<ffffffff812aea3a>] acpi_os_execute_deferred+0x29/0x36
 [<ffffffff81069dc2>] process_one_work+0x132/0x450
 [<ffffffff8106bbcb>] worker_thread+0x17b/0x3c0
 [<ffffffff8106ba50>] ? manage_workers+0x120/0x120
 [<ffffffff81070aee>] kthread+0x9e/0xb0
 [<ffffffff81514724>] kernel_thread_helper+0x4/0x10
 [<ffffffff81070a50>] ? kthread_freezable_should_stop+0x70/0x70
 [<ffffffff81514720>] ? gs_change+0x13/0x13
Code: f3 49 89 d4 45 85 ed 4d 89 c6 48 8b 0f 74 48 48 85 c9 75 17 eb 41
0f 1f 80 00 00 00 00 41 83 ed 01 4c 89 f9 74 22 4d 85 ff 74 1d <4c> 8b
79 08 4c 89 e2 48 89 de 48 89 cf ff 11 4d 85 f6 74 04 41
RIP  [<ffffffff8150eef6>] notifier_call_chain+0x46/0x80
 RSP <ffff88042868fb20>
CR2: ffffffffa01af838
---[ end trace 0100930068e73e6f ]---
BUG: unable to handle kernel paging request at fffffffffffffff8
IP: [<ffffffff810705b0>] kthread_data+0x10/0x20
PGD 1a0d067 PUD 1a0e067 PMD 0
Oops: 0000 [#2] SMP

Only i7core_edac and sb_edac have such issues because they have more
than one memory controller which means they have to register mce
decoder many times.

Cc: <stable@vger.kernel.org> # 3.2 and upper
Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
e35fca4
popcornmix popcornmix referenced this issue from a commit June 12, 2012
elp cfg80211: fix potential deadlock in regulatory
reg_timeout_work() calls restore_regulatory_settings() which
takes cfg80211_mutex.

reg_set_request_processed() already holds cfg80211_mutex
before calling cancel_delayed_work_sync(reg_timeout),
so it might deadlock.

Call the async cancel_delayed_work instead, in order
to avoid the potential deadlock.

This is the relevant lockdep warning:

cfg80211: Calling CRDA for country: XX

======================================================
[ INFO: possible circular locking dependency detected ]
3.4.0-rc5-wl+ #26 Not tainted
-------------------------------------------------------
kworker/0:2/1391 is trying to acquire lock:
 (cfg80211_mutex){+.+.+.}, at: [<bf28ae00>] restore_regulatory_settings+0x34/0x418 [cfg80211]

but task is already holding lock:
 ((reg_timeout).work){+.+...}, at: [<c0059e94>] process_one_work+0x1f0/0x480

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 ((reg_timeout).work){+.+...}:
       [<c008fd44>] validate_chain+0xb94/0x10f0
       [<c0090b68>] __lock_acquire+0x8c8/0x9b0
       [<c0090d40>] lock_acquire+0xf0/0x114
       [<c005b600>] wait_on_work+0x4c/0x154
       [<c005c000>] __cancel_work_timer+0xd4/0x11c
       [<c005c064>] cancel_delayed_work_sync+0x1c/0x20
       [<bf28b274>] reg_set_request_processed+0x50/0x78 [cfg80211]
       [<bf28bd84>] set_regdom+0x550/0x600 [cfg80211]
       [<bf294cd8>] nl80211_set_reg+0x218/0x258 [cfg80211]
       [<c03c7738>] genl_rcv_msg+0x1a8/0x1e8
       [<c03c6a00>] netlink_rcv_skb+0x5c/0xc0
       [<c03c7584>] genl_rcv+0x28/0x34
       [<c03c6720>] netlink_unicast+0x15c/0x228
       [<c03c6c7c>] netlink_sendmsg+0x218/0x298
       [<c03933c8>] sock_sendmsg+0xa4/0xc0
       [<c039406c>] __sys_sendmsg+0x1e4/0x268
       [<c0394228>] sys_sendmsg+0x4c/0x70
       [<c0013840>] ret_fast_syscall+0x0/0x3c

-> #1 (reg_mutex){+.+.+.}:
       [<c008fd44>] validate_chain+0xb94/0x10f0
       [<c0090b68>] __lock_acquire+0x8c8/0x9b0
       [<c0090d40>] lock_acquire+0xf0/0x114
       [<c04734dc>] mutex_lock_nested+0x48/0x320
       [<bf28b2cc>] reg_todo+0x30/0x538 [cfg80211]
       [<c0059f44>] process_one_work+0x2a0/0x480
       [<c005a4b4>] worker_thread+0x1bc/0x2bc
       [<c0061148>] kthread+0x98/0xa4
       [<c0014af4>] kernel_thread_exit+0x0/0x8

-> #0 (cfg80211_mutex){+.+.+.}:
       [<c008ed58>] print_circular_bug+0x68/0x2cc
       [<c008fb28>] validate_chain+0x978/0x10f0
       [<c0090b68>] __lock_acquire+0x8c8/0x9b0
       [<c0090d40>] lock_acquire+0xf0/0x114
       [<c04734dc>] mutex_lock_nested+0x48/0x320
       [<bf28ae00>] restore_regulatory_settings+0x34/0x418 [cfg80211]
       [<bf28b200>] reg_timeout_work+0x1c/0x20 [cfg80211]
       [<c0059f44>] process_one_work+0x2a0/0x480
       [<c005a4b4>] worker_thread+0x1bc/0x2bc
       [<c0061148>] kthread+0x98/0xa4
       [<c0014af4>] kernel_thread_exit+0x0/0x8

other info that might help us debug this:

Chain exists of:
  cfg80211_mutex --> reg_mutex --> (reg_timeout).work

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock((reg_timeout).work);
                               lock(reg_mutex);
                               lock((reg_timeout).work);
  lock(cfg80211_mutex);

 *** DEADLOCK ***

2 locks held by kworker/0:2/1391:
 #0:  (events){.+.+.+}, at: [<c0059e94>] process_one_work+0x1f0/0x480
 #1:  ((reg_timeout).work){+.+...}, at: [<c0059e94>] process_one_work+0x1f0/0x480

stack backtrace:
[<c001b928>] (unwind_backtrace+0x0/0x12c) from [<c0471d3c>] (dump_stack+0x20/0x24)
[<c0471d3c>] (dump_stack+0x20/0x24) from [<c008ef70>] (print_circular_bug+0x280/0x2cc)
[<c008ef70>] (print_circular_bug+0x280/0x2cc) from [<c008fb28>] (validate_chain+0x978/0x10f0)
[<c008fb28>] (validate_chain+0x978/0x10f0) from [<c0090b68>] (__lock_acquire+0x8c8/0x9b0)
[<c0090b68>] (__lock_acquire+0x8c8/0x9b0) from [<c0090d40>] (lock_acquire+0xf0/0x114)
[<c0090d40>] (lock_acquire+0xf0/0x114) from [<c04734dc>] (mutex_lock_nested+0x48/0x320)
[<c04734dc>] (mutex_lock_nested+0x48/0x320) from [<bf28ae00>] (restore_regulatory_settings+0x34/0x418 [cfg80211])
[<bf28ae00>] (restore_regulatory_settings+0x34/0x418 [cfg80211]) from [<bf28b200>] (reg_timeout_work+0x1c/0x20 [cfg80211])
[<bf28b200>] (reg_timeout_work+0x1c/0x20 [cfg80211]) from [<c0059f44>] (process_one_work+0x2a0/0x480)
[<c0059f44>] (process_one_work+0x2a0/0x480) from [<c005a4b4>] (worker_thread+0x1bc/0x2bc)
[<c005a4b4>] (worker_thread+0x1bc/0x2bc) from [<c0061148>] (kthread+0x98/0xa4)
[<c0061148>] (kthread+0x98/0xa4) from [<c0014af4>] (kernel_thread_exit+0x0/0x8)
cfg80211: Calling CRDA to update world regulatory domain
cfg80211: World regulatory domain updated:
cfg80211:   (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
cfg80211:   (2402000 KHz - 2472000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
cfg80211:   (2457000 KHz - 2482000 KHz @ 20000 KHz), (300 mBi, 2000 mBm)
cfg80211:   (2474000 KHz - 2494000 KHz @ 20000 KHz), (300 mBi, 2000 mBm)
cfg80211:   (5170000 KHz - 5250000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
cfg80211:   (5735000 KHz - 5835000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)

Cc: stable@kernel.org
Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
fe20b39
popcornmix popcornmix referenced this issue from a commit June 06, 2012
x86/smp: Fix topology checks on AMD MCM CPUs
The warning below triggers on AMD MCM packages because physical package
IDs on the cores of a _physical_ socket are the same. I.e., this field
says which CPUs belong to the same physical package.

However, the same two CPUs belong to two different internal, i.e.
"logical" nodes in the same physical socket which is reflected in the
CPU-to-node map on x86 with NUMA.

Which makes this check wrong on the above topologies so circumvent it.

[    0.444413] Booting Node   0, Processors  #1 #2 #3 #4 #5 Ok.
[    0.461388] ------------[ cut here ]------------
[    0.465997] WARNING: at arch/x86/kernel/smpboot.c:310 topology_sane.clone.1+0x6e/0x81()
[    0.473960] Hardware name: Dinar
[    0.477170] sched: CPU #6's mc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency.
[    0.486860] Booting Node   1, Processors  #6
[    0.491104] Modules linked in:
[    0.494141] Pid: 0, comm: swapper/6 Not tainted 3.4.0+ #1
[    0.499510] Call Trace:
[    0.501946]  [<ffffffff8144bf92>] ? topology_sane.clone.1+0x6e/0x81
[    0.508185]  [<ffffffff8102f1fc>] warn_slowpath_common+0x85/0x9d
[    0.514163]  [<ffffffff8102f2b7>] warn_slowpath_fmt+0x46/0x48
[    0.519881]  [<ffffffff8144bf92>] topology_sane.clone.1+0x6e/0x81
[    0.525943]  [<ffffffff8144c234>] set_cpu_sibling_map+0x251/0x371
[    0.532004]  [<ffffffff8144c4ee>] start_secondary+0x19a/0x218
[    0.537729] ---[ end trace 4eaa2a86a8e2da22 ]---
[    0.628197]  #7 #8 #9 #10 #11 Ok.
[    0.807108] Booting Node   3, Processors  #12 #13 #14 #15 #16 #17 Ok.
[    0.897587] Booting Node   2, Processors  #18 #19 #20 #21 #22 #23 Ok.
[    0.917443] Brought up 24 CPUs

We ran a topology sanity check test we have here on it and
it all looks ok... hopefully :).

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120529135442.GE29157@aftab.osrc.amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
161270f
popcornmix popcornmix referenced this issue from a commit June 13, 2012
watchdog: Quiet down the boot messages
A bunch of bugzillas have complained how noisy the nmi_watchdog
is during boot-up especially with its expected failure cases
(like virt and bios resource contention).

This is my attempt to quiet them down and keep it less confusing
for the end user.  What I did is print the message for cpu0 and
save it for future comparisons.  If future cpus have an
identical message as cpu0, then don't print the redundant info.
However, if a future cpu has a different message, happily print
that loudly.

Before the change, you would see something like:

    ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
    CPU0: Intel(R) Core(TM)2 Quad CPU    Q9550  @ 2.83GHz stepping 0a
    Performance Events: PEBS fmt0+, Core2 events, Intel PMU driver.
    ... version:                2
    ... bit width:              40
    ... generic registers:      2
    ... value mask:             000000ffffffffff
    ... max period:             000000007fffffff
    ... fixed-purpose events:   3
    ... event mask:             0000000700000003
    NMI watchdog enabled, takes one hw-pmu counter.
    Booting Node   0, Processors  #1
    NMI watchdog enabled, takes one hw-pmu counter.
     #2
    NMI watchdog enabled, takes one hw-pmu counter.
     #3 Ok.
    NMI watchdog enabled, takes one hw-pmu counter.
    Brought up 4 CPUs
    Total of 4 processors activated (22607.24 BogoMIPS).

After the change, it is simplified to:

    ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
    CPU0: Intel(R) Core(TM)2 Quad CPU    Q9550  @ 2.83GHz stepping 0a
    Performance Events: PEBS fmt0+, Core2 events, Intel PMU driver.
    ... version:                2
    ... bit width:              40
    ... generic registers:      2
    ... value mask:             000000ffffffffff
    ... max period:             000000007fffffff
    ... fixed-purpose events:   3
    ... event mask:             0000000700000003
    NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
    Booting Node   0, Processors  #1 #2 #3 Ok.
    Brought up 4 CPUs

V2: little changes based on Joe Perches' feedback
V3: printk cleanup based on Ingo's feedback; checkpatch fix
V4: keep printk as one long line
V5: Ingo fix ups

Reported-and-tested-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: nzimmer@sgi.com
Cc: joe@perches.com
Link: http://lkml.kernel.org/r/1339594548-17227-1-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
a702704
popcornmix popcornmix referenced this issue from a commit June 21, 2012
Bjørn Mork net: qmi_wwan: fix Gobi device probing
Ignoring interfaces with additional descriptors is not a reliable
method for locating the correct interface on Gobi devices.  There
is at least one device where this method fails:
https://bbs.archlinux.org/viewtopic.php?id=143506

The result is that the AT command port (interface #2) is hidden
from qcserial, preventing traditional serial modem usage:

[   15.562552] qmi_wwan 4-1.6:1.0: cdc-wdm0: USB WDM device
[   15.562691] qmi_wwan 4-1.6:1.0: wwan0: register 'qmi_wwan' at usb-0000:00:1d.0-1.6, Qualcomm Gobi wwan/QMI device, 1e:df:3c:3a:4e:3b
[   15.563383] qmi_wwan: probe of 4-1.6:1.1 failed with error -22
[   15.564189] qmi_wwan 4-1.6:1.2: cdc-wdm1: USB WDM device
[   15.564302] qmi_wwan 4-1.6:1.2: wwan1: register 'qmi_wwan' at usb-0000:00:1d.0-1.6, Qualcomm Gobi wwan/QMI device, 1e:df:3c:3a:4e:3b
[   15.564328] qmi_wwan: probe of 4-1.6:1.3 failed with error -22
[   15.569376] qcserial 4-1.6:1.1: Qualcomm USB modem converter detected
[   15.569440] usb 4-1.6: Qualcomm USB modem converter now attached to ttyUSB0
[   15.570372] qcserial 4-1.6:1.3: Qualcomm USB modem converter detected
[   15.570430] usb 4-1.6: Qualcomm USB modem converter now attached to ttyUSB1

Use static interface numbers taken from the interface map in
qcserial for all Gobi devices instead:

	Gobi 1K USB layout:
	0: serial port (doesn't respond)
	1: serial port (doesn't respond)
	2: AT-capable modem port
	3: QMI/net

	Gobi 2K+ USB layout:
	0: QMI/net
	1: DM/DIAG (use libqcdm from ModemManager for communication)
	2: AT-capable modem port
	3: NMEA

This should be more reliable over all, and will also prevent the
noisy "probe failed" messages.  The whitelisting logic is expected
to be replaced by direct interface number matching in 3.6.

Reported-by: Heinrich Siebmanns (Harvey) <H.Siebmanns@t-online.de>
Cc: <stable@vger.kernel.org> # v3.4: 0000188 USB: qmi_wwan: Make forced int 4 whitelist generic
Cc: <stable@vger.kernel.org> # v3.4: f7142e6 USB: qmi_wwan: Add ZTE (Vodafone) K3520-Z
Cc: <stable@vger.kernel.org> # v3.4
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
b9f90eb
popcornmix popcornmix referenced this issue from a commit June 25, 2012
net: l2tp_eth: use LLTX to avoid LOCKDEP splats
Denys Fedoryshchenko reported a LOCKDEP issue with l2tp code.

[ 8683.927442] ======================================================
[ 8683.927555] [ INFO: possible circular locking dependency detected ]
[ 8683.927672] 3.4.1-build-0061 #14 Not tainted
[ 8683.927782] -------------------------------------------------------
[ 8683.927895] swapper/0/0 is trying to acquire lock:
[ 8683.928007]  (slock-AF_INET){+.-...}, at: [<e0fc73ec>]
l2tp_xmit_skb+0x173/0x47e [l2tp_core]
[ 8683.928121]
[ 8683.928121] but task is already holding lock:
[ 8683.928121]  (_xmit_ETHER#2){+.-...}, at: [<c02f062d>]
sch_direct_xmit+0x36/0x119
[ 8683.928121]
[ 8683.928121] which lock already depends on the new lock.
[ 8683.928121]
[ 8683.928121]
[ 8683.928121] the existing dependency chain (in reverse order) is:
[ 8683.928121]
[ 8683.928121] -> #1 (_xmit_ETHER#2){+.-...}:
[ 8683.928121]        [<c015a561>] lock_acquire+0x71/0x85
[ 8683.928121]        [<c034da2d>] _raw_spin_lock+0x33/0x40
[ 8683.928121]        [<c0304e0c>] ip_send_reply+0xf2/0x1ce
[ 8683.928121]        [<c0317dbc>] tcp_v4_send_reset+0x153/0x16f
[ 8683.928121]        [<c0317f4a>] tcp_v4_do_rcv+0x172/0x194
[ 8683.928121]        [<c031929b>] tcp_v4_rcv+0x387/0x5a0
[ 8683.928121]        [<c03001d0>] ip_local_deliver_finish+0x13a/0x1e9
[ 8683.928121]        [<c0300645>] NF_HOOK.clone.11+0x46/0x4d
[ 8683.928121]        [<c030075b>] ip_local_deliver+0x41/0x45
[ 8683.928121]        [<c03005dd>] ip_rcv_finish+0x31a/0x33c
[ 8683.928121]        [<c0300645>] NF_HOOK.clone.11+0x46/0x4d
[ 8683.928121]        [<c0300960>] ip_rcv+0x201/0x23d
[ 8683.928121]        [<c02de91b>] __netif_receive_skb+0x329/0x378
[ 8683.928121]        [<c02deae8>] netif_receive_skb+0x4e/0x7d
[ 8683.928121]        [<e08d5ef3>] rtl8139_poll+0x243/0x33d [8139too]
[ 8683.928121]        [<c02df103>] net_rx_action+0x90/0x15d
[ 8683.928121]        [<c012b2b5>] __do_softirq+0x7b/0x118
[ 8683.928121]
[ 8683.928121] -> #0 (slock-AF_INET){+.-...}:
[ 8683.928121]        [<c0159f1b>] __lock_acquire+0x9a3/0xc27
[ 8683.928121]        [<c015a561>] lock_acquire+0x71/0x85
[ 8683.928121]        [<c034da2d>] _raw_spin_lock+0x33/0x40
[ 8683.928121]        [<e0fc73ec>] l2tp_xmit_skb+0x173/0x47e
[l2tp_core]
[ 8683.928121]        [<e0fe31fb>] l2tp_eth_dev_xmit+0x1a/0x2f
[l2tp_eth]
[ 8683.928121]        [<c02e01e7>] dev_hard_start_xmit+0x333/0x3f2
[ 8683.928121]        [<c02f064c>] sch_direct_xmit+0x55/0x119
[ 8683.928121]        [<c02e0528>] dev_queue_xmit+0x282/0x418
[ 8683.928121]        [<c031f4fb>] NF_HOOK.clone.19+0x45/0x4c
[ 8683.928121]        [<c031f524>] arp_xmit+0x22/0x24
[ 8683.928121]        [<c031f567>] arp_send+0x41/0x48
[ 8683.928121]        [<c031fa7d>] arp_process+0x289/0x491
[ 8683.928121]        [<c031f4fb>] NF_HOOK.clone.19+0x45/0x4c
[ 8683.928121]        [<c031f7a0>] arp_rcv+0xb1/0xc3
[ 8683.928121]        [<c02de91b>] __netif_receive_skb+0x329/0x378
[ 8683.928121]        [<c02de9d3>] process_backlog+0x69/0x130
[ 8683.928121]        [<c02df103>] net_rx_action+0x90/0x15d
[ 8683.928121]        [<c012b2b5>] __do_softirq+0x7b/0x118
[ 8683.928121]
[ 8683.928121] other info that might help us debug this:
[ 8683.928121]
[ 8683.928121]  Possible unsafe locking scenario:
[ 8683.928121]
[ 8683.928121]        CPU0                    CPU1
[ 8683.928121]        ----                    ----
[ 8683.928121]   lock(_xmit_ETHER#2);
[ 8683.928121]                                lock(slock-AF_INET);
[ 8683.928121]                                lock(_xmit_ETHER#2);
[ 8683.928121]   lock(slock-AF_INET);
[ 8683.928121]
[ 8683.928121]  *** DEADLOCK ***
[ 8683.928121]
[ 8683.928121] 3 locks held by swapper/0/0:
[ 8683.928121]  #0:  (rcu_read_lock){.+.+..}, at: [<c02dbc10>]
rcu_lock_acquire+0x0/0x30
[ 8683.928121]  #1:  (rcu_read_lock_bh){.+....}, at: [<c02dbc10>]
rcu_lock_acquire+0x0/0x30
[ 8683.928121]  #2:  (_xmit_ETHER#2){+.-...}, at: [<c02f062d>]
sch_direct_xmit+0x36/0x119
[ 8683.928121]
[ 8683.928121] stack backtrace:
[ 8683.928121] Pid: 0, comm: swapper/0 Not tainted 3.4.1-build-0061 #14
[ 8683.928121] Call Trace:
[ 8683.928121]  [<c034bdd2>] ? printk+0x18/0x1a
[ 8683.928121]  [<c0158904>] print_circular_bug+0x1ac/0x1b6
[ 8683.928121]  [<c0159f1b>] __lock_acquire+0x9a3/0xc27
[ 8683.928121]  [<c015a561>] lock_acquire+0x71/0x85
[ 8683.928121]  [<e0fc73ec>] ? l2tp_xmit_skb+0x173/0x47e [l2tp_core]
[ 8683.928121]  [<c034da2d>] _raw_spin_lock+0x33/0x40
[ 8683.928121]  [<e0fc73ec>] ? l2tp_xmit_skb+0x173/0x47e [l2tp_core]
[ 8683.928121]  [<e0fc73ec>] l2tp_xmit_skb+0x173/0x47e [l2tp_core]
[ 8683.928121]  [<e0fe31fb>] l2tp_eth_dev_xmit+0x1a/0x2f [l2tp_eth]
[ 8683.928121]  [<c02e01e7>] dev_hard_start_xmit+0x333/0x3f2
[ 8683.928121]  [<c02f064c>] sch_direct_xmit+0x55/0x119
[ 8683.928121]  [<c02e0528>] dev_queue_xmit+0x282/0x418
[ 8683.928121]  [<c02e02a6>] ? dev_hard_start_xmit+0x3f2/0x3f2
[ 8683.928121]  [<c031f4fb>] NF_HOOK.clone.19+0x45/0x4c
[ 8683.928121]  [<c031f524>] arp_xmit+0x22/0x24
[ 8683.928121]  [<c02e02a6>] ? dev_hard_start_xmit+0x3f2/0x3f2
[ 8683.928121]  [<c031f567>] arp_send+0x41/0x48
[ 8683.928121]  [<c031fa7d>] arp_process+0x289/0x491
[ 8683.928121]  [<c031f7f4>] ? __neigh_lookup.clone.20+0x42/0x42
[ 8683.928121]  [<c031f4fb>] NF_HOOK.clone.19+0x45/0x4c
[ 8683.928121]  [<c031f7a0>] arp_rcv+0xb1/0xc3
[ 8683.928121]  [<c031f7f4>] ? __neigh_lookup.clone.20+0x42/0x42
[ 8683.928121]  [<c02de91b>] __netif_receive_skb+0x329/0x378
[ 8683.928121]  [<c02de9d3>] process_backlog+0x69/0x130
[ 8683.928121]  [<c02df103>] net_rx_action+0x90/0x15d
[ 8683.928121]  [<c012b2b5>] __do_softirq+0x7b/0x118
[ 8683.928121]  [<c012b23a>] ? local_bh_enable+0xd/0xd
[ 8683.928121]  <IRQ>  [<c012b4d0>] ? irq_exit+0x41/0x91
[ 8683.928121]  [<c0103c6f>] ? do_IRQ+0x79/0x8d
[ 8683.928121]  [<c0157ea1>] ? trace_hardirqs_off_caller+0x2e/0x86
[ 8683.928121]  [<c034ef6e>] ? common_interrupt+0x2e/0x34
[ 8683.928121]  [<c0108a33>] ? default_idle+0x23/0x38
[ 8683.928121]  [<c01091a8>] ? cpu_idle+0x55/0x6f
[ 8683.928121]  [<c033df25>] ? rest_init+0xa1/0xa7
[ 8683.928121]  [<c033de84>] ? __read_lock_failed+0x14/0x14
[ 8683.928121]  [<c0498745>] ? start_kernel+0x303/0x30a
[ 8683.928121]  [<c0498209>] ? repair_env_string+0x51/0x51
[ 8683.928121]  [<c04980a8>] ? i386_start_kernel+0xa8/0xaf

It appears that like most virtual devices, l2tp should be converted to
LLTX mode.

This patch takes care of statistics using atomic_long in both RX and TX
paths, and fix a bug in l2tp_eth_dev_recv(), which was caching skb->data
before a pskb_may_pull() call.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Denys Fedoryshchenko <denys@visp.net.lb>
Cc: James Chapman <jchapman@katalix.com>
Cc: Hong zhi guo <honkiko@gmail.com>
Cc: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
a2842a1
popcornmix popcornmix referenced this issue from a commit July 02, 2012
Stephen Boyd regulator: Fix recursive mutex lockdep warning
A recursive lockdep warning occurs if you call
regulator_set_optimum_mode() on a regulator with a supply because
there is no nesting annotation for the rdev->mutex. To avoid this
warning, get the supply's load before locking the regulator's
mutex to avoid grabbing the same class of lock twice.

=============================================
[ INFO: possible recursive locking detected ]
3.4.0 #3257 Tainted: G        W
---------------------------------------------
swapper/0/1 is trying to acquire lock:
 (&rdev->mutex){+.+.+.}, at: [<c036e9e0>] regulator_get_voltage+0x18/0x38

but task is already holding lock:
 (&rdev->mutex){+.+.+.}, at: [<c036ef38>] regulator_set_optimum_mode+0x24/0x224

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&rdev->mutex);
  lock(&rdev->mutex);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

3 locks held by swapper/0/1:
 #0:  (&__lockdep_no_validate__){......}, at: [<c03dbb48>] __driver_attach+0x40/0x8c
 #1:  (&__lockdep_no_validate__){......}, at: [<c03dbb58>] __driver_attach+0x50/0x8c
 #2:  (&rdev->mutex){+.+.+.}, at: [<c036ef38>] regulator_set_optimum_mode+0x24/0x224

stack backtrace:
[<c001521c>] (unwind_backtrace+0x0/0x12c) from [<c00cc4d4>] (validate_chain+0x760/0x1080)
[<c00cc4d4>] (validate_chain+0x760/0x1080) from [<c00cd744>] (__lock_acquire+0x950/0xa10)
[<c00cd744>] (__lock_acquire+0x950/0xa10) from [<c00cd990>] (lock_acquire+0x18c/0x1e8)
[<c00cd990>] (lock_acquire+0x18c/0x1e8) from [<c080c248>] (mutex_lock_nested+0x68/0x3c4)
[<c080c248>] (mutex_lock_nested+0x68/0x3c4) from [<c036e9e0>] (regulator_get_voltage+0x18/0x38)
[<c036e9e0>] (regulator_get_voltage+0x18/0x38) from [<c036efb8>] (regulator_set_optimum_mode+0xa4/0x224)
...

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
d92d95b
popcornmix popcornmix referenced this issue from a commit July 03, 2012
Alexander Holler leds: heartbeat: fix bug on panic
With commit 49dca5a I introduced
a bug (visible if CONFIG_PROVE_RCU is enabled) which occures when a panic
has happened:

[ 1526.520230] ===============================
[ 1526.520230] [ INFO: suspicious RCU usage. ]
[ 1526.520230] 3.5.0-rc1+ #12 Not tainted
[ 1526.520230] -------------------------------
[ 1526.520230] /c/kernel-tests/mm/include/linux/rcupdate.h:436 Illegal context switch in RCU read-side critical section!
[ 1526.520230]
[ 1526.520230] other info that might help us debug this:
[ 1526.520230]
[ 1526.520230]
[ 1526.520230] rcu_scheduler_active = 1, debug_locks = 0
[ 1526.520230] 3 locks held by net.agent/3279:
[ 1526.520230]  #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff82f85962>] do_page_fault+0x193/0x390
[ 1526.520230]  #1:  (panic_lock){+.+...}, at: [<ffffffff82ed2830>] panic+0x37/0x1d3
[ 1526.520230]  #2:  (rcu_read_lock){.+.+..}, at: [<ffffffff810b9b28>] rcu_lock_acquire+0x0/0x29
[ 1526.520230]
[ 1526.520230] stack backtrace:
[ 1526.520230] Pid: 3279, comm: net.agent Not tainted 3.5.0-rc1+ #12
[ 1526.520230] Call Trace:
[ 1526.520230]  [<ffffffff810e1570>] lockdep_rcu_suspicious+0x109/0x112
[ 1526.520230]  [<ffffffff810bfe3a>] rcu_preempt_sleep_check+0x45/0x47
[ 1526.520230]  [<ffffffff810bfe5a>] __might_sleep+0x1e/0x19a
[ 1526.520230]  [<ffffffff82f8010e>] down_write+0x26/0x81
[ 1526.520230]  [<ffffffff8276a966>] led_trigger_unregister+0x1f/0x9c
[ 1526.520230]  [<ffffffff8276def5>] heartbeat_reboot_notifier+0x15/0x19
[ 1526.520230]  [<ffffffff82f85bf5>] notifier_call_chain+0x96/0xcd
[ 1526.520230]  [<ffffffff82f85cba>] __atomic_notifier_call_chain+0x8e/0xff
[ 1526.520230]  [<ffffffff81094b7c>] ? kmsg_dump+0x37/0x1eb
[ 1526.520230]  [<ffffffff82f85d3f>] atomic_notifier_call_chain+0x14/0x16
[ 1526.520230]  [<ffffffff82ed28e1>] panic+0xe8/0x1d3
[ 1526.520230]  [<ffffffff811473e2>] out_of_memory+0x15d/0x1d3

So in case of a panic, now just turn of the LED. Other approaches like
scheduling a work to unregister the trigger aren't working because there
isn't much which still runs after a panic occured (except timers).

Signed-off-by: Alexander Holler <holler@ahsoftware.de>
Signed-off-by: Bryan Wu <bryan.wu@canonical.com>
fb31fbe
popcornmix popcornmix referenced this issue from a commit June 13, 2012
[media] tvp5150: fix kernel crash if chip is unavailable
tvp5150 driver probe function doesn't check if the chip is present.
Thus the driver can be loaded without having a device.
This is dangerous and can cause kernel crash like this:

Kernel BUG at c03c0964 [verbose debug info unavailable]
Internal error: Oops - BUG: 0 [#1] PREEMPT ARM
Modules linked in:
CPU: 0    Tainted: G        W     (3.4.0-cm-t3730+ #2)
PC is at media_entity_create_link+0xe4/0xf4
LR is at isp_register_entities+0x228/0x2f4
pc : [<c03c0964>]    lr : [<c03f3b30>]    psr: 60000013
sp : cf02de50  ip : 00000000  fp : c079405c
r10: 00000000  r9 : 00000000  r8 : cf33c800
r7 : c0794834  r6 : 00000000  r5 : 00000000  r4 : cf365b48
r3 : 00000000  r2 : cf365b48  r1 : 00000000  r0 : cf33c800
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
Control: 10c5387d  Table: 80004019  DAC: 00000015
Process swapper (pid: 1, stack limit = 0xcf02c2f0)
Stack: (0xcf02de50 to 0xcf02e000)
de40:                                     cf360000 cf366f28 cf366218 c0794834
de60: cf365b48 00000000 cf33c800 c03f3b30 00000000 00000000 cf360890 cf3668a0
de80: 00000003 cf360000 c0785a58 00000000 cf360528 00000000 00000000 00003fff
dea0: cf360500 c03f4cdc c06cc1f4 cf360000 c0785a58 c0d27808 c07d55ec c0785a58
dec0: c031f0e0 c07d55ec c0776900 000000bb 00000000 c032040c c03203f4 c031ef0c
dee0: c0785a58 c07d55ec c0785a8c c031f0e0 c075e670 c031f0c8 cf02deb8 c0785a58
df00: c07d55ec c031f174 c07d55ec 00000000 cf02df18 c031d7a0 cf01d4a8 cf068b10
df20: 00000000 c07d55ec c07c74d0 cf34bcc0 00000000 c031ded8 c0672340 c054cd38
df40: 00000000 c07e68c0 c07d55ec 00000000 00000000 c075e670 c0776900 000000bb
df60: 00000000 c031f770 c07e68c0 00000007 c07e68c0 00000000 c075e670 c0008790
df80: 000000bb 00000006 00000006 c066e650 cf02dfa4 c07689b8 c07689b8 00000007
dfa0: c07e68c0 c073f2e8 c07689c0 000000bb 00000000 c073f2bc 00000006 00000006
dfc0: c073f2e8 00000000 c077649c c077649c c00150cc 00000013 00000000 00000000
dfe0: 00000000 c073f3cc cf02dfe8 00000000 c073f368 c00150cc 00000000 00000000
[<c03c0964>] (media_entity_create_link+0xe4/0xf4) from [<c03f3b30>] (isp_register_entities+0x228/0x2f4)
[<c03f3b30>] (isp_register_entities+0x228/0x2f4) from [<c03f4cdc>] (isp_probe+0x7ac/0x9b8)
[<c03f4cdc>] (isp_probe+0x7ac/0x9b8) from [<c032040c>] (platform_drv_probe+0x18/0x1c)
[<c032040c>] (platform_drv_probe+0x18/0x1c) from [<c031ef0c>] (really_probe+0x64/0x1d8)
[<c031ef0c>] (really_probe+0x64/0x1d8) from [<c031f0c8>] (driver_probe_device+0x48/0x60)
[<c031f0c8>] (driver_probe_device+0x48/0x60) from [<c031f174>] (__driver_attach+0x94/0x98)
[<c031f174>] (__driver_attach+0x94/0x98) from [<c031d7a0>] (bus_for_each_dev+0x54/0x80)
[<c031d7a0>] (bus_for_each_dev+0x54/0x80) from [<c031ded8>] (bus_add_driver+0xac/0x2a8)
[<c031ded8>] (bus_add_driver+0xac/0x2a8) from [<c031f770>] (driver_register+0x78/0x180)
[<c031f770>] (driver_register+0x78/0x180) from [<c0008790>] (do_one_initcall+0x34/0x184)
[<c0008790>] (do_one_initcall+0x34/0x184) from [<c073f2bc>] (do_basic_setup+0x9c/0xc8)
[<c073f2bc>] (do_basic_setup+0x9c/0xc8) from [<c073f3cc>] (kernel_init+0x64/0xec)
[<c073f3cc>] (kernel_init+0x64/0xec) from [<c00150cc>] (kernel_thread_exit+0x0/0x8)
Code: e1c812b6 e8bd87f0 e7f001f2 eafffffe (e7f001f2)

---[ end trace 3ed3c618b26ff3e8 ]---
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

This patch fixes the tvp5150_read() function to return an error in case
the I2C transaction fails.
tvp5150_probe() and other relevant driver callbacks changed to check the
status of the I2C read operations.
In case of a read error throw an error message with v4l2_err()
instead of v4l2_dbg().

[mchehab@redhat.com: Fix a small typo breaking compilation]
Signed-off-by: Dmitry Lifshitz <lifshitz@compulab.co.il>
Signed-off-by: Igor Grinberg <grinberg@compulab.co.il>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
8cd0d4c
popcornmix popcornmix referenced this issue from a commit July 06, 2012
Stephen Boyd ARM: 7457/1: smp: Fix suspicious RCU originating from cpu_die()
While running hotplug tests I ran into this RCU splat

===============================
[ INFO: suspicious RCU usage. ]
3.4.0 #3275 Tainted: G        W
-------------------------------
include/linux/rcupdate.h:729 rcu_read_lock() used illegally while idle!

other info that might help us debug this:

RCU used illegally from idle CPU!
rcu_scheduler_active = 1, debug_locks = 0
RCU used illegally from extended quiescent state!
4 locks held by swapper/2/0:
 #0:  ((cpu_died).wait.lock){......}, at: [<c00ab128>] complete+0x1c/0x5c
 #1:  (&p->pi_lock){-.-.-.}, at: [<c00b275c>] try_to_wake_up+0x2c/0x388
 #2:  (&rq->lock){-.-.-.}, at: [<c00b2860>] try_to_wake_up+0x130/0x388
 #3:  (rcu_read_lock){.+.+..}, at: [<c00abe5c>] cpuacct_charge+0x28/0x1f4

stack backtrace:
[<c001521c>] (unwind_backtrace+0x0/0x12c) from [<c00abec8>] (cpuacct_charge+0x94/0x1f4)
[<c00abec8>] (cpuacct_charge+0x94/0x1f4) from [<c00b395c>] (update_curr+0x24c/0x2c8)
[<c00b395c>] (update_curr+0x24c/0x2c8) from [<c00b59c4>] (enqueue_task_fair+0x50/0x194)
[<c00b59c4>] (enqueue_task_fair+0x50/0x194) from [<c00afea4>] (enqueue_task+0x30/0x34)
[<c00afea4>] (enqueue_task+0x30/0x34) from [<c00b0908>] (ttwu_activate+0x14/0x38)
[<c00b0908>] (ttwu_activate+0x14/0x38) from [<c00b28a8>] (try_to_wake_up+0x178/0x388)
[<c00b28a8>] (try_to_wake_up+0x178/0x388) from [<c00a82a0>] (__wake_up_common+0x34/0x78)
[<c00a82a0>] (__wake_up_common+0x34/0x78) from [<c00ab154>] (complete+0x48/0x5c)
[<c00ab154>] (complete+0x48/0x5c) from [<c07db7cc>] (cpu_die+0x2c/0x58)
[<c07db7cc>] (cpu_die+0x2c/0x58) from [<c000f954>] (cpu_idle+0x64/0xfc)
[<c000f954>] (cpu_idle+0x64/0xfc) from [<80208160>] (0x80208160)

When a cpu is marked offline during its idle thread it calls
cpu_die() during an RCU idle period. cpu_die() calls complete()
to notify the killing process that the cpu has died. complete()
calls into the scheduler code and eventually grabs an RCU read
lock in cpuacct_charge().

Mark complete() as RCU_NONIDLE so that RCU pays attention to this
CPU for the duration of the complete() function even though it's
in idle.

Suggested-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
ff081e0
popcornmix popcornmix referenced this issue from a commit July 11, 2012
cifs: when CONFIG_HIGHMEM is set, serialize the read/write kmaps
Jian found that when he ran fsx on a 32 bit arch with a large wsize the
process and one of the bdi writeback kthreads would sometimes deadlock
with a stack trace like this:

crash> bt
PID: 2789   TASK: f02edaa0  CPU: 3   COMMAND: "fsx"
 #0 [eed63cbc] schedule at c083c5b3
 #1 [eed63d80] kmap_high at c0500ec8
 #2 [eed63db0] cifs_async_writev at f7fabcd7 [cifs]
 #3 [eed63df0] cifs_writepages at f7fb7f5c [cifs]
 #4 [eed63e50] do_writepages at c04f3e32
 #5 [eed63e54] __filemap_fdatawrite_range at c04e152a
 #6 [eed63ea4] filemap_fdatawrite at c04e1b3e
 #7 [eed63eb4] cifs_file_aio_write at f7fa111a [cifs]
 #8 [eed63ecc] do_sync_write at c052d202
 #9 [eed63f74] vfs_write at c052d4ee
#10 [eed63f94] sys_write at c052df4c
#11 [eed63fb0] ia32_sysenter_target at c0409a98
    EAX: 00000004  EBX: 00000003  ECX: abd73b73  EDX: 012a65c6
    DS:  007b      ESI: 012a65c6  ES:  007b      EDI: 00000000
    SS:  007b      ESP: bf8db178  EBP: bf8db1f8  GS:  0033
    CS:  0073      EIP: 40000424  ERR: 00000004  EFLAGS: 00000246

Each task would kmap part of its address array before getting stuck, but
not enough to actually issue the write.

This patch fixes this by serializing the marshal_iov operations for
async reads and writes. The idea here is to ensure that cifs
aggressively tries to populate a request before attempting to fulfill
another one. As soon as all of the pages are kmapped for a request, then
we can unlock and allow another one to proceed.

There's no need to do this serialization on non-CONFIG_HIGHMEM arches
however, so optimize all of this out when CONFIG_HIGHMEM isn't set.

Cc: <stable@vger.kernel.org>
Reported-by: Jian Li <jiali@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
3cf003c
popcornmix popcornmix referenced this issue from a commit July 23, 2012
nfs: skip commit in releasepage if we're freeing memory for fs-relate…
…d reasons

We've had some reports of a deadlock where rpciod ends up with a stack
trace like this:

    PID: 2507   TASK: ffff88103691ab40  CPU: 14  COMMAND: "rpciod/14"
     #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
     #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
     #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
     #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
     #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
     #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
     #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
     #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
     #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
     #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
    #10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
    #11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
    #12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
    #13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
    #14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
    #15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
    #16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
    #17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
    #18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
    #19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
    #20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
    #21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
    #22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
    #23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
    #24 [ffff8810343bfee8] kthread at ffffffff8108dd96
    #25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca

rpciod is trying to allocate memory for a new socket to talk to the
server. The VM ends up calling ->releasepage to get more memory, and it
tries to do a blocking commit. That commit can't succeed however without
a connected socket, so we deadlock.

Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
socket allocation, and having nfs_release_page check for that flag when
deciding whether to do a commit call. Also, set PF_FSTRANS
unconditionally in rpc_async_schedule since that function can also do
allocations sometimes.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
5cf02d0
popcornmix popcornmix referenced this issue from a commit August 08, 2012
Stanislaw Gruszka sched: fix divide by zero at {thread_group,task}_times
On architectures where cputime_t is 64 bit type, is possible to trigger
divide by zero on do_div(temp, (__force u32) total) line, if total is a
non zero number but has lower 32 bit's zeroed. Removing casting is not
a good solution since some do_div() implementations do cast to u32
internally.

This problem can be triggered in practice on very long lived processes:

  PID: 2331   TASK: ffff880472814b00  CPU: 2   COMMAND: "oraagent.bin"
   #0 [ffff880472a51b70] machine_kexec at ffffffff8103214b
   #1 [ffff880472a51bd0] crash_kexec at ffffffff810b91c2
   #2 [ffff880472a51ca0] oops_end at ffffffff814f0b00
   #3 [ffff880472a51cd0] die at ffffffff8100f26b
   #4 [ffff880472a51d00] do_trap at ffffffff814f03f4
   #5 [ffff880472a51d60] do_divide_error at ffffffff8100cfff
   #6 [ffff880472a51e00] divide_error at ffffffff8100be7b
      [exception RIP: thread_group_times+0x56]
      RIP: ffffffff81056a16  RSP: ffff880472a51eb8  RFLAGS: 00010046
      RAX: bc3572c9fe12d194  RBX: ffff880874150800  RCX: 0000000110266fad
      RDX: 0000000000000000  RSI: ffff880472a51eb8  RDI: 001038ae7d9633dc
      RBP: ffff880472a51ef8   R8: 00000000b10a3a64   R9: ffff880874150800
      R10: 00007fcba27ab680  R11: 0000000000000202  R12: ffff880472a51f08
      R13: ffff880472a51f10  R14: 0000000000000000  R15: 0000000000000007
      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
   #7 [ffff880472a51f00] do_sys_times at ffffffff8108845d
   #8 [ffff880472a51f40] sys_times at ffffffff81088524
   #9 [ffff880472a51f80] system_call_fastpath at ffffffff8100b0f2
      RIP: 0000003808caac3a  RSP: 00007fcba27ab6d8  RFLAGS: 00000202
      RAX: 0000000000000064  RBX: ffffffff8100b0f2  RCX: 0000000000000000
      RDX: 00007fcba27ab6e0  RSI: 000000000076d58e  RDI: 00007fcba27ab6e0
      RBP: 00007fcba27ab700   R8: 0000000000000020   R9: 000000000000091b
      R10: 00007fcba27ab680  R11: 0000000000000202  R12: 00007fff9ca41940
      R13: 0000000000000000  R14: 00007fcba27ac9c0  R15: 00007fff9ca41940
      ORIG_RAX: 0000000000000064  CS: 0033  SS: 002b

Cc: stable@vger.kernel.org
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120808092714.GA3580@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
bea6832
popcornmix popcornmix referenced this issue from a commit August 20, 2012
tcp: fix possible socket refcount problem
Commit 6f458df (tcp: improve latencies of timer triggered events)
added bug leading to following trace :

[ 2866.131281] IPv4: Attempt to release TCP socket in state 1 ffff880019ec0000
[ 2866.131726]
[ 2866.132188] =========================
[ 2866.132281] [ BUG: held lock freed! ]
[ 2866.132281] 3.6.0-rc1+ #622 Not tainted
[ 2866.132281] -------------------------
[ 2866.132281] kworker/0:1/652 is freeing memory ffff880019ec0000-ffff880019ec0a1f, with a lock still held there!
[ 2866.132281]  (sk_lock-AF_INET-RPC){+.+...}, at: [<ffffffff81903619>] tcp_sendmsg+0x29/0xcc6
[ 2866.132281] 4 locks held by kworker/0:1/652:
[ 2866.132281]  #0:  (rpciod){.+.+.+}, at: [<ffffffff81083567>] process_one_work+0x1de/0x47f
[ 2866.132281]  #1:  ((&task->u.tk_work)){+.+.+.}, at: [<ffffffff81083567>] process_one_work+0x1de/0x47f
[ 2866.132281]  #2:  (sk_lock-AF_INET-RPC){+.+...}, at: [<ffffffff81903619>] tcp_sendmsg+0x29/0xcc6
[ 2866.132281]  #3:  (&icsk->icsk_retransmit_timer){+.-...}, at: [<ffffffff81078017>] run_timer_softirq+0x1ad/0x35f
[ 2866.132281]
[ 2866.132281] stack backtrace:
[ 2866.132281] Pid: 652, comm: kworker/0:1 Not tainted 3.6.0-rc1+ #622
[ 2866.132281] Call Trace:
[ 2866.132281]  <IRQ>  [<ffffffff810bc527>] debug_check_no_locks_freed+0x112/0x159
[ 2866.132281]  [<ffffffff818a0839>] ? __sk_free+0xfd/0x114
[ 2866.132281]  [<ffffffff811549fa>] kmem_cache_free+0x6b/0x13a
[ 2866.132281]  [<ffffffff818a0839>] __sk_free+0xfd/0x114
[ 2866.132281]  [<ffffffff818a08c0>] sk_free+0x1c/0x1e
[ 2866.132281]  [<ffffffff81911e1c>] tcp_write_timer+0x51/0x56
[ 2866.132281]  [<ffffffff81078082>] run_timer_softirq+0x218/0x35f
[ 2866.132281]  [<ffffffff81078017>] ? run_timer_softirq+0x1ad/0x35f
[ 2866.132281]  [<ffffffff810f5831>] ? rb_commit+0x58/0x85
[ 2866.132281]  [<ffffffff81911dcb>] ? tcp_write_timer_handler+0x148/0x148
[ 2866.132281]  [<ffffffff81070bd6>] __do_softirq+0xcb/0x1f9
[ 2866.132281]  [<ffffffff81a0a00c>] ? _raw_spin_unlock+0x29/0x2e
[ 2866.132281]  [<ffffffff81a1227c>] call_softirq+0x1c/0x30
[ 2866.132281]  [<ffffffff81039f38>] do_softirq+0x4a/0xa6
[ 2866.132281]  [<ffffffff81070f2b>] irq_exit+0x51/0xad
[ 2866.132281]  [<ffffffff81a129cd>] do_IRQ+0x9d/0xb4
[ 2866.132281]  [<ffffffff81a0a3ef>] common_interrupt+0x6f/0x6f
[ 2866.132281]  <EOI>  [<ffffffff8109d006>] ? sched_clock_cpu+0x58/0xd1
[ 2866.132281]  [<ffffffff81a0a172>] ? _raw_spin_unlock_irqrestore+0x4c/0x56
[ 2866.132281]  [<ffffffff81078692>] mod_timer+0x178/0x1a9
[ 2866.132281]  [<ffffffff818a00aa>] sk_reset_timer+0x19/0x26
[ 2866.132281]  [<ffffffff8190b2cc>] tcp_rearm_rto+0x99/0xa4
[ 2866.132281]  [<ffffffff8190dfba>] tcp_event_new_data_sent+0x6e/0x70
[ 2866.132281]  [<ffffffff8190f7ea>] tcp_write_xmit+0x7de/0x8e4
[ 2866.132281]  [<ffffffff818a565d>] ? __alloc_skb+0xa0/0x1a1
[ 2866.132281]  [<ffffffff8190f952>] __tcp_push_pending_frames+0x2e/0x8a
[ 2866.132281]  [<ffffffff81904122>] tcp_sendmsg+0xb32/0xcc6
[ 2866.132281]  [<ffffffff819229c2>] inet_sendmsg+0xaa/0xd5
[ 2866.132281]  [<ffffffff81922918>] ? inet_autobind+0x5f/0x5f
[ 2866.132281]  [<ffffffff810ee7f1>] ? trace_clock_local+0x9/0xb
[ 2866.132281]  [<ffffffff8189adab>] sock_sendmsg+0xa3/0xc4
[ 2866.132281]  [<ffffffff810f5de6>] ? rb_reserve_next_event+0x26f/0x2d5
[ 2866.132281]  [<ffffffff8103e6a9>] ? native_sched_clock+0x29/0x6f
[ 2866.132281]  [<ffffffff8103e6f8>] ? sched_clock+0x9/0xd
[ 2866.132281]  [<ffffffff810ee7f1>] ? trace_clock_local+0x9/0xb
[ 2866.132281]  [<ffffffff8189ae03>] kernel_sendmsg+0x37/0x43
[ 2866.132281]  [<ffffffff8199ce49>] xs_send_kvec+0x77/0x80
[ 2866.132281]  [<ffffffff8199cec1>] xs_sendpages+0x6f/0x1a0
[ 2866.132281]  [<ffffffff8107826d>] ? try_to_del_timer_sync+0x55/0x61
[ 2866.132281]  [<ffffffff8199d0d2>] xs_tcp_send_request+0x55/0xf1
[ 2866.132281]  [<ffffffff8199bb90>] xprt_transmit+0x89/0x1db
[ 2866.132281]  [<ffffffff81999bcd>] ? call_connect+0x3c/0x3c
[ 2866.132281]  [<ffffffff81999d92>] call_transmit+0x1c5/0x20e
[ 2866.132281]  [<ffffffff819a0d55>] __rpc_execute+0x6f/0x225
[ 2866.132281]  [<ffffffff81999bcd>] ? call_connect+0x3c/0x3c
[ 2866.132281]  [<ffffffff819a0f33>] rpc_async_schedule+0x28/0x34
[ 2866.132281]  [<ffffffff810835d6>] process_one_work+0x24d/0x47f
[ 2866.132281]  [<ffffffff81083567>] ? process_one_work+0x1de/0x47f
[ 2866.132281]  [<ffffffff819a0f0b>] ? __rpc_execute+0x225/0x225
[ 2866.132281]  [<ffffffff81083a6d>] worker_thread+0x236/0x317
[ 2866.132281]  [<ffffffff81083837>] ? process_scheduled_works+0x2f/0x2f
[ 2866.132281]  [<ffffffff8108b7b8>] kthread+0x9a/0xa2
[ 2866.132281]  [<ffffffff81a12184>] kernel_thread_helper+0x4/0x10
[ 2866.132281]  [<ffffffff81a0a4b0>] ? retint_restore_args+0x13/0x13
[ 2866.132281]  [<ffffffff8108b71e>] ? __init_kthread_worker+0x5a/0x5a
[ 2866.132281]  [<ffffffff81a12180>] ? gs_change+0x13/0x13
[ 2866.308506] IPv4: Attempt to release TCP socket in state 1 ffff880019ec0000
[ 2866.309689] =============================================================================
[ 2866.310254] BUG TCP (Not tainted): Object already free
[ 2866.310254] -----------------------------------------------------------------------------
[ 2866.310254]

The bug comes from the fact that timer set in sk_reset_timer() can run
before we actually do the sock_hold(). socket refcount reaches zero and
we free the socket too soon.

timer handler is not allowed to reduce socket refcnt if socket is owned
by the user, or we need to change sk_reset_timer() implementation.

We should take a reference on the socket in case TCP_DELACK_TIMER_DEFERRED
or TCP_DELACK_TIMER_DEFERRED bit are set in tsq_flags

Also fix a typo in tcp_delack_timer(), where TCP_WRITE_TIMER_DEFERRED
was used instead of TCP_DELACK_TIMER_DEFERRED.

For consistency, use same socket refcount change for TCP_MTU_REDUCED_DEFERRED,
even if not fired from a timer.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Tested-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
144d56e
popcornmix popcornmix referenced this issue from a commit July 30, 2012
fbcon: Fix bit_putcs() call to kmalloc(s, GFP_KERNEL)
Switch to kmalloc(,GFP_ATOMIC) in bit_putcs to fix below trace:

[    9.771812] BUG: sleeping function called from invalid context at /usr/src/linux-git/mm/slub.c:943
[    9.771814] in_atomic(): 1, irqs_disabled(): 1, pid: 1063, name: mount
[    9.771818] Pid: 1063, comm: mount Not tainted 3.5.0-jupiter-00003-g8d858b1-dirty #2
[    9.771819] Call Trace:
[    9.771838]  [<c104f79b>] __might_sleep+0xcb/0xe0
[    9.771844]  [<c10c00d4>] __kmalloc+0xb4/0x1c0
[    9.771851]  [<c1041d4a>] ? queue_work+0x1a/0x30
[    9.771854]  [<c1041dcf>] ? queue_delayed_work+0xf/0x30
[    9.771862]  [<c1205832>] ? bit_putcs+0xf2/0x3e0
[    9.771865]  [<c1041e01>] ? schedule_delayed_work+0x11/0x20
[    9.771868]  [<c1205832>] bit_putcs+0xf2/0x3e0
[    9.771875]  [<c12002b8>] ? get_color.clone.14+0x28/0x100
[    9.771878]  [<c1200d2f>] fbcon_putcs+0x11f/0x130
[    9.771882]  [<c1205740>] ? bit_clear+0xe0/0xe0
[    9.771885]  [<c1200f6d>] fbcon_redraw.clone.21+0x11d/0x160
[    9.771889]  [<c120383d>] fbcon_scroll+0x79d/0xe10
[    9.771892]  [<c12002b8>] ? get_color.clone.14+0x28/0x100
[    9.771897]  [<c124c0b4>] scrup+0x64/0xd0
[    9.771900]  [<c124c22b>] lf+0x2b/0x60
[    9.771903]  [<c124cc95>] vt_console_print+0x1d5/0x2f0
[    9.771907]  [<c124cac0>] ? register_vt_notifier+0x20/0x20
[    9.771913]  [<c102b335>] call_console_drivers.clone.5+0xa5/0xc0
[    9.771916]  [<c102c58e>] console_unlock+0x2fe/0x3c0
[    9.771920]  [<c102ca16>] vprintk_emit+0x2e6/0x300
[    9.771924]  [<c13f01ae>] printk+0x38/0x3a
[    9.771931]  [<c112e8fe>] reiserfs_remount+0x2ae/0x3e0
[    9.771934]  [<c112e650>] ? reiserfs_fill_super+0xb00/0xb00
[    9.771939]  [<c10ca0ab>] do_remount_sb+0xab/0x150
[    9.771943]  [<c1034476>] ? ns_capable+0x46/0x70
[    9.771948]  [<c10e059c>] do_mount+0x20c/0x6b0
[    9.771955]  [<c10a7044>] ? strndup_user+0x34/0x50
[    9.771958]  [<c10e0acc>] sys_mount+0x6c/0xa0
[    9.771964]  [<c13f2557>] sysenter_do_call+0x12/0x26

According to comment in bit_putcs() that kammloc() call only happens
when fbcon is drawing to a monochrome framebuffer (which is my case with
hid-picolcd).

Signed-off-by: Bruno Prémont <bonbons@linux-vserver.org>
Signed-off-by: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
72caa5f
popcornmix popcornmix referenced this issue from a commit July 17, 2012
mmc: mxs-mmc: fix deadlock caused by recursion loop
Release the lock before mmc_signal_sdio_irq is called by
mxs_mmc_enable_sdio_irq.

Backtrace:
[   65.470000] =============================================
[   65.470000] [ INFO: possible recursive locking detected ]
[   65.470000] 3.5.0-rc5 #2 Not tainted
[   65.470000] ---------------------------------------------
[   65.470000] ksdioirqd/mmc0/73 is trying to acquire lock:
[   65.470000]  (&(&host->lock)->rlock#2){-.-...}, at: [<bf054120>] mxs_mmc_enable_sdio_irq+0x18/0xdc [mxs_mmc]
[   65.470000]
[   65.470000] but task is already holding lock:
[   65.470000]  (&(&host->lock)->rlock#2){-.-...}, at: [<bf054120>] mxs_mmc_enable_sdio_irq+0x18/0xdc [mxs_mmc]
[   65.470000]
[   65.470000] other info that might help us debug this:
[   65.470000]  Possible unsafe locking scenario:
[   65.470000]
[   65.470000]        CPU0
[   65.470000]        ----
[   65.470000]   lock(&(&host->lock)->rlock#2);
[   65.470000]   lock(&(&host->lock)->rlock#2);
[   65.470000]
[   65.470000]  *** DEADLOCK ***
[   65.470000]
[   65.470000]  May be due to missing lock nesting notation
[   65.470000]
[   65.470000] 1 lock held by ksdioirqd/mmc0/73:
[   65.470000]  #0:  (&(&host->lock)->rlock#2){-.-...}, at: [<bf054120>] mxs_mmc_enable_sdio_irq+0x18/0xdc [mxs_mmc]
[   65.470000]
[   65.470000] stack backtrace:
[   65.470000] [<c0014990>] (unwind_backtrace+0x0/0xf4) from [<c005ccb8>] (__lock_acquire+0x14f8/0x1b98)
[   65.470000] [<c005ccb8>] (__lock_acquire+0x14f8/0x1b98) from [<c005d3f8>] (lock_acquire+0xa0/0x108)
[   65.470000] [<c005d3f8>] (lock_acquire+0xa0/0x108) from [<c02f671c>] (_raw_spin_lock_irqsave+0x48/0x5c)
[   65.470000] [<c02f671c>] (_raw_spin_lock_irqsave+0x48/0x5c) from [<bf054120>] (mxs_mmc_enable_sdio_irq+0x18/0xdc [mxs_mmc])
[   65.470000] [<bf054120>] (mxs_mmc_enable_sdio_irq+0x18/0xdc [mxs_mmc]) from [<bf0541d0>] (mxs_mmc_enable_sdio_irq+0xc8/0xdc [mxs_mmc])
[   65.470000] [<bf0541d0>] (mxs_mmc_enable_sdio_irq+0xc8/0xdc [mxs_mmc]) from [<c0219b38>] (sdio_irq_thread+0x1bc/0x274)
[   65.470000] [<c0219b38>] (sdio_irq_thread+0x1bc/0x274) from [<c003c324>] (kthread+0x8c/0x98)
[   65.470000] [<c003c324>] (kthread+0x8c/0x98) from [<c00101ac>] (kernel_thread_exit+0x0/0x8)
[   65.470000] BUG: spinlock lockup suspected on CPU#0, ksdioirqd/mmc0/73
[   65.470000]  lock: 0xc3358724, .magic: dead4ead, .owner: ksdioirqd/mmc0/73, .owner_cpu: 0
[   65.470000] [<c0014990>] (unwind_backtrace+0x0/0xf4) from [<c01b46b0>] (do_raw_spin_lock+0x100/0x144)
[   65.470000] [<c01b46b0>] (do_raw_spin_lock+0x100/0x144) from [<c02f6724>] (_raw_spin_lock_irqsave+0x50/0x5c)
[   65.470000] [<c02f6724>] (_raw_spin_lock_irqsave+0x50/0x5c) from [<bf054120>] (mxs_mmc_enable_sdio_irq+0x18/0xdc [mxs_mmc])
[   65.470000] [<bf054120>] (mxs_mmc_enable_sdio_irq+0x18/0xdc [mxs_mmc]) from [<bf0541d0>] (mxs_mmc_enable_sdio_irq+0xc8/0xdc [mxs_mmc])
[   65.470000] [<bf0541d0>] (mxs_mmc_enable_sdio_irq+0xc8/0xdc [mxs_mmc]) from [<c0219b38>] (sdio_irq_thread+0x1bc/0x274)
[   65.470000] [<c0219b38>] (sdio_irq_thread+0x1bc/0x274) from [<c003c324>] (kthread+0x8c/0x98)
[   65.470000] [<c003c324>] (kthread+0x8c/0x98) from [<c00101ac>] (kernel_thread_exit+0x0/0x8)

Reported-by: Attila Kinali <attila@kinali.ch>
Signed-off-by: Lauri Hintsala <lauri.hintsala@bluegiga.com>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
fc108d2
popcornmix popcornmix referenced this issue from a commit September 04, 2012
l2tp: fix a lockdep splat
Fixes following lockdep splat :

[ 1614.734896] =============================================
[ 1614.734898] [ INFO: possible recursive locking detected ]
[ 1614.734901] 3.6.0-rc3+ #782 Not tainted
[ 1614.734903] ---------------------------------------------
[ 1614.734905] swapper/11/0 is trying to acquire lock:
[ 1614.734907]  (slock-AF_INET){+.-...}, at: [<ffffffffa0209d72>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
[ 1614.734920]
[ 1614.734920] but task is already holding lock:
[ 1614.734922]  (slock-AF_INET){+.-...}, at: [<ffffffff815fce23>] tcp_v4_err+0x163/0x6b0
[ 1614.734932]
[ 1614.734932] other info that might help us debug this:
[ 1614.734935]  Possible unsafe locking scenario:
[ 1614.734935]
[ 1614.734937]        CPU0
[ 1614.734938]        ----
[ 1614.734940]   lock(slock-AF_INET);
[ 1614.734943]   lock(slock-AF_INET);
[ 1614.734946]
[ 1614.734946]  *** DEADLOCK ***
[ 1614.734946]
[ 1614.734949]  May be due to missing lock nesting notation
[ 1614.734949]
[ 1614.734952] 7 locks held by swapper/11/0:
[ 1614.734954]  #0:  (rcu_read_lock){.+.+..}, at: [<ffffffff81592801>] __netif_receive_skb+0x251/0xd00
[ 1614.734964]  #1:  (rcu_read_lock){.+.+..}, at: [<ffffffff815d319c>] ip_local_deliver_finish+0x4c/0x4e0
[ 1614.734972]  #2:  (rcu_read_lock){.+.+..}, at: [<ffffffff8160d116>] icmp_socket_deliver+0x46/0x230
[ 1614.734982]  #3:  (slock-AF_INET){+.-...}, at: [<ffffffff815fce23>] tcp_v4_err+0x163/0x6b0
[ 1614.734989]  #4:  (rcu_read_lock){.+.+..}, at: [<ffffffff815da240>] ip_queue_xmit+0x0/0x680
[ 1614.734997]  #5:  (rcu_read_lock_bh){.+....}, at: [<ffffffff815d9925>] ip_finish_output+0x135/0x890
[ 1614.735004]  #6:  (rcu_read_lock_bh){.+....}, at: [<ffffffff81595680>] dev_queue_xmit+0x0/0xe00
[ 1614.735012]
[ 1614.735012] stack backtrace:
[ 1614.735016] Pid: 0, comm: swapper/11 Not tainted 3.6.0-rc3+ #782
[ 1614.735018] Call Trace:
[ 1614.735020]  <IRQ>  [<ffffffff810a50ac>] __lock_acquire+0x144c/0x1b10
[ 1614.735033]  [<ffffffff810a334b>] ? check_usage+0x9b/0x4d0
[ 1614.735037]  [<ffffffff810a6762>] ? mark_held_locks+0x82/0x130
[ 1614.735042]  [<ffffffff810a5df0>] lock_acquire+0x90/0x200
[ 1614.735047]  [<ffffffffa0209d72>] ? l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
[ 1614.735051]  [<ffffffff810a69ad>] ? trace_hardirqs_on+0xd/0x10
[ 1614.735060]  [<ffffffff81749b31>] _raw_spin_lock+0x41/0x50
[ 1614.735065]  [<ffffffffa0209d72>] ? l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
[ 1614.735069]  [<ffffffffa0209d72>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
[ 1614.735075]  [<ffffffffa014f7f2>] l2tp_eth_dev_xmit+0x32/0x60 [l2tp_eth]
[ 1614.735079]  [<ffffffff81595112>] dev_hard_start_xmit+0x502/0xa70
[ 1614.735083]  [<ffffffff81594c6e>] ? dev_hard_start_xmit+0x5e/0xa70
[ 1614.735087]  [<ffffffff815957c1>] ? dev_queue_xmit+0x141/0xe00
[ 1614.735093]  [<ffffffff815b622e>] sch_direct_xmit+0xfe/0x290
[ 1614.735098]  [<ffffffff81595865>] dev_queue_xmit+0x1e5/0xe00
[ 1614.735102]  [<ffffffff81595680>] ? dev_hard_start_xmit+0xa70/0xa70
[ 1614.735106]  [<ffffffff815b4daa>] ? eth_header+0x3a/0xf0
[ 1614.735111]  [<ffffffff8161d33e>] ? fib_get_table+0x2e/0x280
[ 1614.735117]  [<ffffffff8160a7e2>] arp_xmit+0x22/0x60
[ 1614.735121]  [<ffffffff8160a863>] arp_send+0x43/0x50
[ 1614.735125]  [<ffffffff8160b82f>] arp_solicit+0x18f/0x450
[ 1614.735132]  [<ffffffff8159d9da>] neigh_probe+0x4a/0x70
[ 1614.735137]  [<ffffffff815a191a>] __neigh_event_send+0xea/0x300
[ 1614.735141]  [<ffffffff815a1c93>] neigh_resolve_output+0x163/0x260
[ 1614.735146]  [<ffffffff815d9cf5>] ip_finish_output+0x505/0x890
[ 1614.735150]  [<ffffffff815d9925>] ? ip_finish_output+0x135/0x890
[ 1614.735154]  [<ffffffff815dae79>] ip_output+0x59/0xf0
[ 1614.735158]  [<ffffffff815da1cd>] ip_local_out+0x2d/0xa0
[ 1614.735162]  [<ffffffff815da403>] ip_queue_xmit+0x1c3/0x680
[ 1614.735165]  [<ffffffff815da240>] ? ip_local_out+0xa0/0xa0
[ 1614.735172]  [<ffffffff815f4402>] tcp_transmit_skb+0x402/0xa60
[ 1614.735177]  [<ffffffff815f5a11>] tcp_retransmit_skb+0x1a1/0x620
[ 1614.735181]  [<ffffffff815f7e93>] tcp_retransmit_timer+0x393/0x960
[ 1614.735185]  [<ffffffff815fce23>] ? tcp_v4_err+0x163/0x6b0
[ 1614.735189]  [<ffffffff815fd317>] tcp_v4_err+0x657/0x6b0
[ 1614.735194]  [<ffffffff8160d116>] ? icmp_socket_deliver+0x46/0x230
[ 1614.735199]  [<ffffffff8160d19e>] icmp_socket_deliver+0xce/0x230
[ 1614.735203]  [<ffffffff8160d116>] ? icmp_socket_deliver+0x46/0x230
[ 1614.735208]  [<ffffffff8160d464>] icmp_unreach+0xe4/0x2c0
[ 1614.735213]  [<ffffffff8160e520>] icmp_rcv+0x350/0x4a0
[ 1614.735217]  [<ffffffff815d3285>] ip_local_deliver_finish+0x135/0x4e0
[ 1614.735221]  [<ffffffff815d319c>] ? ip_local_deliver_finish+0x4c/0x4e0
[ 1614.735225]  [<ffffffff815d3ffa>] ip_local_deliver+0x4a/0x90
[ 1614.735229]  [<ffffffff815d37b7>] ip_rcv_finish+0x187/0x730
[ 1614.735233]  [<ffffffff815d425d>] ip_rcv+0x21d/0x300
[ 1614.735237]  [<ffffffff81592a1b>] __netif_receive_skb+0x46b/0xd00
[ 1614.735241]  [<ffffffff81592801>] ? __netif_receive_skb+0x251/0xd00
[ 1614.735245]  [<ffffffff81593368>] process_backlog+0xb8/0x180
[ 1614.735249]  [<ffffffff81593cf9>] net_rx_action+0x159/0x330
[ 1614.735257]  [<ffffffff810491f0>] __do_softirq+0xd0/0x3e0
[ 1614.735264]  [<ffffffff8109ed24>] ? tick_program_event+0x24/0x30
[ 1614.735270]  [<ffffffff8175419c>] call_softirq+0x1c/0x30
[ 1614.735278]  [<ffffffff8100425d>] do_softirq+0x8d/0xc0
[ 1614.735282]  [<ffffffff8104983e>] irq_exit+0xae/0xe0
[ 1614.735287]  [<ffffffff8175494e>] smp_apic_timer_interrupt+0x6e/0x99
[ 1614.735291]  [<ffffffff81753a1c>] apic_timer_interrupt+0x6c/0x80
[ 1614.735293]  <EOI>  [<ffffffff810a14ad>] ? trace_hardirqs_off+0xd/0x10
[ 1614.735306]  [<ffffffff81336f85>] ? intel_idle+0xf5/0x150
[ 1614.735310]  [<ffffffff81336f7e>] ? intel_idle+0xee/0x150
[ 1614.735317]  [<ffffffff814e6ea9>] cpuidle_enter+0x19/0x20
[ 1614.735321]  [<ffffffff814e7538>] cpuidle_idle_call+0xa8/0x630
[ 1614.735327]  [<ffffffff8100c1ba>] cpu_idle+0x8a/0xe0
[ 1614.735333]  [<ffffffff8173762e>] start_secondary+0x220/0x222

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
37159ef
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi October 03, 2012
bonding: set qdisc_tx_busylock to avoid LOCKDEP splat
If a qdisc is installed on a bonding device, its possible to get
following lockdep splat under stress :

 =============================================
 [ INFO: possible recursive locking detected ]
 3.6.0+ #211 Not tainted
 ---------------------------------------------
 ping/4876 is trying to acquire lock:
  (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+.-...}, at: [<ffffffff8157a191>] dev_queue_xmit+0xe1/0x830

 but task is already holding lock:
  (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+.-...}, at: [<ffffffff8157a191>] dev_queue_xmit+0xe1/0x830

 other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(dev->qdisc_tx_busylock ?: &qdisc_tx_busylock);
   lock(dev->qdisc_tx_busylock ?: &qdisc_tx_busylock);

  *** DEADLOCK ***

  May be due to missing lock nesting notation

 6 locks held by ping/4876:
  #0:  (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff815e5030>] raw_sendmsg+0x600/0xc30
  #1:  (rcu_read_lock_bh){.+....}, at: [<ffffffff815ba4bd>] ip_finish_output+0x12d/0x870
  #2:  (rcu_read_lock_bh){.+....}, at: [<ffffffff8157a0b0>] dev_queue_xmit+0x0/0x830
  #3:  (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+.-...}, at: [<ffffffff8157a191>] dev_queue_xmit+0xe1/0x830
  #4:  (&bond->lock){++.?..}, at: [<ffffffffa02128c1>] bond_start_xmit+0x31/0x4b0 [bonding]
  #5:  (rcu_read_lock_bh){.+....}, at: [<ffffffff8157a0b0>] dev_queue_xmit+0x0/0x830

 stack backtrace:
 Pid: 4876, comm: ping Not tainted 3.6.0+ #211
 Call Trace:
  [<ffffffff810a0145>] __lock_acquire+0x715/0x1b80
  [<ffffffff810a256b>] ? mark_held_locks+0x9b/0x100
  [<ffffffff810a1bf2>] lock_acquire+0x92/0x1d0
  [<ffffffff8157a191>] ? dev_queue_xmit+0xe1/0x830
  [<ffffffff81726b7c>] _raw_spin_lock+0x3c/0x50
  [<ffffffff8157a191>] ? dev_queue_xmit+0xe1/0x830
  [<ffffffff8106264d>] ? rcu_read_lock_bh_held+0x5d/0x90
  [<ffffffff8157a191>] dev_queue_xmit+0xe1/0x830
  [<ffffffff8157a0b0>] ? netdev_pick_tx+0x570/0x570
  [<ffffffffa0212a6a>] bond_start_xmit+0x1da/0x4b0 [bonding]
  [<ffffffff815796d0>] dev_hard_start_xmit+0x240/0x6b0
  [<ffffffff81597c6e>] sch_direct_xmit+0xfe/0x2a0
  [<ffffffff8157a249>] dev_queue_xmit+0x199/0x830
  [<ffffffff8157a0b0>] ? netdev_pick_tx+0x570/0x570
  [<ffffffff815ba96f>] ip_finish_output+0x5df/0x870
  [<ffffffff815ba4bd>] ? ip_finish_output+0x12d/0x870
  [<ffffffff815bb964>] ip_output+0x54/0xf0
  [<ffffffff815bad48>] ip_local_out+0x28/0x90
  [<ffffffff815bc444>] ip_send_skb+0x14/0x50
  [<ffffffff815bc4b2>] ip_push_pending_frames+0x32/0x40
  [<ffffffff815e536a>] raw_sendmsg+0x93a/0xc30
  [<ffffffff8128d570>] ? selinux_file_send_sigiotask+0x1f0/0x1f0
  [<ffffffff8109ddb4>] ? __lock_is_held+0x54/0x80
  [<ffffffff815f6730>] ? inet_recvmsg+0x220/0x220
  [<ffffffff8109ddb4>] ? __lock_is_held+0x54/0x80
  [<ffffffff815f6855>] inet_sendmsg+0x125/0x240
  [<ffffffff815f6730>] ? inet_recvmsg+0x220/0x220
  [<ffffffff8155cddb>] sock_sendmsg+0xab/0xe0
  [<ffffffff810a1650>] ? lock_release_non_nested+0xa0/0x2e0
  [<ffffffff810a1650>] ? lock_release_non_nested+0xa0/0x2e0
  [<ffffffff8155d18c>] __sys_sendmsg+0x37c/0x390
  [<ffffffff81195b2a>] ? fsnotify+0x2ca/0x7e0
  [<ffffffff811958e8>] ? fsnotify+0x88/0x7e0
  [<ffffffff81361f36>] ? put_ldisc+0x56/0xd0
  [<ffffffff8116f98a>] ? fget_light+0x3da/0x510
  [<ffffffff8155f6c4>] sys_sendmsg+0x44/0x80
  [<ffffffff8172fc22>] system_call_fastpath+0x16/0x1b

Avoid this problem using a distinct lock_class_key for bonding
devices.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
49ee492
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi October 05, 2012
Dmitry Monakhov ext4: fix ext4_flush_completed_IO wait semantics
BUG #1) All places where we call ext4_flush_completed_IO are broken
    because buffered io and DIO/AIO goes through three stages
    1) submitted io,
    2) completed io (in i_completed_io_list) conversion pended
    3) finished  io (conversion done)
    And by calling ext4_flush_completed_IO we will flush only
    requests which were in (2) stage, which is wrong because:
     1) punch_hole and truncate _must_ wait for all outstanding unwritten io
      regardless to it's state.
     2) fsync and nolock_dio_read should also wait because there is
        a time window between end_page_writeback() and ext4_add_complete_io()
        As result integrity fsync is broken in case of buffered write
        to fallocated region:
        fsync                                      blkdev_completion
	 ->filemap_write_and_wait_range
                                                   ->ext4_end_bio
                                                     ->end_page_writeback
          <-- filemap_write_and_wait_range return
	 ->ext4_flush_completed_IO
   	 sees empty i_completed_io_list but pended
   	 conversion still exist
                                                     ->ext4_add_complete_io

BUG #2) Race window becomes wider due to the 'ext4: completed_io
locking cleanup V4' patch series

This patch make following changes:
1) ext4_flush_completed_io() now first try to flush completed io and when
   wait for any outstanding unwritten io via ext4_unwritten_wait()
2) Rename function to more appropriate name.
3) Assert that all callers of ext4_flush_unwritten_io should hold i_mutex to
   prevent endless wait

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
c278531
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi October 08, 2012
mm, slab: release slab_mutex earlier in kmem_cache_destroy()
Commit 1331e7a ("rcu: Remove _rcu_barrier() dependency on
__stop_machine()") introduced slab_mutex -> cpu_hotplug.lock dependency
through kmem_cache_destroy() -> rcu_barrier() -> _rcu_barrier() ->
get_online_cpus().

Lockdep thinks that this might actually result in ABBA deadlock,
and reports it as below:

=== [ cut here ] ===
 ======================================================
 [ INFO: possible circular locking dependency detected ]
 3.6.0-rc5-00004-g0d8ee37 #143 Not tainted
 -------------------------------------------------------
 kworker/u:2/40 is trying to acquire lock:
  (rcu_sched_state.barrier_mutex){+.+...}, at: [<ffffffff810f2126>] _rcu_barrier+0x26/0x1e0

 but task is already holding lock:
  (slab_mutex){+.+.+.}, at: [<ffffffff81176e15>] kmem_cache_destroy+0x45/0xe0

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #2 (slab_mutex){+.+.+.}:
        [<ffffffff810ae1e2>] validate_chain+0x632/0x720
        [<ffffffff810ae5d9>] __lock_acquire+0x309/0x530
        [<ffffffff810ae921>] lock_acquire+0x121/0x190
        [<ffffffff8155d4cc>] __mutex_lock_common+0x5c/0x450
        [<ffffffff8155d9ee>] mutex_lock_nested+0x3e/0x50
        [<ffffffff81558cb5>] cpuup_callback+0x2f/0xbe
        [<ffffffff81564b83>] notifier_call_chain+0x93/0x140
        [<ffffffff81076f89>] __raw_notifier_call_chain+0x9/0x10
        [<ffffffff8155719d>] _cpu_up+0xba/0x14e
        [<ffffffff815572ed>] cpu_up+0xbc/0x117
        [<ffffffff81ae05e3>] smp_init+0x6b/0x9f
        [<ffffffff81ac47d6>] kernel_init+0x147/0x1dc
        [<ffffffff8156ab44>] kernel_thread_helper+0x4/0x10

 -> #1 (cpu_hotplug.lock){+.+.+.}:
        [<ffffffff810ae1e2>] validate_chain+0x632/0x720
        [<ffffffff810ae5d9>] __lock_acquire+0x309/0x530
        [<ffffffff810ae921>] lock_acquire+0x121/0x190
        [<ffffffff8155d4cc>] __mutex_lock_common+0x5c/0x450
        [<ffffffff8155d9ee>] mutex_lock_nested+0x3e/0x50
        [<ffffffff81049197>] get_online_cpus+0x37/0x50
        [<ffffffff810f21bb>] _rcu_barrier+0xbb/0x1e0
        [<ffffffff810f22f0>] rcu_barrier_sched+0x10/0x20
        [<ffffffff810f2309>] rcu_barrier+0x9/0x10
        [<ffffffff8118c129>] deactivate_locked_super+0x49/0x90
        [<ffffffff8118cc01>] deactivate_super+0x61/0x70
        [<ffffffff811aaaa7>] mntput_no_expire+0x127/0x180
        [<ffffffff811ab49e>] sys_umount+0x6e/0xd0
        [<ffffffff81569979>] system_call_fastpath+0x16/0x1b

 -> #0 (rcu_sched_state.barrier_mutex){+.+...}:
        [<ffffffff810adb4e>] check_prev_add+0x3de/0x440
        [<ffffffff810ae1e2>] validate_chain+0x632/0x720
        [<ffffffff810ae5d9>] __lock_acquire+0x309/0x530
        [<ffffffff810ae921>] lock_acquire+0x121/0x190
        [<ffffffff8155d4cc>] __mutex_lock_common+0x5c/0x450
        [<ffffffff8155d9ee>] mutex_lock_nested+0x3e/0x50
        [<ffffffff810f2126>] _rcu_barrier+0x26/0x1e0
        [<ffffffff810f22f0>] rcu_barrier_sched+0x10/0x20
        [<ffffffff810f2309>] rcu_barrier+0x9/0x10
        [<ffffffff81176ea1>] kmem_cache_destroy+0xd1/0xe0
        [<ffffffffa04c3154>] nf_conntrack_cleanup_net+0xe4/0x110 [nf_conntrack]
        [<ffffffffa04c31aa>] nf_conntrack_cleanup+0x2a/0x70 [nf_conntrack]
        [<ffffffffa04c42ce>] nf_conntrack_net_exit+0x5e/0x80 [nf_conntrack]
        [<ffffffff81454b79>] ops_exit_list+0x39/0x60
        [<ffffffff814551ab>] cleanup_net+0xfb/0x1b0
        [<ffffffff8106917b>] process_one_work+0x26b/0x4c0
        [<ffffffff81069f3e>] worker_thread+0x12e/0x320
        [<ffffffff8106f73e>] kthread+0x9e/0xb0
        [<ffffffff8156ab44>] kernel_thread_helper+0x4/0x10

 other info that might help us debug this:

 Chain exists of:
   rcu_sched_state.barrier_mutex --> cpu_hotplug.lock --> slab_mutex

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(slab_mutex);
                                lock(cpu_hotplug.lock);
                                lock(slab_mutex);
   lock(rcu_sched_state.barrier_mutex);

  *** DEADLOCK ***
=== [ cut here ] ===

This is actually a false positive. Lockdep has no way of knowing the fact
that the ABBA can actually never happen, because of special semantics of
cpu_hotplug.refcount and its handling in cpu_hotplug_begin(); the mutual
exclusion there is not achieved through mutex, but through
cpu_hotplug.refcount.

The "neither cpu_up() nor cpu_down() will proceed past cpu_hotplug_begin()
until everyone who called get_online_cpus() will call put_online_cpus()"
semantics is totally invisible to lockdep.

This patch therefore moves the unlock of slab_mutex so that rcu_barrier()
is being called with it unlocked. It has two advantages:

- it slightly reduces hold time of slab_mutex; as it's used to protect
  the cachep list, it's not necessary to hold it over kmem_cache_free()
  call any more
- it silences the lockdep false positive warning, as it avoids lockdep ever
  learning about slab_mutex -> cpu_hotplug.lock dependency

Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
210ed9d
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi October 12, 2012
Linus Torvalds Merge tag 'sound-3.7' of git://git.kernel.org/pub/scm/linux/kernel/gi…
…t/tiwai/sound

Pull sound updates #2 from Takashi Iwai:
 "This update contains a few cleanup works, regression/stable fixes
  gathered since the last pull request.

   - Clean up with generic hd-audio jack handling code by David
     Henningsson
   - A few regression fixes for standardized HD-audio auto-parser
   - Misc clean-up and small fixes"

* tag 'sound-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda - do not detect jack on internal speakers for Realtek
  ALSA: hda - Fix missing beep on ASUS X43U notebook
  ALSA: hda - Remove AZX_DCAPS_POSFIX_COMBO
  ALSA: hda - Warn an allocation for an uninitialized array
  ALSA: hda/cirrus - Add missing init/free of hda_gen_spec
  ALSA: hda - Fix memory leaks at error path in patch_cirrus.c
  ALSA: hda - Add missing hda_gen_spec to struct via_spec
  ALSA: hda - remove "Mic Jack Mode" for headset jacks (Latitude Exx30)
  ALSA: hda - make Cirrus codec use generic unsol event handler
  ALSA: hda - make VIA codec use generic unsol event handler
  ALSA: hda - Remove dead GPIO code for VIA codec
  ALSA: usb-audio: Add TASCAM US122 MKII playback
2fc07ef
popcornmix popcornmix referenced this issue from a commit November 14, 2012
ipv4/ip_vti.c: VTI fix post-decryption forwarding
[ Upstream commit b294200 ]

With the latest kernel there are two things that must be done post decryption
 so that the packet are forwarded.
 1. Remove the mark from the packet. This will cause the packet to not match
 the ipsec-policy again. However doing this causes the post-decryption check to
 fail also and the packet will get dropped. (cat /proc/net/xfrm_stat).
 2. Remove the sp association in the skbuff so that no policy check is done on
 the packet for VTI tunnels.

Due to #2 above we must now do a security-policy check in the vti rcv path
prior to resetting the mark in the skbuff.

Signed-off-by: Saurabh Mohan <saurabh.mohan@vyatta.com>
Reported-by: Ruben Herold <ruben@puettmann.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
b31143d
popcornmix popcornmix referenced this issue from a commit September 25, 2012
NFC: Use dynamic initialization for rwlocks
commit fe235b5 upstream.

If rwlock is dynamically allocated but statically initialized it is
missing proper lockdep annotation.

INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
Pid: 3352, comm: neard Not tainted 3.5.0-999-nfc+ #2
Call Trace:
[<ffffffff810c8526>] __lock_acquire+0x8f6/0x1bf0
[<ffffffff81739045>] ? printk+0x4d/0x4f
[<ffffffff810c9eed>] lock_acquire+0x9d/0x220
[<ffffffff81702bfe>] ? nfc_llcp_sock_from_sn+0x4e/0x160
[<ffffffff81746724>] _raw_read_lock+0x44/0x60
[<ffffffff81702bfe>] ? nfc_llcp_sock_from_sn+0x4e/0x160
[<ffffffff81702bfe>] nfc_llcp_sock_from_sn+0x4e/0x160
[<ffffffff817034a7>] nfc_llcp_get_sdp_ssap+0xa7/0x1b0
[<ffffffff81706353>] llcp_sock_bind+0x173/0x210
[<ffffffff815d9c94>] sys_bind+0xe4/0x100
[<ffffffff8139209e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[<ffffffff8174ea69>] system_call_fastpath+0x16/0x1b

Signed-off-by: Szymon Janc <szymon.janc@tieto.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
cc0e742
popcornmix popcornmix referenced this issue from a commit November 06, 2012
KVM: x86: invalid opcode oops on SET_SREGS with OSXSAVE bit set (CVE-…
…2012-4461)

commit 6d1068b upstream.

On hosts without the XSAVE support unprivileged local user can trigger
oops similar to the one below by setting X86_CR4_OSXSAVE bit in guest
cr4 register using KVM_SET_SREGS ioctl and later issuing KVM_RUN
ioctl.

invalid opcode: 0000 [#2] SMP
Modules linked in: tun ip6table_filter ip6_tables ebtable_nat ebtables
...
Pid: 24935, comm: zoog_kvm_monito Tainted: G      D      3.2.0-3-686-pae
EIP: 0060:[<f8b9550c>] EFLAGS: 00210246 CPU: 0
EIP is at kvm_arch_vcpu_ioctl_run+0x92a/0xd13 [kvm]
EAX: 00000001 EBX: 000f387e ECX: 00000000 EDX: 00000000
ESI: 00000000 EDI: 00000000 EBP: ef5a0060 ESP: d7c63e70
 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process zoog_kvm_monito (pid: 24935, ti=d7c62000 task=ed84a0c0
task.ti=d7c62000)
Stack:
 00000001 f70a1200 f8b940a9 ef5a0060 00000000 00200202 f8769009 00000000
 ef5a0060 000f387e eda5c020 8722f9c8 00015bae 00000000 ed84a0c0 ed84a0c0
 c12bf02d 0000ae80 ef7f8740 fffffffb f359b740 ef5a0060 f8b85dc1 0000ae80
Call Trace:
 [<f8b940a9>] ? kvm_arch_vcpu_ioctl_set_sregs+0x2fe/0x308 [kvm]
...
 [<c12bfb44>] ? syscall_call+0x7/0xb
Code: 89 e8 e8 14 ee ff ff ba 00 00 04 00 89 e8 e8 98 48 ff ff 85 c0 74
1e 83 7d 48 00 75 18 8b 85 08 07 00 00 31 c9 8b 95 0c 07 00 00 <0f> 01
d1 c7 45 48 01 00 00 00 c7 45 1c 01 00 00 00 0f ae f0 89
EIP: [<f8b9550c>] kvm_arch_vcpu_ioctl_run+0x92a/0xd13 [kvm] SS:ESP
0068:d7c63e70

QEMU first retrieves the supported features via KVM_GET_SUPPORTED_CPUID
and then sets them later. So guest's X86_FEATURE_XSAVE should be masked
out on hosts without X86_FEATURE_XSAVE, making kvm_set_cr4 with
X86_CR4_OSXSAVE fail. Userspaces that allow specifying guest cpuid with
X86_FEATURE_XSAVE even on hosts that do not support it, might be
susceptible to this attack from inside the guest as well.

Allow setting X86_CR4_OSXSAVE bit only if host has XSAVE support.

Signed-off-by: Petr Matousek <pmatouse@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
e467125
Stephen Warren swarren referenced this issue from a commit December 04, 2012
Commit has since been removed from the repository and is no longer available.
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi March 16, 2011
drbd: Separate connection state changes from minor dev state changes #2
New function got_conn_RqSReply()

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
e4f78ed
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi November 05, 2012
cgroup: deactivate CSS's and mark cgroup dead before invoking ->pre_d…
…estroy()

Because ->pre_destroy() could fail and can't be called under
cgroup_mutex, cgroup destruction did something very ugly.

  1. Grab cgroup_mutex and verify it can be destroyed; fail otherwise.

  2. Release cgroup_mutex and call ->pre_destroy().

  3. Re-grab cgroup_mutex and verify it can still be destroyed; fail
     otherwise.

  4. Continue destroying.

In addition to being ugly, it has been always broken in various ways.
For example, memcg ->pre_destroy() expects the cgroup to be inactive
after it's done but tasks can be attached and detached between #2 and
#3 and the conditions that memcg verified in ->pre_destroy() might no
longer hold by the time control reaches #3.

Now that ->pre_destroy() is no longer allowed to fail.  We can switch
to the following.

  1. Grab cgroup_mutex and verify it can be destroyed; fail otherwise.

  2. Deactivate CSS's and mark the cgroup removed thus preventing any
     further operations which can invalidate the verification from #1.

  3. Release cgroup_mutex and call ->pre_destroy().

  4. Re-grab cgroup_mutex and continue destroying.

After this change, controllers can safely assume that ->pre_destroy()
will only be called only once for a given cgroup and, once
->pre_destroy() is called, the cgroup will stay dormant till it's
destroyed.

This removes the only reason ->pre_destroy() can fail - new task being
attached or child cgroup being created inbetween.  Error out path is
removed and ->pre_destroy() invocation is open coded in
cgroup_rmdir().

v2: cgroup_call_pre_destroy() removal moved to this patch per Michal.
    Commit message updated per Glauber.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Glauber Costa <glommer@parallels.com>
1a90dd5
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi September 27, 2012
Jon Hunter ARM: OMAP3+: Implement timer workaround for errata i103 and i767
Errata Titles:
i103: Delay needed to read some GP timer, WD timer and sync timer
      registers after wakeup (OMAP3/4)
i767: Delay needed to read some GP timer registers after wakeup (OMAP5)

Description (i103/i767):
If a General Purpose Timer (GPTimer) is in posted mode
(TSICR [2].POSTED=1), due to internal resynchronizations, values read in
TCRR, TCAR1 and TCAR2 registers right after the timer interface clock
(L4) goes from stopped to active may not return the expected values. The
most common event leading to this situation occurs upon wake up from
idle.

GPTimer non-posted synchronization mode is not impacted by this
limitation.

Workarounds:
1). Disable posted mode
2). Use static dependency between timer clock domain and MPUSS clock
    domain
3). Use no-idle mode when the timer is active

Workarounds #2 and #3 are not pratical from a power standpoint and so
workaround #1 has been implemented. Disabling posted mode adds some CPU
overhead for configuring and reading the timers as the CPU has to wait
for accesses to be re-synchronised within the timer. However, disabling
posted mode guarantees correct operation.

Please note that it is safe to use posted mode for timers if the counter
(TCRR) and capture (TCARx) registers will never be read. An example of
this is the clock-event system timer. This is used by the kernel to
schedule events however, the timers counter is never read and capture
registers are not used. Given that the kernel configures this timer
often yet never reads the counter register it is safe to enable posted
mode in this case. Hence, for the timer used for kernel clock-events,
posted mode is enabled by overriding the errata for devices that are
impacted by this defect.

For drivers using the timers that do not read the counter or capture
registers and wish to use posted mode, can override the errata and
enable posted mode by making the following function calls.

	__omap_dm_timer_override_errata(timer, OMAP_TIMER_ERRATA_I103_I767);
	__omap_dm_timer_enable_posted(timer);

Both dmtimers and watchdogs are impacted by this defect this patch only
implements the workaround for the dmtimer. Currently the watchdog driver
does not read the counter register and so no workaround is necessary.

Posted mode will be disabled for all OMAP2+ devices (including AM33xx)
using a GP timer as a clock-source timer to guarantee correct operation.
This is not necessary for OMAP24xx devices but the default clock-source
timer for OMAP24xx devices is the 32k-sync timer and not the GP timer
and so should not have any impact. This should be re-visited for future
devices if this errata is fixed.

Confirmed with Vaibhav Hiremath that this bug also impacts AM33xx
devices.

Signed-off-by: Jon Hunter <jon-hunter@ti.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
bfd6d02
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi November 06, 2012
KVM: x86: invalid opcode oops on SET_SREGS with OSXSAVE bit set (CVE-…
…2012-4461)

On hosts without the XSAVE support unprivileged local user can trigger
oops similar to the one below by setting X86_CR4_OSXSAVE bit in guest
cr4 register using KVM_SET_SREGS ioctl and later issuing KVM_RUN
ioctl.

invalid opcode: 0000 [#2] SMP
Modules linked in: tun ip6table_filter ip6_tables ebtable_nat ebtables
...
Pid: 24935, comm: zoog_kvm_monito Tainted: G      D      3.2.0-3-686-pae
EIP: 0060:[<f8b9550c>] EFLAGS: 00210246 CPU: 0
EIP is at kvm_arch_vcpu_ioctl_run+0x92a/0xd13 [kvm]
EAX: 00000001 EBX: 000f387e ECX: 00000000 EDX: 00000000
ESI: 00000000 EDI: 00000000 EBP: ef5a0060 ESP: d7c63e70
 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process zoog_kvm_monito (pid: 24935, ti=d7c62000 task=ed84a0c0
task.ti=d7c62000)
Stack:
 00000001 f70a1200 f8b940a9 ef5a0060 00000000 00200202 f8769009 00000000
 ef5a0060 000f387e eda5c020 8722f9c8 00015bae 00000000 ed84a0c0 ed84a0c0
 c12bf02d 0000ae80 ef7f8740 fffffffb f359b740 ef5a0060 f8b85dc1 0000ae80
Call Trace:
 [<f8b940a9>] ? kvm_arch_vcpu_ioctl_set_sregs+0x2fe/0x308 [kvm]
...
 [<c12bfb44>] ? syscall_call+0x7/0xb
Code: 89 e8 e8 14 ee ff ff ba 00 00 04 00 89 e8 e8 98 48 ff ff 85 c0 74
1e 83 7d 48 00 75 18 8b 85 08 07 00 00 31 c9 8b 95 0c 07 00 00 <0f> 01
d1 c7 45 48 01 00 00 00 c7 45 1c 01 00 00 00 0f ae f0 89
EIP: [<f8b9550c>] kvm_arch_vcpu_ioctl_run+0x92a/0xd13 [kvm] SS:ESP
0068:d7c63e70

QEMU first retrieves the supported features via KVM_GET_SUPPORTED_CPUID
and then sets them later. So guest's X86_FEATURE_XSAVE should be masked
out on hosts without X86_FEATURE_XSAVE, making kvm_set_cr4 with
X86_CR4_OSXSAVE fail. Userspaces that allow specifying guest cpuid with
X86_FEATURE_XSAVE even on hosts that do not support it, might be
susceptible to this attack from inside the guest as well.

Allow setting X86_CR4_OSXSAVE bit only if host has XSAVE support.

Signed-off-by: Petr Matousek <pmatouse@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
6d1068b
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi October 09, 2012
mpe powerpc/xmon: Remove unused xmon_expect() & xmon_read_poll()
It looks like xmon_expect() was used for doing xmon over a modem (!?),
that code was dropped in 2005 in commit 51d3082 "Unify udbg (#2)".

Once xmon_expect() is gone xmon_read_poll() is unused, drop it too.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
eb1c2ab
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi October 09, 2012
mpe powerpc/xmon: Remove empty xmon_map_scc()
This has been empty since 2005, commit 51d3082 "Unify udbg (#2)".

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
08702c7
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi November 14, 2012
ipv4/ip_vti.c: VTI fix post-decryption forwarding
With the latest kernel there are two things that must be done post decryption
 so that the packet are forwarded.
 1. Remove the mark from the packet. This will cause the packet to not match
 the ipsec-policy again. However doing this causes the post-decryption check to
 fail also and the packet will get dropped. (cat /proc/net/xfrm_stat).
 2. Remove the sp association in the skbuff so that no policy check is done on
 the packet for VTI tunnels.

Due to #2 above we must now do a security-policy check in the vti rcv path
prior to resetting the mark in the skbuff.

Signed-off-by: Saurabh Mohan <saurabh.mohan@vyatta.com>
Reported-by: Ruben Herold <ruben@puettmann.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
b294200
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi November 07, 2012
Paul E. McKenney sched: Mark RCU reader in sched_show_task()
When sched_show_task() is invoked from try_to_freeze_tasks(), there is
no RCU read-side critical section, resulting in the following splat:

[  125.780730] ===============================
[  125.780766] [ INFO: suspicious RCU usage. ]
[  125.780804] 3.7.0-rc3+ #988 Not tainted
[  125.780838] -------------------------------
[  125.780875] /home/rafael/src/linux/kernel/sched/core.c:4497 suspicious rcu_dereference_check() usage!
[  125.780946]
[  125.780946] other info that might help us debug this:
[  125.780946]
[  125.781031]
[  125.781031] rcu_scheduler_active = 1, debug_locks = 0
[  125.781087] 4 locks held by s2ram/4211:
[  125.781120]  #0:  (&buffer->mutex){+.+.+.}, at: [<ffffffff811e2acf>] sysfs_write_file+0x3f/0x160
[  125.781233]  #1:  (s_active#94){.+.+.+}, at: [<ffffffff811e2b58>] sysfs_write_file+0xc8/0x160
[  125.781339]  #2:  (pm_mutex){+.+.+.}, at: [<ffffffff81090a81>] pm_suspend+0x81/0x230
[  125.781439]  #3:  (tasklist_lock){.?.?..}, at: [<ffffffff8108feed>] try_to_freeze_tasks+0x2cd/0x3f0
[  125.781543]
[  125.781543] stack backtrace:
[  125.781584] Pid: 4211, comm: s2ram Not tainted 3.7.0-rc3+ #988
[  125.781632] Call Trace:
[  125.781662]  [<ffffffff810a3c73>] lockdep_rcu_suspicious+0x103/0x140
[  125.781719]  [<ffffffff8107cf21>] sched_show_task+0x121/0x180
[  125.781770]  [<ffffffff8108ffb4>] try_to_freeze_tasks+0x394/0x3f0
[  125.781823]  [<ffffffff810903b5>] freeze_kernel_threads+0x25/0x80
[  125.781876]  [<ffffffff81090b65>] pm_suspend+0x165/0x230
[  125.781924]  [<ffffffff8108fa29>] state_store+0x99/0x100
[  125.781975]  [<ffffffff812f5867>] kobj_attr_store+0x17/0x20
[  125.782038]  [<ffffffff811e2b71>] sysfs_write_file+0xe1/0x160
[  125.782091]  [<ffffffff811667a6>] vfs_write+0xc6/0x180
[  125.782138]  [<ffffffff81166ada>] sys_write+0x5a/0xa0
[  125.782185]  [<ffffffff812ff6ae>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[  125.782242]  [<ffffffff81669dd2>] system_call_fastpath+0x16/0x1b

This commit therefore adds the needed RCU read-side critical section.

Reported-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
4e79752
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi September 18, 2012
Olaf Hering ata_piix: reenable MS Virtual PC guests
An earlier commit cd00608 ("ata_piix:
defer disks to the Hyper-V drivers by default") broke MS Virtual PC
guests. Hyper-V guests and Virtual PC guests have nearly identical DMI
info. As a result the driver does currently ignore the emulated hardware
in Virtual PC guests and defers the handling to hv_blkvsc. Since Virtual
PC does not offer paravirtualized drivers no disks will be found in the
guest.

One difference in the DMI info is the product version. This patch adds a
match for MS Virtual PC 2007 and "unignores" the emulated hardware.

This was reported for openSuSE 12.1 in bugzilla:
https://bugzilla.novell.com/show_bug.cgi?id=737532

Here is a detailed list of DMI info from example guests:

hwinfo --bios:

virtual pc guest:

  System Info: #1
    Manufacturer: "Microsoft Corporation"
    Product: "Virtual Machine"
    Version: "VS2005R2"
    Serial: "3178-9905-1533-4840-9282-0569-59"
    UUID: undefined, but settable
    Wake-up: 0x06 (Power Switch)
  Board Info: #2
    Manufacturer: "Microsoft Corporation"
    Product: "Virtual Machine"
    Version: "5.0"
    Serial: "3178-9905-1533-4840-9282-0569-59"
  Chassis Info: #3
    Manufacturer: "Microsoft Corporation"
    Version: "5.0"
    Serial: "3178-9905-1533-4840-9282-0569-59"
    Asset Tag: "7188-3705-6309-9738-9645-0364-00"
    Type: 0x03 (Desktop)
    Bootup State: 0x03 (Safe)
    Power Supply State: 0x03 (Safe)
    Thermal State: 0x01 (Other)
    Security Status: 0x01 (Other)

win2k8 guest:

  System Info: #1
    Manufacturer: "Microsoft Corporation"
    Product: "Virtual Machine"
    Version: "7.0"
    Serial: "9106-3420-9819-5495-1514-2075-48"
    UUID: undefined, but settable
    Wake-up: 0x06 (Power Switch)
  Board Info: #2
    Manufacturer: "Microsoft Corporation"
    Product: "Virtual Machine"
    Version: "7.0"
    Serial: "9106-3420-9819-5495-1514-2075-48"
  Chassis Info: #3
    Manufacturer: "Microsoft Corporation"
    Version: "7.0"
    Serial: "9106-3420-9819-5495-1514-2075-48"
    Asset Tag: "7076-9522-6699-1042-9501-1785-77"
    Type: 0x03 (Desktop)
    Bootup State: 0x03 (Safe)
    Power Supply State: 0x03 (Safe)
    Thermal State: 0x01 (Other)
    Security Status: 0x01 (Other)

win2k12 guest:

  System Info: #1
    Manufacturer: "Microsoft Corporation"
    Product: "Virtual Machine"
    Version: "7.0"
    Serial: "8179-1954-0187-0085-3868-2270-14"
    UUID: undefined, but settable
    Wake-up: 0x06 (Power Switch)
  Board Info: #2
    Manufacturer: "Microsoft Corporation"
    Product: "Virtual Machine"
    Version: "7.0"
    Serial: "8179-1954-0187-0085-3868-2270-14"
  Chassis Info: #3
    Manufacturer: "Microsoft Corporation"
    Version: "7.0"
    Serial: "8179-1954-0187-0085-3868-2270-14"
    Asset Tag: "8374-0485-4557-6331-0620-5845-25"
    Type: 0x03 (Desktop)
    Bootup State: 0x03 (Safe)
    Power Supply State: 0x03 (Safe)
    Thermal State: 0x01 (Other)
    Security Status: 0x01 (Other)

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
d990434
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi November 28, 2012
workqueue: exit rescuer_thread() as TASK_RUNNING
A rescue thread exiting TASK_INTERRUPTIBLE can lead to a task scheduling
off, never to be seen again.  In the case where this occurred, an exiting
thread hit reiserfs homebrew conditional resched while holding a mutex,
bringing the box to its knees.

PID: 18105  TASK: ffff8807fd412180  CPU: 5   COMMAND: "kdmflush"
 #0 [ffff8808157e7670] schedule at ffffffff8143f489
 #1 [ffff8808157e77b8] reiserfs_get_block at ffffffffa038ab2d [reiserfs]
 #2 [ffff8808157e79a8] __block_write_begin at ffffffff8117fb14
 #3 [ffff8808157e7a98] reiserfs_write_begin at ffffffffa0388695 [reiserfs]
 #4 [ffff8808157e7ad8] generic_perform_write at ffffffff810ee9e2
 #5 [ffff8808157e7b58] generic_file_buffered_write at ffffffff810eeb41
 #6 [ffff8808157e7ba8] __generic_file_aio_write at ffffffff810f1a3a
 #7 [ffff8808157e7c58] generic_file_aio_write at ffffffff810f1c88
 #8 [ffff8808157e7cc8] do_sync_write at ffffffff8114f850
 #9 [ffff8808157e7dd8] do_acct_process at ffffffff810a268f
    [exception RIP: kernel_thread_helper]
    RIP: ffffffff8144a5c0  RSP: ffff8808157e7f58  RFLAGS: 00000202
    RAX: 0000000000000000  RBX: 0000000000000000  RCX: 0000000000000000
    RDX: 0000000000000000  RSI: ffffffff8107af60  RDI: ffff8803ee491d18
    RBP: 0000000000000000   R8: 0000000000000000   R9: 0000000000000000
    R10: 0000000000000000  R11: 0000000000000000  R12: 0000000000000000
    R13: 0000000000000000  R14: 0000000000000000  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018

Signed-off-by: Mike Galbraith <mgalbraith@suse.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
412d32e
popcornmix popcornmix referenced this issue from a commit November 28, 2012
workqueue: exit rescuer_thread() as TASK_RUNNING
commit 412d32e upstream.

A rescue thread exiting TASK_INTERRUPTIBLE can lead to a task scheduling
off, never to be seen again.  In the case where this occurred, an exiting
thread hit reiserfs homebrew conditional resched while holding a mutex,
bringing the box to its knees.

PID: 18105  TASK: ffff8807fd412180  CPU: 5   COMMAND: "kdmflush"
 #0 [ffff8808157e7670] schedule at ffffffff8143f489
 #1 [ffff8808157e77b8] reiserfs_get_block at ffffffffa038ab2d [reiserfs]
 #2 [ffff8808157e79a8] __block_write_begin at ffffffff8117fb14
 #3 [ffff8808157e7a98] reiserfs_write_begin at ffffffffa0388695 [reiserfs]
 #4 [ffff8808157e7ad8] generic_perform_write at ffffffff810ee9e2
 #5 [ffff8808157e7b58] generic_file_buffered_write at ffffffff810eeb41
 #6 [ffff8808157e7ba8] __generic_file_aio_write at ffffffff810f1a3a
 #7 [ffff8808157e7c58] generic_file_aio_write at ffffffff810f1c88
 #8 [ffff8808157e7cc8] do_sync_write at ffffffff8114f850
 #9 [ffff8808157e7dd8] do_acct_process at ffffffff810a268f
    [exception RIP: kernel_thread_helper]
    RIP: ffffffff8144a5c0  RSP: ffff8808157e7f58  RFLAGS: 00000202
    RAX: 0000000000000000  RBX: 0000000000000000  RCX: 0000000000000000
    RDX: 0000000000000000  RSI: ffffffff8107af60  RDI: ffff8803ee491d18
    RBP: 0000000000000000   R8: 0000000000000000   R9: 0000000000000000
    R10: 0000000000000000  R11: 0000000000000000  R12: 0000000000000000
    R13: 0000000000000000  R14: 0000000000000000  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018

Signed-off-by: Mike Galbraith <mgalbraith@suse.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2ccea42
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi December 08, 2012
davem330 Merge branch 'tipc_net-next_v2' of git://git.kernel.org/pub/scm/linux…
…/kernel/git/paulg/linux

Paul Gortmaker says:

====================
Changes since v1:
	-get rid of essentially unused variable spotted by
	 Neil Horman (patch #2)

	-drop patch #3; defer it for 3.9 content, so Neil,
	 Jon and Ying can discuss its specifics at their
	 leisure while net-next is closed.  (It had no
	 direct dependencies to the rest of the series, and
	 was just an optimization)

	-fix indentation of accept() code directly in place
	 vs. forking it out to a separate function (was patch
	 #10, now patch #9).

Rebuilt and re-ran tests just to ensure nothing odd happened.

Original v1 text follows, updated pull information follows that.

           ---------

Here is another batch of TIPC changes.  The most interesting
thing is probably the non-blocking socket connect - I'm told
there were several users looking forward to seeing this.

Also there were some resource limitation changes that had
the right intent back in 2005, but were now apparently causing
needless limitations to people's real use cases; those have
been relaxed/removed.

There is a lockdep splat fix, but no need for a stable backport,
since it is virtually impossible to trigger in mainline; you
have to essentially modify code to force the probabilities
in your favour to see it.

The rest can largely be categorized as general cleanup of things
seen in the process of getting the above changes done.

Tested between 64 and 32 bit nodes with the test suite.  I've
also compile tested all the individual commits on the chain.

I'd originally figured on this queue not being ready for 3.8, but
the extended stabilization window of 3.7 has changed that.  On
the other hand, this can still be 3.9 material, if that simply
works better for folks - no problem for me to defer it to 2013.
If anyone spots any problems then I'll definitely defer it,
rather than rush a last minute respin.
===================

Signed-off-by: David S. Miller <davem@davemloft.net>
ba50166
Stephen Warren swarren referenced this issue from a commit December 17, 2012
Commit has since been removed from the repository and is no longer available.
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi December 18, 2012
Linus Torvalds Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel…
…/git/s390/linux

Pull s390 update #2 from Martin Schwidefsky:
 "The main patch is the function measurement blocks extension for PCI to
  do performance statistics and help with debugging.  The other patch is
  a small cleanup in ccwdev.h."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/ccwdev: Include asm/schid.h.
  s390/pci: performance statistics and debug infrastructure
7b07786
Stephen Warren swarren referenced this issue from a commit December 23, 2012
Commit has since been removed from the repository and is no longer available.
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi December 20, 2012
memcg: don't register hotcpu notifier from ->css_alloc()
Commit 648bb56 ("cgroup: lock cgroup_mutex in cgroup_init_subsys()")
made cgroup_init_subsys() grab cgroup_mutex before invoking
->css_alloc() for the root css.  Because memcg registers hotcpu notifier
from ->css_alloc() for the root css, this introduced circular locking
dependency between cgroup_mutex and cpu hotplug.

Fix it by moving hotcpu notifier registration to a subsys initcall.

  ======================================================
  [ INFO: possible circular locking dependency detected ]
  3.7.0-rc4-work+ #42 Not tainted
  -------------------------------------------------------
  bash/645 is trying to acquire lock:
   (cgroup_mutex){+.+.+.}, at: [<ffffffff8110c5b7>] cgroup_lock+0x17/0x20

  but task is already holding lock:
   (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff8109300f>] cpu_hotplug_begin+0x2f/0x60

  which lock already depends on the new lock.

  the existing dependency chain (in reverse order) is:

 -> #1 (cpu_hotplug.lock){+.+.+.}:
         lock_acquire+0x97/0x1e0
         mutex_lock_nested+0x61/0x3b0
         get_online_cpus+0x3c/0x60
         rebuild_sched_domains_locked+0x1b/0x70
         cpuset_write_resmask+0x298/0x2c0
         cgroup_file_write+0x1ef/0x300
         vfs_write+0xa8/0x160
         sys_write+0x52/0xa0
         system_call_fastpath+0x16/0x1b

 -> #0 (cgroup_mutex){+.+.+.}:
         __lock_acquire+0x14ce/0x1d20
         lock_acquire+0x97/0x1e0
         mutex_lock_nested+0x61/0x3b0
         cgroup_lock+0x17/0x20
         cpuset_handle_hotplug+0x1b/0x560
         cpuset_update_active_cpus+0xe/0x10
         cpuset_cpu_inactive+0x47/0x50
         notifier_call_chain+0x66/0x150
         __raw_notifier_call_chain+0xe/0x10
         __cpu_notify+0x20/0x40
         _cpu_down+0x7e/0x2f0
         cpu_down+0x36/0x50
         store_online+0x5d/0xe0
         dev_attr_store+0x18/0x30
         sysfs_write_file+0xe0/0x150
         vfs_write+0xa8/0x160
         sys_write+0x52/0xa0
         system_call_fastpath+0x16/0x1b
  other info that might help us debug this:

   Possible unsafe locking scenario:

         CPU0                    CPU1
         ----                    ----
    lock(cpu_hotplug.lock);
                                 lock(cgroup_mutex);
                                 lock(cpu_hotplug.lock);
    lock(cgroup_mutex);

   *** DEADLOCK ***

  5 locks held by bash/645:
   #0:  (&buffer->mutex){+.+.+.}, at: [<ffffffff8123bab8>] sysfs_write_file+0x48/0x150
   #1:  (s_active#42){.+.+.+}, at: [<ffffffff8123bb38>] sysfs_write_file+0xc8/0x150
   #2:  (x86_cpu_hotplug_driver_mutex){+.+...}, at: [<ffffffff81079277>] cpu_hotplug_driver_lock+0x1
+7/0x20
   #3:  (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff81093157>] cpu_maps_update_begin+0x17/0x20
   #4:  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff8109300f>] cpu_hotplug_begin+0x2f/0x60

  stack backtrace:
  Pid: 645, comm: bash Not tainted 3.7.0-rc4-work+ #42
  Call Trace:
   print_circular_bug+0x28e/0x29f
   __lock_acquire+0x14ce/0x1d20
   lock_acquire+0x97/0x1e0
   mutex_lock_nested+0x61/0x3b0
   cgroup_lock+0x17/0x20
   cpuset_handle_hotplug+0x1b/0x560
   cpuset_update_active_cpus+0xe/0x10
   cpuset_cpu_inactive+0x47/0x50
   notifier_call_chain+0x66/0x150
   __raw_notifier_call_chain+0xe/0x10
   __cpu_notify+0x20/0x40
   _cpu_down+0x7e/0x2f0
   cpu_down+0x36/0x50
   store_online+0x5d/0xe0
   dev_attr_store+0x18/0x30
   sysfs_write_file+0xe0/0x150
   vfs_write+0xa8/0x160
   sys_write+0x52/0xa0
   system_call_fastpath+0x16/0x1b

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
154b454
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi December 21, 2012
ipv4: arp: fix a lockdep splat in arp_solicit()
Yan Burman reported following lockdep warning :

=============================================
[ INFO: possible recursive locking detected ]
3.7.0+ #24 Not tainted
---------------------------------------------
swapper/1/0 is trying to acquire lock:
  (&n->lock){++--..}, at: [<ffffffff8139f56e>] __neigh_event_send
+0x2e/0x2f0

but task is already holding lock:
  (&n->lock){++--..}, at: [<ffffffff813f63f4>] arp_solicit+0x1d4/0x280

other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(&n->lock);
   lock(&n->lock);

  *** DEADLOCK ***

  May be due to missing lock nesting notation

4 locks held by swapper/1/0:
  #0:  (((&n->timer))){+.-...}, at: [<ffffffff8104b350>]
call_timer_fn+0x0/0x1c0
  #1:  (&n->lock){++--..}, at: [<ffffffff813f63f4>] arp_solicit
+0x1d4/0x280
  #2:  (rcu_read_lock_bh){.+....}, at: [<ffffffff81395400>]
dev_queue_xmit+0x0/0x5d0
  #3:  (rcu_read_lock_bh){.+....}, at: [<ffffffff813cb41e>]
ip_finish_output+0x13e/0x640

stack backtrace:
Pid: 0, comm: swapper/1 Not tainted 3.7.0+ #24
Call Trace:
  <IRQ>  [<ffffffff8108c7ac>] validate_chain+0xdcc/0x11f0
  [<ffffffff8108d570>] ? __lock_acquire+0x440/0xc30
  [<ffffffff81120565>] ? kmem_cache_free+0xe5/0x1c0
  [<ffffffff8108d570>] __lock_acquire+0x440/0xc30
  [<ffffffff813c3570>] ? inet_getpeer+0x40/0x600
  [<ffffffff8108d570>] ? __lock_acquire+0x440/0xc30
  [<ffffffff8139f56e>] ? __neigh_event_send+0x2e/0x2f0
  [<ffffffff8108ddf5>] lock_acquire+0x95/0x140
  [<ffffffff8139f56e>] ? __neigh_event_send+0x2e/0x2f0
  [<ffffffff8108d570>] ? __lock_acquire+0x440/0xc30
  [<ffffffff81448d4b>] _raw_write_lock_bh+0x3b/0x50
  [<ffffffff8139f56e>] ? __neigh_event_send+0x2e/0x2f0
  [<ffffffff8139f56e>] __neigh_event_send+0x2e/0x2f0
  [<ffffffff8139f99b>] neigh_resolve_output+0x16b/0x270
  [<ffffffff813cb62d>] ip_finish_output+0x34d/0x640
  [<ffffffff813cb41e>] ? ip_finish_output+0x13e/0x640
  [<ffffffffa046f146>] ? vxlan_xmit+0x556/0xbec [vxlan]
  [<ffffffff813cb9a0>] ip_output+0x80/0xf0
  [<ffffffff813ca368>] ip_local_out+0x28/0x80
  [<ffffffffa046f25a>] vxlan_xmit+0x66a/0xbec [vxlan]
  [<ffffffffa046f146>] ? vxlan_xmit+0x556/0xbec [vxlan]
  [<ffffffff81394a50>] ? skb_gso_segment+0x2b0/0x2b0
  [<ffffffff81449355>] ? _raw_spin_unlock_irqrestore+0x65/0x80
  [<ffffffff81394c57>] ? dev_queue_xmit_nit+0x207/0x270
  [<ffffffff813950c8>] dev_hard_start_xmit+0x298/0x5d0
  [<ffffffff813956f3>] dev_queue_xmit+0x2f3/0x5d0
  [<ffffffff81395400>] ? dev_hard_start_xmit+0x5d0/0x5d0
  [<ffffffff813f5788>] arp_xmit+0x58/0x60
  [<ffffffff813f59db>] arp_send+0x3b/0x40
  [<ffffffff813f6424>] arp_solicit+0x204/0x280
  [<ffffffff813a1a70>] ? neigh_add+0x310/0x310
  [<ffffffff8139f515>] neigh_probe+0x45/0x70
  [<ffffffff813a1c10>] neigh_timer_handler+0x1a0/0x2a0
  [<ffffffff8104b3cf>] call_timer_fn+0x7f/0x1c0
  [<ffffffff8104b350>] ? detach_if_pending+0x120/0x120
  [<ffffffff8104b748>] run_timer_softirq+0x238/0x2b0
  [<ffffffff813a1a70>] ? neigh_add+0x310/0x310
  [<ffffffff81043e51>] __do_softirq+0x101/0x280
  [<ffffffff814518cc>] call_softirq+0x1c/0x30
  [<ffffffff81003b65>] do_softirq+0x85/0xc0
  [<ffffffff81043a7e>] irq_exit+0x9e/0xc0
  [<ffffffff810264f8>] smp_apic_timer_interrupt+0x68/0xa0
  [<ffffffff8145122f>] apic_timer_interrupt+0x6f/0x80
  <EOI>  [<ffffffff8100a054>] ? mwait_idle+0xa4/0x1c0
  [<ffffffff8100a04b>] ? mwait_idle+0x9b/0x1c0
  [<ffffffff8100a6a9>] cpu_idle+0x89/0xe0
  [<ffffffff81441127>] start_secondary+0x1b2/0x1b6

Bug is from arp_solicit(), releasing the neigh lock after arp_send()
In case of vxlan, we eventually need to write lock a neigh lock later.

Its a false positive, but we can get rid of it without lockdep
annotations.

We can instead use neigh_ha_snapshot() helper.

Reported-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9650388
Olipro Olipro referenced this issue from a commit in Olipro/linux-RPi August 21, 2012
Pavel Emelyanov packet: Protect packet sk list with mutex (v2)
Change since v1:

* Fixed inuse counters access spotted by Eric

In patch eea68e2 (packet: Report socket mclist info via diag module) I've
introduced a "scheduling in atomic" problem in packet diag module -- the
socket list is traversed under rcu_read_lock() while performed under it sk
mclist access requires rtnl lock (i.e. -- mutex) to be taken.

[152363.820563] BUG: scheduling while atomic: crtools/12517/0x10000002
[152363.820573] 4 locks held by crtools/12517:
[152363.820581]  #0:  (sock_diag_mutex){+.+.+.}, at: [<ffffffff81a2dcb5>] sock_diag_rcv+0x1f/0x3e
[152363.820613]  #1:  (sock_diag_table_mutex){+.+.+.}, at: [<ffffffff81a2de70>] sock_diag_rcv_msg+0xdb/0x11a
[152363.820644]  #2:  (nlk->cb_mutex){+.+.+.}, at: [<ffffffff81a67d01>] netlink_dump+0x23/0x1ab
[152363.820693]  #3:  (rcu_read_lock){.+.+..}, at: [<ffffffff81b6a049>] packet_diag_dump+0x0/0x1af

Similar thing was then re-introduced by further packet diag patches (fanount
mutex and pgvec mutex for rings) :(

Apart from being terribly sorry for the above, I propose to change the packet
sk list protection from spinlock to mutex. This lock currently protects two
modifications:

* sklist
* prot inuse counters

The sklist modifications can be just reprotected with mutex since they already
occur in a sleeping context. The inuse counters modifications are trickier -- the
__this_cpu_-s are used inside, thus requiring the caller to handle the potential
issues with contexts himself. Since packet sockets' counters are modified in two
places only (packet_create and packet_release) we only need to protect the context
from being preempted. BH disabling is not required in this case.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
0fa7fa9
Olipro Olipro referenced this issue from a commit in Olipro/linux-RPi August 15, 2012
ARM: 7493/1: use generic unaligned.h
This moves ARM over to the asm-generic/unaligned.h header. This has the
benefit of better code generated especially for ARMv7 on gcc 4.7+
compilers.

As Arnd Bergmann, points out: The asm-generic version uses the "struct"
version for native-endian unaligned access and the "byteshift" version
for the opposite endianess. The current ARM version however uses the
"byteshift" implementation for both.

Thanks to Nicolas Pitre for the excellent analysis:

Test case:

int foo (int *x) { return get_unaligned(x); }
long long bar (long long *x) { return get_unaligned(x); }

With the current ARM version:

foo:
	ldrb	r3, [r0, #2]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D) + 2B], MEM[(const u8 *)x_1(D) + 2B]
	ldrb	r1, [r0, #1]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D) + 1B], MEM[(const u8 *)x_1(D) + 1B]
	ldrb	r2, [r0, #0]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D)], MEM[(const u8 *)x_1(D)]
	mov	r3, r3, asl #16	@ tmp154, MEM[(const u8 *)x_1(D) + 2B],
	ldrb	r0, [r0, #3]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D) + 3B], MEM[(const u8 *)x_1(D) + 3B]
	orr	r3, r3, r1, asl #8	@, tmp155, tmp154, MEM[(const u8 *)x_1(D) + 1B],
	orr	r3, r3, r2	@ tmp157, tmp155, MEM[(const u8 *)x_1(D)]
	orr	r0, r3, r0, asl #24	@,, tmp157, MEM[(const u8 *)x_1(D) + 3B],
	bx	lr	@

bar:
	stmfd	sp!, {r4, r5, r6, r7}	@,
	mov	r2, #0	@ tmp184,
	ldrb	r5, [r0, #6]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D) + 6B], MEM[(const u8 *)x_1(D) + 6B]
	ldrb	r4, [r0, #5]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D) + 5B], MEM[(const u8 *)x_1(D) + 5B]
	ldrb	ip, [r0, #2]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D) + 2B], MEM[(const u8 *)x_1(D) + 2B]
	ldrb	r1, [r0, #4]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D) + 4B], MEM[(const u8 *)x_1(D) + 4B]
	mov	r5, r5, asl #16	@ tmp175, MEM[(const u8 *)x_1(D) + 6B],
	ldrb	r7, [r0, #1]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D) + 1B], MEM[(const u8 *)x_1(D) + 1B]
	orr	r5, r5, r4, asl #8	@, tmp176, tmp175, MEM[(const u8 *)x_1(D) + 5B],
	ldrb	r6, [r0, #7]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D) + 7B], MEM[(const u8 *)x_1(D) + 7B]
	orr	r5, r5, r1	@ tmp178, tmp176, MEM[(const u8 *)x_1(D) + 4B]
	ldrb	r4, [r0, #0]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D)], MEM[(const u8 *)x_1(D)]
	mov	ip, ip, asl #16	@ tmp188, MEM[(const u8 *)x_1(D) + 2B],
	ldrb	r1, [r0, #3]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D) + 3B], MEM[(const u8 *)x_1(D) + 3B]
	orr	ip, ip, r7, asl #8	@, tmp189, tmp188, MEM[(const u8 *)x_1(D) + 1B],
	orr	r3, r5, r6, asl #24	@,, tmp178, MEM[(const u8 *)x_1(D) + 7B],
	orr	ip, ip, r4	@ tmp191, tmp189, MEM[(const u8 *)x_1(D)]
	orr	ip, ip, r1, asl #24	@, tmp194, tmp191, MEM[(const u8 *)x_1(D) + 3B],
	mov	r1, r3	@,
	orr	r0, r2, ip	@ tmp171, tmp184, tmp194
	ldmfd	sp!, {r4, r5, r6, r7}
	bx	lr

In both cases the code is slightly suboptimal.  One may wonder why
wasting r2 with the constant 0 in the second case for example.  And all
the mov's could be folded in subsequent orr's, etc.

Now with the asm-generic version:

foo:
	ldr	r0, [r0, #0]	@ unaligned	@,* x
	bx	lr	@

bar:
	mov	r3, r0	@ x, x
	ldr	r0, [r0, #0]	@ unaligned	@,* x
	ldr	r1, [r3, #4]	@ unaligned	@,
	bx	lr	@

This is way better of course, but only because this was compiled for
ARMv7. In this case the compiler knows that the hardware can do
unaligned word access.  This isn't that obvious for foo(), but if we
remove the get_unaligned() from bar as follows:

long long bar (long long *x) {return *x; }

then the resulting code is:

bar:
	ldmia	r0, {r0, r1}	@ x,,
	bx	lr	@

So this proves that the presumed aligned vs unaligned cases does have
influence on the instructions the compiler may use and that the above
unaligned code results are not just an accident.

Still... this isn't fully conclusive without at least looking at the
resulting assembly fron a pre ARMv6 compilation.  Let's see with an
ARMv5 target:

foo:
	ldrb	r3, [r0, #0]	@ zero_extendqisi2	@ tmp139,* x
	ldrb	r1, [r0, #1]	@ zero_extendqisi2	@ tmp140,
	ldrb	r2, [r0, #2]	@ zero_extendqisi2	@ tmp143,
	ldrb	r0, [r0, #3]	@ zero_extendqisi2	@ tmp146,
	orr	r3, r3, r1, asl #8	@, tmp142, tmp139, tmp140,
	orr	r3, r3, r2, asl #16	@, tmp145, tmp142, tmp143,
	orr	r0, r3, r0, asl #24	@,, tmp145, tmp146,
	bx	lr	@

bar:
	stmfd	sp!, {r4, r5, r6, r7}	@,
	ldrb	r2, [r0, #0]	@ zero_extendqisi2	@ tmp139,* x
	ldrb	r7, [r0, #1]	@ zero_extendqisi2	@ tmp140,
	ldrb	r3, [r0, #4]	@ zero_extendqisi2	@ tmp149,
	ldrb	r6, [r0, #5]	@ zero_extendqisi2	@ tmp150,
	ldrb	r5, [r0, #2]	@ zero_extendqisi2	@ tmp143,
	ldrb	r4, [r0, #6]	@ zero_extendqisi2	@ tmp153,
	ldrb	r1, [r0, #7]	@ zero_extendqisi2	@ tmp156,
	ldrb	ip, [r0, #3]	@ zero_extendqisi2	@ tmp146,
	orr	r2, r2, r7, asl #8	@, tmp142, tmp139, tmp140,
	orr	r3, r3, r6, asl #8	@, tmp152, tmp149, tmp150,
	orr	r2, r2, r5, asl #16	@, tmp145, tmp142, tmp143,
	orr	r3, r3, r4, asl #16	@, tmp155, tmp152, tmp153,
	orr	r0, r2, ip, asl #24	@,, tmp145, tmp146,
	orr	r1, r3, r1, asl #24	@,, tmp155, tmp156,
	ldmfd	sp!, {r4, r5, r6, r7}
	bx	lr

Compared to the initial results, this is really nicely optimized and I
couldn't do much better if I were to hand code it myself.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
d25c881
Olipro Olipro referenced this issue from a commit in Olipro/linux-RPi September 05, 2012
net: qdisc busylock needs lockdep annotations
It seems we need to provide ability for stacked devices
to use specific lock_class_key for sch->busylock

We could instead default l2tpeth tx_queue_len to 0 (no qdisc), but
a user might use a qdisc anyway.

(So same fixes are probably needed on non LLTX stacked drivers)

Noticed while stressing L2TPV3 setup :

======================================================
 [ INFO: possible circular locking dependency detected ]
 3.6.0-rc3+ #788 Not tainted
 -------------------------------------------------------
 netperf/4660 is trying to acquire lock:
  (l2tpsock){+.-...}, at: [<ffffffffa0208db2>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core]

 but task is already holding lock:
  (&(&sch->busylock)->rlock){+.-...}, at: [<ffffffff81596595>] dev_queue_xmit+0xd75/0xe00

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #1 (&(&sch->busylock)->rlock){+.-...}:
        [<ffffffff810a5df0>] lock_acquire+0x90/0x200
        [<ffffffff817499fc>] _raw_spin_lock_irqsave+0x4c/0x60
        [<ffffffff81074872>] __wake_up+0x32/0x70
        [<ffffffff8136d39e>] tty_wakeup+0x3e/0x80
        [<ffffffff81378fb3>] pty_write+0x73/0x80
        [<ffffffff8136cb4c>] tty_put_char+0x3c/0x40
        [<ffffffff813722b2>] process_echoes+0x142/0x330
        [<ffffffff813742ab>] n_tty_receive_buf+0x8fb/0x1230
        [<ffffffff813777b2>] flush_to_ldisc+0x142/0x1c0
        [<ffffffff81062818>] process_one_work+0x198/0x760
        [<ffffffff81063236>] worker_thread+0x186/0x4b0
        [<ffffffff810694d3>] kthread+0x93/0xa0
        [<ffffffff81753e24>] kernel_thread_helper+0x4/0x10

 -> #0 (l2tpsock){+.-...}:
        [<ffffffff810a5288>] __lock_acquire+0x1628/0x1b10
        [<ffffffff810a5df0>] lock_acquire+0x90/0x200
        [<ffffffff817498c1>] _raw_spin_lock+0x41/0x50
        [<ffffffffa0208db2>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
        [<ffffffffa021a802>] l2tp_eth_dev_xmit+0x32/0x60 [l2tp_eth]
        [<ffffffff815952b2>] dev_hard_start_xmit+0x502/0xa70
        [<ffffffff815b63ce>] sch_direct_xmit+0xfe/0x290
        [<ffffffff81595a05>] dev_queue_xmit+0x1e5/0xe00
        [<ffffffff815d9d60>] ip_finish_output+0x3d0/0x890
        [<ffffffff815db019>] ip_output+0x59/0xf0
        [<ffffffff815da36d>] ip_local_out+0x2d/0xa0
        [<ffffffff815da5a3>] ip_queue_xmit+0x1c3/0x680
        [<ffffffff815f4192>] tcp_transmit_skb+0x402/0xa60
        [<ffffffff815f4a94>] tcp_write_xmit+0x1f4/0xa30
        [<ffffffff815f5300>] tcp_push_one+0x30/0x40
        [<ffffffff815e6672>] tcp_sendmsg+0xe82/0x1040
        [<ffffffff81614495>] inet_sendmsg+0x125/0x230
        [<ffffffff81576cdc>] sock_sendmsg+0xdc/0xf0
        [<ffffffff81579ece>] sys_sendto+0xfe/0x130
        [<ffffffff81752c92>] system_call_fastpath+0x16/0x1b
  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&(&sch->busylock)->rlock);
                                lock(l2tpsock);
                                lock(&(&sch->busylock)->rlock);
   lock(l2tpsock);

  *** DEADLOCK ***

 5 locks held by netperf/4660:
  #0:  (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff815e581c>] tcp_sendmsg+0x2c/0x1040
  #1:  (rcu_read_lock){.+.+..}, at: [<ffffffff815da3e0>] ip_queue_xmit+0x0/0x680
  #2:  (rcu_read_lock_bh){.+....}, at: [<ffffffff815d9ac5>] ip_finish_output+0x135/0x890
  #3:  (rcu_read_lock_bh){.+....}, at: [<ffffffff81595820>] dev_queue_xmit+0x0/0xe00
  #4:  (&(&sch->busylock)->rlock){+.-...}, at: [<ffffffff81596595>] dev_queue_xmit+0xd75/0xe00

 stack backtrace:
 Pid: 4660, comm: netperf Not tainted 3.6.0-rc3+ #788
 Call Trace:
  [<ffffffff8173dbf8>] print_circular_bug+0x1fb/0x20c
  [<ffffffff810a5288>] __lock_acquire+0x1628/0x1b10
  [<ffffffff810a334b>] ? check_usage+0x9b/0x4d0
  [<ffffffff810a3f44>] ? __lock_acquire+0x2e4/0x1b10
  [<ffffffff810a5df0>] lock_acquire+0x90/0x200
  [<ffffffffa0208db2>] ? l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
  [<ffffffff817498c1>] _raw_spin_lock+0x41/0x50
  [<ffffffffa0208db2>] ? l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
  [<ffffffffa0208db2>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
  [<ffffffffa021a802>] l2tp_eth_dev_xmit+0x32/0x60 [l2tp_eth]
  [<ffffffff815952b2>] dev_hard_start_xmit+0x502/0xa70
  [<ffffffff81594e0e>] ? dev_hard_start_xmit+0x5e/0xa70
  [<ffffffff81595961>] ? dev_queue_xmit+0x141/0xe00
  [<ffffffff815b63ce>] sch_direct_xmit+0xfe/0x290
  [<ffffffff81595a05>] dev_queue_xmit+0x1e5/0xe00
  [<ffffffff81595820>] ? dev_hard_start_xmit+0xa70/0xa70
  [<ffffffff815d9d60>] ip_finish_output+0x3d0/0x890
  [<ffffffff815d9ac5>] ? ip_finish_output+0x135/0x890
  [<ffffffff815db019>] ip_output+0x59/0xf0
  [<ffffffff815da36d>] ip_local_out+0x2d/0xa0
  [<ffffffff815da5a3>] ip_queue_xmit+0x1c3/0x680
  [<ffffffff815da3e0>] ? ip_local_out+0xa0/0xa0
  [<ffffffff815f4192>] tcp_transmit_skb+0x402/0xa60
  [<ffffffff815fa25e>] ? tcp_md5_do_lookup+0x18e/0x1a0
  [<ffffffff815f4a94>] tcp_write_xmit+0x1f4/0xa30
  [<ffffffff815f5300>] tcp_push_one+0x30/0x40
  [<ffffffff815e6672>] tcp_sendmsg+0xe82/0x1040
  [<ffffffff81614495>] inet_sendmsg+0x125/0x230
  [<ffffffff81614370>] ? inet_create+0x6b0/0x6b0
  [<ffffffff8157e6e2>] ? sock_update_classid+0xc2/0x3b0
  [<ffffffff8157e750>] ? sock_update_classid+0x130/0x3b0
  [<ffffffff81576cdc>] sock_sendmsg+0xdc/0xf0
  [<ffffffff81162579>] ? fget_light+0x3f9/0x4f0
  [<ffffffff81579ece>] sys_sendto+0xfe/0x130
  [<ffffffff810a69ad>] ? trace_hardirqs_on+0xd/0x10
  [<ffffffff8174a0b0>] ? _raw_spin_unlock_irq+0x30/0x50
  [<ffffffff810757e3>] ? finish_task_switch+0x83/0xf0
  [<ffffffff810757a6>] ? finish_task_switch+0x46/0xf0
  [<ffffffff81752cb7>] ? sysret_check+0x1b/0x56
  [<ffffffff81752c92>] system_call_fastpath+0x16/0x1b

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
23d3b8b
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi January 11, 2013
mm: mmap: annotate vm_lock_anon_vma locking properly for lockdep
Commit 5a50508 ("mm/rmap: Convert the struct anon_vma::mutex to an
rwsem") turned anon_vma mutex to rwsem.

However, the properly annotated nested locking in mm_take_all_locks()
has been converted from

	mutex_lock_nest_lock(&anon_vma->root->mutex, &mm->mmap_sem);

to

	down_write(&anon_vma->root->rwsem);

which is incomplete, and causes the false positive report from lockdep
below.

Annotate the fact that mmap_sem is used as an outter lock to serialize
taking of all the anon_vma rwsems at once no matter the order, using the
down_write_nest_lock() primitive.

This patch fixes this lockdep report:

 =============================================
 [ INFO: possible recursive locking detected ]
 3.8.0-rc2-00036-g5f73896 #171 Not tainted
 ---------------------------------------------
 qemu-kvm/2315 is trying to acquire lock:
  (&anon_vma->rwsem){+.+...}, at: mm_take_all_locks+0x149/0x1b0

 but task is already holding lock:
  (&anon_vma->rwsem){+.+...}, at: mm_take_all_locks+0x149/0x1b0

 other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(&anon_vma->rwsem);
   lock(&anon_vma->rwsem);

  *** DEADLOCK ***

  May be due to missing lock nesting notation

 4 locks held by qemu-kvm/2315:
  #0:  (&mm->mmap_sem){++++++}, at: do_mmu_notifier_register+0xfc/0x170
  #1:  (mm_all_locks_mutex){+.+...}, at: mm_take_all_locks+0x36/0x1b0
  #2:  (&mapping->i_mmap_mutex){+.+...}, at: mm_take_all_locks+0xc9/0x1b0
  #3:  (&anon_vma->rwsem){+.+...}, at: mm_take_all_locks+0x149/0x1b0

 stack backtrace:
 Pid: 2315, comm: qemu-kvm Not tainted 3.8.0-rc2-00036-g5f73896 #171
 Call Trace:
   print_deadlock_bug+0xf2/0x100
   validate_chain+0x4f6/0x720
   __lock_acquire+0x359/0x580
   lock_acquire+0x121/0x190
   down_write+0x3f/0x70
   mm_take_all_locks+0x149/0x1b0
   do_mmu_notifier_register+0x68/0x170
   mmu_notifier_register+0xe/0x10
   kvm_create_vm+0x22b/0x330 [kvm]
   kvm_dev_ioctl+0xf8/0x1a0 [kvm]
   do_vfs_ioctl+0x9d/0x350
   sys_ioctl+0x91/0xb0
   system_call_fastpath+0x16/0x1b

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Cc: Rik van Riel <riel@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mel Gorman <mel@csn.ul.ie>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
572043c
Stephen Warren swarren referenced this issue from a commit January 14, 2013
Commit has since been removed from the repository and is no longer available.
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi December 25, 2012
tty: don't deadlock while flushing workqueue
Since commit 89c8d91 ("tty: localise the lock") I see a dead lock
in one of my dummy_hcd + g_nokia test cases. The first run was usually
okay, the second often resulted in a splat by lockdep and the third was
usually a dead lock.
Lockdep complained about tty->hangup_work and tty->legacy_mutex taken
both ways:
| ======================================================
| [ INFO: possible circular locking dependency detected ]
| 3.7.0-rc6+ #204 Not tainted
| -------------------------------------------------------
| kworker/2:1/35 is trying to acquire lock:
|  (&tty->legacy_mutex){+.+.+.}, at: [<c14051e6>] tty_lock_nested+0x36/0x80
|
| but task is already holding lock:
|  ((&tty->hangup_work)){+.+...}, at: [<c104f6e4>] process_one_work+0x124/0x5e0
|
| which lock already depends on the new lock.
|
| the existing dependency chain (in reverse order) is:
|
| -> #2 ((&tty->hangup_work)){+.+...}:
|        [<c107fe74>] lock_acquire+0x84/0x190
|        [<c104d82d>] flush_work+0x3d/0x240
|        [<c12e6986>] tty_ldisc_flush_works+0x16/0x30
|        [<c12e7861>] tty_ldisc_release+0x21/0x70
|        [<c12e0dfc>] tty_release+0x35c/0x470
|        [<c1105e28>] __fput+0xd8/0x270
|        [<c1105fcd>] ____fput+0xd/0x10
|        [<c1051dd9>] task_work_run+0xb9/0xf0
|        [<c1002a51>] do_notify_resume+0x51/0x80
|        [<c140550a>] work_notifysig+0x35/0x3b
|
| -> #1 (&tty->legacy_mutex/1){+.+...}:
|        [<c107fe74>] lock_acquire+0x84/0x190
|        [<c140276c>] mutex_lock_nested+0x6c/0x2f0
|        [<c14051e6>] tty_lock_nested+0x36/0x80
|        [<c1405279>] tty_lock_pair+0x29/0x70
|        [<c12e0bb8>] tty_release+0x118/0x470
|        [<c1105e28>] __fput+0xd8/0x270
|        [<c1105fcd>] ____fput+0xd/0x10
|        [<c1051dd9>] task_work_run+0xb9/0xf0
|        [<c1002a51>] do_notify_resume+0x51/0x80
|        [<c140550a>] work_notifysig+0x35/0x3b
|
| -> #0 (&tty->legacy_mutex){+.+.+.}:
|        [<c107f3c9>] __lock_acquire+0x1189/0x16a0
|        [<c107fe74>] lock_acquire+0x84/0x190
|        [<c140276c>] mutex_lock_nested+0x6c/0x2f0
|        [<c14051e6>] tty_lock_nested+0x36/0x80
|        [<c140523f>] tty_lock+0xf/0x20
|        [<c12df8e4>] __tty_hangup+0x54/0x410
|        [<c12dfcb2>] do_tty_hangup+0x12/0x20
|        [<c104f763>] process_one_work+0x1a3/0x5e0
|        [<c104fec9>] worker_thread+0x119/0x3a0
|        [<c1055084>] kthread+0x94/0xa0
|        [<c140ca37>] ret_from_kernel_thread+0x1b/0x28
|
|other info that might help us debug this:
|
|Chain exists of:
|  &tty->legacy_mutex --> &tty->legacy_mutex/1 --> (&tty->hangup_work)
|
| Possible unsafe locking scenario:
|
|       CPU0                    CPU1
|       ----                    ----
|  lock((&tty->hangup_work));
|                               lock(&tty->legacy_mutex/1);
|                               lock((&tty->hangup_work));
|  lock(&tty->legacy_mutex);
|
| *** DEADLOCK ***

Before the path mentioned tty_ldisc_release() look like this:

|	tty_ldisc_halt(tty);
|	tty_ldisc_flush_works(tty);
|	tty_lock();

As it can be seen, it first flushes the workqueue and then grabs the
tty_lock. Now we grab the lock first:

|	tty_lock_pair(tty, o_tty);
|	tty_ldisc_halt(tty);
|	tty_ldisc_flush_works(tty);

so lockdep's complaint seems valid.

The earlier version of this patch took the ldisc_mutex since the other
user of tty_ldisc_flush_works() (tty_set_ldisc()) did this.
Peter Hurley then said that it is should not be requried. Since it
wasn't done earlier, I dropped this part.
The code under tty_ldisc_kill() was executed earlier with the tty lock
taken so it is taken again.

I was able to reproduce the deadlock on v3.8-rc1, this patch fixes the
problem in my testcase. I didn't notice any problems so far.

Cc: Alan Cox <alan@linux.intel.com>
Cc: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
852e4a8
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi January 22, 2013
davem330 Merge branch 'legacy-isa-delete' of git://git.kernel.org/pub/scm/linu…
…x/kernel/git/paulg/linux

Paul Gortmaker says:

====================
The Ethernet-HowTo was maintained for roughly 10 years, from 1993 to 2003.
Fortunately sane hardware probing and auto detection (via PCI and ISA/PnP)
largely made the document a relic of the past, hence it being abandoned
a decade ago.

However, there is one last useful thing that we can extract from the
effort made in maintaining that document.  We can use it to guide us
with respect to what rare, experimental and/or super ancient 10Mbit
ISA drivers don't make sense to maintain in-tree anymore.

Nobody will argue that ISA is obsolete.  Availability went away at about
the time Pentium3 motherboards moved from 500MHz Slot1/SECC processors
to the green 500MHz Socket 370 Pentium3 chips, at the turn of the century.

In theory, it is possible that someone could still be running one of these
12+ year old P3 machines and want 3.9+ bleeding edge kernels (but unlikely).
In light of the above (remote) possibility, we can defer the removal of some
ISA network drivers that were highly popular and well tested.  Typically
that means the stuff more from the mid to late '90s, some with ISA PnP
support, like the 3c509, the wd/SMC 8390 based stuff, PCnet/lance etc.

But a lot of other drivers, typically from the early 1990s were for rare
hardware, and experimental (to the point of requiring a cron job that would
do a test ping, and then ifconfig down/up and/or a rmmod/insmod!).  And
some of these drivers (znet, and lp486e to name two) are physically tied
to platforms with on motherboard ethernet -- of 486 machines that date
from the early 1990s and can only have single digit amounts of memory.

What I'd like to achieve here with this series, is to get rid of those old
drivers that are no longer being used.  In an earlier discussion where
I'd proposed deleting a single driver, Alan suggested we instead dump
all the historical stuff in one go, to make it "...immediately obvious
where the break point is..."[1] and that it was "perfectly reasonable it
(and a pile of other ISA cards) ought to be shown the door"[2].  So that
is the goal here - make a clear line in the sand where the really ancient
stuff finally gets kicked to the curb.

Two old parallel port drivers are considered for removal here as well,
since in early 386/486 ISA machines, the parallel port was typically found
with the UARTS on the multi-I/O ISA controller card.  These drivers also date
from the early 1990's; parallel ports are no longer found on modern boards,
and their performance was not even capable of 10% of 10Mbit bandwidth.

Allow me a preemptive justification against the inevitable comments from
well meaning bystanders who suggest "why not just leave all this alone?".
Dead drivers cost us all if they are left in tree.  If you think that
is false, then please first consider:

-every time you type "git status", you are checking to see if modifications
 have been made by you to all that dead code.

-every time you type "git grep <regex>" you are searching through files
 which contain that dead code that simply does not interest you.

-every time you build a "allyesconfig" and an "allmodconfig" (don't tell
 me you skip this step before submitting your changes to a maintainer),
 you waste CPU cycles building this dead code.

-every time there is a tree wide API change, or cleanup, or file relocation,
 we pay the cost of updating dead code, or moving dead code.

-daily regression tests (take linux-next as the most transparent
 example) spend time building (and possibly running) this dead code.

-hard working people who regularly run auditing tools looking for lurking
 bugs (sparse/coverity/smatch/coccinelle) are wasting time checking for,
 and fixing bugs in this dead code.

This last one is key.  Please take a look at the git history for the
files that are proposed for removal here.  Look at the git history for
any one of them ("git whatchanged --follow drivers/net/.../driver.c")
Mentally sort the changes into two bins -- (1) the robotic tree-wide
changes, and (2) the "look I found a real run-time bug while using this"
category.  You will see that category #2 is essentially empty.

Further to that, realize that drivers don't simply disappear.  We are
not operating in the binary-only distribution space like other OS.  All
these drivers remain in the git history forever.  If a person is an
enthusiast for extreme legacy hardware, they are probably already
customizing their kernel source and building it themselves to support
such systems.  Also keep in mind that they could still build the 3.8
kernel exactly as-is, and run it (or a 3.8.x stable variant of it) for
several more years if they were really determined to cling to these old
experimental ISA drivers for some reason.

In summary, I hope that folks can be pragmatic about this, and not
get swept up in nostalgia.  Ask yourself whether it is realistic to
expect a person would have a genuine use case where they would
need to build a 3.9+ modern kernel and install it on some legacy hardware
that has no option but to absolutely _require_ one of the drivers
that are deleted here.

The following series was created with --irreversible-delete for
ease of review (it skips showing the content of files that are
deleted); however the complete patches can be pulled as per below.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
930d52c
Stephen Warren swarren referenced this issue from a commit January 28, 2013
Commit has since been removed from the repository and is no longer available.
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi January 22, 2013
soreuseport: TCP/IPv4 implementation
Allow multiple listener sockets to bind to the same port.

Motivation for soresuseport would be something like a web server
binding to port 80 running with multiple threads, where each thread
might have it's own listener socket.  This could be done as an
alternative to other models: 1) have one listener thread which
dispatches completed connections to workers. 2) accept on a single
listener socket from multiple threads.  In case #1 the listener thread
can easily become the bottleneck with high connection turn-over rate.
In case #2, the proportion of connections accepted per thread tends
to be uneven under high connection load (assuming simple event loop:
while (1) { accept(); process() }, wakeup does not promote fairness
among the sockets.  We have seen the  disproportion to be as high
as 3:1 ratio between thread accepting most connections and the one
accepting the fewest.  With so_reusport the distribution is
uniform.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
da5e363
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi January 22, 2013
soreuseport: TCP/IPv6 implementation
Motivation for soreuseport would be something like a web server
binding to port 80 running with multiple threads, where each thread
might have it's own listener socket.  This could be done as an
alternative to other models: 1) have one listener thread which
dispatches completed connections to workers. 2) accept on a single
listener socket from multiple threads.  In case #1 the listener thread
can easily become the bottleneck with high connection turn-over rate.
In case #2, the proportion of connections accepted per thread tends
to be uneven under high connection load (assuming simple event loop:
while (1) { accept(); process() }, wakeup does not promote fairness
among the sockets.  We have seen the  disproportion to be as high
as 3:1 ratio between thread accepting most connections and the one
accepting the fewest.  With so_reusport the distribution is
uniform.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5ba2495
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi January 23, 2013
davem330 Merge branch 'soreuseport'
Tom Herbert says:

====================
This series implements so_reuseport (SO_REUSEPORT socket option) for
TCP and UDP.  For TCP, so_reuseport allows multiple listener sockets
to be bound to the same port.  In the case of UDP, so_reuseport allows
multiple sockets to bind to the same port.  To prevent port hijacking
all sockets bound to the same port using so_reuseport must have the
same uid.  Received packets are distributed to multiple sockets bound
to the same port using a 4-tuple hash.

The motivating case for so_resuseport in TCP would be something like
a web server binding to port 80 running with multiple threads, where
each thread might have it's own listener socket.  This could be done
as an alternative to other models: 1) have one listener thread which
dispatches completed connections to workers. 2) accept on a single
listener socket from multiple threads.  In case #1 the listener thread
can easily become the bottleneck with high connection turn-over rate.
In case #2, the proportion of connections accepted per thread tends
to be uneven under high connection load (assuming simple event loop:
while (1) { accept(); process() }, wakeup does not promote fairness
among the sockets.  We have seen the  disproportion to be as high
as 3:1 ratio between thread accepting most connections and the one
accepting the fewest.  With so_reusport the distribution is
uniform.

The TCP implementation has a problem in that the request sockets for a
listener are attached to a listener socket.  If a SYN is received, a
listener socket is chosen and request structure is created (SYN-RECV
state).  If the subsequent ack in 3WHS does not match the same port
by so_reusport, the connection state is not found (reset) and the
request structure is orphaned.  This scenario would occur when the
number of listener sockets bound to a port changes (new ones are
added, or old ones closed).  We are looking for a solution to this,
maybe allow multiple sockets to share the same request table...

The motivating case for so_reuseport in UDP would be something like a
DNS server.  An alternative would be to recv on the same socket from
multiple threads.  As in the case of TCP, the load across these threads
tends to be disproportionate and we also see a lot of contection on
the socket lock.  Note that SO_REUSEADDR already allows multiple UDP
sockets to bind to the same port, however there is no provision to
prevent hijacking and nothing to distribute packets across all the
sockets sharing the same bound port.  This patch does not change the
semantics of SO_REUSEADDR, but provides usable functionality of it
for unicast.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
c617f39
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi January 08, 2013
efi, x86: Pass a proper identity mapping in efi_call_phys_prelog
Update efi_call_phys_prelog to install an identity mapping of all available
memory.  This corrects a bug on very large systems with more then 512 GB in
which bios would not be able to access addresses above not in the mapping.

The result is a crash that looks much like this.

BUG: unable to handle kernel paging request at 000000effd870020
IP: [<0000000078bce331>] 0x78bce330
PGD 0
Oops: 0000 [#1] SMP
Modules linked in:
CPU 0
Pid: 0, comm: swapper/0 Tainted: G        W    3.8.0-rc1-next-20121224-medusa_ntz+ #2 Intel Corp. Stoutland Platform
RIP: 0010:[<0000000078bce331>]  [<0000000078bce331>] 0x78bce330
RSP: 0000:ffffffff81601d28  EFLAGS: 00010006
RAX: 0000000078b80e18 RBX: 0000000000000004 RCX: 0000000000000004
RDX: 0000000078bcf958 RSI: 0000000000002400 RDI: 8000000000000000
RBP: 0000000078bcf760 R08: 000000effd870000 R09: 0000000000000000
R10: 0000000000000000 R11: 00000000000000c3 R12: 0000000000000030
R13: 000000effd870000 R14: 0000000000000000 R15: ffff88effd870000
FS:  0000000000000000(0000) GS:ffff88effe400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000effd870020 CR3: 000000000160c000 CR4: 00000000000006b0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper/0 (pid: 0, threadinfo ffffffff81600000, task ffffffff81614400)
Stack:
 0000000078b80d18 0000000000000004 0000000078bced7b ffff880078b81fff
 0000000000000000 0000000000000082 0000000078bce3a8 0000000000002400
 0000000060000202 0000000078b80da0 0000000078bce45d ffffffff8107cb5a
Call Trace:
 [<ffffffff8107cb5a>] ? on_each_cpu+0x77/0x83
 [<ffffffff8102f4eb>] ? change_page_attr_set_clr+0x32f/0x3ed
 [<ffffffff81035946>] ? efi_call4+0x46/0x80
 [<ffffffff816c5abb>] ? efi_enter_virtual_mode+0x1f5/0x305
 [<ffffffff816aeb24>] ? start_kernel+0x34a/0x3d2
 [<ffffffff816ae5ed>] ? repair_env_string+0x60/0x60
 [<ffffffff816ae2be>] ? x86_64_start_reservations+0xba/0xc1
 [<ffffffff816ae120>] ? early_idt_handlers+0x120/0x120
 [<ffffffff816ae419>] ? x86_64_start_kernel+0x154/0x163
Code:  Bad RIP value.
RIP  [<0000000078bce331>] 0x78bce330
 RSP <ffffffff81601d28>
CR2: 000000effd870020
---[ end trace ead828934fef5eab ]---

Cc: stable@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Robin Holt <holt@sgi.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
b8f2c21
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi November 23, 2012
[SCSI] qla4xxx: Fix MBOX intr switching from polling to intr mode for…
… ISP83XX

Issue:
Mailbox command timed out after switching from polling mode to interrupt mode.

Events:-
 1. Mailbox interrupts are disabled
 2. FW generates AEN and at same time driver enables Mailbox Interrupt
 3. Driver issues new mailbox to Firmware

In above case driver will not get AEN interrupts generated by FW in step #2 as
FW generated this AEN when interrupts are disabled. During the same time
driver enabled the mailbox interrupt, so driver will not poll for interrupt.
Driver will never process AENs generated in step #2 and issues new mailbox to
FW, but now FW is not able to post mailbox completion as AENs generated before
are not processed by driver.

Fix:
Enable Mailbox / AEN interrupts before initializing FW in case of ISP83XX.
This will make sure we process all Mailbox and AENs in interrupt mode.

Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
5c19b92
Stephen Warren swarren referenced this issue from a commit February 08, 2013
Commit has since been removed from the repository and is no longer available.
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi January 29, 2013
Andre Guedes Bluetooth: Reduce critical section in sco_conn_ready
This patch reduces the critical section protected by sco_conn_lock in
sco_conn_ready function. The lock is acquired only when it is really
needed.

This patch fixes the following lockdep warning which is generated
when the host terminates a SCO connection.

Today, this warning is a false positive. There is no way those
two threads reported by lockdep are running at the same time since
hdev->workqueue (where rx_work is queued) is single-thread. However,
if somehow this behavior is changed in future, we will have a
potential deadlock.

======================================================
[ INFO: possible circular locking dependency detected ]
3.8.0-rc1+ #7 Not tainted
-------------------------------------------------------
kworker/u:1H/1018 is trying to acquire lock:
 (&(&conn->lock)->rlock){+.+...}, at: [<ffffffffa0033ba6>] sco_chan_del+0x66/0x190 [bluetooth]

but task is already holding lock:
 (slock-AF_BLUETOOTH-BTPROTO_SCO){+.+...}, at: [<ffffffffa0033d5a>] sco_conn_del+0x8a/0xe0 [bluetooth]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (slock-AF_BLUETOOTH-BTPROTO_SCO){+.+...}:
       [<ffffffff81083011>] lock_acquire+0xb1/0xe0
       [<ffffffff813efd01>] _raw_spin_lock+0x41/0x80
       [<ffffffffa003436e>] sco_connect_cfm+0xbe/0x350 [bluetooth]
       [<ffffffffa0015d6c>] hci_event_packet+0xd3c/0x29b0 [bluetooth]
       [<ffffffffa0004583>] hci_rx_work+0x133/0x870 [bluetooth]
       [<ffffffff8104d65f>] process_one_work+0x2bf/0x4f0
       [<ffffffff81050022>] worker_thread+0x2b2/0x3e0
       [<ffffffff81056021>] kthread+0xd1/0xe0
       [<ffffffff813f14bc>] ret_from_fork+0x7c/0xb0

-> #0 (&(&conn->lock)->rlock){+.+...}:
       [<ffffffff81082215>] __lock_acquire+0x1465/0x1c70
       [<ffffffff81083011>] lock_acquire+0xb1/0xe0
       [<ffffffff813efd01>] _raw_spin_lock+0x41/0x80
       [<ffffffffa0033ba6>] sco_chan_del+0x66/0x190 [bluetooth]
       [<ffffffffa0033d6d>] sco_conn_del+0x9d/0xe0 [bluetooth]
       [<ffffffffa0034653>] sco_disconn_cfm+0x53/0x60 [bluetooth]
       [<ffffffffa000fef3>] hci_disconn_complete_evt.isra.54+0x363/0x3c0 [bluetooth]
       [<ffffffffa00150f7>] hci_event_packet+0xc7/0x29b0 [bluetooth]
       [<ffffffffa0004583>] hci_rx_work+0x133/0x870 [bluetooth]
       [<ffffffff8104d65f>] process_one_work+0x2bf/0x4f0
       [<ffffffff81050022>] worker_thread+0x2b2/0x3e0
       [<ffffffff81056021>] kthread+0xd1/0xe0
       [<ffffffff813f14bc>] ret_from_fork+0x7c/0xb0

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(slock-AF_BLUETOOTH-BTPROTO_SCO);
                               lock(&(&conn->lock)->rlock);
                               lock(slock-AF_BLUETOOTH-BTPROTO_SCO);
  lock(&(&conn->lock)->rlock);

 *** DEADLOCK ***

4 locks held by kworker/u:1H/1018:
 #0:  (hdev->name#2){.+.+.+}, at: [<ffffffff8104d5f8>] process_one_work+0x258/0x4f0
 #1:  ((&hdev->rx_work)){+.+.+.}, at: [<ffffffff8104d5f8>] process_one_work+0x258/0x4f0
 #2:  (&hdev->lock){+.+.+.}, at: [<ffffffffa000fbe9>] hci_disconn_complete_evt.isra.54+0x59/0x3c0 [bluetooth]
 #3:  (slock-AF_BLUETOOTH-BTPROTO_SCO){+.+...}, at: [<ffffffffa0033d5a>] sco_conn_del+0x8a/0xe0 [bluetooth]

stack backtrace:
Pid: 1018, comm: kworker/u:1H Not tainted 3.8.0-rc1+ #7
Call Trace:
 [<ffffffff813e92f9>] print_circular_bug+0x1fb/0x20c
 [<ffffffff81082215>] __lock_acquire+0x1465/0x1c70
 [<ffffffff81083011>] lock_acquire+0xb1/0xe0
 [<ffffffffa0033ba6>] ? sco_chan_del+0x66/0x190 [bluetooth]
 [<ffffffff813efd01>] _raw_spin_lock+0x41/0x80
 [<ffffffffa0033ba6>] ? sco_chan_del+0x66/0x190 [bluetooth]
 [<ffffffffa0033ba6>] sco_chan_del+0x66/0x190 [bluetooth]
 [<ffffffffa0033d6d>] sco_conn_del+0x9d/0xe0 [bluetooth]
 [<ffffffffa0034653>] sco_disconn_cfm+0x53/0x60 [bluetooth]
 [<ffffffffa000fef3>] hci_disconn_complete_evt.isra.54+0x363/0x3c0 [bluetooth]
 [<ffffffffa000fbd0>] ? hci_disconn_complete_evt.isra.54+0x40/0x3c0 [bluetooth]
 [<ffffffffa00150f7>] hci_event_packet+0xc7/0x29b0 [bluetooth]
 [<ffffffff81202e90>] ? __dynamic_pr_debug+0x80/0x90
 [<ffffffff8133ff7d>] ? kfree_skb+0x2d/0x40
 [<ffffffffa0021644>] ? hci_send_to_monitor+0x1a4/0x1c0 [bluetooth]
 [<ffffffffa0004583>] hci_rx_work+0x133/0x870 [bluetooth]
 [<ffffffff8104d5f8>] ? process_one_work+0x258/0x4f0
 [<ffffffff8104d65f>] process_one_work+0x2bf/0x4f0
 [<ffffffff8104d5f8>] ? process_one_work+0x258/0x4f0
 [<ffffffff8104fdc1>] ? worker_thread+0x51/0x3e0
 [<ffffffffa0004450>] ? hci_tx_work+0x800/0x800 [bluetooth]
 [<ffffffff81050022>] worker_thread+0x2b2/0x3e0
 [<ffffffff8104fd70>] ? busy_worker_rebind_fn+0x100/0x100
 [<ffffffff81056021>] kthread+0xd1/0xe0
 [<ffffffff81055f50>] ? flush_kthread_worker+0xc0/0xc0
 [<ffffffff813f14bc>] ret_from_fork+0x7c/0xb0
 [<ffffffff81055f50>] ? flush_kthread_worker+0xc0/0xc0

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
4052808
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi February 02, 2013
davem330 Merge branch 'delete-wanrouter' of git://git.kernel.org/pub/scm/linux…
…/kernel/git/paulg/linux

Paul Gortmaker says:

====================
The removal of wanrouter code was originally listed in the (now
gone) feature removal file since May 2012, and an RFC of the
deletion was posted[1] in late 2012.  The overall concept was given
an OK, but defconfig contamination, build failures, etc. meant that
it didn't quite make it into mainline for 3.8.

Since that time, Dan discovered (via code audit) a runtime bug that
proves nobody has been using this for over four years[2].  With that
new information, I think it makes sense for someone to follow through
on Joe's original RFC and get this done for the 3.9 release.

In addition to resolving the build failures of the RFC by keeping
stub headers, this also splits the change into two parts, just like
the token ring removal did.  Part #1 decouples the mainline kernel
from the expired subsystem, and part #2 does the large scale
deletion of the subsystem content.

The advantage of the above, is that a "git blame" will never lead
you to a 4000+ line deletion commit.  The large scale deletion will
never show up in a "git blame" and hence the same advantages that we
get from the "--irreversible-delete" in the review stage of "git
format-patch" are also embedded into the git history itself.  This
may seem like a moot point to some, but for those who spend a
considerable amount of time data mining in the git history, this is
probably worth doing.

I have done build tests of all[mod/yes]config for both the stage 1
(Makefile and Kconfig) and stage 2 (full driver delete) as a sanity
check, and the issues with the previously posted RFC should be gone.

Speaking of "--irreversible-delete" -- these patches were created
with that option, so if you want to use them locally, you are going
to have to pull (location below) the content instead of doing a
"git am" of the mailed out content.

[1] http://patchwork.ozlabs.org/patch/198794/
[2] http://www.spinics.net/lists/netdev/msg218670.html
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
33397a7
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi January 30, 2013
peterhurley pty: Fix BUG()s when ptmx_open() errors out
If pmtx_open() fails to get a slave inode or fails the pty_open(),
the tty is released as part of the error cleanup. As evidenced by the
first BUG stacktrace below, pty_close() assumes that the linked pty has
a valid, initialized inode* stored in driver_data.

Also, as evidenced by the second BUG stacktrace below, pty_unix98_shutdown()
assumes that the master pty's driver_data has been initialized.

1) Fix the invalid assumption in pty_close().
2) Initialize driver_data immediately so proper devpts fs cleanup occurs.

Fixes this BUG:

[  815.868844] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
[  815.869018] IP: [<ffffffff81207bcc>] devpts_pty_kill+0x1c/0xa0
[  815.869190] PGD 7c775067 PUD 79deb067 PMD 0
[  815.869315] Oops: 0000 [#1] PREEMPT SMP
[  815.869443] Modules linked in: kvm_intel kvm snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi microcode snd_rawmidi psmouse serio_raw snd_seq_midi_event snd_seq snd_timer$
[  815.870025] CPU 0
[  815.870143] Pid: 27819, comm: stress_test_tty Tainted: G        W    3.8.0-next-20130125+ttypatch-2-xeon #2 Bochs Bochs
[  815.870386] RIP: 0010:[<ffffffff81207bcc>]  [<ffffffff81207bcc>] devpts_pty_kill+0x1c/0xa0
[  815.870540] RSP: 0018:ffff88007d3e1ac8  EFLAGS: 00010282
[  815.870661] RAX: ffff880079c20800 RBX: 0000000000000000 RCX: 0000000000000000
[  815.870804] RDX: ffff880079c209a8 RSI: 0000000000000286 RDI: 0000000000000000
[  815.870933] RBP: ffff88007d3e1ae8 R08: 0000000000000000 R09: 0000000000000000
[  815.871078] R10: 0000000000000000 R11: 0000000000000001 R12: ffff88007bfb7e00
[  815.871209] R13: 0000000000000005 R14: ffff880079c20c00 R15: ffff880079c20c00
[  815.871343] FS:  00007f2e86206700(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
[  815.871495] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  815.871617] CR2: 0000000000000028 CR3: 000000007ae56000 CR4: 00000000000006f0
[  815.871752] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  815.871902] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  815.872012] Process stress_test_tty (pid: 27819, threadinfo ffff88007d3e0000, task ffff88007c874530)
[  815.872012] Stack:
[  815.872012]  ffff88007bfb7e00 ffff880079c20c00 ffff88007bfb7e00 0000000000000005
[  815.872012]  ffff88007d3e1b08 ffffffff81417be7 ffff88007caa9bd8 ffff880079c20800
[  815.872012]  ffff88007d3e1bc8 ffffffff8140e5f8 0000000000000000 0000000000000000
[  815.872012] Call Trace:
[  815.872012]  [<ffffffff81417be7>] pty_close+0x157/0x170
[  815.872012]  [<ffffffff8140e5f8>] tty_release+0x138/0x580
[  815.872012]  [<ffffffff816d29f3>] ? _raw_spin_lock+0x23/0x30
[  815.872012]  [<ffffffff816d267a>] ? _raw_spin_unlock+0x1a/0x40
[  815.872012]  [<ffffffff816d0178>] ? __mutex_unlock_slowpath+0x48/0x60
[  815.872012]  [<ffffffff81417dff>] ptmx_open+0x11f/0x180
[  815.872012]  [<ffffffff8119394b>] chrdev_open+0x9b/0x1c0
[  815.872012]  [<ffffffff8118d643>] do_dentry_open+0x203/0x290
[  815.872012]  [<ffffffff811938b0>] ? cdev_put+0x30/0x30
[  815.872012]  [<ffffffff8118d705>] finish_open+0x35/0x50
[  815.872012]  [<ffffffff8119dcce>] do_last+0x6fe/0xe90
[  815.872012]  [<ffffffff8119a7af>] ? link_path_walk+0x7f/0x880
[  815.872012]  [<ffffffff810909d5>] ? cpuacct_charge+0x75/0x80
[  815.872012]  [<ffffffff8119e51c>] path_openat+0xbc/0x4e0
[  815.872012]  [<ffffffff816d0fd0>] ? __schedule+0x400/0x7f0
[  815.872012]  [<ffffffff8140e956>] ? tty_release+0x496/0x580
[  815.872012]  [<ffffffff8119ec11>] do_filp_open+0x41/0xa0
[  815.872012]  [<ffffffff816d267a>] ? _raw_spin_unlock+0x1a/0x40
[  815.872012]  [<ffffffff811abe39>] ? __alloc_fd+0xe9/0x140
[  815.872012]  [<ffffffff8118ea44>] do_sys_open+0xf4/0x1e0
[  815.872012]  [<ffffffff8118eb51>] sys_open+0x21/0x30
[  815.872012]  [<ffffffff816da499>] system_call_fastpath+0x16/0x1b
[  815.872012] Code: 0f 1f 80 00 00 00 00 45 31 e4 eb d7 0f 0b 90 0f 1f 44 00 00 55 48 89 e5 48 83 ec 20 48 89 5d e8 48 89 fb 4c 89 65 f0 4c 89 6d f8 <48> 8b 47 28 48 81 78 58 d1 1c 0$
[  815.872012] RIP  [<ffffffff81207bcc>] devpts_pty_kill+0x1c/0xa0
[  815.872012]  RSP <ffff88007d3e1ac8>
[  815.872012] CR2: 0000000000000028
[  815.897036] ---[ end trace eadf50b7f34e47d5 ]---

Fixes this BUG also:

[  608.366836] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
[  608.366948] IP: [<ffffffff812078d8>] devpts_kill_index+0x18/0x70
[  608.367050] PGD 7c75b067 PUD 7b919067 PMD 0
[  608.367135] Oops: 0000 [#1] PREEMPT SMP
[  608.367201] Modules linked in: kvm_intel kvm snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event microcode snd_seq psmouse snd_timer snd_seq_device serio_raw snd mac_hid soundcore snd_page_alloc rfcomm virtio_balloon parport_pc bnep bluetooth ppdev i2c_piix4 lp parport floppy
[  608.367617] CPU 2
[  608.367669] Pid: 1918, comm: stress_test_tty Tainted: G        W    3.8.0-next-20130125+ttypatch-2-xeon #2 Bochs Bochs
[  608.367796] RIP: 0010:[<ffffffff812078d8>]  [<ffffffff812078d8>] devpts_kill_index+0x18/0x70
[  608.367885] RSP: 0018:ffff88007ae41a88  EFLAGS: 00010286
[  608.367951] RAX: ffffffff81417e80 RBX: ffff880036472400 RCX: 0000000180400028
[  608.368010] RDX: ffff880036470004 RSI: 0000000000000004 RDI: 0000000000000000
[  608.368010] RBP: ffff88007ae41a98 R08: 0000000000000000 R09: 0000000000000001
[  608.368010] R10: ffffea0001f22e40 R11: ffffffff814151d5 R12: 0000000000000004
[  608.368010] R13: ffff880036470000 R14: 0000000000000004 R15: ffff880036472400
[  608.368010] FS:  00007ff7a5268700(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
[  608.368010] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  608.368010] CR2: 0000000000000028 CR3: 000000007a0fd000 CR4: 00000000000006e0
[  608.368010] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  608.368010] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  608.368010] Process stress_test_tty (pid: 1918, threadinfo ffff88007ae40000, task ffff88003688dc40)
[  608.368010] Stack:
[  608.368010]  ffff880036472400 0000000000000001 ffff88007ae41aa8 ffffffff81417e98
[  608.368010]  ffff88007ae41ac8 ffffffff8140c42b ffff88007ac73100 ffff88007ac73100
[  608.368010]  ffff88007ae41b98 ffffffff8140ead5 ffff88007ae41b38 ffff88007ca40e40
[  608.368010] Call Trace:
[  608.368010]  [<ffffffff81417e98>] pty_unix98_shutdown+0x18/0x20
[  608.368010]  [<ffffffff8140c42b>] release_tty+0x3b/0xe0
[  608.368010]  [<ffffffff8140ead5>] __tty_release+0x575/0x5d0
[  608.368010]  [<ffffffff816d2c63>] ? _raw_spin_lock+0x23/0x30
[  608.368010]  [<ffffffff816d28ea>] ? _raw_spin_unlock+0x1a/0x40
[  608.368010]  [<ffffffff816d03e8>] ? __mutex_unlock_slowpath+0x48/0x60
[  608.368010]  [<ffffffff8140ef79>] tty_open+0x449/0x5f0
[  608.368010]  [<ffffffff8119394b>] chrdev_open+0x9b/0x1c0
[  608.368010]  [<ffffffff8118d643>] do_dentry_open+0x203/0x290
[  608.368010]  [<ffffffff811938b0>] ? cdev_put+0x30/0x30
[  608.368010]  [<ffffffff8118d705>] finish_open+0x35/0x50
[  608.368010]  [<ffffffff8119dcce>] do_last+0x6fe/0xe90
[  608.368010]  [<ffffffff8119a7af>] ? link_path_walk+0x7f/0x880
[  608.368010]  [<ffffffff8119e51c>] path_openat+0xbc/0x4e0
[  608.368010]  [<ffffffff8119ec11>] do_filp_open+0x41/0xa0
[  608.368010]  [<ffffffff816d28ea>] ? _raw_spin_unlock+0x1a/0x40
[  608.368010]  [<ffffffff811abe39>] ? __alloc_fd+0xe9/0x140
[  608.368010]  [<ffffffff8118ea44>] do_sys_open+0xf4/0x1e0
[  608.368010]  [<ffffffff816d2c63>] ? _raw_spin_lock+0x23/0x30
[  608.368010]  [<ffffffff8118eb51>] sys_open+0x21/0x30
[  608.368010]  [<ffffffff816da719>] system_call_fastpath+0x16/0x1b
[  608.368010] Code: ec 48 83 c4 10 5b 41 5c 5d c3 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 48 83 ec 10 4c 89 65 f8 41 89 f4 48 89 5d f0 <48> 8b 47 28 48 81 78 58 d1 1c 00 00 74 0b 48 8b 05 4b 66 cf 00
[  608.368010] RIP  [<ffffffff812078d8>] devpts_kill_index+0x18/0x70
[  608.368010]  RSP <ffff88007ae41a88>
[  608.368010] CR2: 0000000000000028
[  608.394153] ---[ end trace afe83b0fb5fbda93 ]---

Reported-by: Ilya Zykov <ilya@ilyx.ru>
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7acf6cd
Stephen Warren swarren referenced this issue from a commit February 08, 2013
Commit has since been removed from the repository and is no longer available.
Stephen Warren swarren referenced this issue from a commit February 08, 2013
Commit has since been removed from the repository and is no longer available.
Stephen Warren swarren referenced this issue from a commit February 08, 2013
Commit has since been removed from the repository and is no longer available.
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi February 05, 2013
David Ahern perf evlist: Make event_copy local to mmaps
I am getting segfaults *after* the time sorting of perf samples where
the event type is off the charts:

(gdb) bt
\#0  0x0807b1b2 in hists__inc_nr_events (hists=0x80a99c4, type=1163281902) at util/hist.c:1225
\#1  0x08070795 in perf_session_deliver_event (session=0x80a9b90, event=0xf7a6aff8, sample=0xffffc318, tool=0xffffc520,
    file_offset=0) at util/session.c:884
\#2  0x0806f9b9 in flush_sample_queue (s=0x80a9b90, tool=0xffffc520) at util/session.c:555
\#3  0x0806fc53 in process_finished_round (tool=0xffffc520, event=0x0, session=0x80a9b90) at util/session.c:645

This is bizarre because the event has already been processed once --
before it was added to the samples queue -- and the event was found to
be sane at that time.

There seem to be 2 causes:

1. perf_evlist__mmap_read updates the read location even though there
are outstanding references to events sitting in the mmap buffers via the
ordered samples queue.

2. There is a single evlist->event_copy for all evlist entries.
event_copy is used to handle an event wrapping at the mmap buffer
boundary.

This patch addresses the second problem - making event_copy local to
each perf_mmap. With this change my highly repeatable use case no longer
fails.

The first problem is much more complicated and will be the subject of a
future patch.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1360098762-61827-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
0479b8b
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi February 05, 2013
leds-lp55xx: add new common driver for lp5521/5523
 This patch supports basic common driver code for LP5521, LP5523/55231 devices.

 ( Driver Structure Data )

 lp55xx_led and lp55xx_chip
 In lp55xx common driver, two different data structure is used.
 o lp55xx_led
   control multi output LED channels such as led current, channel index.
 o lp55xx_chip
   general chip control such like the I2C and platform data.

 For example, LP5521 has maximum 3 LED channels.
 LP5523/55231 has 9 output channels.

 lp55xx_chip for LP5521 ... lp55xx_led #1
                            lp55xx_led #2
                            lp55xx_led #3

 lp55xx_chip for LP5523 ... lp55xx_led