smsc95xx patch unnecessary #3

Closed
Alan-Cox opened this Issue Jan 25, 2012 · 8 comments

Comments

Projects
None yet
6 participants
Contributor

Alan-Cox commented Jan 25, 2012

The Linux kernel supports setting the hardware address by ioctl

ifconfig eth0 down
ifconfig eth0 hw aa:bb:cc:dd:ee:ff
ifconfig eth0 up

So this patch can be dropped and has no chance of going upstream

Collaborator

popcornmix commented Jan 25, 2012

OTP bits are connected to the GPU. The ARM currently can't see the OTP bits.
The serial number is stored in the OTP bits. The MAC adress is the OUI + serial number.

The command line is a convenient way of passing information from GPU to ARM.

I agree this is better done from userspace.

I'll try and get an ARM driver in that exposes the serial/OTP bits, so this is no longer necessary in smsc95xx driver.

asb commented Jan 25, 2012

Alan: how about having smsc95xx use system_serial_low and system_serial_high to generate the MAC?

huceke commented Jan 25, 2012

Setting the mac adress makes perfect sense. Take the szenario where you have a nfs root without an initial ramdisk.

Contributor

Alan-Cox commented Jan 25, 2012

huceke: so use an initrd (you need to anyway in reality to do the dhcp and the like properly given the usb ethernet will appear at some point during the boot as its asynchronously discovered

asb: if the bits are exposed so the kernel can get at them that sounds ideal - the driver can then just pick them up if its a PI (or better yet if the device usb id is unique to the PI). The problem though if you don't have a unique id is you need to be careful of the case where someone plugs the same device on a dongle into a PI..

Contributor

Alan-Cox commented Jan 25, 2012

(as a PS - the mac= is broken for this case too...)

@richo richo pushed a commit to richo/linux that referenced this issue Mar 6, 2012

@bergwolf bergwolf + Trond Myklebust pnfsblock: acquire im_lock in _preload_range
When calling _add_entry, we should take the im_lock to protect
agains other modifiers.

Cc: <stable@vger.kernel.org> #3.1+
Signed-off-by: Peng Tao <peng_tao@emc.com>
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
39e567a

@richo richo pushed a commit to richo/linux that referenced this issue Mar 6, 2012

@bergwolf bergwolf + Trond Myklebust pnfsblock: don't spinlock when freeing block_dev
bl_free_block_dev() may sleep. We can not call it with spinlock held.
Besides, there is no need to take bm_lock as we are last user freeing bm_devlist.

Cc: <stable@vger.kernel.org> #3.1+
Signed-off-by: Peng Tao <peng_tao@emc.com>
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
93a3844

@richo richo pushed a commit to richo/linux that referenced this issue Mar 6, 2012

@bergwolf bergwolf + Trond Myklebust pnfsblock: limit bio page count
One bio can have at most BIO_MAX_PAGES pages. We should limit it bec otherwise
bio_alloc will fail when there are many pages in one read/write_pagelist.

Cc: <stable@vger.kernel.org> #3.1+
Signed-off-by: Peng Tao <peng_tao@emc.com>
Signed-off-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
74a6eeb

@richo richo pushed a commit to richo/linux that referenced this issue Mar 6, 2012

@tmlind Vaibhav Hiremath + tmlind ARM: OMAP2+: timer: Fix crash due to wrong arg to __omap_dm_timer_rea…
…d_counter

Commit 2f0778a (ARM: 7205/2: sched_clock: allow sched_clock to be
selected at runtime) had a typo for the case when CONFIG_OMAP_32K_TIMER
is not set.

In dmtimer_read_sched_clock(), wrong argument was getting passed to
__omap_dm_timer_read_counter() function call; instead of "&clksrc",
we were passing "clksrc.io_base", which results into kernel crash.

To reproduce kernel crash, just disable the CONFIG_OMAP_32K_TIMER config
option (and DEBUG_LL) and build/boot the kernel.
This will use dmtimer as a kernel clocksource and lead to kernel
crash during boot  -

[    0.000000] OMAP clocksource: GPTIMER2 at 26000000 Hz
[    0.000000] sched_clock: 32 bits at 26MHz, resolution 38ns, wraps every
165191ms
[    0.000000] Unable to handle kernel paging request at virtual address
00030ef1
[    0.000000] pgd = c0004000
[    0.000000] [00030ef1] *pgd=00000000
[    0.000000] Internal error: Oops: 5 [#1] SMP
[    0.000000] Modules linked in:
[    0.000000] CPU: 0    Not tainted  (3.3.0-rc1-11574-g0c76665-dirty #3)
[    0.000000] PC is at dmtimer_read_sched_clock+0x18/0x4c
[    0.000000] LR is at update_sched_clock+0x10/0x84
[    0.000000] pc : [<c00243b8>]    lr : [<c0018684>]    psr: 200001d3
[    0.000000] sp : c0641f38  ip : c0641e18  fp : 0000000a
[    0.000000] r10: 151c3303  r9 : 00000026  r8 : 76276259
[    0.000000] r7 : 00028547  r6 : c065ac80  r5 : 431bde82  r4 : c0655968
[    0.000000] r3 : 00030ef1  r2 : fb032000  r1 : 00000028  r0 : 00000001

Signed-off-by: Vaibhav Hiremath <hvaibhav@ti.com>
[tony@atomide.com: updated comments]
Signed-off-by: Tony Lindgren <tony@atomide.com>
dbc3982

@richo richo pushed a commit to richo/linux that referenced this issue Mar 6, 2012

@prasad-joshi prasad-joshi logfs: Propagate page parameter to __logfs_write_inode
During GC LogFS has to rewrite each valid block to a separate segment.
Rewrite operation reads data from an old segment and writes it to a
newly allocated segment. Since every write operation changes data
block pointers maintained in inode, inode should also be rewritten.

In GC path to avoid AB-BA deadlock LogFS marks a page with
PG_pre_locked in addition to locking the page (PG_locked). The page
lock is ignored iff the page is pre-locked.

LogFS uses a special file called segment file. The segment file
maintains an 8 bytes entry for every segment. It keeps track of erase
count, level etc. for every segment.

Bad things happen with a segment belonging to the segment file is GCed

 ------------[ cut here ]------------
kernel BUG at /home/prasad/logfs/readwrite.c:297!
invalid opcode: 0000 [#1] SMP
Modules linked in: logfs joydev usbhid hid psmouse e1000 i2c_piix4
		serio_raw [last unloaded: logfs]
Pid: 20161, comm: mount Not tainted 3.1.0-rc3+ #3 innotek GmbH
		VirtualBox
EIP: 0060:[<f809132a>] EFLAGS: 00010292 CPU: 0
EIP is at logfs_lock_write_page+0x6a/0x70 [logfs]
EAX: 00000027 EBX: f73f5b20 ECX: c16007c8 EDX: 00000094
ESI: 00000000 EDI: e59be6e4 EBP: c7337b28 ESP: c7337b18
DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process mount (pid: 20161, ti=c7336000 task=eb323f70 task.ti=c7336000)
Stack:
f8099a3d c7337b24 f73f5b20 00001002 c7337b50 f8091f6d f8099a4d f80994e4
00000003 00000000 c7337b68 00000000 c67e4400 00001000 c7337b80 f80935e5
00000000 00000000 00000000 00000000 e1fcf000 0000000f e59be618 c70bf900
Call Trace:
[<f8091f6d>] logfs_get_write_page.clone.16+0xdd/0x100 [logfs]
[<f80935e5>] logfs_mod_segment_entry+0x55/0x110 [logfs]
[<f809460d>] logfs_get_segment_entry+0x1d/0x20 [logfs]
[<f8091060>] ? logfs_cleanup_journal+0x50/0x50 [logfs]
[<f809521b>] ostore_get_erase_count+0x1b/0x40 [logfs]
[<f80965b8>] logfs_open_area+0xc8/0x150 [logfs]
[<c141a7ec>] ? kmemleak_alloc+0x2c/0x60
[<f809668e>] __logfs_segment_write.clone.16+0x4e/0x1b0 [logfs]
[<c10dd563>] ? mempool_kmalloc+0x13/0x20
[<c10dd563>] ? mempool_kmalloc+0x13/0x20
[<f809696f>] logfs_segment_write+0x17f/0x1d0 [logfs]
[<f8092e8c>] logfs_write_i0+0x11c/0x180 [logfs]
[<f8092f35>] logfs_write_direct+0x45/0x90 [logfs]
[<f80934cd>] __logfs_write_buf+0xbd/0xf0 [logfs]
[<c102900e>] ? kmap_atomic_prot+0x4e/0xe0
[<f809424b>] logfs_write_buf+0x3b/0x60 [logfs]
[<f80947a9>] __logfs_write_inode+0xa9/0x110 [logfs]
[<f8094cb0>] logfs_rewrite_block+0xc0/0x110 [logfs]
[<f8095300>] ? get_mapping_page+0x10/0x60 [logfs]
[<f8095aa0>] ? logfs_load_object_aliases+0x2e0/0x2f0 [logfs]
[<f808e57d>] logfs_gc_segment+0x2ad/0x310 [logfs]
[<f808e62a>] __logfs_gc_once+0x4a/0x80 [logfs]
[<f808ed43>] logfs_gc_pass+0x683/0x6a0 [logfs]
[<f8097a89>] logfs_mount+0x5a9/0x680 [logfs]
[<c1126b21>] mount_fs+0x21/0xd0
[<c10f6f6f>] ? __alloc_percpu+0xf/0x20
[<c113da41>] ? alloc_vfsmnt+0xb1/0x130
[<c113db4b>] vfs_kern_mount+0x4b/0xa0
[<c113e06e>] do_kern_mount+0x3e/0xe0
[<c113f60d>] do_mount+0x34d/0x670
[<c10f2749>] ? strndup_user+0x49/0x70
[<c113fcab>] sys_mount+0x6b/0xa0
[<c142d87c>] syscall_call+0x7/0xb
Code: f8 e8 8b 93 39 c9 8b 45 f8 3e 0f ba 28 00 19 d2 85 d2 74 ca eb d0 0f 0b 8d 45 fc 89 44 24 04 c7 04 24 3d 9a 09 f8 e8 09 92 39 c9 <0f> 0b 8d 74 26 00 55 89 e5 3e 8d 74 26 00 8b 10 80 e6 01 74 09
EIP: [<f809132a>] logfs_lock_write_page+0x6a/0x70 [logfs] SS:ESP 0068:c7337b18
---[ end trace 96e67d5b3aa3d6ca ]---

The patch passes locked page to __logfs_write_inode. It calls function
logfs_get_wblocks() to pre-lock the page. This ensures any further
attempts to lock the page are ignored (esp from get_erase_count).

Acked-by: Joern Engel <joern@logfs.org>
Signed-off-by: Prasad Joshi <prasadjoshi.linux@gmail.com>
0bd9038

@richo richo pushed a commit to richo/linux that referenced this issue Mar 6, 2012

@ming1 ming1 + Jiri Kosina HID: usbhid: fix dead lock between open and disconect
There is no reason to hold hiddev->existancelock before
calling usb_deregister_dev, so move it out of the lock.

The patch fixes the lockdep warning below.

[ 5733.386271] ======================================================
[ 5733.386274] [ INFO: possible circular locking dependency detected ]
[ 5733.386278] 3.2.0-custom-next-20120111+ #1 Not tainted
[ 5733.386281] -------------------------------------------------------
[ 5733.386284] khubd/186 is trying to acquire lock:
[ 5733.386288]  (minor_rwsem){++++.+}, at: [<ffffffffa0011a04>] usb_deregister_dev+0x37/0x9e [usbcore]
[ 5733.386311]
[ 5733.386312] but task is already holding lock:
[ 5733.386315]  (&hiddev->existancelock){+.+...}, at: [<ffffffffa0094d17>] hiddev_disconnect+0x26/0x87 [usbhid]
[ 5733.386328]
[ 5733.386329] which lock already depends on the new lock.
[ 5733.386330]
[ 5733.386333]
[ 5733.386334] the existing dependency chain (in reverse order) is:
[ 5733.386336]
[ 5733.386337] -> #1 (&hiddev->existancelock){+.+...}:
[ 5733.386346]        [<ffffffff81082d26>] lock_acquire+0xcb/0x10e
[ 5733.386357]        [<ffffffff813df961>] __mutex_lock_common+0x60/0x465
[ 5733.386366]        [<ffffffff813dfe4d>] mutex_lock_nested+0x36/0x3b
[ 5733.386371]        [<ffffffffa0094ad6>] hiddev_open+0x113/0x193 [usbhid]
[ 5733.386378]        [<ffffffffa0011971>] usb_open+0x66/0xc2 [usbcore]
[ 5733.386390]        [<ffffffff8111a8b5>] chrdev_open+0x12b/0x154
[ 5733.386402]        [<ffffffff811159a8>] __dentry_open.isra.16+0x20b/0x355
[ 5733.386408]        [<ffffffff811165dc>] nameidata_to_filp+0x43/0x4a
[ 5733.386413]        [<ffffffff81122ed5>] do_last+0x536/0x570
[ 5733.386419]        [<ffffffff8112300b>] path_openat+0xce/0x301
[ 5733.386423]        [<ffffffff81123327>] do_filp_open+0x33/0x81
[ 5733.386427]        [<ffffffff8111664d>] do_sys_open+0x6a/0xfc
[ 5733.386431]        [<ffffffff811166fb>] sys_open+0x1c/0x1e
[ 5733.386434]        [<ffffffff813e7c79>] system_call_fastpath+0x16/0x1b
[ 5733.386441]
[ 5733.386441] -> #0 (minor_rwsem){++++.+}:
[ 5733.386448]        [<ffffffff8108255d>] __lock_acquire+0xa80/0xd74
[ 5733.386454]        [<ffffffff81082d26>] lock_acquire+0xcb/0x10e
[ 5733.386458]        [<ffffffff813e01f5>] down_write+0x44/0x77
[ 5733.386464]        [<ffffffffa0011a04>] usb_deregister_dev+0x37/0x9e [usbcore]
[ 5733.386475]        [<ffffffffa0094d2d>] hiddev_disconnect+0x3c/0x87 [usbhid]
[ 5733.386483]        [<ffffffff8132df51>] hid_disconnect+0x3f/0x54
[ 5733.386491]        [<ffffffff8132dfb4>] hid_device_remove+0x4e/0x7a
[ 5733.386496]        [<ffffffff812c0957>] __device_release_driver+0x81/0xcd
[ 5733.386502]        [<ffffffff812c09c3>] device_release_driver+0x20/0x2d
[ 5733.386507]        [<ffffffff812c0564>] bus_remove_device+0x114/0x128
[ 5733.386512]        [<ffffffff812bdd6f>] device_del+0x131/0x183
[ 5733.386519]        [<ffffffff8132def3>] hid_destroy_device+0x1e/0x3d
[ 5733.386525]        [<ffffffffa00916b0>] usbhid_disconnect+0x36/0x42 [usbhid]
[ 5733.386530]        [<ffffffffa000fb60>] usb_unbind_interface+0x57/0x11f [usbcore]
[ 5733.386542]        [<ffffffff812c0957>] __device_release_driver+0x81/0xcd
[ 5733.386547]        [<ffffffff812c09c3>] device_release_driver+0x20/0x2d
[ 5733.386552]        [<ffffffff812c0564>] bus_remove_device+0x114/0x128
[ 5733.386557]        [<ffffffff812bdd6f>] device_del+0x131/0x183
[ 5733.386562]        [<ffffffffa000de61>] usb_disable_device+0xa8/0x1d8 [usbcore]
[ 5733.386573]        [<ffffffffa0006bd2>] usb_disconnect+0xab/0x11f [usbcore]
[ 5733.386583]        [<ffffffffa0008aa0>] hub_thread+0x73b/0x1157 [usbcore]
[ 5733.386593]        [<ffffffff8105dc0f>] kthread+0x95/0x9d
[ 5733.386601]        [<ffffffff813e90b4>] kernel_thread_helper+0x4/0x10
[ 5733.386607]
[ 5733.386608] other info that might help us debug this:
[ 5733.386609]
[ 5733.386612]  Possible unsafe locking scenario:
[ 5733.386613]
[ 5733.386615]        CPU0                    CPU1
[ 5733.386618]        ----                    ----
[ 5733.386620]   lock(&hiddev->existancelock);
[ 5733.386625]                                lock(minor_rwsem);
[ 5733.386630]                                lock(&hiddev->existancelock);
[ 5733.386635]   lock(minor_rwsem);
[ 5733.386639]
[ 5733.386640]  *** DEADLOCK ***
[ 5733.386641]
[ 5733.386644] 6 locks held by khubd/186:
[ 5733.386646]  #0:  (&__lockdep_no_validate__){......}, at: [<ffffffffa00084af>] hub_thread+0x14a/0x1157 [usbcore]
[ 5733.386661]  #1:  (&__lockdep_no_validate__){......}, at: [<ffffffffa0006b77>] usb_disconnect+0x50/0x11f [usbcore]
[ 5733.386677]  #2:  (hcd->bandwidth_mutex){+.+.+.}, at: [<ffffffffa0006bc8>] usb_disconnect+0xa1/0x11f [usbcore]
[ 5733.386693]  #3:  (&__lockdep_no_validate__){......}, at: [<ffffffff812c09bb>] device_release_driver+0x18/0x2d
[ 5733.386704]  #4:  (&__lockdep_no_validate__){......}, at: [<ffffffff812c09bb>] device_release_driver+0x18/0x2d
[ 5733.386714]  #5:  (&hiddev->existancelock){+.+...}, at: [<ffffffffa0094d17>] hiddev_disconnect+0x26/0x87 [usbhid]
[ 5733.386727]
[ 5733.386727] stack backtrace:
[ 5733.386731] Pid: 186, comm: khubd Not tainted 3.2.0-custom-next-20120111+ #1
[ 5733.386734] Call Trace:
[ 5733.386741]  [<ffffffff81062881>] ? up+0x34/0x3b
[ 5733.386747]  [<ffffffff813d9ef3>] print_circular_bug+0x1f8/0x209
[ 5733.386752]  [<ffffffff8108255d>] __lock_acquire+0xa80/0xd74
[ 5733.386756]  [<ffffffff810808b4>] ? trace_hardirqs_on_caller+0x15d/0x1a3
[ 5733.386763]  [<ffffffff81043a3f>] ? vprintk+0x3f4/0x419
[ 5733.386774]  [<ffffffffa0011a04>] ? usb_deregister_dev+0x37/0x9e [usbcore]
[ 5733.386779]  [<ffffffff81082d26>] lock_acquire+0xcb/0x10e
[ 5733.386789]  [<ffffffffa0011a04>] ? usb_deregister_dev+0x37/0x9e [usbcore]
[ 5733.386797]  [<ffffffff813e01f5>] down_write+0x44/0x77
[ 5733.386807]  [<ffffffffa0011a04>] ? usb_deregister_dev+0x37/0x9e [usbcore]
[ 5733.386818]  [<ffffffffa0011a04>] usb_deregister_dev+0x37/0x9e [usbcore]
[ 5733.386825]  [<ffffffffa0094d2d>] hiddev_disconnect+0x3c/0x87 [usbhid]
[ 5733.386830]  [<ffffffff8132df51>] hid_disconnect+0x3f/0x54
[ 5733.386834]  [<ffffffff8132dfb4>] hid_device_remove+0x4e/0x7a
[ 5733.386839]  [<ffffffff812c0957>] __device_release_driver+0x81/0xcd
[ 5733.386844]  [<ffffffff812c09c3>] device_release_driver+0x20/0x2d
[ 5733.386848]  [<ffffffff812c0564>] bus_remove_device+0x114/0x128
[ 5733.386854]  [<ffffffff812bdd6f>] device_del+0x131/0x183
[ 5733.386859]  [<ffffffff8132def3>] hid_destroy_device+0x1e/0x3d
[ 5733.386865]  [<ffffffffa00916b0>] usbhid_disconnect+0x36/0x42 [usbhid]
[ 5733.386876]  [<ffffffffa000fb60>] usb_unbind_interface+0x57/0x11f [usbcore]
[ 5733.386882]  [<ffffffff812c0957>] __device_release_driver+0x81/0xcd
[ 5733.386886]  [<ffffffff812c09c3>] device_release_driver+0x20/0x2d
[ 5733.386890]  [<ffffffff812c0564>] bus_remove_device+0x114/0x128
[ 5733.386895]  [<ffffffff812bdd6f>] device_del+0x131/0x183
[ 5733.386905]  [<ffffffffa000de61>] usb_disable_device+0xa8/0x1d8 [usbcore]
[ 5733.386916]  [<ffffffffa0006bd2>] usb_disconnect+0xab/0x11f [usbcore]
[ 5733.386921]  [<ffffffff813dff82>] ? __mutex_unlock_slowpath+0x130/0x141
[ 5733.386929]  [<ffffffffa0008aa0>] hub_thread+0x73b/0x1157 [usbcore]
[ 5733.386935]  [<ffffffff8106a51d>] ? finish_task_switch+0x78/0x150
[ 5733.386941]  [<ffffffff8105e396>] ? __init_waitqueue_head+0x4c/0x4c
[ 5733.386950]  [<ffffffffa0008365>] ? usb_remote_wakeup+0x56/0x56 [usbcore]
[ 5733.386955]  [<ffffffff8105dc0f>] kthread+0x95/0x9d
[ 5733.386961]  [<ffffffff813e90b4>] kernel_thread_helper+0x4/0x10
[ 5733.386966]  [<ffffffff813e24b8>] ? retint_restore_args+0x13/0x13
[ 5733.386970]  [<ffffffff8105db7a>] ? __init_kthread_worker+0x55/0x55
[ 5733.386974]  [<ffffffff813e90b0>] ? gs_change+0x13/0x13

Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
ba18311

@richo richo pushed a commit to richo/linux that referenced this issue Mar 6, 2012

@torvalds Mel Gorman + torvalds mm: compaction: check pfn_valid when entering a new MAX_ORDER_NR_PAGE…
…S block during isolation for migration

When isolating for migration, migration starts at the start of a zone
which is not necessarily pageblock aligned.  Further, it stops isolating
when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally
not aligned.  This allows isolate_migratepages() to call pfn_to_page() on
an invalid PFN which can result in a crash.  This was originally reported
against a 3.0-based kernel with the following trace in a crash dump.

PID: 9902   TASK: d47aecd0  CPU: 0   COMMAND: "memcg_process_s"
 #0 [d72d3ad0] crash_kexec at c028cfdb
 #1 [d72d3b24] oops_end at c05c5322
 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60
 #3 [d72d3bec] bad_area at c0227fb6
 #4 [d72d3c00] do_page_fault at c05c72ec
 #5 [d72d3c80] error_code (via page_fault) at c05c47a4
    EAX: 00000000  EBX: 000c0000  ECX: 00000001  EDX: 00000807  EBP: 000c0000
    DS:  007b      ESI: 00000001  ES:  007b      EDI: f3000a80  GS:  6f50
    CS:  0060      EIP: c030b15a  ERR: ffffffff  EFLAGS: 00010002
 #6 [d72d3cb4] isolate_migratepages at c030b15a
 #7 [d72d3d14] zone_watermark_ok at c02d26cb
 #8 [d72d3d2c] compact_zone at c030b8de
 #9 [d72d3d68] compact_zone_order at c030bba1
#10 [d72d3db4] try_to_compact_pages at c030bc84
#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7
#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7
#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97
#14 [d72d3eb8] alloc_pages_vma at c030a845
#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb
#16 [d72d3f00] handle_mm_fault at c02f36c6
#17 [d72d3f30] do_page_fault at c05c70ed
#18 [d72d3fb0] error_code (via page_fault) at c05c47a4
    EAX: b71ff00  EBX: 00000001  ECX: 00001600  EDX: 00000431
    DS:  007b      ESI: 08048950  ES:  007b      EDI: bfaa3788
    SS:  007b      ESP: bfaa36e0  EBP: bfaa3828  GS:  6f50
    CS:  0073      EIP: 080487c8  ERR: ffffffff  EFLAGS: 00010202

It was also reported by Herbert van den Bergh against 3.1-based kernel
with the following snippet from the console log.

BUG: unable to handle kernel paging request at 01c00008
IP: [<c0522399>] isolate_migratepages+0x119/0x390
*pdpt = 000000002f7ce001 *pde = 0000000000000000

It is expected that it also affects 3.2.x and current mainline.

The problem is that pfn_valid is only called on the first PFN being
checked and that PFN is not necessarily aligned.  Lets say we have a case
like this

H = MAX_ORDER_NR_PAGES boundary
| = pageblock boundary
m = cc->migrate_pfn
f = cc->free_pfn
o = memory hole

H------|------H------|----m-Hoooooo|ooooooH-f----|------H

The migrate_pfn is just below a memory hole and the free scanner is beyond
the hole.  When isolate_migratepages started, it scans from migrate_pfn to
migrate_pfn+pageblock_nr_pages which is now in a memory hole.  It checks
pfn_valid() on the first PFN but then scans into the hole where there are
not necessarily valid struct pages.

This patch ensures that isolate_migratepages calls pfn_valid when
necessary.

Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
0bf380b

@richo richo pushed a commit to richo/linux that referenced this issue Mar 6, 2012

@yizou @Jkirsher yizou + Jkirsher ixgbe: do not update real num queues when netdev is going away
If the netdev is already in NETREG_UNREGISTERING/_UNREGISTERED state, do not
update the real num tx queues. netdev_queue_update_kobjects() is already
called via remove_queue_kobjects() at NETREG_UNREGISTERING time. So, when
upper layer driver, e.g., FCoE protocol stack is monitoring the netdev
event of NETDEV_UNREGISTER and calls back to LLD ndo_fcoe_disable() to remove
extra queues allocated for FCoE, the associated txq sysfs kobjects are already
removed, and trying to update the real num queues would cause something like
below:

...
PID: 25138  TASK: ffff88021e64c440  CPU: 3   COMMAND: "kworker/3:3"
 #0 [ffff88021f007760] machine_kexec at ffffffff810226d9
 #1 [ffff88021f0077d0] crash_kexec at ffffffff81089d2d
 #2 [ffff88021f0078a0] oops_end at ffffffff813bca78
 #3 [ffff88021f0078d0] no_context at ffffffff81029e72
 #4 [ffff88021f007920] __bad_area_nosemaphore at ffffffff8102a155
 #5 [ffff88021f0079f0] bad_area_nosemaphore at ffffffff8102a23e
 #6 [ffff88021f007a00] do_page_fault at ffffffff813bf32e
 #7 [ffff88021f007b10] page_fault at ffffffff813bc045
    [exception RIP: sysfs_find_dirent+17]
    RIP: ffffffff81178611  RSP: ffff88021f007bc0  RFLAGS: 00010246
    RAX: ffff88021e64c440  RBX: ffffffff8156cc63  RCX: 0000000000000004
    RDX: ffffffff8156cc63  RSI: 0000000000000000  RDI: 0000000000000000
    RBP: ffff88021f007be0   R8: 0000000000000004   R9: 0000000000000008
    R10: ffffffff816fed00  R11: 0000000000000004  R12: 0000000000000000
    R13: ffffffff8156cc63  R14: 0000000000000000  R15: ffff8802222a0000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #8 [ffff88021f007be8] sysfs_get_dirent at ffffffff81178c07
 #9 [ffff88021f007c18] sysfs_remove_group at ffffffff8117ac27
#10 [ffff88021f007c48] netdev_queue_update_kobjects at ffffffff813178f9
#11 [ffff88021f007c88] netif_set_real_num_tx_queues at ffffffff81303e38
#12 [ffff88021f007cc8] ixgbe_set_num_queues at ffffffffa0249763 [ixgbe]
#13 [ffff88021f007cf8] ixgbe_init_interrupt_scheme at ffffffffa024ea89 [ixgbe]
#14 [ffff88021f007d48] ixgbe_fcoe_disable at ffffffffa0267113 [ixgbe]
#15 [ffff88021f007d68] vlan_dev_fcoe_disable at ffffffffa014fef5 [8021q]
#16 [ffff88021f007d78] fcoe_interface_cleanup at ffffffffa02b7dfd [fcoe]
#17 [ffff88021f007df8] fcoe_destroy_work at ffffffffa02b7f08 [fcoe]
#18 [ffff88021f007e18] process_one_work at ffffffff8105d7ca
#19 [ffff88021f007e68] worker_thread at ffffffff81060513
#20 [ffff88021f007ee8] kthread at ffffffff810648b6
#21 [ffff88021f007f48] kernel_thread_helper at ffffffff813c40f4

Signed-off-by: Yi Zou <yi.zou@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9d837ea

@richo richo pushed a commit to richo/linux that referenced this issue Mar 6, 2012

@tavip @jhedberg tavip + jhedberg Bluetooth: silence lockdep warning
Since bluetooth uses multiple protocols types, to avoid lockdep
warnings, we need to use different lockdep classes (one for each
protocol type).

This is already done in bt_sock_create but it misses a couple of cases
when new connections are created. This patch corrects that to fix the
following warning:

<4>[ 1864.732366] =======================================================
<4>[ 1864.733030] [ INFO: possible circular locking dependency detected ]
<4>[ 1864.733544] 3.0.16-mid3-00007-gc9a0f62 #3
<4>[ 1864.733883] -------------------------------------------------------
<4>[ 1864.734408] t.android.btclc/4204 is trying to acquire lock:
<4>[ 1864.734869]  (rfcomm_mutex){+.+.+.}, at: [<c14970ea>] rfcomm_dlc_close+0x15/0x30
<4>[ 1864.735541]
<4>[ 1864.735549] but task is already holding lock:
<4>[ 1864.736045]  (sk_lock-AF_BLUETOOTH){+.+.+.}, at: [<c1498bf7>] lock_sock+0xa/0xc
<4>[ 1864.736732]
<4>[ 1864.736740] which lock already depends on the new lock.
<4>[ 1864.736750]
<4>[ 1864.737428]
<4>[ 1864.737437] the existing dependency chain (in reverse order) is:
<4>[ 1864.738016]
<4>[ 1864.738023] -> #1 (sk_lock-AF_BLUETOOTH){+.+.+.}:
<4>[ 1864.738549]        [<c1062273>] lock_acquire+0x104/0x140
<4>[ 1864.738977]        [<c13d35c1>] lock_sock_nested+0x58/0x68
<4>[ 1864.739411]        [<c1493c33>] l2cap_sock_sendmsg+0x3e/0x76
<4>[ 1864.739858]        [<c13d06c3>] __sock_sendmsg+0x50/0x59
<4>[ 1864.740279]        [<c13d0ea2>] sock_sendmsg+0x94/0xa8
<4>[ 1864.740687]        [<c13d0ede>] kernel_sendmsg+0x28/0x37
<4>[ 1864.741106]        [<c14969ca>] rfcomm_send_frame+0x30/0x38
<4>[ 1864.741542]        [<c1496a2a>] rfcomm_send_ua+0x58/0x5a
<4>[ 1864.741959]        [<c1498447>] rfcomm_run+0x441/0xb52
<4>[ 1864.742365]        [<c104f095>] kthread+0x63/0x68
<4>[ 1864.742742]        [<c14d5182>] kernel_thread_helper+0x6/0xd
<4>[ 1864.743187]
<4>[ 1864.743193] -> #0 (rfcomm_mutex){+.+.+.}:
<4>[ 1864.743667]        [<c1061ada>] __lock_acquire+0x988/0xc00
<4>[ 1864.744100]        [<c1062273>] lock_acquire+0x104/0x140
<4>[ 1864.744519]        [<c14d2c70>] __mutex_lock_common+0x3b/0x33f
<4>[ 1864.744975]        [<c14d303e>] mutex_lock_nested+0x2d/0x36
<4>[ 1864.745412]        [<c14970ea>] rfcomm_dlc_close+0x15/0x30
<4>[ 1864.745842]        [<c14990d9>] __rfcomm_sock_close+0x5f/0x6b
<4>[ 1864.746288]        [<c1499114>] rfcomm_sock_shutdown+0x2f/0x62
<4>[ 1864.746737]        [<c13d275d>] sys_socketcall+0x1db/0x422
<4>[ 1864.747165]        [<c14d42f0>] syscall_call+0x7/0xb

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
b5a30dd

@bootc bootc pushed a commit to bootc/linux-rpi-orig that referenced this issue May 8, 2012

@gregkh Mel Gorman + gregkh mm: compaction: check pfn_valid when entering a new MAX_ORDER_NR_PAGE…
…S block during isolation for migration

commit 0bf380b upstream.

When isolating for migration, migration starts at the start of a zone
which is not necessarily pageblock aligned.  Further, it stops isolating
when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally
not aligned.  This allows isolate_migratepages() to call pfn_to_page() on
an invalid PFN which can result in a crash.  This was originally reported
against a 3.0-based kernel with the following trace in a crash dump.

PID: 9902   TASK: d47aecd0  CPU: 0   COMMAND: "memcg_process_s"
 #0 [d72d3ad0] crash_kexec at c028cfdb
 #1 [d72d3b24] oops_end at c05c5322
 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60
 #3 [d72d3bec] bad_area at c0227fb6
 #4 [d72d3c00] do_page_fault at c05c72ec
 #5 [d72d3c80] error_code (via page_fault) at c05c47a4
    EAX: 00000000  EBX: 000c0000  ECX: 00000001  EDX: 00000807  EBP: 000c0000
    DS:  007b      ESI: 00000001  ES:  007b      EDI: f3000a80  GS:  6f50
    CS:  0060      EIP: c030b15a  ERR: ffffffff  EFLAGS: 00010002
 #6 [d72d3cb4] isolate_migratepages at c030b15a
 #7 [d72d3d14] zone_watermark_ok at c02d26cb
 #8 [d72d3d2c] compact_zone at c030b8de
 #9 [d72d3d68] compact_zone_order at c030bba1
#10 [d72d3db4] try_to_compact_pages at c030bc84
#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7
#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7
#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97
#14 [d72d3eb8] alloc_pages_vma at c030a845
#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb
#16 [d72d3f00] handle_mm_fault at c02f36c6
#17 [d72d3f30] do_page_fault at c05c70ed
#18 [d72d3fb0] error_code (via page_fault) at c05c47a4
    EAX: b71ff00  EBX: 00000001  ECX: 00001600  EDX: 00000431
    DS:  007b      ESI: 08048950  ES:  007b      EDI: bfaa3788
    SS:  007b      ESP: bfaa36e0  EBP: bfaa3828  GS:  6f50
    CS:  0073      EIP: 080487c8  ERR: ffffffff  EFLAGS: 00010202

It was also reported by Herbert van den Bergh against 3.1-based kernel
with the following snippet from the console log.

BUG: unable to handle kernel paging request at 01c00008
IP: [<c0522399>] isolate_migratepages+0x119/0x390
*pdpt = 000000002f7ce001 *pde = 0000000000000000

It is expected that it also affects 3.2.x and current mainline.

The problem is that pfn_valid is only called on the first PFN being
checked and that PFN is not necessarily aligned.  Lets say we have a case
like this

H = MAX_ORDER_NR_PAGES boundary
| = pageblock boundary
m = cc->migrate_pfn
f = cc->free_pfn
o = memory hole

H------|------H------|----m-Hoooooo|ooooooH-f----|------H

The migrate_pfn is just below a memory hole and the free scanner is beyond
the hole.  When isolate_migratepages started, it scans from migrate_pfn to
migrate_pfn+pageblock_nr_pages which is now in a memory hole.  It checks
pfn_valid() on the first PFN but then scans into the hole where there are
not necessarily valid struct pages.

This patch ensures that isolate_migratepages calls pfn_valid when
necessary.

Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9da11af
Contributor

lp0 commented May 13, 2012

The information for the device is almost completely blank:

Bus 001 Device 003: ID 0424:ec00 Standard Microsystems Corp. 
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass          255 Vendor Specific Class
  bDeviceSubClass         0 
  bDeviceProtocol         1 
  bMaxPacketSize0        64
  idVendor           0x0424 Standard Microsystems Corp.
  idProduct          0xec00 
  bcdDevice            2.00
  iManufacturer           0 
  iProduct                0 
  iSerial                 0 
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength           39
    bNumInterfaces          1
    bConfigurationValue     1
    iConfiguration          0 
    bmAttributes         0xe0
      Self Powered
      Remote Wakeup
    MaxPower                2mA
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           3
      bInterfaceClass       255 Vendor Specific Class
      bInterfaceSubClass      0 
      bInterfaceProtocol    255 
      iInterface              0 
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x02  EP 2 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0200  1x 512 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x83  EP 3 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0010  1x 16 bytes
        bInterval               4
Device Qualifier (for other device speed):
  bLength                10
  bDescriptorType         6
  bcdUSB               2.00
  bDeviceClass          255 Vendor Specific Class
  bDeviceSubClass         0 
  bDeviceProtocol         1 
  bMaxPacketSize0        64
  bNumConfigurations      1
Device Status:     0x0001
  Self Powered

If some part of it could be filled in with something unique to the model (e.g. idVendor/idProduct or "Raspberry Pi" in the iProduct) then the driver could safely use the system serial to calculate the MAC.

@erique erique pushed a commit to erique/rpi_linux that referenced this issue Jul 16, 2012

@bebarino @davem330 bebarino + davem330 ks8851: Fix mutex deadlock in ks8851_net_stop()
There is a potential deadlock scenario when the ks8851 driver
is removed. The interrupt handler schedules a workqueue which
acquires a mutex that ks8851_net_stop() also acquires before
flushing the workqueue. Previously lockdep wouldn't be able
to find this problem but now that it has the support we can
trigger this lockdep warning by rmmoding the driver after
an ifconfig up.

Fix the possible deadlock by disabling the interrupts in
the chip and then release the lock across the workqueue
flushing. The mutex is only there to proect the registers
anyway so this should be ok.

=======================================================
[ INFO: possible circular locking dependency detected ]
3.0.21-00021-g8b33780-dirty #2911
-------------------------------------------------------
rmmod/125 is trying to acquire lock:
 ((&ks->irq_work)){+.+...}, at: [<c019e0b8>] flush_work+0x0/0xac

but task is already holding lock:
 (&ks->lock){+.+...}, at: [<bf00b850>] ks8851_net_stop+0x64/0x138 [ks8851]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&ks->lock){+.+...}:
       [<c01b89c8>] __lock_acquire+0x940/0x9f8
       [<c01b9058>] lock_acquire+0x10c/0x130
       [<c083dbec>] mutex_lock_nested+0x68/0x3dc
       [<bf00bd48>] ks8851_irq_work+0x24/0x46c [ks8851]
       [<c019c580>] process_one_work+0x2d8/0x518
       [<c019cb98>] worker_thread+0x220/0x3a0
       [<c01a2ad4>] kthread+0x88/0x94
       [<c0107008>] kernel_thread_exit+0x0/0x8

-> #0 ((&ks->irq_work)){+.+...}:
       [<c01b7984>] validate_chain+0x914/0x1018
       [<c01b89c8>] __lock_acquire+0x940/0x9f8
       [<c01b9058>] lock_acquire+0x10c/0x130
       [<c019e104>] flush_work+0x4c/0xac
       [<bf00b858>] ks8851_net_stop+0x6c/0x138 [ks8851]
       [<c06b209c>] __dev_close_many+0x98/0xcc
       [<c06b2174>] dev_close_many+0x68/0xd0
       [<c06b22ec>] rollback_registered_many+0xcc/0x2b8
       [<c06b2554>] rollback_registered+0x28/0x34
       [<c06b25b8>] unregister_netdevice_queue+0x58/0x7c
       [<c06b25f4>] unregister_netdev+0x18/0x20
       [<bf00c1f4>] ks8851_remove+0x64/0xb4 [ks8851]
       [<c049ddf0>] spi_drv_remove+0x18/0x1c
       [<c0468e98>] __device_release_driver+0x7c/0xbc
       [<c0468f64>] driver_detach+0x8c/0xb4
       [<c0467f00>] bus_remove_driver+0xb8/0xe8
       [<c01c1d20>] sys_delete_module+0x1e8/0x27c
       [<c0105ec0>] ret_fast_syscall+0x0/0x3c

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&ks->lock);
                               lock((&ks->irq_work));
                               lock(&ks->lock);
  lock((&ks->irq_work));

 *** DEADLOCK ***

4 locks held by rmmod/125:
 #0:  (&__lockdep_no_validate__){+.+.+.}, at: [<c0468f44>] driver_detach+0x6c/0xb4
 #1:  (&__lockdep_no_validate__){+.+.+.}, at: [<c0468f50>] driver_detach+0x78/0xb4
 #2:  (rtnl_mutex){+.+.+.}, at: [<c06b25e8>] unregister_netdev+0xc/0x20
 #3:  (&ks->lock){+.+...}, at: [<bf00b850>] ks8851_net_stop+0x64/0x138 [ks8851]

Cc: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
c5a9993

@erique erique pushed a commit to erique/rpi_linux that referenced this issue Jul 16, 2012

@mvpublicworks @linvjw mvpublicworks + linvjw iwlwifi: use correct released ucode version
Report correctly the latest released version
of the iwlwifi firmware for all
iwlwifi-supported devices.

Cc: stable@vger.kernel.org #3.3+
Signed-off-by: Meenakshi Venkataraman <meenakshi.venkataraman@intel.com>
Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
78cbcf2

@erique erique pushed a commit to erique/rpi_linux that referenced this issue Jul 16, 2012

@linvjw Wey-Yi Guy + linvjw iwlwifi: use 6000G2B for 6030 device series
"iwlwifi: use correct released ucode version" change
the ucode api ok from 6000G2 to 6000G2B, but it shall belong
to 6030 device series, not the 6005 device series. Fix it

Cc: stable@vger.kernel.org #3.3+
Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
1ed2ec3

@erique erique pushed a commit to erique/rpi_linux that referenced this issue Jul 16, 2012

@gregkh Andrea Arcangeli + gregkh mm: pmd_read_atomic: fix 32bit PAE pmd walk vs pmd_populate SMP race …
…condition

commit 26c1917 upstream.

When holding the mmap_sem for reading, pmd_offset_map_lock should only
run on a pmd_t that has been read atomically from the pmdp pointer,
otherwise we may read only half of it leading to this crash.

PID: 11679  TASK: f06e8000  CPU: 3   COMMAND: "do_race_2_panic"
 #0 [f06a9dd8] crash_kexec at c049b5ec
 #1 [f06a9e2c] oops_end at c083d1c2
 #2 [f06a9e40] no_context at c0433ded
 #3 [f06a9e64] bad_area_nosemaphore at c043401a
 #4 [f06a9e6c] __do_page_fault at c0434493
 #5 [f06a9eec] do_page_fault at c083eb45
 #6 [f06a9f04] error_code (via page_fault) at c083c5d5
    EAX: 01fb470c EBX: fff35000 ECX: 00000003 EDX: 00000100 EBP:
    00000000
    DS:  007b     ESI: 9e201000 ES:  007b     EDI: 01fb4700 GS:  00e0
    CS:  0060     EIP: c083bc14 ERR: ffffffff EFLAGS: 00010246
 #7 [f06a9f38] _spin_lock at c083bc14
 #8 [f06a9f44] sys_mincore at c0507b7d
 #9 [f06a9fb0] system_call at c083becd
                         start           len
    EAX: ffffffda  EBX: 9e200000  ECX: 00001000  EDX: 6228537f
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 003d0f00
    SS:  007b      ESP: 62285354  EBP: 62285388  GS:  0033
    CS:  0073      EIP: 00291416  ERR: 000000da  EFLAGS: 00000286

This should be a longstanding bug affecting x86 32bit PAE without THP.
Only archs with 64bit large pmd_t and 32bit unsigned long should be
affected.

With THP enabled the barrier() in pmd_none_or_trans_huge_or_clear_bad()
would partly hide the bug when the pmd transition from none to stable,
by forcing a re-read of the *pmd in pmd_offset_map_lock, but when THP is
enabled a new set of problem arises by the fact could then transition
freely in any of the none, pmd_trans_huge or pmd_trans_stable states.
So making the barrier in pmd_none_or_trans_huge_or_clear_bad()
unconditional isn't good idea and it would be a flakey solution.

This should be fully fixed by introducing a pmd_read_atomic that reads
the pmd in order with THP disabled, or by reading the pmd atomically
with cmpxchg8b with THP enabled.

Luckily this new race condition only triggers in the places that must
already be covered by pmd_none_or_trans_huge_or_clear_bad() so the fix
is localized there but this bug is not related to THP.

NOTE: this can trigger on x86 32bit systems with PAE enabled with more
than 4G of ram, otherwise the high part of the pmd will never risk to be
truncated because it would be zero at all times, in turn so hiding the
SMP race.

This bug was discovered and fully debugged by Ulrich, quote:

----
[..]
pmd_none_or_trans_huge_or_clear_bad() loads the content of edx and
eax.

    496 static inline int pmd_none_or_trans_huge_or_clear_bad(pmd_t
    *pmd)
    497 {
    498         /* depend on compiler for an atomic pmd read */
    499         pmd_t pmdval = *pmd;

                                // edi = pmd pointer
0xc0507a74 <sys_mincore+548>:   mov    0x8(%esp),%edi
...
                                // edx = PTE page table high address
0xc0507a84 <sys_mincore+564>:   mov    0x4(%edi),%edx
...
                                // eax = PTE page table low address
0xc0507a8e <sys_mincore+574>:   mov    (%edi),%eax

[..]

Please note that the PMD is not read atomically. These are two "mov"
instructions where the high order bits of the PMD entry are fetched
first. Hence, the above machine code is prone to the following race.

-  The PMD entry {high|low} is 0x0000000000000000.
   The "mov" at 0xc0507a84 loads 0x00000000 into edx.

-  A page fault (on another CPU) sneaks in between the two "mov"
   instructions and instantiates the PMD.

-  The PMD entry {high|low} is now 0x00000003fda38067.
   The "mov" at 0xc0507a8e loads 0xfda38067 into eax.
----

Reported-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2d363e9

@popcornmix popcornmix pushed a commit to popcornmix/linux that referenced this issue Aug 16, 2012

@jtlayton @bwhacks jtlayton + bwhacks cifs: when CONFIG_HIGHMEM is set, serialize the read/write kmaps
commit 3cf003c upstream.

Jian found that when he ran fsx on a 32 bit arch with a large wsize the
process and one of the bdi writeback kthreads would sometimes deadlock
with a stack trace like this:

crash> bt
PID: 2789   TASK: f02edaa0  CPU: 3   COMMAND: "fsx"
 #0 [eed63cbc] schedule at c083c5b3
 #1 [eed63d80] kmap_high at c0500ec8
 #2 [eed63db0] cifs_async_writev at f7fabcd7 [cifs]
 #3 [eed63df0] cifs_writepages at f7fb7f5c [cifs]
 #4 [eed63e50] do_writepages at c04f3e32
 #5 [eed63e54] __filemap_fdatawrite_range at c04e152a
 #6 [eed63ea4] filemap_fdatawrite at c04e1b3e
 #7 [eed63eb4] cifs_file_aio_write at f7fa111a [cifs]
 #8 [eed63ecc] do_sync_write at c052d202
 #9 [eed63f74] vfs_write at c052d4ee
#10 [eed63f94] sys_write at c052df4c
#11 [eed63fb0] ia32_sysenter_target at c0409a98
    EAX: 00000004  EBX: 00000003  ECX: abd73b73  EDX: 012a65c6
    DS:  007b      ESI: 012a65c6  ES:  007b      EDI: 00000000
    SS:  007b      ESP: bf8db178  EBP: bf8db1f8  GS:  0033
    CS:  0073      EIP: 40000424  ERR: 00000004  EFLAGS: 00000246

Each task would kmap part of its address array before getting stuck, but
not enough to actually issue the write.

This patch fixes this by serializing the marshal_iov operations for
async reads and writes. The idea here is to ensure that cifs
aggressively tries to populate a request before attempting to fulfill
another one. As soon as all of the pages are kmapped for a request, then
we can unlock and allow another one to proceed.

There's no need to do this serialization on non-CONFIG_HIGHMEM arches
however, so optimize all of this out when CONFIG_HIGHMEM isn't set.

Reported-by: Jian Li <jiali@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
bb761c9

@popcornmix popcornmix pushed a commit to popcornmix/linux that referenced this issue Aug 16, 2012

@jtlayton @bwhacks jtlayton + bwhacks nfs: skip commit in releasepage if we're freeing memory for fs-relate…
…d reasons

commit 5cf02d0 upstream.

We've had some reports of a deadlock where rpciod ends up with a stack
trace like this:

    PID: 2507   TASK: ffff88103691ab40  CPU: 14  COMMAND: "rpciod/14"
     #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
     #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
     #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
     #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
     #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
     #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
     #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
     #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
     #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
     #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
    #10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
    #11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
    #12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
    #13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
    #14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
    #15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
    #16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
    #17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
    #18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
    #19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
    #20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
    #21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
    #22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
    #23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
    #24 [ffff8810343bfee8] kthread at ffffffff8108dd96
    #25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca

rpciod is trying to allocate memory for a new socket to talk to the
server. The VM ends up calling ->releasepage to get more memory, and it
tries to do a blocking commit. That commit can't succeed however without
a connected socket, so we deadlock.

Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
socket allocation, and having nfs_release_page check for that flag when
deciding whether to do a commit call. Also, set PF_FSTRANS
unconditionally in rpc_async_schedule since that function can also do
allocations sometimes.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
1c88c58

@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Sep 20, 2012

Ben Myers xfs: stop the sync worker before xfs_unmountfs
Cancel work of the xfs_sync_worker before teardown of the log in
xfs_unmountfs.  This prevents occasional crashes on unmount like so:

PID: 21602  TASK: ee9df060  CPU: 0   COMMAND: "kworker/0:3"
 #0 [c5377d28] crash_kexec at c0292c94
 #1 [c5377d80] oops_end at c07090c2
 #2 [c5377d98] no_context at c06f614e
 #3 [c5377dbc] __bad_area_nosemaphore at c06f6281
 #4 [c5377df4] bad_area_nosemaphore at c06f629b
 #5 [c5377e00] do_page_fault at c070b0cb
 #6 [c5377e7c] error_code (via page_fault) at c070892c
    EAX: f300c6a8  EBX: f300c6a8  ECX: 000000c0  EDX: 000000c0  EBP: c5377ed0
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 00000001  GS:  ffffad20
    CS:  0060      EIP: c0481ad0  ERR: ffffffff  EFLAGS: 00010246
 #7 [c5377eb0] atomic64_read_cx8 at c0481ad0
 #8 [c5377ebc] xlog_assign_tail_lsn_locked at f7cc7c6e [xfs]
 #9 [c5377ed4] xfs_trans_ail_delete_bulk at f7ccd520 [xfs]
#10 [c5377f0c] xfs_buf_iodone at f7ccb602 [xfs]
#11 [c5377f24] xfs_buf_do_callbacks at f7cca524 [xfs]
#12 [c5377f30] xfs_buf_iodone_callbacks at f7cca5da [xfs]
#13 [c5377f4c] xfs_buf_iodone_work at f7c718d0 [xfs]
#14 [c5377f58] process_one_work at c024ee4c
#15 [c5377f98] worker_thread at c024f43d
#16 [c5377fbc] kthread at c025326b
#17 [c5377fe8] kernel_thread_helper at c070e834

PID: 26653  TASK: e79143b0  CPU: 3   COMMAND: "umount"
 #0 [cde0fda0] __schedule at c0706595
 #1 [cde0fe28] schedule at c0706b89
 #2 [cde0fe30] schedule_timeout at c0705600
 #3 [cde0fe94] __down_common at c0706098
 #4 [cde0fec8] __down at c0706122
 #5 [cde0fed0] down at c025936f
 #6 [cde0fee0] xfs_buf_lock at f7c7131d [xfs]
 #7 [cde0ff00] xfs_freesb at f7cc2236 [xfs]
 #8 [cde0ff10] xfs_fs_put_super at f7c80f21 [xfs]
 #9 [cde0ff1c] generic_shutdown_super at c0333d7a
#10 [cde0ff38] kill_block_super at c0333e0f
#11 [cde0ff48] deactivate_locked_super at c0334218
#12 [cde0ff58] deactivate_super at c033495d
#13 [cde0ff68] mntput_no_expire at c034bc13
#14 [cde0ff7c] sys_umount at c034cc69
#15 [cde0ffa0] sys_oldumount at c034ccd4
#16 [cde0ffb0] system_call at c0707e66

commit 11159a0 added this to xfs_log_unmount and needs to be cleaned up
at a later date.

Signed-off-by: Ben Myers <bpm@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
4026c9f

@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Sep 20, 2012

@htejun Lai Jiangshan + htejun workqueue: always clear WORKER_REBIND in busy_worker_rebind_fn()
busy_worker_rebind_fn() didn't clear WORKER_REBIND if rebinding failed
(CPU is down again).  This used to be okay because the flag wasn't
used for anything else.

However, after 25511a4 "workqueue: reimplement CPU online rebinding
to handle idle workers", WORKER_REBIND is also used to command idle
workers to rebind.  If not cleared, the worker may confuse the next
CPU_UP cycle by having REBIND spuriously set or oops / get stuck by
prematurely calling idle_worker_rebind().

  WARNING: at /work/os/wq/kernel/workqueue.c:1323 worker_thread+0x4cd/0x5
 00()
  Hardware name: Bochs
  Modules linked in: test_wq(O-)
  Pid: 33, comm: kworker/1:1 Tainted: G           O 3.6.0-rc1-work+ #3
  Call Trace:
   [<ffffffff8109039f>] warn_slowpath_common+0x7f/0xc0
   [<ffffffff810903fa>] warn_slowpath_null+0x1a/0x20
   [<ffffffff810b3f1d>] worker_thread+0x4cd/0x500
   [<ffffffff810bc16e>] kthread+0xbe/0xd0
   [<ffffffff81bd2664>] kernel_thread_helper+0x4/0x10
  ---[ end trace e977cf20f4661968 ]---
  BUG: unable to handle kernel NULL pointer dereference at           (null)
  IP: [<ffffffff810b3db0>] worker_thread+0x360/0x500
  PGD 0
  Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
  Modules linked in: test_wq(O-)
  CPU 0
  Pid: 33, comm: kworker/1:1 Tainted: G        W  O 3.6.0-rc1-work+ #3 Bochs Bochs
  RIP: 0010:[<ffffffff810b3db0>]  [<ffffffff810b3db0>] worker_thread+0x360/0x500
  RSP: 0018:ffff88001e1c9de0  EFLAGS: 00010086
  RAX: 0000000000000000 RBX: ffff88001e633e00 RCX: 0000000000004140
  RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000009
  RBP: ffff88001e1c9ea0 R08: 0000000000000000 R09: 0000000000000001
  R10: 0000000000000002 R11: 0000000000000000 R12: ffff88001fc8d580
  R13: ffff88001fc8d590 R14: ffff88001e633e20 R15: ffff88001e1c6900
  FS:  0000000000000000(0000) GS:ffff88001fc00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
  CR2: 0000000000000000 CR3: 00000000130e8000 CR4: 00000000000006f0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
  Process kworker/1:1 (pid: 33, threadinfo ffff88001e1c8000, task ffff88001e1c6900)
  Stack:
   ffff880000000000 ffff88001e1c9e40 0000000000000001 ffff88001e1c8010
   ffff88001e519c78 ffff88001e1c9e58 ffff88001e1c6900 ffff88001e1c6900
   ffff88001e1c6900 ffff88001e1c6900 ffff88001fc8d340 ffff88001fc8d340
  Call Trace:
   [<ffffffff810bc16e>] kthread+0xbe/0xd0
   [<ffffffff81bd2664>] kernel_thread_helper+0x4/0x10
  Code: b1 00 f6 43 48 02 0f 85 91 01 00 00 48 8b 43 38 48 89 df 48 8b 00 48 89 45 90 e8 ac f0 ff ff 3c 01 0f 85 60 01 00 00 48 8b 53 50 <8b> 02 83 e8 01 85 c0 89 02 0f 84 3b 01 00 00 48 8b 43 38 48 8b
  RIP  [<ffffffff810b3db0>] worker_thread+0x360/0x500
   RSP <ffff88001e1c9de0>
  CR2: 0000000000000000

There was no reason to keep WORKER_REBIND on failure in the first
place - WORKER_UNBOUND is guaranteed to be set in such cases
preventing incorrectly activating concurrency management.  Always
clear WORKER_REBIND.

tj: Updated comment and description.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
960bd11

@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Sep 29, 2012

Ben Myers xfs: stop the sync worker before xfs_unmountfs
Cancel work of the xfs_sync_worker before teardown of the log in
xfs_unmountfs.  This prevents occasional crashes on unmount like so:

PID: 21602  TASK: ee9df060  CPU: 0   COMMAND: "kworker/0:3"
 #0 [c5377d28] crash_kexec at c0292c94
 #1 [c5377d80] oops_end at c07090c2
 #2 [c5377d98] no_context at c06f614e
 #3 [c5377dbc] __bad_area_nosemaphore at c06f6281
 #4 [c5377df4] bad_area_nosemaphore at c06f629b
 #5 [c5377e00] do_page_fault at c070b0cb
 #6 [c5377e7c] error_code (via page_fault) at c070892c
    EAX: f300c6a8  EBX: f300c6a8  ECX: 000000c0  EDX: 000000c0  EBP: c5377ed0
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 00000001  GS:  ffffad20
    CS:  0060      EIP: c0481ad0  ERR: ffffffff  EFLAGS: 00010246
 #7 [c5377eb0] atomic64_read_cx8 at c0481ad0
 #8 [c5377ebc] xlog_assign_tail_lsn_locked at f7cc7c6e [xfs]
 #9 [c5377ed4] xfs_trans_ail_delete_bulk at f7ccd520 [xfs]
#10 [c5377f0c] xfs_buf_iodone at f7ccb602 [xfs]
#11 [c5377f24] xfs_buf_do_callbacks at f7cca524 [xfs]
#12 [c5377f30] xfs_buf_iodone_callbacks at f7cca5da [xfs]
#13 [c5377f4c] xfs_buf_iodone_work at f7c718d0 [xfs]
#14 [c5377f58] process_one_work at c024ee4c
#15 [c5377f98] worker_thread at c024f43d
#16 [c5377fbc] kthread at c025326b
#17 [c5377fe8] kernel_thread_helper at c070e834

PID: 26653  TASK: e79143b0  CPU: 3   COMMAND: "umount"
 #0 [cde0fda0] __schedule at c0706595
 #1 [cde0fe28] schedule at c0706b89
 #2 [cde0fe30] schedule_timeout at c0705600
 #3 [cde0fe94] __down_common at c0706098
 #4 [cde0fec8] __down at c0706122
 #5 [cde0fed0] down at c025936f
 #6 [cde0fee0] xfs_buf_lock at f7c7131d [xfs]
 #7 [cde0ff00] xfs_freesb at f7cc2236 [xfs]
 #8 [cde0ff10] xfs_fs_put_super at f7c80f21 [xfs]
 #9 [cde0ff1c] generic_shutdown_super at c0333d7a
#10 [cde0ff38] kill_block_super at c0333e0f
#11 [cde0ff48] deactivate_locked_super at c0334218
#12 [cde0ff58] deactivate_super at c033495d
#13 [cde0ff68] mntput_no_expire at c034bc13
#14 [cde0ff7c] sys_umount at c034cc69
#15 [cde0ffa0] sys_oldumount at c034ccd4
#16 [cde0ffb0] system_call at c0707e66

commit 11159a0 added this to xfs_log_unmount and needs to be cleaned up
at a later date.

Signed-off-by: Ben Myers <bpm@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
0ba6e53

@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Sep 29, 2012

@mcgrof @linvjw mcgrof + linvjw cfg80211: fix possible circular lock on reg_regdb_search()
When call_crda() is called we kick off a witch hunt search
for the same regulatory domain on our internal regulatory
database and that work gets kicked off on a workqueue, this
is done while the cfg80211_mutex is held. If that workqueue
kicks off it will first lock reg_regdb_search_mutex and
later cfg80211_mutex but to ensure two CPUs will not contend
against cfg80211_mutex the right thing to do is to have the
reg_regdb_search() wait until the cfg80211_mutex is let go.

The lockdep report is pasted below.

cfg80211: Calling CRDA to update world regulatory domain

======================================================
[ INFO: possible circular locking dependency detected ]
3.3.8 #3 Tainted: G           O
-------------------------------------------------------
kworker/0:1/235 is trying to acquire lock:
 (cfg80211_mutex){+.+...}, at: [<816468a4>] set_regdom+0x78c/0x808 [cfg80211]

but task is already holding lock:
 (reg_regdb_search_mutex){+.+...}, at: [<81646828>] set_regdom+0x710/0x808 [cfg80211]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (reg_regdb_search_mutex){+.+...}:
       [<800a8384>] lock_acquire+0x60/0x88
       [<802950a8>] mutex_lock_nested+0x54/0x31c
       [<81645778>] is_world_regdom+0x9f8/0xc74 [cfg80211]

-> #1 (reg_mutex#2){+.+...}:
       [<800a8384>] lock_acquire+0x60/0x88
       [<802950a8>] mutex_lock_nested+0x54/0x31c
       [<8164539c>] is_world_regdom+0x61c/0xc74 [cfg80211]

-> #0 (cfg80211_mutex){+.+...}:
       [<800a77b8>] __lock_acquire+0x10d4/0x17bc
       [<800a8384>] lock_acquire+0x60/0x88
       [<802950a8>] mutex_lock_nested+0x54/0x31c
       [<816468a4>] set_regdom+0x78c/0x808 [cfg80211]

other info that might help us debug this:

Chain exists of:
  cfg80211_mutex --> reg_mutex#2 --> reg_regdb_search_mutex

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(reg_regdb_search_mutex);
                               lock(reg_mutex#2);
                               lock(reg_regdb_search_mutex);
  lock(cfg80211_mutex);

 *** DEADLOCK ***

3 locks held by kworker/0:1/235:
 #0:  (events){.+.+..}, at: [<80089a00>] process_one_work+0x230/0x460
 #1:  (reg_regdb_work){+.+...}, at: [<80089a00>] process_one_work+0x230/0x460
 #2:  (reg_regdb_search_mutex){+.+...}, at: [<81646828>] set_regdom+0x710/0x808 [cfg80211]

stack backtrace:
Call Trace:
[<80290fd4>] dump_stack+0x8/0x34
[<80291bc4>] print_circular_bug+0x2ac/0x2d8
[<800a77b8>] __lock_acquire+0x10d4/0x17bc
[<800a8384>] lock_acquire+0x60/0x88
[<802950a8>] mutex_lock_nested+0x54/0x31c
[<816468a4>] set_regdom+0x78c/0x808 [cfg80211]

Reported-by: Felix Fietkau <nbd@openwrt.org>
Tested-by: Felix Fietkau <nbd@openwrt.org>
Cc: stable@vger.kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@do-not-panic.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
a85d0d7

@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Sep 29, 2012

Steffen Maier + James Bottomley [SCSI] zfcp: Adapt to new FC_PORTSPEED semantics
Commit a9277e7
"[SCSI] scsi_transport_fc: Getting FC Port Speed in sync with FC-GS"
changed the semantics of FC_PORTSPEED defines to
FDMI port attributes of FC-HBA/SM-HBA
which is different from the previous bit reversed
Report Port Speed Capabilities (RPSC) ELS of FC-GS/FC-LS.

Zfcp showed "10 Gbit" instead of "4 Gbit" for supported_speeds.
It now uses explicit bit conversion as the other LLDs already
do, in order to be independent of the kernel bit semantics.
See also http://marc.info/?l=linux-scsi&m=134452926830730&w=2

Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Reviewed-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> #3.4+
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
d220197

@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Sep 29, 2012

Steffen Maier + James Bottomley [SCSI] zfcp: Bounds checking for deferred error trace
The pl vector has scount elements, i.e. pl[scount-1] is the last valid
element. For maximum sized requests, payload->counter == scount after
the last loop iteration. Therefore, do bounds checking first (with
boolean shortcut) to not access the invalid element pl[scount].

Do not trust the maximum sbale->scount value from the HBA
but ensure we won't access the pl vector out of our allocated bounds.
While at it, clean up scoping and prevent unnecessary memset.

Minor fix for 86a9668
"[SCSI] zfcp: support for hardware data router"

Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Reviewed-by: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> #3.2+
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
01e6052

@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Oct 3, 2012

@dwmw2 Andreas Bießmann + dwmw2 mtd: omap2: fix omap_nand_remove segfault
Do not kfree() the mtd_info; it is handled in the mtd subsystem and
already freed by nand_release(). Instead kfree() the struct
omap_nand_info allocated in omap_nand_probe which was not freed before.

This patch fixes following error when unloading the omap2 module:

---8<---
~ $ rmmod omap2
------------[ cut here ]------------
kernel BUG at mm/slab.c:3126!
Internal error: Oops - BUG: 0 [#1] PREEMPT ARM
Modules linked in: omap2(-)
CPU: 0    Not tainted  (3.6.0-rc3-00230-g155e36d-dirty #3)
PC is at cache_free_debugcheck+0x2d4/0x36c
LR is at kfree+0xc8/0x2ac
pc : [<c01125a0>]    lr : [<c0112efc>]    psr: 200d0193
sp : c521fe08  ip : c0e8ef90  fp : c521fe5c
r10: bf0001fc  r9 : c521e000  r8 : c0d99c8c
r7 : c661ebc0  r6 : c065d5a4  r5 : c65c4060  r4 : c78005c0
r3 : 00000000  r2 : 00001000  r1 : c65c4000  r0 : 00000001
Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 10c5387d  Table: 86694019  DAC: 00000015
Process rmmod (pid: 549, stack limit = 0xc521e2f0)
Stack: (0xc521fe08 to 0xc5220000)
fe00:                   c008a874 c00bf44c c515c6d0 200d0193 c65c4860 c515c240
fe20: c521fe3c c521fe30 c008a9c0 c008a854 c521fe5c c65c4860 c78005c0 bf0001fc
fe40: c780ff40 a00d0113 c521e000 00000000 c521fe84 c521fe60 c0112efc c01122d8
fe60: c65c4860 c0673778 c06737ac 00000000 00070013 00000000 c521fe9c c521fe88
fe80: bf0001fc c0112e40 c0673778 bf001ca8 c521feac c521fea0 c02ca11c bf0001ac
fea0: c521fec4 c521feb0 c02c82c4 c02ca100 c0673778 bf001ca8 c521fee4 c521fec8
fec0: c02c8dd8 c02c8250 00000000 bf001ca8 bf001ca8 c0804ee0 c521ff04 c521fee8
fee0: c02c804c c02c8d20 bf001924 00000000 bf001ca8 c521e000 c521ff1c c521ff08
ff00: c02c950c c02c7fbc bf001d48 00000000 c521ff2c c521ff20 c02ca3a4 c02c94b8
ff20: c521ff3c c521ff30 bf001938 c02ca394 c521ffa4 c521ff40 c009beb4 bf001930
ff40: c521ff6c 70616d6f b6fe0032 c0014f84 70616d6f b6fe0032 00000081 60070010
ff60: c521ff84 c521ff70 c008e1f4 c00bf328 0001a004 70616d6f c521ff94 0021ff88
ff80: c008e368 0001a004 70616d6f b6fe0032 00000081 c0015028 00000000 c521ffa8
ffa0: c0014dc0 c009bcd0 0001a004 70616d6f bec2ab38 00000880 bec2ab38 00000880
ffc0: 0001a004 70616d6f b6fe0032 00000081 00000319 00000000 b6fe1000 00000000
ffe0: bec2ab30 bec2ab20 00019f00 b6f539c0 60070010 bec2ab38 aaaaaaaa aaaaaaaa
Backtrace:
[<c01122cc>] (cache_free_debugcheck+0x0/0x36c) from [<c0112efc>] (kfree+0xc8/0x2ac)
[<c0112e34>] (kfree+0x0/0x2ac) from [<bf0001fc>] (omap_nand_remove+0x5c/0x64 [omap2])
[<bf0001a0>] (omap_nand_remove+0x0/0x64 [omap2]) from [<c02ca11c>] (platform_drv_remove+0x28/0x2c)
 r5:bf001ca8 r4:c0673778
[<c02ca0f4>] (platform_drv_remove+0x0/0x2c) from [<c02c82c4>] (__device_release_driver+0x80/0xdc)
[<c02c8244>] (__device_release_driver+0x0/0xdc) from [<c02c8dd8>] (driver_detach+0xc4/0xc8)
 r5:bf001ca8 r4:c0673778
[<c02c8d14>] (driver_detach+0x0/0xc8) from [<c02c804c>] (bus_remove_driver+0x9c/0x104)
 r6:c0804ee0 r5:bf001ca8 r4:bf001ca8 r3:00000000
[<c02c7fb0>] (bus_remove_driver+0x0/0x104) from [<c02c950c>] (driver_unregister+0x60/0x80)
 r6:c521e000 r5:bf001ca8 r4:00000000 r3:bf001924
[<c02c94ac>] (driver_unregister+0x0/0x80) from [<c02ca3a4>] (platform_driver_unregister+0x1c/0x20)
 r5:00000000 r4:bf001d48
[<c02ca388>] (platform_driver_unregister+0x0/0x20) from [<bf001938>] (omap_nand_driver_exit+0x14/0x1c [omap2])
[<bf001924>] (omap_nand_driver_exit+0x0/0x1c [omap2]) from [<c009beb4>] (sys_delete_module+0x1f0/0x2ec)
[<c009bcc4>] (sys_delete_module+0x0/0x2ec) from [<c0014dc0>] (ret_fast_syscall+0x0/0x48)
 r8:c0015028 r7:00000081 r6:b6fe0032 r5:70616d6f r4:0001a004
Code: e1a00005 eb0d9172 e7f001f2 e7f001f2 (e7f001f2)
---[ end trace 6a30b24d8c0cc2ee ]---
Segmentation fault
--->8---

This error was introduced in 67ce04b which
was the first commit of this driver.

Signed-off-by: Andreas Bießmann <andreas@biessmann.de>
Cc: stable@vger.kernel.org
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
7d9b110

@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012

@tavip @jhedberg tavip + jhedberg Bluetooth: silence lockdep warning
Since bluetooth uses multiple protocols types, to avoid lockdep
warnings, we need to use different lockdep classes (one for each
protocol type).

This is already done in bt_sock_create but it misses a couple of cases
when new connections are created. This patch corrects that to fix the
following warning:

<4>[ 1864.732366] =======================================================
<4>[ 1864.733030] [ INFO: possible circular locking dependency detected ]
<4>[ 1864.733544] 3.0.16-mid3-00007-gc9a0f62 #3
<4>[ 1864.733883] -------------------------------------------------------
<4>[ 1864.734408] t.android.btclc/4204 is trying to acquire lock:
<4>[ 1864.734869]  (rfcomm_mutex){+.+.+.}, at: [<c14970ea>] rfcomm_dlc_close+0x15/0x30
<4>[ 1864.735541]
<4>[ 1864.735549] but task is already holding lock:
<4>[ 1864.736045]  (sk_lock-AF_BLUETOOTH){+.+.+.}, at: [<c1498bf7>] lock_sock+0xa/0xc
<4>[ 1864.736732]
<4>[ 1864.736740] which lock already depends on the new lock.
<4>[ 1864.736750]
<4>[ 1864.737428]
<4>[ 1864.737437] the existing dependency chain (in reverse order) is:
<4>[ 1864.738016]
<4>[ 1864.738023] -> #1 (sk_lock-AF_BLUETOOTH){+.+.+.}:
<4>[ 1864.738549]        [<c1062273>] lock_acquire+0x104/0x140
<4>[ 1864.738977]        [<c13d35c1>] lock_sock_nested+0x58/0x68
<4>[ 1864.739411]        [<c1493c33>] l2cap_sock_sendmsg+0x3e/0x76
<4>[ 1864.739858]        [<c13d06c3>] __sock_sendmsg+0x50/0x59
<4>[ 1864.740279]        [<c13d0ea2>] sock_sendmsg+0x94/0xa8
<4>[ 1864.740687]        [<c13d0ede>] kernel_sendmsg+0x28/0x37
<4>[ 1864.741106]        [<c14969ca>] rfcomm_send_frame+0x30/0x38
<4>[ 1864.741542]        [<c1496a2a>] rfcomm_send_ua+0x58/0x5a
<4>[ 1864.741959]        [<c1498447>] rfcomm_run+0x441/0xb52
<4>[ 1864.742365]        [<c104f095>] kthread+0x63/0x68
<4>[ 1864.742742]        [<c14d5182>] kernel_thread_helper+0x6/0xd
<4>[ 1864.743187]
<4>[ 1864.743193] -> #0 (rfcomm_mutex){+.+.+.}:
<4>[ 1864.743667]        [<c1061ada>] __lock_acquire+0x988/0xc00
<4>[ 1864.744100]        [<c1062273>] lock_acquire+0x104/0x140
<4>[ 1864.744519]        [<c14d2c70>] __mutex_lock_common+0x3b/0x33f
<4>[ 1864.744975]        [<c14d303e>] mutex_lock_nested+0x2d/0x36
<4>[ 1864.745412]        [<c14970ea>] rfcomm_dlc_close+0x15/0x30
<4>[ 1864.745842]        [<c14990d9>] __rfcomm_sock_close+0x5f/0x6b
<4>[ 1864.746288]        [<c1499114>] rfcomm_sock_shutdown+0x2f/0x62
<4>[ 1864.746737]        [<c13d275d>] sys_socketcall+0x1db/0x422
<4>[ 1864.747165]        [<c14d42f0>] syscall_call+0x7/0xb

Signed-off-by: Octavian Purdila <octavian.purdila@intel.com>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
d22015a

@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012

@jbarnes993 Yinghai Lu + jbarnes993 PCI: make sriov work with hotplug remove
When hot removing a pci express module that has a pcie switch and supports
SRIOV, we got:

[ 5918.610127] pciehp 0000:80:02.2:pcie04: pcie_isr: intr_loc 1
[ 5918.615779] pciehp 0000:80:02.2:pcie04: Attention button interrupt received
[ 5918.622730] pciehp 0000:80:02.2:pcie04: Button pressed on Slot(3)
[ 5918.629002] pciehp 0000:80:02.2:pcie04: pciehp_get_power_status: SLOTCTRL a8 value read 1f9
[ 5918.637416] pciehp 0000:80:02.2:pcie04: PCI slot #3 - powering off due to button press.
[ 5918.647125] pciehp 0000:80:02.2:pcie04: pcie_isr: intr_loc 10
[ 5918.653039] pciehp 0000:80:02.2:pcie04: pciehp_green_led_blink: SLOTCTRL a8 write cmd 200
[ 5918.661229] pciehp 0000:80:02.2:pcie04: pciehp_set_attention_status: SLOTCTRL a8 write cmd c0
[ 5924.667627] pciehp 0000:80:02.2:pcie04: Disabling domain🚌device=0000:b0:00
[ 5924.674909] pciehp 0000:80:02.2:pcie04: pciehp_get_power_status: SLOTCTRL a8 value read 2f9
[ 5924.683262] pciehp 0000:80:02.2:pcie04: pciehp_unconfigure_device: domain🚌dev = 0000:b0:00
[ 5924.693976] libfcoe_device_notification: NETDEV_UNREGISTER eth6
[ 5924.764979] libfcoe_device_notification: NETDEV_UNREGISTER eth14
[ 5924.873539] libfcoe_device_notification: NETDEV_UNREGISTER eth15
[ 5924.995209] libfcoe_device_notification: NETDEV_UNREGISTER eth16
[ 5926.114407] sxge 0000:b2:00.0: PCI INT A disabled
[ 5926.119342] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 5926.127189] IP: [<ffffffff81353a3b>] pci_stop_bus_device+0x33/0x83
[ 5926.133377] PGD 0
[ 5926.135402] Oops: 0000 [#1] SMP
[ 5926.138659] CPU 2
[ 5926.140499] Modules linked in:
...
[ 5926.143754]
[ 5926.275823] Call Trace:
[ 5926.278267]  [<ffffffff81353a38>] pci_stop_bus_device+0x30/0x83
[ 5926.284180]  [<ffffffff81353af4>] pci_remove_bus_device+0x1a/0xba
[ 5926.290264]  [<ffffffff81366311>] pciehp_unconfigure_device+0x110/0x17b
[ 5926.296866]  [<ffffffff81365dd9>] ? pciehp_disable_slot+0x188/0x188
[ 5926.303123]  [<ffffffff81365d6f>] pciehp_disable_slot+0x11e/0x188
[ 5926.309206]  [<ffffffff81365e68>] pciehp_power_thread+0x8f/0xe0
...

 +-[0000:80]-+-00.0-[81-8f]--
 |           +-01.0-[90-9f]--
 |           +-02.0-[a0-af]--
 |           +-02.2-[b0-bf]----00.0-[b1-b3]--+-02.0-[b2]--+-00.0 Device
 |           |                               |            +-00.1 Device
 |           |                               |            +-00.2 Device
 |           |                               |            \-00.3 Device
 |           |                               \-03.0-[b3]--+-00.0 Device
 |           |                                            +-00.1 Device
 |           |                                            +-00.2 Device
 |           |                                            \-00.3 Device

root complex: 80:02.2
pci express modules: have pcie switch and are listed as b0:00.0, b1:02.0 and b1:03.0.
end devices  are b2:00.0 and b3.00.0.
VFs are: b2:00.1,... b2:00.3, and b3:00.1,...,b3:00.3

Root cause: when doing pci_stop_bus_device() with phys fn, it will stop
virt fn and remove the fn, so
	list_for_each_safe(l, n, &bus->devices)
will have problem to refer freed n that is pointed to vf entry.

Solution is just replacing list_for_each_safe() with
list_for_each_prev_safe().  This will make sure we can get valid n pointer
to PF instead of the freed VF pointer (because newly added devices are
inserted to the bus->devices list tail).

During reviewing the patch, Bjorn said:
|   The PCI hot-remove path calls pci_stop_bus_devices() via
|   pci_remove_bus_device().
|
|   pci_stop_bus_devices() traverses the bus->devices list (point A below),
|   stopping each device in turn, which calls the driver remove() method.  When
|   the device is an SR-IOV PF, the driver calls pci_disable_sriov(), which
|   also uses pci_remove_bus_device() to remove the VF devices from the
|   bus->devices list (point B).
|
|       pci_remove_bus_device
|         pci_stop_bus_device
|           pci_stop_bus_devices(subordinate)
|             list_for_each(bus->devices)             <-- A
|               pci_stop_bus_device(PF)
|                 ...
|                   driver->remove
|                     pci_disable_sriov
|                       ...
|                         pci_remove_bus_device(VF)
|                             <remove from bus_list>  <-- B
|
|   At B, we're changing the same list we're iterating through at A, so when
|   the driver remove() method returns, the pci_stop_bus_devices() iterator has
|   a pointer to a list entry that has already been freed.

Discussion thread can be found : https://lkml.org/lkml/2011/10/15/141
				 https://lkml.org/lkml/2012/1/23/360

-v5: According to Linus to make remove more robust, Change to
     list_for_each_prev_safe instead. That is more reasonable, because
     those devices are added to tail of the list before.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
ac205b7

@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012

Luck, Tony + H. Peter Anvin x86: Remove some noise from boot log when starting cpus
Printing the "start_ip" for every secondary cpu is very noisy on a large
system - and doesn't add any value. Drop this message.

Console log before:
Booting Node   0, Processors  #1
smpboot cpu 1: start_ip = 96000
 #2
smpboot cpu 2: start_ip = 96000
 #3
smpboot cpu 3: start_ip = 96000
 #4
smpboot cpu 4: start_ip = 96000
       ...
 #31
smpboot cpu 31: start_ip = 96000
Brought up 32 CPUs

Console log after:
Booting Node   0, Processors  #1 #2 #3 #4 #5 #6 #7 Ok.
Booting Node   1, Processors  #8 #9 #10 #11 #12 #13 #14 #15 Ok.
Booting Node   0, Processors  #16 #17 #18 #19 #20 #21 #22 #23 Ok.
Booting Node   1, Processors  #24 #25 #26 #27 #28 #29 #30 #31
Brought up 32 CPUs

Acked-by: Borislav Petkov <bp@amd64.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/4f452eb42507460426@agluck-desktop.sc.intel.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
140f190

@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012

@paulgortmaker paulgortmaker device.h: audit and cleanup users in main include dir
The <linux/device.h> header includes a lot of stuff, and
it in turn gets a lot of use just for the basic "struct device"
which appears so often.

Clean up the users as follows:

1) For those headers only needing "struct device" as a pointer
in fcn args, replace the include with exactly that.

2) For headers not really using anything from device.h, simply
delete the include altogether.

3) For headers relying on getting device.h implicitly before
being included themselves, now explicitly include device.h

4) For files in which doing #1 or #2 uncovers an implicit
dependency on some other header, fix by explicitly adding
the required header(s).

Any C files that were implicitly relying on device.h to be
present have already been dealt with in advance.

Total removals from #1 and #2: 51.  Total additions coming
from #3: 9.  Total other implicit dependencies from #4: 7.

As of 3.3-rc1, there were 110, so a net removal of 42 gives
about a 38% reduction in device.h presence in include/*

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
313162d

@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012

@torvalds torvalds Merge branch 'pcmcia' of git://git.linaro.org/people/rmk/linux-arm
Pull #3 ARM updates from Russell King:
 "This adds gpio support to soc_common, allowing an amount of code to be
  deleted from each PCMCIA socket driver for the PXA/SA11x0 SoCs."

* 'pcmcia' of git://git.linaro.org/people/rmk/linux-arm:
  PCMCIA: sa1111: rename sa1111 socket drivers to have sa1111_ prefix.
  PCMCIA: make lubbock socket driver part of sa1111_cs
  PCMCIA: add Kconfig control for building sa11xx_base.c
  PCMCIA: sa1111: jornada720: no need to disable IRQs around sa1111_set_io
  PCMCIA: sa1111: pass along sa1111_pcmcia_configure_socket() failure code
  PCMCIA: soc_common: remove explicit wrprot initialization in socket drivers
  PCMCIA: soc_common: remove soc_pcmcia_*_irqs functions
  PCMCIA: sa11x0: h3600: convert to use new irq/gpio management
  PCMCIA: sa11x0: simpad: convert to use new irq/gpio management
  PCMCIA: sa11x0: shannon: convert to use new irq/gpio management
  PCMCIA: sa11x0: nanoengine: convert reset handling to use GPIO subsystem
  PCMCIA: sa11x0: nanoengine: convert to use new irq/gpio management
  PCMCIA: sa11x0: cerf: convert reset handling to use GPIO subsystem
  PCMCIA: sa11x0: cerf: convert to use new irq/gpio management
  PCMCIA: sa11x0: assabet: convert to use new irq/gpio management
  PCMCIA: sa1111: use new per-socket irq/gpio infrastructure
  PCMCIA: pxa: convert PXA socket drivers to use new irq/gpio management
  PCMCIA: soc_common: add GPIO support for card status signals
  PCMCIA: soc_common: move common initialization into soc_common
24613ff

@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012

@gxt @dhowells gxt + dhowells Disintegrate asm/system.h for Unicore32 [based on ver #3, changed by …
…gxt]

Disintegrate asm/system.h for Unicore32. (Compilation successful)
The implementation details are not changed, but only splitted.
BTW, some codestyles are adjusted.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Guan Xuetao <gxt@mprc.pku.edu.cn>
8978bfd

@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012

@torvalds torvalds Merge tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel…
….org/pub/scm/linux/kernel/git/dhowells/linux-asm_system

Pull "Disintegrate and delete asm/system.h" from David Howells:
 "Here are a bunch of patches to disintegrate asm/system.h into a set of
  separate bits to relieve the problem of circular inclusion
  dependencies.

  I've built all the working defconfigs from all the arches that I can
  and made sure that they don't break.

  The reason for these patches is that I recently encountered a circular
  dependency problem that came about when I produced some patches to
  optimise get_order() by rewriting it to use ilog2().

  This uses bitops - and on the SH arch asm/bitops.h drags in
  asm-generic/get_order.h by a circuituous route involving asm/system.h.

  The main difficulty seems to be asm/system.h.  It holds a number of
  low level bits with no/few dependencies that are commonly used (eg.
  memory barriers) and a number of bits with more dependencies that
  aren't used in many places (eg.  switch_to()).

  These patches break asm/system.h up into the following core pieces:

    (1) asm/barrier.h

        Move memory barriers here.  This already done for MIPS and Alpha.

    (2) asm/switch_to.h

        Move switch_to() and related stuff here.

    (3) asm/exec.h

        Move arch_align_stack() here.  Other process execution related bits
        could perhaps go here from asm/processor.h.

    (4) asm/cmpxchg.h

        Move xchg() and cmpxchg() here as they're full word atomic ops and
        frequently used by atomic_xchg() and atomic_cmpxchg().

    (5) asm/bug.h

        Move die() and related bits.

    (6) asm/auxvec.h

        Move AT_VECTOR_SIZE_ARCH here.

  Other arch headers are created as needed on a per-arch basis."

Fixed up some conflicts from other header file cleanups and moving code
around that has happened in the meantime, so David's testing is somewhat
weakened by that.  We'll find out anything that got broken and fix it..

* tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system: (38 commits)
  Delete all instances of asm/system.h
  Remove all #inclusions of asm/system.h
  Add #includes needed to permit the removal of asm/system.h
  Move all declarations of free_initmem() to linux/mm.h
  Disintegrate asm/system.h for OpenRISC
  Split arch_align_stack() out from asm-generic/system.h
  Split the switch_to() wrapper out of asm-generic/system.h
  Move the asm-generic/system.h xchg() implementation to asm-generic/cmpxchg.h
  Create asm-generic/barrier.h
  Make asm-generic/cmpxchg.h #include asm-generic/cmpxchg-local.h
  Disintegrate asm/system.h for Xtensa
  Disintegrate asm/system.h for Unicore32 [based on ver #3, changed by gxt]
  Disintegrate asm/system.h for Tile
  Disintegrate asm/system.h for Sparc
  Disintegrate asm/system.h for SH
  Disintegrate asm/system.h for Score
  Disintegrate asm/system.h for S390
  Disintegrate asm/system.h for PowerPC
  Disintegrate asm/system.h for PA-RISC
  Disintegrate asm/system.h for MN10300
  ...
0195c00

@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012

@gregkh Bruno Prémont + gregkh sysfs: Prevent crash on unset sysfs group attributes
Do not let the kernel crash when a device is registered with
sysfs while group attributes are not set (aka NULL).

Warn about the offender with some information about the offending
device.

This would warn instead of trying NULL pointer deref like:
 BUG: unable to handle kernel NULL pointer dereference at (null)
 IP: [<ffffffff81152673>] internal_create_group+0x83/0x1a0
 PGD 0
 Oops: 0000 [#1] SMP
 CPU 0
 Modules linked in:

 Pid: 1, comm: swapper/0 Not tainted 3.4.0-rc1-x86_64 #3 HP ProLiant DL360 G4
 RIP: 0010:[<ffffffff81152673>]  [<ffffffff81152673>] internal_create_group+0x83/0x1a0
 RSP: 0018:ffff88019485fd70  EFLAGS: 00010202
 RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000000001
 RDX: ffff880192e99908 RSI: ffff880192e99630 RDI: ffffffff81a26c60
 RBP: ffff88019485fdc0 R08: 0000000000000000 R09: 0000000000000000
 R10: ffff880192e99908 R11: 0000000000000000 R12: ffffffff81a16a00
 R13: ffff880192e99908 R14: ffffffff81a16900 R15: 0000000000000000
 FS:  0000000000000000(0000) GS:ffff88019bc00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 0000000000000000 CR3: 0000000001a0c000 CR4: 00000000000007f0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process swapper/0 (pid: 1, threadinfo ffff88019485e000, task ffff880194878000)
 Stack:
  ffff88019485fdd0 ffff880192da9d60 0000000000000000 ffff880192e99908
  ffff880192e995d8 0000000000000001 ffffffff81a16a00 ffff880192da9d60
  0000000000000000 0000000000000000 ffff88019485fdd0 ffffffff811527be
 Call Trace:
  [<ffffffff811527be>] sysfs_create_group+0xe/0x10
  [<ffffffff81376ca6>] device_add_groups+0x46/0x80
  [<ffffffff81377d3d>] device_add+0x46d/0x6a0
  ...

Signed-off-by: Bruno Prémont <bonbons@linux-vserver.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5631f2c

@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012

@mvpublicworks @linvjw mvpublicworks + linvjw iwlwifi: use correct released ucode version
Report correctly the latest released version
of the iwlwifi firmware for all
iwlwifi-supported devices.

Cc: stable@vger.kernel.org #3.3+
Signed-off-by: Meenakshi Venkataraman <meenakshi.venkataraman@intel.com>
Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
e377a4f

@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012

@linvjw Wey-Yi Guy + linvjw iwlwifi: use 6000G2B for 6030 device series
"iwlwifi: use correct released ucode version" change
the ucode api ok from 6000G2 to 6000G2B, but it shall belong
to 6030 device series, not the 6005 device series. Fix it

Cc: stable@vger.kernel.org #3.3+
Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
35e7ada

@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012

Ben Myers xfs: protect xfs_sync_worker with s_umount semaphore
xfs_sync_worker checks the MS_ACTIVE flag in s_flags to avoid doing
work during mount and unmount.  This flag can be cleared by unmount
after the xfs_sync_worker checks it but before the work is completed.
The has caused crashes in the completion handler for the dummy
transaction commited by xfs_sync_worker:

PID: 27544  TASK: ffff88013544e040  CPU: 3   COMMAND: "kworker/3:0"
 #0 [ffff88016fdff930] machine_kexec at ffffffff810244e9
 #1 [ffff88016fdff9a0] crash_kexec at ffffffff8108d053
 #2 [ffff88016fdffa70] oops_end at ffffffff813ad1b8
 #3 [ffff88016fdffaa0] no_context at ffffffff8102bd48
 #4 [ffff88016fdffaf0] __bad_area_nosemaphore at ffffffff8102c04d
 #5 [ffff88016fdffb40] bad_area_nosemaphore at ffffffff8102c12e
 #6 [ffff88016fdffb50] do_page_fault at ffffffff813afaee
 #7 [ffff88016fdffc60] page_fault at ffffffff813ac635
    [exception RIP: xlog_get_lowest_lsn+0x30]
    RIP: ffffffffa04a9910  RSP: ffff88016fdffd10  RFLAGS: 00010246
    RAX: ffffc90014e48000  RBX: ffff88014d879980  RCX: ffff88014d879980
    RDX: ffff8802214ee4c0  RSI: 0000000000000000  RDI: 0000000000000000
    RBP: ffff88016fdffd10   R8: ffff88014d879a80   R9: 0000000000000000
    R10: 0000000000000001  R11: 0000000000000000  R12: ffff8802214ee400
    R13: ffff88014d879980  R14: 0000000000000000  R15: ffff88022fd96605
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #8 [ffff88016fdffd18] xlog_state_do_callback at ffffffffa04aa186 [xfs]
 #9 [ffff88016fdffd98] xlog_state_done_syncing at ffffffffa04aa568 [xfs]

Protect xfs_sync_worker by using the s_umount semaphore at the read
level to provide exclusion with unmount while work is progressing.

Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
1307bbd

@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012

@taoma-tm @tytso taoma-tm + tytso ext4: protect group inode free counting with group lock
Now when we set the group inode free count, we don't have a proper
group lock so that multiple threads may decrease the inode free
count at the same time. And e2fsck will complain something like:

Free inodes count wrong for group #1 (1, counted=0).
Fix? no

Free inodes count wrong for group #2 (3, counted=0).
Fix? no

Directories count wrong for group #2 (780, counted=779).
Fix? no

Free inodes count wrong for group #3 (2272, counted=2273).
Fix? no

So this patch try to protect it with the ext4_lock_group.

btw, it is found by xfstests test case 269 and the volume is
mkfsed with the parameter
"-O ^resize_inode,^uninit_bg,extent,meta_bg,flex_bg,ext_attr"
and I have run it 100 times and the error in e2fsck doesn't
show up again.

Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
6f2e9f0

@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012

@torvalds Andrea Arcangeli + torvalds mm: pmd_read_atomic: fix 32bit PAE pmd walk vs pmd_populate SMP race …
…condition

When holding the mmap_sem for reading, pmd_offset_map_lock should only
run on a pmd_t that has been read atomically from the pmdp pointer,
otherwise we may read only half of it leading to this crash.

PID: 11679  TASK: f06e8000  CPU: 3   COMMAND: "do_race_2_panic"
 #0 [f06a9dd8] crash_kexec at c049b5ec
 #1 [f06a9e2c] oops_end at c083d1c2
 #2 [f06a9e40] no_context at c0433ded
 #3 [f06a9e64] bad_area_nosemaphore at c043401a
 #4 [f06a9e6c] __do_page_fault at c0434493
 #5 [f06a9eec] do_page_fault at c083eb45
 #6 [f06a9f04] error_code (via page_fault) at c083c5d5
    EAX: 01fb470c EBX: fff35000 ECX: 00000003 EDX: 00000100 EBP:
    00000000
    DS:  007b     ESI: 9e201000 ES:  007b     EDI: 01fb4700 GS:  00e0
    CS:  0060     EIP: c083bc14 ERR: ffffffff EFLAGS: 00010246
 #7 [f06a9f38] _spin_lock at c083bc14
 #8 [f06a9f44] sys_mincore at c0507b7d
 #9 [f06a9fb0] system_call at c083becd
                         start           len
    EAX: ffffffda  EBX: 9e200000  ECX: 00001000  EDX: 6228537f
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 003d0f00
    SS:  007b      ESP: 62285354  EBP: 62285388  GS:  0033
    CS:  0073      EIP: 00291416  ERR: 000000da  EFLAGS: 00000286

This should be a longstanding bug affecting x86 32bit PAE without THP.
Only archs with 64bit large pmd_t and 32bit unsigned long should be
affected.

With THP enabled the barrier() in pmd_none_or_trans_huge_or_clear_bad()
would partly hide the bug when the pmd transition from none to stable,
by forcing a re-read of the *pmd in pmd_offset_map_lock, but when THP is
enabled a new set of problem arises by the fact could then transition
freely in any of the none, pmd_trans_huge or pmd_trans_stable states.
So making the barrier in pmd_none_or_trans_huge_or_clear_bad()
unconditional isn't good idea and it would be a flakey solution.

This should be fully fixed by introducing a pmd_read_atomic that reads
the pmd in order with THP disabled, or by reading the pmd atomically
with cmpxchg8b with THP enabled.

Luckily this new race condition only triggers in the places that must
already be covered by pmd_none_or_trans_huge_or_clear_bad() so the fix
is localized there but this bug is not related to THP.

NOTE: this can trigger on x86 32bit systems with PAE enabled with more
than 4G of ram, otherwise the high part of the pmd will never risk to be
truncated because it would be zero at all times, in turn so hiding the
SMP race.

This bug was discovered and fully debugged by Ulrich, quote:

----
[..]
pmd_none_or_trans_huge_or_clear_bad() loads the content of edx and
eax.

    496 static inline int pmd_none_or_trans_huge_or_clear_bad(pmd_t
    *pmd)
    497 {
    498         /* depend on compiler for an atomic pmd read */
    499         pmd_t pmdval = *pmd;

                                // edi = pmd pointer
0xc0507a74 <sys_mincore+548>:   mov    0x8(%esp),%edi
...
                                // edx = PTE page table high address
0xc0507a84 <sys_mincore+564>:   mov    0x4(%edi),%edx
...
                                // eax = PTE page table low address
0xc0507a8e <sys_mincore+574>:   mov    (%edi),%eax

[..]

Please note that the PMD is not read atomically. These are two "mov"
instructions where the high order bits of the PMD entry are fetched
first. Hence, the above machine code is prone to the following race.

-  The PMD entry {high|low} is 0x0000000000000000.
   The "mov" at 0xc0507a84 loads 0x00000000 into edx.

-  A page fault (on another CPU) sneaks in between the two "mov"
   instructions and instantiates the PMD.

-  The PMD entry {high|low} is now 0x00000003fda38067.
   The "mov" at 0xc0507a8e loads 0xfda38067 into eax.
----

Reported-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
26c1917

@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012

@sgruszka @linvjw sgruszka + linvjw rt2x00: use atomic variable for seqno
Remove spinlock as atomic_t can be used instead. Note we use only 16
lower bits, upper bits are changed but we impilcilty cast to u16.

This fix possible deadlock on IBSS mode reproted by lockdep:

=================================
[ INFO: inconsistent lock state ]
3.4.0-wl+ #4 Not tainted
---------------------------------
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
kworker/u:2/30374 [HC0[0]:SC0[0]:HE1:SE1] takes:
 (&(&intf->seqlock)->rlock){+.?...}, at: [<f9979a20>] rt2x00queue_create_tx_descriptor+0x380/0x490 [rt2x00lib]
{IN-SOFTIRQ-W} state was registered at:
  [<c04978ab>] __lock_acquire+0x47b/0x1050
  [<c0498504>] lock_acquire+0x84/0xf0
  [<c0835733>] _raw_spin_lock+0x33/0x40
  [<f9979a20>] rt2x00queue_create_tx_descriptor+0x380/0x490 [rt2x00lib]
  [<f9979f2a>] rt2x00queue_write_tx_frame+0x1a/0x300 [rt2x00lib]
  [<f997834f>] rt2x00mac_tx+0x7f/0x380 [rt2x00lib]
  [<f98fe363>] __ieee80211_tx+0x1b3/0x300 [mac80211]
  [<f98ffdf5>] ieee80211_tx+0x105/0x130 [mac80211]
  [<f99000dd>] ieee80211_xmit+0xad/0x100 [mac80211]
  [<f9900519>] ieee80211_subif_start_xmit+0x2d9/0x930 [mac80211]
  [<c0782e87>] dev_hard_start_xmit+0x307/0x660
  [<c079bb71>] sch_direct_xmit+0xa1/0x1e0
  [<c0784bb3>] dev_queue_xmit+0x183/0x730
  [<c078c27a>] neigh_resolve_output+0xfa/0x1e0
  [<c07b436a>] ip_finish_output+0x24a/0x460
  [<c07b4897>] ip_output+0xb7/0x100
  [<c07b2d60>] ip_local_out+0x20/0x60
  [<c07e01ff>] igmpv3_sendpack+0x4f/0x60
  [<c07e108f>] igmp_ifc_timer_expire+0x29f/0x330
  [<c04520fc>] run_timer_softirq+0x15c/0x2f0
  [<c0449e3e>] __do_softirq+0xae/0x1e0
irq event stamp: 18380437
hardirqs last  enabled at (18380437): [<c0526027>] __slab_alloc.clone.3+0x67/0x5f0
hardirqs last disabled at (18380436): [<c0525ff3>] __slab_alloc.clone.3+0x33/0x5f0
softirqs last  enabled at (18377616): [<c0449eb3>] __do_softirq+0x123/0x1e0
softirqs last disabled at (18377611): [<c041278d>] do_softirq+0x9d/0xe0

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&(&intf->seqlock)->rlock);
  <Interrupt>
    lock(&(&intf->seqlock)->rlock);

 *** DEADLOCK ***

4 locks held by kworker/u:2/30374:
 #0:  (wiphy_name(local->hw.wiphy)){++++.+}, at: [<c045cf99>] process_one_work+0x109/0x3f0
 #1:  ((&sdata->work)){+.+.+.}, at: [<c045cf99>] process_one_work+0x109/0x3f0
 #2:  (&ifibss->mtx){+.+.+.}, at: [<f98f005b>] ieee80211_ibss_work+0x1b/0x470 [mac80211]
 #3:  (&intf->beacon_skb_mutex){+.+...}, at: [<f997a644>] rt2x00queue_update_beacon+0x24/0x50 [rt2x00lib]

stack backtrace:
Pid: 30374, comm: kworker/u:2 Not tainted 3.4.0-wl+ #4
Call Trace:
 [<c04962a6>] print_usage_bug+0x1f6/0x220
 [<c0496a12>] mark_lock+0x2c2/0x300
 [<c0495ff0>] ? check_usage_forwards+0xc0/0xc0
 [<c04978ec>] __lock_acquire+0x4bc/0x1050
 [<c0527890>] ? __kmalloc_track_caller+0x1c0/0x1d0
 [<c0777fb6>] ? copy_skb_header+0x26/0x90
 [<c0498504>] lock_acquire+0x84/0xf0
 [<f9979a20>] ? rt2x00queue_create_tx_descriptor+0x380/0x490 [rt2x00lib]
 [<c0835733>] _raw_spin_lock+0x33/0x40
 [<f9979a20>] ? rt2x00queue_create_tx_descriptor+0x380/0x490 [rt2x00lib]
 [<f9979a20>] rt2x00queue_create_tx_descriptor+0x380/0x490 [rt2x00lib]
 [<f997a5cf>] rt2x00queue_update_beacon_locked+0x5f/0xb0 [rt2x00lib]
 [<f997a64d>] rt2x00queue_update_beacon+0x2d/0x50 [rt2x00lib]
 [<f9977e3a>] rt2x00mac_bss_info_changed+0x1ca/0x200 [rt2x00lib]
 [<f9977c70>] ? rt2x00mac_remove_interface+0x70/0x70 [rt2x00lib]
 [<f98e4dd0>] ieee80211_bss_info_change_notify+0xe0/0x1d0 [mac80211]
 [<f98ef7b8>] __ieee80211_sta_join_ibss+0x3b8/0x610 [mac80211]
 [<c0496ab4>] ? mark_held_locks+0x64/0xc0
 [<c0440012>] ? virt_efi_query_capsule_caps+0x12/0x50
 [<f98efb09>] ieee80211_sta_join_ibss+0xf9/0x140 [mac80211]
 [<f98f0456>] ieee80211_ibss_work+0x416/0x470 [mac80211]
 [<c0496d8b>] ? trace_hardirqs_on+0xb/0x10
 [<c077683b>] ? skb_dequeue+0x4b/0x70
 [<f98f207f>] ieee80211_iface_work+0x13f/0x230 [mac80211]
 [<c045cf99>] ? process_one_work+0x109/0x3f0
 [<c045d015>] process_one_work+0x185/0x3f0
 [<c045cf99>] ? process_one_work+0x109/0x3f0
 [<f98f1f40>] ? ieee80211_teardown_sdata+0xa0/0xa0 [mac80211]
 [<c045ed86>] worker_thread+0x116/0x270
 [<c045ec70>] ? manage_workers+0x1e0/0x1e0
 [<c0462f64>] kthread+0x84/0x90
 [<c0462ee0>] ? __init_kthread_worker+0x60/0x60
 [<c083d382>] kernel_thread_helper+0x6/0x10

Cc: stable@vger.kernel.org
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Acked-by: Helmut Schaa <helmut.schaa@googlemail.com>
Acked-by: Gertjan van Wingerde <gwingerde@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
e5851da

@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012

@elp @linvjw elp + linvjw mac80211: fail authentication when AP denied authentication
ieee80211_rx_mgmt_auth() doesn't handle denied authentication
properly - it authenticates the station and waits for association
(for 5 seconds) instead of failing the authentication.

Fix it by destroying auth_data and bailing out instead.

Signed-off-by: Eliad Peller <eliad@wizery.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Cc: stable@vger.kernel.org #3.4
Signed-off-by: John W. Linville <linville@tuxdriver.com>
dac211e

@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012

Borislav Petkov + Ingo Molnar x86/smp: Fix topology checks on AMD MCM CPUs
The warning below triggers on AMD MCM packages because physical package
IDs on the cores of a _physical_ socket are the same. I.e., this field
says which CPUs belong to the same physical package.

However, the same two CPUs belong to two different internal, i.e.
"logical" nodes in the same physical socket which is reflected in the
CPU-to-node map on x86 with NUMA.

Which makes this check wrong on the above topologies so circumvent it.

[    0.444413] Booting Node   0, Processors  #1 #2 #3 #4 #5 Ok.
[    0.461388] ------------[ cut here ]------------
[    0.465997] WARNING: at arch/x86/kernel/smpboot.c:310 topology_sane.clone.1+0x6e/0x81()
[    0.473960] Hardware name: Dinar
[    0.477170] sched: CPU #6's mc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency.
[    0.486860] Booting Node   1, Processors  #6
[    0.491104] Modules linked in:
[    0.494141] Pid: 0, comm: swapper/6 Not tainted 3.4.0+ #1
[    0.499510] Call Trace:
[    0.501946]  [<ffffffff8144bf92>] ? topology_sane.clone.1+0x6e/0x81
[    0.508185]  [<ffffffff8102f1fc>] warn_slowpath_common+0x85/0x9d
[    0.514163]  [<ffffffff8102f2b7>] warn_slowpath_fmt+0x46/0x48
[    0.519881]  [<ffffffff8144bf92>] topology_sane.clone.1+0x6e/0x81
[    0.525943]  [<ffffffff8144c234>] set_cpu_sibling_map+0x251/0x371
[    0.532004]  [<ffffffff8144c4ee>] start_secondary+0x19a/0x218
[    0.537729] ---[ end trace 4eaa2a86a8e2da22 ]---
[    0.628197]  #7 #8 #9 #10 #11 Ok.
[    0.807108] Booting Node   3, Processors  #12 #13 #14 #15 #16 #17 Ok.
[    0.897587] Booting Node   2, Processors  #18 #19 #20 #21 #22 #23 Ok.
[    0.917443] Brought up 24 CPUs

We ran a topology sanity check test we have here on it and
it all looks ok... hopefully :).

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120529135442.GE29157@aftab.osrc.amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
161270f

@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012

@dzickusrh dzickusrh + Ingo Molnar watchdog: Quiet down the boot messages
A bunch of bugzillas have complained how noisy the nmi_watchdog
is during boot-up especially with its expected failure cases
(like virt and bios resource contention).

This is my attempt to quiet them down and keep it less confusing
for the end user.  What I did is print the message for cpu0 and
save it for future comparisons.  If future cpus have an
identical message as cpu0, then don't print the redundant info.
However, if a future cpu has a different message, happily print
that loudly.

Before the change, you would see something like:

    ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
    CPU0: Intel(R) Core(TM)2 Quad CPU    Q9550  @ 2.83GHz stepping 0a
    Performance Events: PEBS fmt0+, Core2 events, Intel PMU driver.
    ... version:                2
    ... bit width:              40
    ... generic registers:      2
    ... value mask:             000000ffffffffff
    ... max period:             000000007fffffff
    ... fixed-purpose events:   3
    ... event mask:             0000000700000003
    NMI watchdog enabled, takes one hw-pmu counter.
    Booting Node   0, Processors  #1
    NMI watchdog enabled, takes one hw-pmu counter.
     #2
    NMI watchdog enabled, takes one hw-pmu counter.
     #3 Ok.
    NMI watchdog enabled, takes one hw-pmu counter.
    Brought up 4 CPUs
    Total of 4 processors activated (22607.24 BogoMIPS).

After the change, it is simplified to:

    ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
    CPU0: Intel(R) Core(TM)2 Quad CPU    Q9550  @ 2.83GHz stepping 0a
    Performance Events: PEBS fmt0+, Core2 events, Intel PMU driver.
    ... version:                2
    ... bit width:              40
    ... generic registers:      2
    ... value mask:             000000ffffffffff
    ... max period:             000000007fffffff
    ... fixed-purpose events:   3
    ... event mask:             0000000700000003
    NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
    Booting Node   0, Processors  #1 #2 #3 Ok.
    Brought up 4 CPUs

V2: little changes based on Joe Perches' feedback
V3: printk cleanup based on Ingo's feedback; checkpatch fix
V4: keep printk as one long line
V5: Ingo fix ups

Reported-and-tested-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: nzimmer@sgi.com
Cc: joe@perches.com
Link: http://lkml.kernel.org/r/1339594548-17227-1-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
a702704

@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012

Pablo Neira Ayuso netfilter: nf_ct_ecache: fix crash with multiple containers, one shut…
…ting down

Hans reports that he's still hitting:

BUG: unable to handle kernel NULL pointer dereference at 000000000000027c
IP: [<ffffffff813615db>] netlink_has_listeners+0xb/0x60
PGD 0
Oops: 0000 [#3] PREEMPT SMP
CPU 0

It happens when adding a number of containers with do:

nfct_query(h, NFCT_Q_CREATE, ct);

and most likely one namespace shuts down.

this problem was supposed to be fixed by:
70e9942 netfilter: nf_conntrack: make event callback registration per-netns

Still, it was missing one rcu_access_pointer to check if the callback
is set or not.

Reported-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
6bd0405

@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012

@bebarino bebarino + Russell King ARM: 7457/1: smp: Fix suspicious RCU originating from cpu_die()
While running hotplug tests I ran into this RCU splat

===============================
[ INFO: suspicious RCU usage. ]
3.4.0 #3275 Tainted: G        W
-------------------------------
include/linux/rcupdate.h:729 rcu_read_lock() used illegally while idle!

other info that might help us debug this:

RCU used illegally from idle CPU!
rcu_scheduler_active = 1, debug_locks = 0
RCU used illegally from extended quiescent state!
4 locks held by swapper/2/0:
 #0:  ((cpu_died).wait.lock){......}, at: [<c00ab128>] complete+0x1c/0x5c
 #1:  (&p->pi_lock){-.-.-.}, at: [<c00b275c>] try_to_wake_up+0x2c/0x388
 #2:  (&rq->lock){-.-.-.}, at: [<c00b2860>] try_to_wake_up+0x130/0x388
 #3:  (rcu_read_lock){.+.+..}, at: [<c00abe5c>] cpuacct_charge+0x28/0x1f4

stack backtrace:
[<c001521c>] (unwind_backtrace+0x0/0x12c) from [<c00abec8>] (cpuacct_charge+0x94/0x1f4)
[<c00abec8>] (cpuacct_charge+0x94/0x1f4) from [<c00b395c>] (update_curr+0x24c/0x2c8)
[<c00b395c>] (update_curr+0x24c/0x2c8) from [<c00b59c4>] (enqueue_task_fair+0x50/0x194)
[<c00b59c4>] (enqueue_task_fair+0x50/0x194) from [<c00afea4>] (enqueue_task+0x30/0x34)
[<c00afea4>] (enqueue_task+0x30/0x34) from [<c00b0908>] (ttwu_activate+0x14/0x38)
[<c00b0908>] (ttwu_activate+0x14/0x38) from [<c00b28a8>] (try_to_wake_up+0x178/0x388)
[<c00b28a8>] (try_to_wake_up+0x178/0x388) from [<c00a82a0>] (__wake_up_common+0x34/0x78)
[<c00a82a0>] (__wake_up_common+0x34/0x78) from [<c00ab154>] (complete+0x48/0x5c)
[<c00ab154>] (complete+0x48/0x5c) from [<c07db7cc>] (cpu_die+0x2c/0x58)
[<c07db7cc>] (cpu_die+0x2c/0x58) from [<c000f954>] (cpu_idle+0x64/0xfc)
[<c000f954>] (cpu_idle+0x64/0xfc) from [<80208160>] (0x80208160)

When a cpu is marked offline during its idle thread it calls
cpu_die() during an RCU idle period. cpu_die() calls complete()
to notify the killing process that the cpu has died. complete()
calls into the scheduler code and eventually grabs an RCU read
lock in cpuacct_charge().

Mark complete() as RCU_NONIDLE so that RCU pays attention to this
CPU for the duration of the complete() function even though it's
in idle.

Suggested-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
ff081e0

@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012

@jtlayton @smfrench jtlayton + smfrench cifs: when CONFIG_HIGHMEM is set, serialize the read/write kmaps
Jian found that when he ran fsx on a 32 bit arch with a large wsize the
process and one of the bdi writeback kthreads would sometimes deadlock
with a stack trace like this:

crash> bt
PID: 2789   TASK: f02edaa0  CPU: 3   COMMAND: "fsx"
 #0 [eed63cbc] schedule at c083c5b3
 #1 [eed63d80] kmap_high at c0500ec8
 #2 [eed63db0] cifs_async_writev at f7fabcd7 [cifs]
 #3 [eed63df0] cifs_writepages at f7fb7f5c [cifs]
 #4 [eed63e50] do_writepages at c04f3e32
 #5 [eed63e54] __filemap_fdatawrite_range at c04e152a
 #6 [eed63ea4] filemap_fdatawrite at c04e1b3e
 #7 [eed63eb4] cifs_file_aio_write at f7fa111a [cifs]
 #8 [eed63ecc] do_sync_write at c052d202
 #9 [eed63f74] vfs_write at c052d4ee
#10 [eed63f94] sys_write at c052df4c
#11 [eed63fb0] ia32_sysenter_target at c0409a98
    EAX: 00000004  EBX: 00000003  ECX: abd73b73  EDX: 012a65c6
    DS:  007b      ESI: 012a65c6  ES:  007b      EDI: 00000000
    SS:  007b      ESP: bf8db178  EBP: bf8db1f8  GS:  0033
    CS:  0073      EIP: 40000424  ERR: 00000004  EFLAGS: 00000246

Each task would kmap part of its address array before getting stuck, but
not enough to actually issue the write.

This patch fixes this by serializing the marshal_iov operations for
async reads and writes. The idea here is to ensure that cifs
aggressively tries to populate a request before attempting to fulfill
another one. As soon as all of the pages are kmapped for a request, then
we can unlock and allow another one to proceed.

There's no need to do this serialization on non-CONFIG_HIGHMEM arches
however, so optimize all of this out when CONFIG_HIGHMEM isn't set.

Cc: <stable@vger.kernel.org>
Reported-by: Jian Li <jiali@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
3cf003c

@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012

@pcmoore @davem330 pcmoore + davem330 cipso: don't follow a NULL pointer when setsockopt() is called
As reported by Alan Cox, and verified by Lin Ming, when a user
attempts to add a CIPSO option to a socket using the CIPSO_V4_TAG_LOCAL
tag the kernel dies a terrible death when it attempts to follow a NULL
pointer (the skb argument to cipso_v4_validate() is NULL when called via
the setsockopt() syscall).

This patch fixes this by first checking to ensure that the skb is
non-NULL before using it to find the incoming network interface.  In
the unlikely case where the skb is NULL and the user attempts to add
a CIPSO option with the _TAG_LOCAL tag we return an error as this is
not something we want to allow.

A simple reproducer, kindly supplied by Lin Ming, although you must
have the CIPSO DOI #3 configure on the system first or you will be
caught early in cipso_v4_validate():

	#include <sys/types.h>
	#include <sys/socket.h>
	#include <linux/ip.h>
	#include <linux/in.h>
	#include <string.h>

	struct local_tag {
		char type;
		char length;
		char info[4];
	};

	struct cipso {
		char type;
		char length;
		char doi[4];
		struct local_tag local;
	};

	int main(int argc, char **argv)
	{
		int sockfd;
		struct cipso cipso = {
			.type = IPOPT_CIPSO,
			.length = sizeof(struct cipso),
			.local = {
				.type = 128,
				.length = sizeof(struct local_tag),
			},
		};

		memset(cipso.doi, 0, 4);
		cipso.doi[3] = 3;

		sockfd = socket(AF_INET, SOCK_DGRAM, 0);
		#define SOL_IP 0
		setsockopt(sockfd, SOL_IP, IP_OPTIONS,
			&cipso, sizeof(struct cipso));

		return 0;
	}

CC: Lin Ming <mlin@ss.pku.edu.cn>
Reported-by: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
89d7ae3

@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012

@jtlayton jtlayton + Trond Myklebust nfs: skip commit in releasepage if we're freeing memory for fs-relate…
…d reasons

We've had some reports of a deadlock where rpciod ends up with a stack
trace like this:

    PID: 2507   TASK: ffff88103691ab40  CPU: 14  COMMAND: "rpciod/14"
     #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
     #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
     #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
     #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
     #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
     #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
     #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
     #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
     #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
     #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
    #10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
    #11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
    #12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
    #13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
    #14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
    #15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
    #16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
    #17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
    #18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
    #19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
    #20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
    #21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
    #22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
    #23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
    #24 [ffff8810343bfee8] kthread at ffffffff8108dd96
    #25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca

rpciod is trying to allocate memory for a new socket to talk to the
server. The VM ends up calling ->releasepage to get more memory, and it
tries to do a blocking commit. That commit can't succeed however without
a connected socket, so we deadlock.

Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
socket allocation, and having nfs_release_page check for that flag when
deciding whether to do a commit call. Also, set PF_FSTRANS
unconditionally in rpc_async_schedule since that function can also do
allocations sometimes.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
5cf02d0

@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012

@davem330 Eric Dumazet + davem330 net: tcp: ipv6_mapped needs sk_rx_dst_set method
commit 5d299f3 (net: ipv6: fix TCP early demux) added a
regression for ipv6_mapped case.

[   67.422369] SELinux: initialized (dev autofs, type autofs), uses
genfs_contexts
[   67.449678] SELinux: initialized (dev autofs, type autofs), uses
genfs_contexts
[   92.631060] BUG: unable to handle kernel NULL pointer dereference at
(null)
[   92.631435] IP: [<          (null)>]           (null)
[   92.631645] PGD 0
[   92.631846] Oops: 0010 [#1] SMP
[   92.632095] Modules linked in: autofs4 sunrpc ipv6 dm_mirror
dm_region_hash dm_log dm_multipath dm_mod video sbs sbshc battery ac lp
parport sg snd_hda_intel snd_hda_codec snd_seq_oss snd_seq_midi_event
snd_seq snd_seq_device pcspkr snd_pcm_oss snd_mixer_oss snd_pcm
snd_timer serio_raw button floppy snd i2c_i801 i2c_core soundcore
snd_page_alloc shpchp ide_cd_mod cdrom microcode ehci_hcd ohci_hcd
uhci_hcd
[   92.634294] CPU 0
[   92.634294] Pid: 4469, comm: sendmail Not tainted 3.6.0-rc1 #3
[   92.634294] RIP: 0010:[<0000000000000000>]  [<          (null)>]
(null)
[   92.634294] RSP: 0018:ffff880245fc7cb0  EFLAGS: 00010282
[   92.634294] RAX: ffffffffa01985f0 RBX: ffff88024827ad00 RCX:
0000000000000000
[   92.634294] RDX: 0000000000000218 RSI: ffff880254735380 RDI:
ffff88024827ad00
[   92.634294] RBP: ffff880245fc7cc8 R08: 0000000000000001 R09:
0000000000000000
[   92.634294] R10: 0000000000000000 R11: ffff880245fc7bf8 R12:
ffff880254735380
[   92.634294] R13: ffff880254735380 R14: 0000000000000000 R15:
7fffffffffff0218
[   92.634294] FS:  00007f4516ccd6f0(0000) GS:ffff880256600000(0000)
knlGS:0000000000000000
[   92.634294] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   92.634294] CR2: 0000000000000000 CR3: 0000000245ed1000 CR4:
00000000000007f0
[   92.634294] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[   92.634294] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[   92.634294] Process sendmail (pid: 4469, threadinfo ffff880245fc6000,
task ffff880254b8cac0)
[   92.634294] Stack:
[   92.634294]  ffffffff813837a7 ffff88024827ad00 ffff880254b6b0e8
ffff880245fc7d68
[   92.634294]  ffffffff81385083 00000000001d2680 ffff8802547353a8
ffff880245fc7d18
[   92.634294]  ffffffff8105903a ffff88024827ad60 0000000000000002
00000000000000ff
[   92.634294] Call Trace:
[   92.634294]  [<ffffffff813837a7>] ? tcp_finish_connect+0x2c/0xfa
[   92.634294]  [<ffffffff81385083>] tcp_rcv_state_process+0x2b6/0x9c6
[   92.634294]  [<ffffffff8105903a>] ? sched_clock_cpu+0xc3/0xd1
[   92.634294]  [<ffffffff81059073>] ? local_clock+0x2b/0x3c
[   92.634294]  [<ffffffff8138caf3>] tcp_v4_do_rcv+0x63a/0x670
[   92.634294]  [<ffffffff8133278e>] release_sock+0x128/0x1bd
[   92.634294]  [<ffffffff8139f060>] __inet_stream_connect+0x1b1/0x352
[   92.634294]  [<ffffffff813325f5>] ? lock_sock_nested+0x74/0x7f
[   92.634294]  [<ffffffff8104b333>] ? wake_up_bit+0x25/0x25
[   92.634294]  [<ffffffff813325f5>] ? lock_sock_nested+0x74/0x7f
[   92.634294]  [<ffffffff8139f223>] ? inet_stream_connect+0x22/0x4b
[   92.634294]  [<ffffffff8139f234>] inet_stream_connect+0x33/0x4b
[   92.634294]  [<ffffffff8132e8cf>] sys_connect+0x78/0x9e
[   92.634294]  [<ffffffff813fd407>] ? sysret_check+0x1b/0x56
[   92.634294]  [<ffffffff81088503>] ? __audit_syscall_entry+0x195/0x1c8
[   92.634294]  [<ffffffff811cc26e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[   92.634294]  [<ffffffff813fd3e2>] system_call_fastpath+0x16/0x1b
[   92.634294] Code:  Bad RIP value.
[   92.634294] RIP  [<          (null)>]           (null)
[   92.634294]  RSP <ffff880245fc7cb0>
[   92.634294] CR2: 0000000000000000
[   92.648982] ---[ end trace 24e2bed94314c8d9 ]---
[   92.649146] Kernel panic - not syncing: Fatal exception in interrupt

Fix this using inet_sk_rx_dst_set(), and export this function in case
IPv6 is modular.

Reported-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
63d02d1

@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012

@sgruszka sgruszka + Thomas Gleixner sched: fix divide by zero at {thread_group,task}_times
On architectures where cputime_t is 64 bit type, is possible to trigger
divide by zero on do_div(temp, (__force u32) total) line, if total is a
non zero number but has lower 32 bit's zeroed. Removing casting is not
a good solution since some do_div() implementations do cast to u32
internally.

This problem can be triggered in practice on very long lived processes:

  PID: 2331   TASK: ffff880472814b00  CPU: 2   COMMAND: "oraagent.bin"
   #0 [ffff880472a51b70] machine_kexec at ffffffff8103214b
   #1 [ffff880472a51bd0] crash_kexec at ffffffff810b91c2
   #2 [ffff880472a51ca0] oops_end at ffffffff814f0b00
   #3 [ffff880472a51cd0] die at ffffffff8100f26b
   #4 [ffff880472a51d00] do_trap at ffffffff814f03f4
   #5 [ffff880472a51d60] do_divide_error at ffffffff8100cfff
   #6 [ffff880472a51e00] divide_error at ffffffff8100be7b
      [exception RIP: thread_group_times+0x56]
      RIP: ffffffff81056a16  RSP: ffff880472a51eb8  RFLAGS: 00010046
      RAX: bc3572c9fe12d194  RBX: ffff880874150800  RCX: 0000000110266fad
      RDX: 0000000000000000  RSI: ffff880472a51eb8  RDI: 001038ae7d9633dc
      RBP: ffff880472a51ef8   R8: 00000000b10a3a64   R9: ffff880874150800
      R10: 00007fcba27ab680  R11: 0000000000000202  R12: ffff880472a51f08
      R13: ffff880472a51f10  R14: 0000000000000000  R15: 0000000000000007
      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
   #7 [ffff880472a51f00] do_sys_times at ffffffff8108845d
   #8 [ffff880472a51f40] sys_times at ffffffff81088524
   #9 [ffff880472a51f80] system_call_fastpath at ffffffff8100b0f2
      RIP: 0000003808caac3a  RSP: 00007fcba27ab6d8  RFLAGS: 00000202
      RAX: 0000000000000064  RBX: ffffffff8100b0f2  RCX: 0000000000000000
      RDX: 00007fcba27ab6e0  RSI: 000000000076d58e  RDI: 00007fcba27ab6e0
      RBP: 00007fcba27ab700   R8: 0000000000000020   R9: 000000000000091b
      R10: 00007fcba27ab680  R11: 0000000000000202  R12: 00007fff9ca41940
      R13: 0000000000000000  R14: 00007fcba27ac9c0  R15: 00007fff9ca41940
      ORIG_RAX: 0000000000000064  CS: 0033  SS: 002b

Cc: stable@vger.kernel.org
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120808092714.GA3580@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
bea6832

@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012

@bmork @gregkh bmork + gregkh USB: option: add ZTE K5006-Z
The ZTE (Vodafone) K5006-Z use the following
interface layout:

00 DIAG
01 secondary
02 modem
03 networkcard
04 storage

Ignoring interface #3 which is handled by the qmi_wwan
driver.

Cc: Thomas Schäfer <tschaefer@t-online.de>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
f1b5c99

@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012

@davem330 Eric Dumazet + davem330 tcp: fix possible socket refcount problem
Commit 6f458df (tcp: improve latencies of timer triggered events)
added bug leading to following trace :

[ 2866.131281] IPv4: Attempt to release TCP socket in state 1 ffff880019ec0000
[ 2866.131726]
[ 2866.132188] =========================
[ 2866.132281] [ BUG: held lock freed! ]
[ 2866.132281] 3.6.0-rc1+ #622 Not tainted
[ 2866.132281] -------------------------
[ 2866.132281] kworker/0:1/652 is freeing memory ffff880019ec0000-ffff880019ec0a1f, with a lock still held there!
[ 2866.132281]  (sk_lock-AF_INET-RPC){+.+...}, at: [<ffffffff81903619>] tcp_sendmsg+0x29/0xcc6
[ 2866.132281] 4 locks held by kworker/0:1/652:
[ 2866.132281]  #0:  (rpciod){.+.+.+}, at: [<ffffffff81083567>] process_one_work+0x1de/0x47f
[ 2866.132281]  #1:  ((&task->u.tk_work)){+.+.+.}, at: [<ffffffff81083567>] process_one_work+0x1de/0x47f
[ 2866.132281]  #2:  (sk_lock-AF_INET-RPC){+.+...}, at: [<ffffffff81903619>] tcp_sendmsg+0x29/0xcc6
[ 2866.132281]  #3:  (&icsk->icsk_retransmit_timer){+.-...}, at: [<ffffffff81078017>] run_timer_softirq+0x1ad/0x35f
[ 2866.132281]
[ 2866.132281] stack backtrace:
[ 2866.132281] Pid: 652, comm: kworker/0:1 Not tainted 3.6.0-rc1+ #622
[ 2866.132281] Call Trace:
[ 2866.132281]  <IRQ>  [<ffffffff810bc527>] debug_check_no_locks_freed+0x112/0x159
[ 2866.132281]  [<ffffffff818a0839>] ? __sk_free+0xfd/0x114
[ 2866.132281]  [<ffffffff811549fa>] kmem_cache_free+0x6b/0x13a
[ 2866.132281]  [<ffffffff818a0839>] __sk_free+0xfd/0x114
[ 2866.132281]  [<ffffffff818a08c0>] sk_free+0x1c/0x1e
[ 2866.132281]  [<ffffffff81911e1c>] tcp_write_timer+0x51/0x56
[ 2866.132281]  [<ffffffff81078082>] run_timer_softirq+0x218/0x35f
[ 2866.132281]  [<ffffffff81078017>] ? run_timer_softirq+0x1ad/0x35f
[ 2866.132281]  [<ffffffff810f5831>] ? rb_commit+0x58/0x85
[ 2866.132281]  [<ffffffff81911dcb>] ? tcp_write_timer_handler+0x148/0x148
[ 2866.132281]  [<ffffffff81070bd6>] __do_softirq+0xcb/0x1f9
[ 2866.132281]  [<ffffffff81a0a00c>] ? _raw_spin_unlock+0x29/0x2e
[ 2866.132281]  [<ffffffff81a1227c>] call_softirq+0x1c/0x30
[ 2866.132281]  [<ffffffff81039f38>] do_softirq+0x4a/0xa6
[ 2866.132281]  [<ffffffff81070f2b>] irq_exit+0x51/0xad
[ 2866.132281]  [<ffffffff81a129cd>] do_IRQ+0x9d/0xb4
[ 2866.132281]  [<ffffffff81a0a3ef>] common_interrupt+0x6f/0x6f
[ 2866.132281]  <EOI>  [<ffffffff8109d006>] ? sched_clock_cpu+0x58/0xd1
[ 2866.132281]  [<ffffffff81a0a172>] ? _raw_spin_unlock_irqrestore+0x4c/0x56
[ 2866.132281]  [<ffffffff81078692>] mod_timer+0x178/0x1a9
[ 2866.132281]  [<ffffffff818a00aa>] sk_reset_timer+0x19/0x26
[ 2866.132281]  [<ffffffff8190b2cc>] tcp_rearm_rto+0x99/0xa4
[ 2866.132281]  [<ffffffff8190dfba>] tcp_event_new_data_sent+0x6e/0x70
[ 2866.132281]  [<ffffffff8190f7ea>] tcp_write_xmit+0x7de/0x8e4
[ 2866.132281]  [<ffffffff818a565d>] ? __alloc_skb+0xa0/0x1a1
[ 2866.132281]  [<ffffffff8190f952>] __tcp_push_pending_frames+0x2e/0x8a
[ 2866.132281]  [<ffffffff81904122>] tcp_sendmsg+0xb32/0xcc6
[ 2866.132281]  [<ffffffff819229c2>] inet_sendmsg+0xaa/0xd5
[ 2866.132281]  [<ffffffff81922918>] ? inet_autobind+0x5f/0x5f
[ 2866.132281]  [<ffffffff810ee7f1>] ? trace_clock_local+0x9/0xb
[ 2866.132281]  [<ffffffff8189adab>] sock_sendmsg+0xa3/0xc4
[ 2866.132281]  [<ffffffff810f5de6>] ? rb_reserve_next_event+0x26f/0x2d5
[ 2866.132281]  [<ffffffff8103e6a9>] ? native_sched_clock+0x29/0x6f
[ 2866.132281]  [<ffffffff8103e6f8>] ? sched_clock+0x9/0xd
[ 2866.132281]  [<ffffffff810ee7f1>] ? trace_clock_local+0x9/0xb
[ 2866.132281]  [<ffffffff8189ae03>] kernel_sendmsg+0x37/0x43
[ 2866.132281]  [<ffffffff8199ce49>] xs_send_kvec+0x77/0x80
[ 2866.132281]  [<ffffffff8199cec1>] xs_sendpages+0x6f/0x1a0
[ 2866.132281]  [<ffffffff8107826d>] ? try_to_del_timer_sync+0x55/0x61
[ 2866.132281]  [<ffffffff8199d0d2>] xs_tcp_send_request+0x55/0xf1
[ 2866.132281]  [<ffffffff8199bb90>] xprt_transmit+0x89/0x1db
[ 2866.132281]  [<ffffffff81999bcd>] ? call_connect+0x3c/0x3c
[ 2866.132281]  [<ffffffff81999d92>] call_transmit+0x1c5/0x20e
[ 2866.132281]  [<ffffffff819a0d55>] __rpc_execute+0x6f/0x225
[ 2866.132281]  [<ffffffff81999bcd>] ? call_connect+0x3c/0x3c
[ 2866.132281]  [<ffffffff819a0f33>] rpc_async_schedule+0x28/0x34
[ 2866.132281]  [<ffffffff810835d6>] process_one_work+0x24d/0x47f
[ 2866.132281]  [<ffffffff81083567>] ? process_one_work+0x1de/0x47f
[ 2866.132281]  [<ffffffff819a0f0b>] ? __rpc_execute+0x225/0x225
[ 2866.132281]  [<ffffffff81083a6d>] worker_thread+0x236/0x317
[ 2866.132281]  [<ffffffff81083837>] ? process_scheduled_works+0x2f/0x2f
[ 2866.132281]  [<ffffffff8108b7b8>] kthread+0x9a/0xa2
[ 2866.132281]  [<ffffffff81a12184>] kernel_thread_helper+0x4/0x10
[ 2866.132281]  [<ffffffff81a0a4b0>] ? retint_restore_args+0x13/0x13
[ 2866.132281]  [<ffffffff8108b71e>] ? __init_kthread_worker+0x5a/0x5a
[ 2866.132281]  [<ffffffff81a12180>] ? gs_change+0x13/0x13
[ 2866.308506] IPv4: Attempt to release TCP socket in state 1 ffff880019ec0000
[ 2866.309689] =============================================================================
[ 2866.310254] BUG TCP (Not tainted): Object already free
[ 2866.310254] -----------------------------------------------------------------------------
[ 2866.310254]

The bug comes from the fact that timer set in sk_reset_timer() can run
before we actually do the sock_hold(). socket refcount reaches zero and
we free the socket too soon.

timer handler is not allowed to reduce socket refcnt if socket is owned
by the user, or we need to change sk_reset_timer() implementation.

We should take a reference on the socket in case TCP_DELACK_TIMER_DEFERRED
or TCP_DELACK_TIMER_DEFERRED bit are set in tsq_flags

Also fix a typo in tcp_delack_timer(), where TCP_WRITE_TIMER_DEFERRED
was used instead of TCP_DELACK_TIMER_DEFERRED.

For consistency, use same socket refcount change for TCP_MTU_REDUCED_DEFERRED,
even if not fired from a timer.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Tested-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
144d56e

@popcornmix popcornmix pushed a commit that referenced this issue Oct 13, 2012

@davem330 Eric Dumazet + davem330 l2tp: fix a lockdep splat
Fixes following lockdep splat :

[ 1614.734896] =============================================
[ 1614.734898] [ INFO: possible recursive locking detected ]
[ 1614.734901] 3.6.0-rc3+ #782 Not tainted
[ 1614.734903] ---------------------------------------------
[ 1614.734905] swapper/11/0 is trying to acquire lock:
[ 1614.734907]  (slock-AF_INET){+.-...}, at: [<ffffffffa0209d72>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
[ 1614.734920]
[ 1614.734920] but task is already holding lock:
[ 1614.734922]  (slock-AF_INET){+.-...}, at: [<ffffffff815fce23>] tcp_v4_err+0x163/0x6b0
[ 1614.734932]
[ 1614.734932] other info that might help us debug this:
[ 1614.734935]  Possible unsafe locking scenario:
[ 1614.734935]
[ 1614.734937]        CPU0
[ 1614.734938]        ----
[ 1614.734940]   lock(slock-AF_INET);
[ 1614.734943]   lock(slock-AF_INET);
[ 1614.734946]
[ 1614.734946]  *** DEADLOCK ***
[ 1614.734946]
[ 1614.734949]  May be due to missing lock nesting notation
[ 1614.734949]
[ 1614.734952] 7 locks held by swapper/11/0:
[ 1614.734954]  #0:  (rcu_read_lock){.+.+..}, at: [<ffffffff81592801>] __netif_receive_skb+0x251/0xd00
[ 1614.734964]  #1:  (rcu_read_lock){.+.+..}, at: [<ffffffff815d319c>] ip_local_deliver_finish+0x4c/0x4e0
[ 1614.734972]  #2:  (rcu_read_lock){.+.+..}, at: [<ffffffff8160d116>] icmp_socket_deliver+0x46/0x230
[ 1614.734982]  #3:  (slock-AF_INET){+.-...}, at: [<ffffffff815fce23>] tcp_v4_err+0x163/0x6b0
[ 1614.734989]  #4:  (rcu_read_lock){.+.+..}, at: [<ffffffff815da240>] ip_queue_xmit+0x0/0x680
[ 1614.734997]  #5:  (rcu_read_lock_bh){.+....}, at: [<ffffffff815d9925>] ip_finish_output+0x135/0x890
[ 1614.735004]  #6:  (rcu_read_lock_bh){.+....}, at: [<ffffffff81595680>] dev_queue_xmit+0x0/0xe00
[ 1614.735012]
[ 1614.735012] stack backtrace:
[ 1614.735016] Pid: 0, comm: swapper/11 Not tainted 3.6.0-rc3+ #782
[ 1614.735018] Call Trace:
[ 1614.735020]  <IRQ>  [<ffffffff810a50ac>] __lock_acquire+0x144c/0x1b10
[ 1614.735033]  [<ffffffff810a334b>] ? check_usage+0x9b/0x4d0
[ 1614.735037]  [<ffffffff810a6762>] ? mark_held_locks+0x82/0x130
[ 1614.735042]  [<ffffffff810a5df0>] lock_acquire+0x90/0x200
[ 1614.735047]  [<ffffffffa0209d72>] ? l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
[ 1614.735051]  [<ffffffff810a69ad>] ? trace_hardirqs_on+0xd/0x10
[ 1614.735060]  [<ffffffff81749b31>] _raw_spin_lock+0x41/0x50
[ 1614.735065]  [<ffffffffa0209d72>] ? l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
[ 1614.735069]  [<ffffffffa0209d72>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
[ 1614.735075]  [<ffffffffa014f7f2>] l2tp_eth_dev_xmit+0x32/0x60 [l2tp_eth]
[ 1614.735079]  [<ffffffff81595112>] dev_hard_start_xmit+0x502/0xa70
[ 1614.735083]  [<ffffffff81594c6e>] ? dev_hard_start_xmit+0x5e/0xa70
[ 1614.735087]  [<ffffffff815957c1>] ? dev_queue_xmit+0x141/0xe00
[ 1614.735093]  [<ffffffff815b622e>] sch_direct_xmit+0xfe/0x290
[ 1614.735098]  [<ffffffff81595865>] dev_queue_xmit+0x1e5/0xe00
[ 1614.735102]  [<ffffffff81595680>] ? dev_hard_start_xmit+0xa70/0xa70
[ 1614.735106]  [<ffffffff815b4daa>] ? eth_header+0x3a/0xf0
[ 1614.735111]  [<ffffffff8161d33e>] ? fib_get_table+0x2e/0x280
[ 1614.735117]  [<ffffffff8160a7e2>] arp_xmit+0x22/0x60
[ 1614.735121]  [<ffffffff8160a863>] arp_send+0x43/0x50
[ 1614.735125]  [<ffffffff8160b82f>] arp_solicit+0x18f/0x450
[ 1614.735132]  [<ffffffff8159d9da>] neigh_probe+0x4a/0x70
[ 1614.735137]  [<ffffffff815a191a>] __neigh_event_send+0xea/0x300
[ 1614.735141]  [<ffffffff815a1c93>] neigh_resolve_output+0x163/0x260
[ 1614.735146]  [<ffffffff815d9cf5>] ip_finish_output+0x505/0x890
[ 1614.735150]  [<ffffffff815d9925>] ? ip_finish_output+0x135/0x890
[ 1614.735154]  [<ffffffff815dae79>] ip_output+0x59/0xf0
[ 1614.735158]  [<ffffffff815da1cd>] ip_local_out+0x2d/0xa0
[ 1614.735162]  [<ffffffff815da403>] ip_queue_xmit+0x1c3/0x680
[ 1614.735165]  [<ffffffff815da240>] ? ip_local_out+0xa0/0xa0
[ 1614.735172]  [<ffffffff815f4402>] tcp_transmit_skb+0x402/0xa60
[ 1614.735177]  [<ffffffff815f5a11>] tcp_retransmit_skb+0x1a1/0x620
[ 1614.735181]  [<ffffffff815f7e93>] tcp_retransmit_timer+0x393/0x960
[ 1614.735185]  [<ffffffff815fce23>] ? tcp_v4_err+0x163/0x6b0
[ 1614.735189]  [<ffffffff815fd317>] tcp_v4_err+0x657/0x6b0
[ 1614.735194]  [<ffffffff8160d116>] ? icmp_socket_deliver+0x46/0x230
[ 1614.735199]  [<ffffffff8160d19e>] icmp_socket_deliver+0xce/0x230
[ 1614.735203]  [<ffffffff8160d116>] ? icmp_socket_deliver+0x46/0x230
[ 1614.735208]  [<ffffffff8160d464>] icmp_unreach+0xe4/0x2c0
[ 1614.735213]  [<ffffffff8160e520>] icmp_rcv+0x350/0x4a0
[ 1614.735217]  [<ffffffff815d3285>] ip_local_deliver_finish+0x135/0x4e0
[ 1614.735221]  [<ffffffff815d319c>] ? ip_local_deliver_finish+0x4c/0x4e0
[ 1614.735225]  [<ffffffff815d3ffa>] ip_local_deliver+0x4a/0x90
[ 1614.735229]  [<ffffffff815d37b7>] ip_rcv_finish+0x187/0x730
[ 1614.735233]  [<ffffffff815d425d>] ip_rcv+0x21d/0x300
[ 1614.735237]  [<ffffffff81592a1b>] __netif_receive_skb+0x46b/0xd00
[ 1614.735241]  [<ffffffff81592801>] ? __netif_receive_skb+0x251/0xd00
[ 1614.735245]  [<ffffffff81593368>] process_backlog+0xb8/0x180
[ 1614.735249]  [<ffffffff81593cf9>] net_rx_action+0x159/0x330
[ 1614.735257]  [<ffffffff810491f0>] __do_softirq+0xd0/0x3e0
[ 1614.735264]  [<ffffffff8109ed24>] ? tick_program_event+0x24/0x30
[ 1614.735270]  [<ffffffff8175419c>] call_softirq+0x1c/0x30
[ 1614.735278]  [<ffffffff8100425d>] do_softirq+0x8d/0xc0
[ 1614.735282]  [<ffffffff8104983e>] irq_exit+0xae/0xe0
[ 1614.735287]  [<ffffffff8175494e>] smp_apic_timer_interrupt+0x6e/0x99
[ 1614.735291]  [<ffffffff81753a1c>] apic_timer_interrupt+0x6c/0x80
[ 1614.735293]  <EOI>  [<ffffffff810a14ad>] ? trace_hardirqs_off+0xd/0x10
[ 1614.735306]  [<ffffffff81336f85>] ? intel_idle+0xf5/0x150
[ 1614.735310]  [<ffffffff81336f7e>] ? intel_idle+0xee/0x150
[ 1614.735317]  [<ffffffff814e6ea9>] cpuidle_enter+0x19/0x20
[ 1614.735321]  [<ffffffff814e7538>] cpuidle_idle_call+0xa8/0x630
[ 1614.735327]  [<ffffffff8100c1ba>] cpu_idle+0x8a/0xe0
[ 1614.735333]  [<ffffffff8173762e>] start_secondary+0x220/0x222

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
37159ef

@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Oct 26, 2012

@davem330 Eric Dumazet + davem330 bonding: set qdisc_tx_busylock to avoid LOCKDEP splat
If a qdisc is installed on a bonding device, its possible to get
following lockdep splat under stress :

 =============================================
 [ INFO: possible recursive locking detected ]
 3.6.0+ #211 Not tainted
 ---------------------------------------------
 ping/4876 is trying to acquire lock:
  (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+.-...}, at: [<ffffffff8157a191>] dev_queue_xmit+0xe1/0x830

 but task is already holding lock:
  (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+.-...}, at: [<ffffffff8157a191>] dev_queue_xmit+0xe1/0x830

 other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(dev->qdisc_tx_busylock ?: &qdisc_tx_busylock);
   lock(dev->qdisc_tx_busylock ?: &qdisc_tx_busylock);

  *** DEADLOCK ***

  May be due to missing lock nesting notation

 6 locks held by ping/4876:
  #0:  (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff815e5030>] raw_sendmsg+0x600/0xc30
  #1:  (rcu_read_lock_bh){.+....}, at: [<ffffffff815ba4bd>] ip_finish_output+0x12d/0x870
  #2:  (rcu_read_lock_bh){.+....}, at: [<ffffffff8157a0b0>] dev_queue_xmit+0x0/0x830
  #3:  (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+.-...}, at: [<ffffffff8157a191>] dev_queue_xmit+0xe1/0x830
  #4:  (&bond->lock){++.?..}, at: [<ffffffffa02128c1>] bond_start_xmit+0x31/0x4b0 [bonding]
  #5:  (rcu_read_lock_bh){.+....}, at: [<ffffffff8157a0b0>] dev_queue_xmit+0x0/0x830

 stack backtrace:
 Pid: 4876, comm: ping Not tainted 3.6.0+ #211
 Call Trace:
  [<ffffffff810a0145>] __lock_acquire+0x715/0x1b80
  [<ffffffff810a256b>] ? mark_held_locks+0x9b/0x100
  [<ffffffff810a1bf2>] lock_acquire+0x92/0x1d0
  [<ffffffff8157a191>] ? dev_queue_xmit+0xe1/0x830
  [<ffffffff81726b7c>] _raw_spin_lock+0x3c/0x50
  [<ffffffff8157a191>] ? dev_queue_xmit+0xe1/0x830
  [<ffffffff8106264d>] ? rcu_read_lock_bh_held+0x5d/0x90
  [<ffffffff8157a191>] dev_queue_xmit+0xe1/0x830
  [<ffffffff8157a0b0>] ? netdev_pick_tx+0x570/0x570
  [<ffffffffa0212a6a>] bond_start_xmit+0x1da/0x4b0 [bonding]
  [<ffffffff815796d0>] dev_hard_start_xmit+0x240/0x6b0
  [<ffffffff81597c6e>] sch_direct_xmit+0xfe/0x2a0
  [<ffffffff8157a249>] dev_queue_xmit+0x199/0x830
  [<ffffffff8157a0b0>] ? netdev_pick_tx+0x570/0x570
  [<ffffffff815ba96f>] ip_finish_output+0x5df/0x870
  [<ffffffff815ba4bd>] ? ip_finish_output+0x12d/0x870
  [<ffffffff815bb964>] ip_output+0x54/0xf0
  [<ffffffff815bad48>] ip_local_out+0x28/0x90
  [<ffffffff815bc444>] ip_send_skb+0x14/0x50
  [<ffffffff815bc4b2>] ip_push_pending_frames+0x32/0x40
  [<ffffffff815e536a>] raw_sendmsg+0x93a/0xc30
  [<ffffffff8128d570>] ? selinux_file_send_sigiotask+0x1f0/0x1f0
  [<ffffffff8109ddb4>] ? __lock_is_held+0x54/0x80
  [<ffffffff815f6730>] ? inet_recvmsg+0x220/0x220
  [<ffffffff8109ddb4>] ? __lock_is_held+0x54/0x80
  [<ffffffff815f6855>] inet_sendmsg+0x125/0x240
  [<ffffffff815f6730>] ? inet_recvmsg+0x220/0x220
  [<ffffffff8155cddb>] sock_sendmsg+0xab/0xe0
  [<ffffffff810a1650>] ? lock_release_non_nested+0xa0/0x2e0
  [<ffffffff810a1650>] ? lock_release_non_nested+0xa0/0x2e0
  [<ffffffff8155d18c>] __sys_sendmsg+0x37c/0x390
  [<ffffffff81195b2a>] ? fsnotify+0x2ca/0x7e0
  [<ffffffff811958e8>] ? fsnotify+0x88/0x7e0
  [<ffffffff81361f36>] ? put_ldisc+0x56/0xd0
  [<ffffffff8116f98a>] ? fget_light+0x3da/0x510
  [<ffffffff8155f6c4>] sys_sendmsg+0x44/0x80
  [<ffffffff8172fc22>] system_call_fastpath+0x16/0x1b

Avoid this problem using a distinct lock_class_key for bonding
devices.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
49ee492

@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Oct 26, 2012

@lenb Srivatsa S. Bhat + lenb ACPI idle, CPU hotplug: Fix NULL pointer dereference during hotplug
On a KVM guest, when a CPU is taken offline and brought back online, we hit
the following NULL pointer dereference:

[   45.400843] Unregister pv shared memory for cpu 1
[   45.412331] smpboot: CPU 1 is now offline
[   45.529894] SMP alternatives: lockdep: fixing up alternatives
[   45.533472] smpboot: Booting Node 0 Processor 1 APIC 0x1
[   45.411526] kvm-clock: cpu 1, msr 0:7d14601, secondary cpu clock
[   45.571370] KVM setup async PF for cpu 1
[   45.572331] kvm-stealtime: cpu 1, msr 7d0e040
[   45.575031] BUG: unable to handle kernel NULL pointer dereference at           (null)
[   45.576017] IP: [<ffffffff81519f98>] cpuidle_disable_device+0x18/0x80
[   45.576017] PGD 5dfb067 PUD 5da8067 PMD 0
[   45.576017] Oops: 0000 [#1] SMP
[   45.576017] Modules linked in:
[   45.576017] CPU 0
[   45.576017] Pid: 607, comm: stress_cpu_hotp Not tainted 3.6.0-padata-tp-debug #3 Bochs Bochs
[   45.576017] RIP: 0010:[<ffffffff81519f98>]  [<ffffffff81519f98>] cpuidle_disable_device+0x18/0x80
[   45.576017] RSP: 0018:ffff880005d93ce8  EFLAGS: 00010286
[   45.576017] RAX: ffff880005d93fd8 RBX: 0000000000000000 RCX: 0000000000000006
[   45.576017] RDX: 0000000000000006 RSI: 2222222222222222 RDI: 0000000000000000
[   45.576017] RBP: ffff880005d93cf8 R08: 2222222222222222 R09: 2222222222222222
[   45.576017] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[   45.576017] R13: 0000000000000000 R14: ffffffff81c8cca0 R15: 0000000000000001
[   45.576017] FS:  00007f91936ae700(0000) GS:ffff880007c00000(0000) knlGS:0000000000000000
[   45.576017] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   45.576017] CR2: 0000000000000000 CR3: 0000000005db3000 CR4: 00000000000006f0
[   45.576017] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   45.576017] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[   45.576017] Process stress_cpu_hotp (pid: 607, threadinfo ffff880005d92000, task ffff8800066bbf40)
[   45.576017] Stack:
[   45.576017]  ffff880007a96400 0000000000000000 ffff880005d93d28 ffffffff813ac689
[   45.576017]  ffff880007a96400 ffff880007a96400 0000000000000002 ffffffff81cd8d01
[   45.576017]  ffff880005d93d58 ffffffff813aa498 0000000000000001 00000000ffffffdd
[   45.576017] Call Trace:
[   45.576017]  [<ffffffff813ac689>] acpi_processor_hotplug+0x55/0x97
[   45.576017]  [<ffffffff813aa498>] acpi_cpu_soft_notify+0x93/0xce
[   45.576017]  [<ffffffff816ae47d>] notifier_call_chain+0x5d/0x110
[   45.576017]  [<ffffffff8109730e>] __raw_notifier_call_chain+0xe/0x10
[   45.576017]  [<ffffffff81069050>] __cpu_notify+0x20/0x40
[   45.576017]  [<ffffffff81069085>] cpu_notify+0x15/0x20
[   45.576017]  [<ffffffff816978f1>] _cpu_up+0xee/0x137
[   45.576017]  [<ffffffff81697983>] cpu_up+0x49/0x59
[   45.576017]  [<ffffffff8168758d>] store_online+0x9d/0xe0
[   45.576017]  [<ffffffff8140a9f8>] dev_attr_store+0x18/0x30
[   45.576017]  [<ffffffff812322c0>] sysfs_write_file+0xe0/0x150
[   45.576017]  [<ffffffff811b389c>] vfs_write+0xac/0x180
[   45.576017]  [<ffffffff811b3be2>] sys_write+0x52/0xa0
[   45.576017]  [<ffffffff816b31e9>] system_call_fastpath+0x16/0x1b
[   45.576017] Code: 48 c7 c7 40 e5 ca 81 e8 07 d0 18 00 5d c3 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 48 83 ec 10 48 89 5d f0 4c 89 65 f8 48 89 fb <f6> 07 02 75 13 48 8b 5d f0 4c 8b 65 f8 c9 c3 66 0f 1f 84 00 00
[   45.576017] RIP  [<ffffffff81519f98>] cpuidle_disable_device+0x18/0x80
[   45.576017]  RSP <ffff880005d93ce8>
[   45.576017] CR2: 0000000000000000
[   45.656079] ---[ end trace 433d6c9ac0b02cef ]---

Analysis:
Commit 3d339dc (cpuidle / ACPI : move cpuidle_device field out of the
acpi_processor_power structure()) made the allocation of the dev structure
(struct cpuidle) of a CPU dynamic, whereas previously it was statically
allocated. And this dynamic allocation occurs in acpi_processor_power_init()
if pr->flags.power evaluates to non-zero.

On KVM guests, pr->flags.power evaluates to zero, hence dev is never
allocated. This causes the NULL pointer (dev) dereference in
cpuidle_disable_device() during a subsequent CPU online operation. Fix this
by ensuring that dev is non-NULL before dereferencing.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Len Brown <len.brown@intel.com>
cf31cd1

@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Oct 26, 2012

@PeterHueweIFX PeterHueweIFX + Kent Yoder tpm: Propagate error from tpm_transmit to fix a timeout hang
tpm_write calls tpm_transmit without checking the return value and
assigns the return value unconditionally to chip->pending_data, even if
it's an error value.
This causes three bugs.

So if we write to /dev/tpm0 with a tpm_param_size bigger than
TPM_BUFSIZE=0x1000 (e.g. 0x100a)
and a bufsize also bigger than TPM_BUFSIZE (e.g. 0x100a)
tpm_transmit returns -E2BIG which is assigned to chip->pending_data as
-7, but tpm_write returns that TPM_BUFSIZE bytes have been successfully
been written to the TPM, altough this is not true (bug #1).

As we did write more than than TPM_BUFSIZE bytes but tpm_write reports
that only TPM_BUFSIZE bytes have been written the vfs tries to write
the remaining bytes (in this case 10 bytes) to the tpm device driver via
tpm_write which then blocks at

 /* cannot perform a write until the read has cleared
 either via tpm_read or a user_read_timer timeout */
 while (atomic_read(&chip->data_pending) != 0)
	 msleep(TPM_TIMEOUT);

for 60 seconds, since data_pending is -7 and nobody is able to
read it (since tpm_read luckily checks if data_pending is greater than
0) (#bug 2).

After that the remaining bytes are written to the TPM which are
interpreted by the tpm as a normal command. (bug #3)
So if the last bytes of the command stream happen to be a e.g.
tpm_force_clear this gets accidentally sent to the TPM.

This patch fixes all three bugs, by propagating the error code of
tpm_write and returning -E2BIG if the input buffer is too big,
since the response from the tpm for a truncated value is bogus anyway.
Moreover it returns -EBUSY to userspace if there is a response ready to be
read.

Signed-off-by: Peter Huewe <peter.huewe@infineon.com>
Signed-off-by: Kent Yoder <key@linux.vnet.ibm.com>
abce9ac

@popcornmix popcornmix pushed a commit that referenced this issue Nov 14, 2012

@gregkh Andreas Bießmann + gregkh mtd: omap2: fix omap_nand_remove segfault
commit 7d9b110 upstream.

Do not kfree() the mtd_info; it is handled in the mtd subsystem and
already freed by nand_release(). Instead kfree() the struct
omap_nand_info allocated in omap_nand_probe which was not freed before.

This patch fixes following error when unloading the omap2 module:

---8<---
~ $ rmmod omap2
------------[ cut here ]------------
kernel BUG at mm/slab.c:3126!
Internal error: Oops - BUG: 0 [#1] PREEMPT ARM
Modules linked in: omap2(-)
CPU: 0    Not tainted  (3.6.0-rc3-00230-g155e36d-dirty #3)
PC is at cache_free_debugcheck+0x2d4/0x36c
LR is at kfree+0xc8/0x2ac
pc : [<c01125a0>]    lr : [<c0112efc>]    psr: 200d0193
sp : c521fe08  ip : c0e8ef90  fp : c521fe5c
r10: bf0001fc  r9 : c521e000  r8 : c0d99c8c
r7 : c661ebc0  r6 : c065d5a4  r5 : c65c4060  r4 : c78005c0
r3 : 00000000  r2 : 00001000  r1 : c65c4000  r0 : 00000001
Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 10c5387d  Table: 86694019  DAC: 00000015
Process rmmod (pid: 549, stack limit = 0xc521e2f0)
Stack: (0xc521fe08 to 0xc5220000)
fe00:                   c008a874 c00bf44c c515c6d0 200d0193 c65c4860 c515c240
fe20: c521fe3c c521fe30 c008a9c0 c008a854 c521fe5c c65c4860 c78005c0 bf0001fc
fe40: c780ff40 a00d0113 c521e000 00000000 c521fe84 c521fe60 c0112efc c01122d8
fe60: c65c4860 c0673778 c06737ac 00000000 00070013 00000000 c521fe9c c521fe88
fe80: bf0001fc c0112e40 c0673778 bf001ca8 c521feac c521fea0 c02ca11c bf0001ac
fea0: c521fec4 c521feb0 c02c82c4 c02ca100 c0673778 bf001ca8 c521fee4 c521fec8
fec0: c02c8dd8 c02c8250 00000000 bf001ca8 bf001ca8 c0804ee0 c521ff04 c521fee8
fee0: c02c804c c02c8d20 bf001924 00000000 bf001ca8 c521e000 c521ff1c c521ff08
ff00: c02c950c c02c7fbc bf001d48 00000000 c521ff2c c521ff20 c02ca3a4 c02c94b8
ff20: c521ff3c c521ff30 bf001938 c02ca394 c521ffa4 c521ff40 c009beb4 bf001930
ff40: c521ff6c 70616d6f b6fe0032 c0014f84 70616d6f b6fe0032 00000081 60070010
ff60: c521ff84 c521ff70 c008e1f4 c00bf328 0001a004 70616d6f c521ff94 0021ff88
ff80: c008e368 0001a004 70616d6f b6fe0032 00000081 c0015028 00000000 c521ffa8
ffa0: c0014dc0 c009bcd0 0001a004 70616d6f bec2ab38 00000880 bec2ab38 00000880
ffc0: 0001a004 70616d6f b6fe0032 00000081 00000319 00000000 b6fe1000 00000000
ffe0: bec2ab30 bec2ab20 00019f00 b6f539c0 60070010 bec2ab38 aaaaaaaa aaaaaaaa
Backtrace:
[<c01122cc>] (cache_free_debugcheck+0x0/0x36c) from [<c0112efc>] (kfree+0xc8/0x2ac)
[<c0112e34>] (kfree+0x0/0x2ac) from [<bf0001fc>] (omap_nand_remove+0x5c/0x64 [omap2])
[<bf0001a0>] (omap_nand_remove+0x0/0x64 [omap2]) from [<c02ca11c>] (platform_drv_remove+0x28/0x2c)
 r5:bf001ca8 r4:c0673778
[<c02ca0f4>] (platform_drv_remove+0x0/0x2c) from [<c02c82c4>] (__device_release_driver+0x80/0xdc)
[<c02c8244>] (__device_release_driver+0x0/0xdc) from [<c02c8dd8>] (driver_detach+0xc4/0xc8)
 r5:bf001ca8 r4:c0673778
[<c02c8d14>] (driver_detach+0x0/0xc8) from [<c02c804c>] (bus_remove_driver+0x9c/0x104)
 r6:c0804ee0 r5:bf001ca8 r4:bf001ca8 r3:00000000
[<c02c7fb0>] (bus_remove_driver+0x0/0x104) from [<c02c950c>] (driver_unregister+0x60/0x80)
 r6:c521e000 r5:bf001ca8 r4:00000000 r3:bf001924
[<c02c94ac>] (driver_unregister+0x0/0x80) from [<c02ca3a4>] (platform_driver_unregister+0x1c/0x20)
 r5:00000000 r4:bf001d48
[<c02ca388>] (platform_driver_unregister+0x0/0x20) from [<bf001938>] (omap_nand_driver_exit+0x14/0x1c [omap2])
[<bf001924>] (omap_nand_driver_exit+0x0/0x1c [omap2]) from [<c009beb4>] (sys_delete_module+0x1f0/0x2ec)
[<c009bcc4>] (sys_delete_module+0x0/0x2ec) from [<c0014dc0>] (ret_fast_syscall+0x0/0x48)
 r8:c0015028 r7:00000081 r6:b6fe0032 r5:70616d6f r4:0001a004
Code: e1a00005 eb0d9172 e7f001f2 e7f001f2 (e7f001f2)
---[ end trace 6a30b24d8c0cc2ee ]---
Segmentation fault
--->8---

This error was introduced in 67ce04b which
was the first commit of this driver.

Signed-off-by: Andreas Bießmann <andreas@biessmann.de>
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ee9be6a

@popcornmix popcornmix pushed a commit that referenced this issue Nov 14, 2012

@PeterHueweIFX @gregkh PeterHueweIFX + gregkh tpm: Propagate error from tpm_transmit to fix a timeout hang
commit abce9ac upstream.

tpm_write calls tpm_transmit without checking the return value and
assigns the return value unconditionally to chip->pending_data, even if
it's an error value.
This causes three bugs.

So if we write to /dev/tpm0 with a tpm_param_size bigger than
TPM_BUFSIZE=0x1000 (e.g. 0x100a)
and a bufsize also bigger than TPM_BUFSIZE (e.g. 0x100a)
tpm_transmit returns -E2BIG which is assigned to chip->pending_data as
-7, but tpm_write returns that TPM_BUFSIZE bytes have been successfully
been written to the TPM, altough this is not true (bug #1).

As we did write more than than TPM_BUFSIZE bytes but tpm_write reports
that only TPM_BUFSIZE bytes have been written the vfs tries to write
the remaining bytes (in this case 10 bytes) to the tpm device driver via
tpm_write which then blocks at

 /* cannot perform a write until the read has cleared
 either via tpm_read or a user_read_timer timeout */
 while (atomic_read(&chip->data_pending) != 0)
	 msleep(TPM_TIMEOUT);

for 60 seconds, since data_pending is -7 and nobody is able to
read it (since tpm_read luckily checks if data_pending is greater than
0) (#bug 2).

After that the remaining bytes are written to the TPM which are
interpreted by the tpm as a normal command. (bug #3)
So if the last bytes of the command stream happen to be a e.g.
tpm_force_clear this gets accidentally sent to the TPM.

This patch fixes all three bugs, by propagating the error code of
tpm_write and returning -E2BIG if the input buffer is too big,
since the response from the tpm for a truncated value is bogus anyway.
Moreover it returns -EBUSY to userspace if there is a response ready to be
read.

Signed-off-by: Peter Huewe <peter.huewe@infineon.com>
Signed-off-by: Kent Yoder <key@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
561948d

@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Dec 5, 2012

@mfleming mfleming + Ingo Molnar x86/efi: Fix oops caused by incorrect set_memory_uc() usage
Calling __pa() with an ioremap'd address is invalid. If we
encounter an efi_memory_desc_t without EFI_MEMORY_WB set in
->attribute we currently call set_memory_uc(), which in turn
calls __pa() on a potentially ioremap'd address.

On CONFIG_X86_32 this results in the following oops:

  BUG: unable to handle kernel paging request at f7f22280
  IP: [<c10257b9>] reserve_ram_pages_type+0x89/0x210
  *pdpt = 0000000001978001 *pde = 0000000001ffb067 *pte = 0000000000000000
  Oops: 0000 [#1] PREEMPT SMP
  Modules linked in:

  Pid: 0, comm: swapper Not tainted 3.0.0-acpi-efi-0805 #3
   EIP: 0060:[<c10257b9>] EFLAGS: 00010202 CPU: 0
   EIP is at reserve_ram_pages_type+0x89/0x210
   EAX: 0070e280 EBX: 38714000 ECX: f7814000 EDX: 00000000
   ESI: 00000000 EDI: 38715000 EBP: c189fef0 ESP: c189fea8
   DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
  Process swapper (pid: 0, ti=c189e000 task=c18bbe60 task.ti=c189e000)
  Stack:
   80000200 ff108000 00000000 c189ff00 00038714 00000000 00000000 c189fed0
   c104f8ca 00038714 00000000 00038715 00000000 00000000 00038715 00000000
   00000010 38715000 c189ff48 c1025aff 38715000 00000000 00000010 00000000
  Call Trace:
   [<c104f8ca>] ? page_is_ram+0x1a/0x40
   [<c1025aff>] reserve_memtype+0xdf/0x2f0
   [<c1024dc9>] set_memory_uc+0x49/0xa0
   [<c19334d0>] efi_enter_virtual_mode+0x1c2/0x3aa
   [<c19216d4>] start_kernel+0x291/0x2f2
   [<c19211c7>] ? loglevel+0x1b/0x1b
   [<c19210bf>] i386_start_kernel+0xbf/0xc8

The only time we can call set_memory_uc() for a memory region is
when it is part of the direct kernel mapping. For the case where
we ioremap a memory region we must leave it alone.

This patch reimplements the fix from e8c7106 ("x86, efi:
Calling __pa() with an ioremap()ed address is invalid") which
was reverted in e1ad783 because it caused a regression on
some MacBooks (they hung at boot). The regression was caused
because the commit only marked EFI_RUNTIME_SERVICES_DATA as
E820_RESERVED_EFI, when it should have marked all regions that
have the EFI_MEMORY_RUNTIME attribute.

Despite first impressions, it's not possible to use
ioremap_cache() to map all cached memory regions on
CONFIG_X86_64 because of the way that the memory map might be
configured as detailed in the following bug report,

	https://bugzilla.redhat.com/show_bug.cgi?id=748516

e.g. some of the EFI memory regions *need* to be mapped as part
of the direct kernel mapping.

Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Cc: Matthew Garrett <mjg@redhat.com>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Huang Ying <huang.ying.caritas@gmail.com>
Cc: Keith Packard <keithp@keithp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1350649546-23541-1-git-send-email-matt@console-pimps.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
3e8fa26

@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Dec 5, 2012

@htejun htejun cgroup: deactivate CSS's and mark cgroup dead before invoking ->pre_d…
…estroy()

Because ->pre_destroy() could fail and can't be called under
cgroup_mutex, cgroup destruction did something very ugly.

  1. Grab cgroup_mutex and verify it can be destroyed; fail otherwise.

  2. Release cgroup_mutex and call ->pre_destroy().

  3. Re-grab cgroup_mutex and verify it can still be destroyed; fail
     otherwise.

  4. Continue destroying.

In addition to being ugly, it has been always broken in various ways.
For example, memcg ->pre_destroy() expects the cgroup to be inactive
after it's done but tasks can be attached and detached between #2 and
#3 and the conditions that memcg verified in ->pre_destroy() might no
longer hold by the time control reaches #3.

Now that ->pre_destroy() is no longer allowed to fail.  We can switch
to the following.

  1. Grab cgroup_mutex and verify it can be destroyed; fail otherwise.

  2. Deactivate CSS's and mark the cgroup removed thus preventing any
     further operations which can invalidate the verification from #1.

  3. Release cgroup_mutex and call ->pre_destroy().

  4. Re-grab cgroup_mutex and continue destroying.

After this change, controllers can safely assume that ->pre_destroy()
will only be called only once for a given cgroup and, once
->pre_destroy() is called, the cgroup will stay dormant till it's
destroyed.

This removes the only reason ->pre_destroy() can fail - new task being
attached or child cgroup being created inbetween.  Error out path is
removed and ->pre_destroy() invocation is open coded in
cgroup_rmdir().

v2: cgroup_call_pre_destroy() removal moved to this patch per Michal.
    Commit message updated per Glauber.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: Glauber Costa <glommer@parallels.com>
1a90dd5

@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Dec 5, 2012

Philipp Reisner drbd: Take a reference on tconn when finding a tconn by name
Rule #3 of kref.txt

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
0ace9df

@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Dec 5, 2012

@jonhunter jonhunter ARM: OMAP3+: Implement timer workaround for errata i103 and i767
Errata Titles:
i103: Delay needed to read some GP timer, WD timer and sync timer
      registers after wakeup (OMAP3/4)
i767: Delay needed to read some GP timer registers after wakeup (OMAP5)

Description (i103/i767):
If a General Purpose Timer (GPTimer) is in posted mode
(TSICR [2].POSTED=1), due to internal resynchronizations, values read in
TCRR, TCAR1 and TCAR2 registers right after the timer interface clock
(L4) goes from stopped to active may not return the expected values. The
most common event leading to this situation occurs upon wake up from
idle.

GPTimer non-posted synchronization mode is not impacted by this
limitation.

Workarounds:
1). Disable posted mode
2). Use static dependency between timer clock domain and MPUSS clock
    domain
3). Use no-idle mode when the timer is active

Workarounds #2 and #3 are not pratical from a power standpoint and so
workaround #1 has been implemented. Disabling posted mode adds some CPU
overhead for configuring and reading the timers as the CPU has to wait
for accesses to be re-synchronised within the timer. However, disabling
posted mode guarantees correct operation.

Please note that it is safe to use posted mode for timers if the counter
(TCRR) and capture (TCARx) registers will never be read. An example of
this is the clock-event system timer. This is used by the kernel to
schedule events however, the timers counter is never read and capture
registers are not used. Given that the kernel configures this timer
often yet never reads the counter register it is safe to enable posted
mode in this case. Hence, for the timer used for kernel clock-events,
posted mode is enabled by overriding the errata for devices that are
impacted by this defect.

For drivers using the timers that do not read the counter or capture
registers and wish to use posted mode, can override the errata and
enable posted mode by making the following function calls.

	__omap_dm_timer_override_errata(timer, OMAP_TIMER_ERRATA_I103_I767);
	__omap_dm_timer_enable_posted(timer);

Both dmtimers and watchdogs are impacted by this defect this patch only
implements the workaround for the dmtimer. Currently the watchdog driver
does not read the counter register and so no workaround is necessary.

Posted mode will be disabled for all OMAP2+ devices (including AM33xx)
using a GP timer as a clock-source timer to guarantee correct operation.
This is not necessary for OMAP24xx devices but the default clock-source
timer for OMAP24xx devices is the 32k-sync timer and not the GP timer
and so should not have any impact. This should be re-visited for future
devices if this errata is fixed.

Confirmed with Vaibhav Hiremath that this bug also impacts AM33xx
devices.

Signed-off-by: Jon Hunter <jon-hunter@ti.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
bfd6d02

@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Dec 5, 2012

@paulmck paulmck sched: Mark RCU reader in sched_show_task()
When sched_show_task() is invoked from try_to_freeze_tasks(), there is
no RCU read-side critical section, resulting in the following splat:

[  125.780730] ===============================
[  125.780766] [ INFO: suspicious RCU usage. ]
[  125.780804] 3.7.0-rc3+ #988 Not tainted
[  125.780838] -------------------------------
[  125.780875] /home/rafael/src/linux/kernel/sched/core.c:4497 suspicious rcu_dereference_check() usage!
[  125.780946]
[  125.780946] other info that might help us debug this:
[  125.780946]
[  125.781031]
[  125.781031] rcu_scheduler_active = 1, debug_locks = 0
[  125.781087] 4 locks held by s2ram/4211:
[  125.781120]  #0:  (&buffer->mutex){+.+.+.}, at: [<ffffffff811e2acf>] sysfs_write_file+0x3f/0x160
[  125.781233]  #1:  (s_active#94){.+.+.+}, at: [<ffffffff811e2b58>] sysfs_write_file+0xc8/0x160
[  125.781339]  #2:  (pm_mutex){+.+.+.}, at: [<ffffffff81090a81>] pm_suspend+0x81/0x230
[  125.781439]  #3:  (tasklist_lock){.?.?..}, at: [<ffffffff8108feed>] try_to_freeze_tasks+0x2cd/0x3f0
[  125.781543]
[  125.781543] stack backtrace:
[  125.781584] Pid: 4211, comm: s2ram Not tainted 3.7.0-rc3+ #988
[  125.781632] Call Trace:
[  125.781662]  [<ffffffff810a3c73>] lockdep_rcu_suspicious+0x103/0x140
[  125.781719]  [<ffffffff8107cf21>] sched_show_task+0x121/0x180
[  125.781770]  [<ffffffff8108ffb4>] try_to_freeze_tasks+0x394/0x3f0
[  125.781823]  [<ffffffff810903b5>] freeze_kernel_threads+0x25/0x80
[  125.781876]  [<ffffffff81090b65>] pm_suspend+0x165/0x230
[  125.781924]  [<ffffffff8108fa29>] state_store+0x99/0x100
[  125.781975]  [<ffffffff812f5867>] kobj_attr_store+0x17/0x20
[  125.782038]  [<ffffffff811e2b71>] sysfs_write_file+0xe1/0x160
[  125.782091]  [<ffffffff811667a6>] vfs_write+0xc6/0x180
[  125.782138]  [<ffffffff81166ada>] sys_write+0x5a/0xa0
[  125.782185]  [<ffffffff812ff6ae>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[  125.782242]  [<ffffffff81669dd2>] system_call_fastpath+0x16/0x1b

This commit therefore adds the needed RCU read-side critical section.

Reported-by: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
4e79752

@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Dec 5, 2012

@htejun htejun cgroup: remove incorrect dget/dput() pair in cgroup_create_dir()
cgroup_create_dir() does weird dancing with dentry refcnt.  On
success, it gets and then puts it achieving nothing.  On failure, it
puts but there isn't no matching get anywhere leading to the following
oops if cgroup_create_file() fails for whatever reason.

  ------------[ cut here ]------------
  kernel BUG at /work/os/work/fs/dcache.c:552!
  invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
  Modules linked in:
  CPU 2
  Pid: 697, comm: mkdir Not tainted 3.7.0-rc4-work+ #3 Bochs Bochs
  RIP: 0010:[<ffffffff811d9c0c>]  [<ffffffff811d9c0c>] dput+0x1dc/0x1e0
  RSP: 0018:ffff88001a3ebef8  EFLAGS: 00010246
  RAX: 0000000000000000 RBX: ffff88000e5b1ef8 RCX: 0000000000000403
  RDX: 0000000000000303 RSI: 2000000000000000 RDI: ffff88000e5b1f58
  RBP: ffff88001a3ebf18 R08: ffffffff82c76960 R09: 0000000000000001
  R10: ffff880015022080 R11: ffd9bed70f48a041 R12: 00000000ffffffea
  R13: 0000000000000001 R14: ffff88000e5b1f58 R15: 00007fff57656d60
  FS:  00007ff05fcb3800(0000) GS:ffff88001fd00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00000000004046f0 CR3: 000000001315f000 CR4: 00000000000006e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
  Process mkdir (pid: 697, threadinfo ffff88001a3ea000, task ffff880015022080)
  Stack:
   ffff88001a3ebf48 00000000ffffffea 0000000000000001 0000000000000000
   ffff88001a3ebf38 ffffffff811cc889 0000000000000001 ffff88000e5b1ef8
   ffff88001a3ebf68 ffffffff811d1fc9 ffff8800198d7f18 ffff880019106ef8
  Call Trace:
   [<ffffffff811cc889>] done_path_create+0x19/0x50
   [<ffffffff811d1fc9>] sys_mkdirat+0x59/0x80
   [<ffffffff811d2009>] sys_mkdir+0x19/0x20
   [<ffffffff81be1e02>] system_call_fastpath+0x16/0x1b
  Code: 00 48 8d 90 18 01 00 00 48 89 93 c0 00 00 00 4c 89 a0 18 01 00 00 48 8b 83 a0 00 00 00 83 80 28 01 00 00 01 e8 e6 6f a0 00 eb 92 <0f> 0b 66 90 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 49 89 fe 41
  RIP  [<ffffffff811d9c0c>] dput+0x1dc/0x1e0
   RSP <ffff88001a3ebef8>
  ---[ end trace 1277bcfd9561ddb0 ]---

Fix it by dropping the unnecessary dget/dput() pair.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Li Zefan <lizefan@huawei.com>
Cc: stable@vger.kernel.org
1754316

@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Dec 5, 2012

@olafhering olafhering + Jeff Garzik ata_piix: reenable MS Virtual PC guests
An earlier commit cd00608 ("ata_piix:
defer disks to the Hyper-V drivers by default") broke MS Virtual PC
guests. Hyper-V guests and Virtual PC guests have nearly identical DMI
info. As a result the driver does currently ignore the emulated hardware
in Virtual PC guests and defers the handling to hv_blkvsc. Since Virtual
PC does not offer paravirtualized drivers no disks will be found in the
guest.

One difference in the DMI info is the product version. This patch adds a
match for MS Virtual PC 2007 and "unignores" the emulated hardware.

This was reported for openSuSE 12.1 in bugzilla:
https://bugzilla.novell.com/show_bug.cgi?id=737532

Here is a detailed list of DMI info from example guests:

hwinfo --bios:

virtual pc guest:

  System Info: #1
    Manufacturer: "Microsoft Corporation"
    Product: "Virtual Machine"
    Version: "VS2005R2"
    Serial: "3178-9905-1533-4840-9282-0569-59"
    UUID: undefined, but settable
    Wake-up: 0x06 (Power Switch)
  Board Info: #2
    Manufacturer: "Microsoft Corporation"
    Product: "Virtual Machine"
    Version: "5.0"
    Serial: "3178-9905-1533-4840-9282-0569-59"
  Chassis Info: #3
    Manufacturer: "Microsoft Corporation"
    Version: "5.0"
    Serial: "3178-9905-1533-4840-9282-0569-59"
    Asset Tag: "7188-3705-6309-9738-9645-0364-00"
    Type: 0x03 (Desktop)
    Bootup State: 0x03 (Safe)
    Power Supply State: 0x03 (Safe)
    Thermal State: 0x01 (Other)
    Security Status: 0x01 (Other)

win2k8 guest:

  System Info: #1
    Manufacturer: "Microsoft Corporation"
    Product: "Virtual Machine"
    Version: "7.0"
    Serial: "9106-3420-9819-5495-1514-2075-48"
    UUID: undefined, but settable
    Wake-up: 0x06 (Power Switch)
  Board Info: #2
    Manufacturer: "Microsoft Corporation"
    Product: "Virtual Machine"
    Version: "7.0"
    Serial: "9106-3420-9819-5495-1514-2075-48"
  Chassis Info: #3
    Manufacturer: "Microsoft Corporation"
    Version: "7.0"
    Serial: "9106-3420-9819-5495-1514-2075-48"
    Asset Tag: "7076-9522-6699-1042-9501-1785-77"
    Type: 0x03 (Desktop)
    Bootup State: 0x03 (Safe)
    Power Supply State: 0x03 (Safe)
    Thermal State: 0x01 (Other)
    Security Status: 0x01 (Other)

win2k12 guest:

  System Info: #1
    Manufacturer: "Microsoft Corporation"
    Product: "Virtual Machine"
    Version: "7.0"
    Serial: "8179-1954-0187-0085-3868-2270-14"
    UUID: undefined, but settable
    Wake-up: 0x06 (Power Switch)
  Board Info: #2
    Manufacturer: "Microsoft Corporation"
    Product: "Virtual Machine"
    Version: "7.0"
    Serial: "8179-1954-0187-0085-3868-2270-14"
  Chassis Info: #3
    Manufacturer: "Microsoft Corporation"
    Version: "7.0"
    Serial: "8179-1954-0187-0085-3868-2270-14"
    Asset Tag: "8374-0485-4557-6331-0620-5845-25"
    Type: 0x03 (Desktop)
    Bootup State: 0x03 (Safe)
    Power Supply State: 0x03 (Safe)
    Thermal State: 0x01 (Other)
    Security Status: 0x01 (Other)

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
d990434

@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Dec 5, 2012

@sashalevin sashalevin + James Bottomley [SCSI] prevent stack buffer overflow in host_reset
store_host_reset() has tried to re-invent the wheel to compare sysfs strings.
Unfortunately it did so poorly and never bothered to check the input from
userspace before overwriting stack with it, so something simple as:

echo "WoopsieWoopsie" >
/sys/devices/pseudo_0/adapter0/host0/scsi_host/host0/host_reset

would result in:

[  316.310101] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffff81f5bac7
[  316.310101]
[  316.320051] Pid: 6655, comm: sh Tainted: G        W    3.7.0-rc5-next-20121114-sasha-00016-g5c9d68d-dirty #129
[  316.320051] Call Trace:
[  316.340058] pps pps0: PPS event at 1352918752.620355751
[  316.340062] pps pps0: capture assert seq #303
[  316.320051]  [<ffffffff83b3856b>] panic+0xcd/0x1f4
[  316.320051]  [<ffffffff81f5bac7>] ? store_host_reset+0xd7/0x100
[  316.320051]  [<ffffffff8110b996>] __stack_chk_fail+0x16/0x20
[  316.320051]  [<ffffffff81f5bac7>] store_host_reset+0xd7/0x100
[  316.320051]  [<ffffffff81e55bb3>] dev_attr_store+0x13/0x30
[  316.320051]  [<ffffffff812f7db1>] sysfs_write_file+0x101/0x170
[  316.320051]  [<ffffffff8127acc8>] vfs_write+0xb8/0x180
[  316.320051]  [<ffffffff8127ae80>] sys_write+0x50/0xa0
[  316.320051]  [<ffffffff83c03418>] tracesys+0xe1/0xe6

Fix this by uninventing whatever was going on there and just use sysfs_streq.

Bug introduced by 2944369 ("[SCSI] scsi: Added support for adapter and
firmware reset").

[jejb: added necessary const to prevent compile warnings]
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: <stable@vger.kernel.org> #3.2+
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
072f19b

@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Dec 5, 2012

Armen Baloyan + James Bottomley [SCSI] qla2xxx: Properly set result field of bsg_job reply structure …
…for success and failure.

FC transport on receiving bsg_job submission failure, calls bsg_job->job_done()
and sets the bsg_job->reply->result the returned value. In contrast, when the
success code (0) is returned fc transport doesn't call bsg_job->job_done() and
doesn't populate bsg_job->reply->result.

Signed-off-by: Steve Hodgson <steve@purestorage.com>
Signed-off-by: Armen Baloyan <armen.baloyan@qlogic.com>
Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Cc: <stable@vger.kernel.org> #3.7
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
63ea923

@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Dec 5, 2012

Giridhar Malavali + James Bottomley [SCSI] qla2xxx: Change in setting UNLOADING flag and FC vports logout…
… sequence while unloading qla2xxx driver.

Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Cc: <stable@vger.kernel.org> #3.7
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
220d36b

@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Dec 5, 2012

Steve Hodgson + James Bottomley [SCSI] qla2xxx: Free rsp_data even on error in qla2x00_process_loopba…
…ck()

Signed-off-by: Steve Hodgson <steve@purestorage.com>
Signed-off-by: Armen Baloyan <armen.baloyan@qlogic.com>
Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Cc: <stable@vger.kernel.org> #3.7
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
9bceab4

@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Dec 5, 2012

@htejun Mike Galbraith + htejun workqueue: exit rescuer_thread() as TASK_RUNNING
A rescue thread exiting TASK_INTERRUPTIBLE can lead to a task scheduling
off, never to be seen again.  In the case where this occurred, an exiting
thread hit reiserfs homebrew conditional resched while holding a mutex,
bringing the box to its knees.

PID: 18105  TASK: ffff8807fd412180  CPU: 5   COMMAND: "kdmflush"
 #0 [ffff8808157e7670] schedule at ffffffff8143f489
 #1 [ffff8808157e77b8] reiserfs_get_block at ffffffffa038ab2d [reiserfs]
 #2 [ffff8808157e79a8] __block_write_begin at ffffffff8117fb14
 #3 [ffff8808157e7a98] reiserfs_write_begin at ffffffffa0388695 [reiserfs]
 #4 [ffff8808157e7ad8] generic_perform_write at ffffffff810ee9e2
 #5 [ffff8808157e7b58] generic_file_buffered_write at ffffffff810eeb41
 #6 [ffff8808157e7ba8] __generic_file_aio_write at ffffffff810f1a3a
 #7 [ffff8808157e7c58] generic_file_aio_write at ffffffff810f1c88
 #8 [ffff8808157e7cc8] do_sync_write at ffffffff8114f850
 #9 [ffff8808157e7dd8] do_acct_process at ffffffff810a268f
    [exception RIP: kernel_thread_helper]
    RIP: ffffffff8144a5c0  RSP: ffff8808157e7f58  RFLAGS: 00000202
    RAX: 0000000000000000  RBX: 0000000000000000  RCX: 0000000000000000
    RDX: 0000000000000000  RSI: ffffffff8107af60  RDI: ffff8803ee491d18
    RBP: 0000000000000000   R8: 0000000000000000   R9: 0000000000000000
    R10: 0000000000000000  R11: 0000000000000000  R12: 0000000000000000
    R13: 0000000000000000  R14: 0000000000000000  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018

Signed-off-by: Mike Galbraith <mgalbraith@suse.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
412d32e

@popcornmix popcornmix pushed a commit that referenced this issue Dec 11, 2012

@gregkh Mike Galbraith + gregkh workqueue: exit rescuer_thread() as TASK_RUNNING
commit 412d32e upstream.

A rescue thread exiting TASK_INTERRUPTIBLE can lead to a task scheduling
off, never to be seen again.  In the case where this occurred, an exiting
thread hit reiserfs homebrew conditional resched while holding a mutex,
bringing the box to its knees.

PID: 18105  TASK: ffff8807fd412180  CPU: 5   COMMAND: "kdmflush"
 #0 [ffff8808157e7670] schedule at ffffffff8143f489
 #1 [ffff8808157e77b8] reiserfs_get_block at ffffffffa038ab2d [reiserfs]
 #2 [ffff8808157e79a8] __block_write_begin at ffffffff8117fb14
 #3 [ffff8808157e7a98] reiserfs_write_begin at ffffffffa0388695 [reiserfs]
 #4 [ffff8808157e7ad8] generic_perform_write at ffffffff810ee9e2
 #5 [ffff8808157e7b58] generic_file_buffered_write at ffffffff810eeb41
 #6 [ffff8808157e7ba8] __generic_file_aio_write at ffffffff810f1a3a
 #7 [ffff8808157e7c58] generic_file_aio_write at ffffffff810f1c88
 #8 [ffff8808157e7cc8] do_sync_write at ffffffff8114f850
 #9 [ffff8808157e7dd8] do_acct_process at ffffffff810a268f
    [exception RIP: kernel_thread_helper]
    RIP: ffffffff8144a5c0  RSP: ffff8808157e7f58  RFLAGS: 00000202
    RAX: 0000000000000000  RBX: 0000000000000000  RCX: 0000000000000000
    RDX: 0000000000000000  RSI: ffffffff8107af60  RDI: ffff8803ee491d18
    RBP: 0000000000000000   R8: 0000000000000000   R9: 0000000000000000
    R10: 0000000000000000  R11: 0000000000000000  R12: 0000000000000000
    R13: 0000000000000000  R14: 0000000000000000  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018

Signed-off-by: Mike Galbraith <mgalbraith@suse.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2ccea42

@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Dec 18, 2012

@ralfbaechle David Daney + ralfbaechle MIPS: Avoid mcheck by flushing page range in huge_ptep_set_access_fla…
…gs()

Problem:

1) Huge page mapping of anonymous memory is initially invalid.  Will be
   faulted in by copy-on-write mechanism.

2) Userspace attempts store at the end of the huge mapping.

3) TLB Refill exception handler fill TLB with a normal (4K sized)
   invalid page at the end of the huge mapping virtual address range.

4) Userspace restarted, and re-attempts the store at the end of the
   huge mapping.

5) Page from #3 is invalid, we get a fault and go to the hugepage
   fault handler.  This tries to map a huge page and calls
   huge_ptep_set_access_flags() to install the mapping.

6) We just call the generic ptep_set_access_flags() to set up the page
   tables, but the flush there assumes a normal (4K sized) page and
   only tries to flush the first part of the huge page virtual address
   out of the TLB, since the existing entry from step #3 doesn't
   conflict, nothing is flushed.

7) We attempt to load the mapping into the TLB, but because it
   conflicts with the entry from step #3, we get a Machine Check
   exception.

The fix: Flush the entire rage covered by the huge page in
huge_ptep_set_access_flags(), and remove the optimization in
local_flush_tlb_range() so that the flush actually does the correct
thing.

Signed-off-by: David Daney <david.daney@cavium.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: Hillf Danton <dhillf@gmail.com>
Patchwork: https://patchwork.linux-mips.org/patch/4661/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
(cherry picked from commit dd617f258cc39d36be26afee9912624a2d23112c)
ac53c4f

@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Dec 18, 2012

@davem330 davem330 Merge branch 'tipc_net-next_v2' of git://git.kernel.org/pub/scm/linux…
…/kernel/git/paulg/linux

Paul Gortmaker says:

====================
Changes since v1:
	-get rid of essentially unused variable spotted by
	 Neil Horman (patch #2)

	-drop patch #3; defer it for 3.9 content, so Neil,
	 Jon and Ying can discuss its specifics at their
	 leisure while net-next is closed.  (It had no
	 direct dependencies to the rest of the series, and
	 was just an optimization)

	-fix indentation of accept() code directly in place
	 vs. forking it out to a separate function (was patch
	 #10, now patch #9).

Rebuilt and re-ran tests just to ensure nothing odd happened.

Original v1 text follows, updated pull information follows that.

           ---------

Here is another batch of TIPC changes.  The most interesting
thing is probably the non-blocking socket connect - I'm told
there were several users looking forward to seeing this.

Also there were some resource limitation changes that had
the right intent back in 2005, but were now apparently causing
needless limitations to people's real use cases; those have
been relaxed/removed.

There is a lockdep splat fix, but no need for a stable backport,
since it is virtually impossible to trigger in mainline; you
have to essentially modify code to force the probabilities
in your favour to see it.

The rest can largely be categorized as general cleanup of things
seen in the process of getting the above changes done.

Tested between 64 and 32 bit nodes with the test suite.  I've
also compile tested all the individual commits on the chain.

I'd originally figured on this queue not being ready for 3.8, but
the extended stabilization window of 3.7 has changed that.  On
the other hand, this can still be 3.9 material, if that simply
works better for folks - no problem for me to defer it to 2013.
If anyone spots any problems then I'll definitely defer it,
rather than rush a last minute respin.
===================

Signed-off-by: David S. Miller <davem@davemloft.net>
ba50166

@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Dec 18, 2012

@yizou @nablio3000 yizou + nablio3000 target/tcm_fc: fix the lockdep warning due to inconsistent lock state
The lockdep warning below is in theory correct but it will be in really weird
rare situation that ends up that deadlock since the tcm fc session is hashed
based the rport id. Nonetheless, the complaining below is about rcu callback
that does the transport_deregister_session() is happening in softirq, where
transport_register_session() that happens earlier is not. This triggers the
lockdep warning below. So, just fix this to make lockdep happy by disabling
the soft irq before calling transport_register_session() in ft_prli.

BTW, this was found in FCoE VN2VN over two VMs, couple of create and destroy
would get this triggered.

v1: was enforcing register to be in softirq context which was not righ. See,
http://www.spinics.net/lists/target-devel/msg03614.html

v2: following comments from Roland&Nick (thanks), it seems we don't have to
do transport_deregister_session() in rcu callback, so move it into ft_sess_free()
but still do kfree() of the corresponding ft_sess struct in rcu callback to
make sure the ft_sess is not freed till the rcu callback.

...
[ 1328.370592] scsi2 : FCoE Driver
[ 1328.383429] fcoe: No FDMI support.
[ 1328.384509] host2: libfc: Link up on port (000000)
[ 1328.934229] host2: Assigned Port ID 00a292
[ 1357.232132] host2: rport 00a393: Remove port
[ 1357.232568] host2: rport 00a393: Port sending LOGO from Ready state
[ 1357.233692] host2: rport 00a393: Delete port
[ 1357.234472] host2: rport 00a393: work event 3
[ 1357.234969] host2: rport 00a393: callback ev 3
[ 1357.235979] host2: rport 00a393: Received a LOGO response closed
[ 1357.236706] host2: rport 00a393: work delete
[ 1357.237481]
[ 1357.237631] =================================
[ 1357.238064] [ INFO: inconsistent lock state ]
[ 1357.238450] 3.7.0-rc7-yikvm+ #3 Tainted: G           O
[ 1357.238450] ---------------------------------
[ 1357.238450] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
[ 1357.238450] ksoftirqd/0/3 [HC0[0]:SC1[1]:HE0:SE0] takes:
[ 1357.238450]  (&(&se_tpg->session_lock)->rlock){+.?...}, at: [<ffffffffa01eacd4>] transport_deregister_session+0x41/0x148 [target_core_mod]
[ 1357.238450] {SOFTIRQ-ON-W} state was registered at:
[ 1357.238450]   [<ffffffff810834f5>] mark_held_locks+0x6d/0x95
[ 1357.238450]   [<ffffffff8108364a>] trace_hardirqs_on_caller+0x12d/0x197
[ 1357.238450]   [<ffffffff810836c1>] trace_hardirqs_on+0xd/0xf
[ 1357.238450]   [<ffffffff8149caba>] _raw_spin_unlock_irq+0x2d/0x45
[ 1357.238450]   [<ffffffffa01e8d10>] __transport_register_session+0xb8/0x122 [target_core_mod]
[ 1357.238450]   [<ffffffffa01e8dbe>] transport_register_session+0x44/0x5a [target_core_mod]
[ 1357.238450]   [<ffffffffa018e32c>] ft_prli+0x1e3/0x275 [tcm_fc]
[ 1357.238450]   [<ffffffffa0160e8d>] fc_rport_recv_req+0x95e/0xdc5 [libfc]
[ 1357.238450]   [<ffffffffa015be88>] fc_lport_recv_els_req+0xc4/0xd5 [libfc]
[ 1357.238450]   [<ffffffffa015c778>] fc_lport_recv_req+0x12f/0x18f [libfc]
[ 1357.238450]   [<ffffffffa015a6d7>] fc_exch_recv+0x8ba/0x981 [libfc]
[ 1357.238450]   [<ffffffffa0176d7a>] fcoe_percpu_receive_thread+0x47a/0x4e2 [fcoe]
[ 1357.238450]   [<ffffffff810549f1>] kthread+0xb1/0xb9
[ 1357.238450]   [<ffffffff814a40ec>] ret_from_fork+0x7c/0xb0
[ 1357.238450] irq event stamp: 275411
[ 1357.238450] hardirqs last  enabled at (275410): [<ffffffff810bb6a0>] rcu_process_callbacks+0x229/0x42a
[ 1357.238450] hardirqs last disabled at (275411): [<ffffffff8149c2f7>] _raw_spin_lock_irqsave+0x22/0x8e
[ 1357.238450] softirqs last  enabled at (275394): [<ffffffff8103d669>] __do_softirq+0x246/0x26f
[ 1357.238450] softirqs last disabled at (275399): [<ffffffff8103d6bb>] run_ksoftirqd+0x29/0x62
[ 1357.238450]
[ 1357.238450] other info that might help us debug this:
[ 1357.238450]  Possible unsafe locking scenario:
[ 1357.238450]
[ 1357.238450]        CPU0
[ 1357.238450]        ----
[ 1357.238450]   lock(&(&se_tpg->session_lock)->rlock);
[ 1357.238450]   <Interrupt>
[ 1357.238450]     lock(&(&se_tpg->session_lock)->rlock);
[ 1357.238450]
[ 1357.238450]  *** DEADLOCK ***
[ 1357.238450]
[ 1357.238450] no locks held by ksoftirqd/0/3.
[ 1357.238450]
[ 1357.238450] stack backtrace:
[ 1357.238450] Pid: 3, comm: ksoftirqd/0 Tainted: G           O 3.7.0-rc7-yikvm+ #3
[ 1357.238450] Call Trace:
[ 1357.238450]  [<ffffffff8149399a>] print_usage_bug+0x1f5/0x206
[ 1357.238450]  [<ffffffff8100da59>] ? save_stack_trace+0x2c/0x49
[ 1357.238450]  [<ffffffff81082aae>] ? print_irq_inversion_bug.part.14+0x1ae/0x1ae
[ 1357.238450]  [<ffffffff81083336>] mark_lock+0x106/0x258
[ 1357.238450]  [<ffffffff81084e34>] __lock_acquire+0x2e7/0xe53
[ 1357.238450]  [<ffffffff8102903d>] ? pvclock_clocksource_read+0x48/0xb4
[ 1357.238450]  [<ffffffff810ba6a3>] ? rcu_process_gp_end+0xc0/0xc9
[ 1357.238450]  [<ffffffffa01eacd4>] ? transport_deregister_session+0x41/0x148 [target_core_mod]
[ 1357.238450]  [<ffffffff81085ef1>] lock_acquire+0x119/0x143
[ 1357.238450]  [<ffffffffa01eacd4>] ? transport_deregister_session+0x41/0x148 [target_core_mod]
[ 1357.238450]  [<ffffffff8149c329>] _raw_spin_lock_irqsave+0x54/0x8e
[ 1357.238450]  [<ffffffffa01eacd4>] ? transport_deregister_session+0x41/0x148 [target_core_mod]
[ 1357.238450]  [<ffffffffa01eacd4>] transport_deregister_session+0x41/0x148 [target_core_mod]
[ 1357.238450]  [<ffffffff810bb6a0>] ? rcu_process_callbacks+0x229/0x42a
[ 1357.238450]  [<ffffffffa018ddc5>] ft_sess_rcu_free+0x17/0x24 [tcm_fc]
[ 1357.238450]  [<ffffffffa018ddae>] ? ft_sess_free+0x1b/0x1b [tcm_fc]
[ 1357.238450]  [<ffffffff810bb6d7>] rcu_process_callbacks+0x260/0x42a
[ 1357.238450]  [<ffffffff8103d55d>] __do_softirq+0x13a/0x26f
[ 1357.238450]  [<ffffffff8149b34e>] ? __schedule+0x65f/0x68e
[ 1357.238450]  [<ffffffff8103d6bb>] run_ksoftirqd+0x29/0x62
[ 1357.238450]  [<ffffffff8105c83c>] smpboot_thread_fn+0x1a5/0x1aa
[ 1357.238450]  [<ffffffff8105c697>] ? smpboot_unregister_percpu_thread+0x47/0x47
[ 1357.238450]  [<ffffffff810549f1>] kthread+0xb1/0xb9
[ 1357.238450]  [<ffffffff8149b49d>] ? wait_for_common+0xbb/0x10a
[ 1357.238450]  [<ffffffff81054940>] ? __init_kthread_worker+0x59/0x59
[ 1357.238450]  [<ffffffff814a40ec>] ret_from_fork+0x7c/0xb0
[ 1357.238450]  [<ffffffff81054940>] ? __init_kthread_worker+0x59/0x59
[ 1417.440099]  rport-2:0-0: blocked FC remote port time out: removing rport

Signed-off-by: Yi Zou <yi.zou@intel.com>
Cc: Open-FCoE <devel@open-fcoe.org>
Cc: Nicholas A. Bellinger <nab@risingtidesystems.com>
Cc: stable@vger.kernel.org
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
9f4ad44

@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Dec 24, 2012

@htejun @torvalds htejun + torvalds memcg: don't register hotcpu notifier from ->css_alloc()
Commit 648bb56 ("cgroup: lock cgroup_mutex in cgroup_init_subsys()")
made cgroup_init_subsys() grab cgroup_mutex before invoking
->css_alloc() for the root css.  Because memcg registers hotcpu notifier
from ->css_alloc() for the root css, this introduced circular locking
dependency between cgroup_mutex and cpu hotplug.

Fix it by moving hotcpu notifier registration to a subsys initcall.

  ======================================================
  [ INFO: possible circular locking dependency detected ]
  3.7.0-rc4-work+ #42 Not tainted
  -------------------------------------------------------
  bash/645 is trying to acquire lock:
   (cgroup_mutex){+.+.+.}, at: [<ffffffff8110c5b7>] cgroup_lock+0x17/0x20

  but task is already holding lock:
   (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff8109300f>] cpu_hotplug_begin+0x2f/0x60

  which lock already depends on the new lock.

  the existing dependency chain (in reverse order) is:

 -> #1 (cpu_hotplug.lock){+.+.+.}:
         lock_acquire+0x97/0x1e0
         mutex_lock_nested+0x61/0x3b0
         get_online_cpus+0x3c/0x60
         rebuild_sched_domains_locked+0x1b/0x70
         cpuset_write_resmask+0x298/0x2c0
         cgroup_file_write+0x1ef/0x300
         vfs_write+0xa8/0x160
         sys_write+0x52/0xa0
         system_call_fastpath+0x16/0x1b

 -> #0 (cgroup_mutex){+.+.+.}:
         __lock_acquire+0x14ce/0x1d20
         lock_acquire+0x97/0x1e0
         mutex_lock_nested+0x61/0x3b0
         cgroup_lock+0x17/0x20
         cpuset_handle_hotplug+0x1b/0x560
         cpuset_update_active_cpus+0xe/0x10
         cpuset_cpu_inactive+0x47/0x50
         notifier_call_chain+0x66/0x150
         __raw_notifier_call_chain+0xe/0x10
         __cpu_notify+0x20/0x40
         _cpu_down+0x7e/0x2f0
         cpu_down+0x36/0x50
         store_online+0x5d/0xe0
         dev_attr_store+0x18/0x30
         sysfs_write_file+0xe0/0x150
         vfs_write+0xa8/0x160
         sys_write+0x52/0xa0
         system_call_fastpath+0x16/0x1b
  other info that might help us debug this:

   Possible unsafe locking scenario:

         CPU0                    CPU1
         ----                    ----
    lock(cpu_hotplug.lock);
                                 lock(cgroup_mutex);
                                 lock(cpu_hotplug.lock);
    lock(cgroup_mutex);

   *** DEADLOCK ***

  5 locks held by bash/645:
   #0:  (&buffer->mutex){+.+.+.}, at: [<ffffffff8123bab8>] sysfs_write_file+0x48/0x150
   #1:  (s_active#42){.+.+.+}, at: [<ffffffff8123bb38>] sysfs_write_file+0xc8/0x150
   #2:  (x86_cpu_hotplug_driver_mutex){+.+...}, at: [<ffffffff81079277>] cpu_hotplug_driver_lock+0x1
+7/0x20
   #3:  (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff81093157>] cpu_maps_update_begin+0x17/0x20
   #4:  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff8109300f>] cpu_hotplug_begin+0x2f/0x60

  stack backtrace:
  Pid: 645, comm: bash Not tainted 3.7.0-rc4-work+ #42
  Call Trace:
   print_circular_bug+0x28e/0x29f
   __lock_acquire+0x14ce/0x1d20
   lock_acquire+0x97/0x1e0
   mutex_lock_nested+0x61/0x3b0
   cgroup_lock+0x17/0x20
   cpuset_handle_hotplug+0x1b/0x560
   cpuset_update_active_cpus+0xe/0x10
   cpuset_cpu_inactive+0x47/0x50
   notifier_call_chain+0x66/0x150
   __raw_notifier_call_chain+0xe/0x10
   __cpu_notify+0x20/0x40
   _cpu_down+0x7e/0x2f0
   cpu_down+0x36/0x50
   store_online+0x5d/0xe0
   dev_attr_store+0x18/0x30
   sysfs_write_file+0xe0/0x150
   vfs_write+0xa8/0x160
   sys_write+0x52/0xa0
   system_call_fastpath+0x16/0x1b

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
154b454

@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Dec 24, 2012

@davem330 Eric Dumazet + davem330 ipv4: arp: fix a lockdep splat in arp_solicit()
Yan Burman reported following lockdep warning :

=============================================
[ INFO: possible recursive locking detected ]
3.7.0+ #24 Not tainted
---------------------------------------------
swapper/1/0 is trying to acquire lock:
  (&n->lock){++--..}, at: [<ffffffff8139f56e>] __neigh_event_send
+0x2e/0x2f0

but task is already holding lock:
  (&n->lock){++--..}, at: [<ffffffff813f63f4>] arp_solicit+0x1d4/0x280

other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(&n->lock);
   lock(&n->lock);

  *** DEADLOCK ***

  May be due to missing lock nesting notation

4 locks held by swapper/1/0:
  #0:  (((&n->timer))){+.-...}, at: [<ffffffff8104b350>]
call_timer_fn+0x0/0x1c0
  #1:  (&n->lock){++--..}, at: [<ffffffff813f63f4>] arp_solicit
+0x1d4/0x280
  #2:  (rcu_read_lock_bh){.+....}, at: [<ffffffff81395400>]
dev_queue_xmit+0x0/0x5d0
  #3:  (rcu_read_lock_bh){.+....}, at: [<ffffffff813cb41e>]
ip_finish_output+0x13e/0x640

stack backtrace:
Pid: 0, comm: swapper/1 Not tainted 3.7.0+ #24
Call Trace:
  <IRQ>  [<ffffffff8108c7ac>] validate_chain+0xdcc/0x11f0
  [<ffffffff8108d570>] ? __lock_acquire+0x440/0xc30
  [<ffffffff81120565>] ? kmem_cache_free+0xe5/0x1c0
  [<ffffffff8108d570>] __lock_acquire+0x440/0xc30
  [<ffffffff813c3570>] ? inet_getpeer+0x40/0x600
  [<ffffffff8108d570>] ? __lock_acquire+0x440/0xc30
  [<ffffffff8139f56e>] ? __neigh_event_send+0x2e/0x2f0
  [<ffffffff8108ddf5>] lock_acquire+0x95/0x140
  [<ffffffff8139f56e>] ? __neigh_event_send+0x2e/0x2f0
  [<ffffffff8108d570>] ? __lock_acquire+0x440/0xc30
  [<ffffffff81448d4b>] _raw_write_lock_bh+0x3b/0x50
  [<ffffffff8139f56e>] ? __neigh_event_send+0x2e/0x2f0
  [<ffffffff8139f56e>] __neigh_event_send+0x2e/0x2f0
  [<ffffffff8139f99b>] neigh_resolve_output+0x16b/0x270
  [<ffffffff813cb62d>] ip_finish_output+0x34d/0x640
  [<ffffffff813cb41e>] ? ip_finish_output+0x13e/0x640
  [<ffffffffa046f146>] ? vxlan_xmit+0x556/0xbec [vxlan]
  [<ffffffff813cb9a0>] ip_output+0x80/0xf0
  [<ffffffff813ca368>] ip_local_out+0x28/0x80
  [<ffffffffa046f25a>] vxlan_xmit+0x66a/0xbec [vxlan]
  [<ffffffffa046f146>] ? vxlan_xmit+0x556/0xbec [vxlan]
  [<ffffffff81394a50>] ? skb_gso_segment+0x2b0/0x2b0
  [<ffffffff81449355>] ? _raw_spin_unlock_irqrestore+0x65/0x80
  [<ffffffff81394c57>] ? dev_queue_xmit_nit+0x207/0x270
  [<ffffffff813950c8>] dev_hard_start_xmit+0x298/0x5d0
  [<ffffffff813956f3>] dev_queue_xmit+0x2f3/0x5d0
  [<ffffffff81395400>] ? dev_hard_start_xmit+0x5d0/0x5d0
  [<ffffffff813f5788>] arp_xmit+0x58/0x60
  [<ffffffff813f59db>] arp_send+0x3b/0x40
  [<ffffffff813f6424>] arp_solicit+0x204/0x280
  [<ffffffff813a1a70>] ? neigh_add+0x310/0x310
  [<ffffffff8139f515>] neigh_probe+0x45/0x70
  [<ffffffff813a1c10>] neigh_timer_handler+0x1a0/0x2a0
  [<ffffffff8104b3cf>] call_timer_fn+0x7f/0x1c0
  [<ffffffff8104b350>] ? detach_if_pending+0x120/0x120
  [<ffffffff8104b748>] run_timer_softirq+0x238/0x2b0
  [<ffffffff813a1a70>] ? neigh_add+0x310/0x310
  [<ffffffff81043e51>] __do_softirq+0x101/0x280
  [<ffffffff814518cc>] call_softirq+0x1c/0x30
  [<ffffffff81003b65>] do_softirq+0x85/0xc0
  [<ffffffff81043a7e>] irq_exit+0x9e/0xc0
  [<ffffffff810264f8>] smp_apic_timer_interrupt+0x68/0xa0
  [<ffffffff8145122f>] apic_timer_interrupt+0x6f/0x80
  <EOI>  [<ffffffff8100a054>] ? mwait_idle+0xa4/0x1c0
  [<ffffffff8100a04b>] ? mwait_idle+0x9b/0x1c0
  [<ffffffff8100a6a9>] cpu_idle+0x89/0xe0
  [<ffffffff81441127>] start_secondary+0x1b2/0x1b6

Bug is from arp_solicit(), releasing the neigh lock after arp_send()
In case of vxlan, we eventually need to write lock a neigh lock later.

Its a false positive, but we can get rid of it without lockdep
annotations.

We can instead use neigh_ha_snapshot() helper.

Reported-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
9650388

@Olipro Olipro pushed a commit to Olipro/linux-RPi that referenced this issue Dec 26, 2012

@xemul @davem330 xemul + davem330 packet: Protect packet sk list with mutex (v2)
Change since v1:

* Fixed inuse counters access spotted by Eric

In patch eea68e2 (packet: Report socket mclist info via diag module) I've
introduced a "scheduling in atomic" problem in packet diag module -- the
socket list is traversed under rcu_read_lock() while performed under it sk
mclist access requires rtnl lock (i.e. -- mutex) to be taken.

[152363.820563] BUG: scheduling while atomic: crtools/12517/0x10000002
[152363.820573] 4 locks held by crtools/12517:
[152363.820581]  #0:  (sock_diag_mutex){+.+.+.}, at: [<ffffffff81a2dcb5>] sock_diag_rcv+0x1f/0x3e
[152363.820613]  #1:  (sock_diag_table_mutex){+.+.+.}, at: [<ffffffff81a2de70>] sock_diag_rcv_msg+0xdb/0x11a
[152363.820644]  #2:  (nlk->cb_mutex){+.+.+.}, at: [<ffffffff81a67d01>] netlink_dump+0x23/0x1ab
[152363.820693]  #3:  (rcu_read_lock){.+.+..}, at: [<ffffffff81b6a049>] packet_diag_dump+0x0/0x1af

Similar thing was then re-introduced by further packet diag patches (fanount
mutex and pgvec mutex for rings) :(

Apart from being terribly sorry for the above, I propose to change the packet
sk list protection from spinlock to mutex. This lock currently protects two
modifications:

* sklist
* prot inuse counters

The sklist modifications can be just reprotected with mutex since they already
occur in a sleeping context. The inuse counters modifications are trickier -- the
__this_cpu_-s are used inside, thus requiring the caller to handle the potential
issues with contexts himself. Since packet sockets' counters are modified in two
places only (packet_create and packet_release) we only need to protect the context
from being preempted. BH disabling is not required in this case.

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
0fa7fa9

@Olipro Olipro pushed a commit to Olipro/linux-RPi that referenced this issue Dec 26, 2012

Rob Herring + Russell King ARM: 7493/1: use generic unaligned.h
This moves ARM over to the asm-generic/unaligned.h header. This has the
benefit of better code generated especially for ARMv7 on gcc 4.7+
compilers.

As Arnd Bergmann, points out: The asm-generic version uses the "struct"
version for native-endian unaligned access and the "byteshift" version
for the opposite endianess. The current ARM version however uses the
"byteshift" implementation for both.

Thanks to Nicolas Pitre for the excellent analysis:

Test case:

int foo (int *x) { return get_unaligned(x); }
long long bar (long long *x) { return get_unaligned(x); }

With the current ARM version:

foo:
	ldrb	r3, [r0, #2]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D) + 2B], MEM[(const u8 *)x_1(D) + 2B]
	ldrb	r1, [r0, #1]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D) + 1B], MEM[(const u8 *)x_1(D) + 1B]
	ldrb	r2, [r0, #0]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D)], MEM[(const u8 *)x_1(D)]
	mov	r3, r3, asl #16	@ tmp154, MEM[(const u8 *)x_1(D) + 2B],
	ldrb	r0, [r0, #3]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D) + 3B], MEM[(const u8 *)x_1(D) + 3B]
	orr	r3, r3, r1, asl #8	@, tmp155, tmp154, MEM[(const u8 *)x_1(D) + 1B],
	orr	r3, r3, r2	@ tmp157, tmp155, MEM[(const u8 *)x_1(D)]
	orr	r0, r3, r0, asl #24	@,, tmp157, MEM[(const u8 *)x_1(D) + 3B],
	bx	lr	@

bar:
	stmfd	sp!, {r4, r5, r6, r7}	@,
	mov	r2, #0	@ tmp184,
	ldrb	r5, [r0, #6]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D) + 6B], MEM[(const u8 *)x_1(D) + 6B]
	ldrb	r4, [r0, #5]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D) + 5B], MEM[(const u8 *)x_1(D) + 5B]
	ldrb	ip, [r0, #2]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D) + 2B], MEM[(const u8 *)x_1(D) + 2B]
	ldrb	r1, [r0, #4]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D) + 4B], MEM[(const u8 *)x_1(D) + 4B]
	mov	r5, r5, asl #16	@ tmp175, MEM[(const u8 *)x_1(D) + 6B],
	ldrb	r7, [r0, #1]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D) + 1B], MEM[(const u8 *)x_1(D) + 1B]
	orr	r5, r5, r4, asl #8	@, tmp176, tmp175, MEM[(const u8 *)x_1(D) + 5B],
	ldrb	r6, [r0, #7]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D) + 7B], MEM[(const u8 *)x_1(D) + 7B]
	orr	r5, r5, r1	@ tmp178, tmp176, MEM[(const u8 *)x_1(D) + 4B]
	ldrb	r4, [r0, #0]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D)], MEM[(const u8 *)x_1(D)]
	mov	ip, ip, asl #16	@ tmp188, MEM[(const u8 *)x_1(D) + 2B],
	ldrb	r1, [r0, #3]	@ zero_extendqisi2	@ MEM[(const u8 *)x_1(D) + 3B], MEM[(const u8 *)x_1(D) + 3B]
	orr	ip, ip, r7, asl #8	@, tmp189, tmp188, MEM[(const u8 *)x_1(D) + 1B],
	orr	r3, r5, r6, asl #24	@,, tmp178, MEM[(const u8 *)x_1(D) + 7B],
	orr	ip, ip, r4	@ tmp191, tmp189, MEM[(const u8 *)x_1(D)]
	orr	ip, ip, r1, asl #24	@, tmp194, tmp191, MEM[(const u8 *)x_1(D) + 3B],
	mov	r1, r3	@,
	orr	r0, r2, ip	@ tmp171, tmp184, tmp194
	ldmfd	sp!, {r4, r5, r6, r7}
	bx	lr

In both cases the code is slightly suboptimal.  One may wonder why
wasting r2 with the constant 0 in the second case for example.  And all
the mov's could be folded in subsequent orr's, etc.

Now with the asm-generic version:

foo:
	ldr	r0, [r0, #0]	@ unaligned	@,* x
	bx	lr	@

bar:
	mov	r3, r0	@ x, x
	ldr	r0, [r0, #0]	@ unaligned	@,* x
	ldr	r1, [r3, #4]	@ unaligned	@,
	bx	lr	@

This is way better of course, but only because this was compiled for
ARMv7. In this case the compiler knows that the hardware can do
unaligned word access.  This isn't that obvious for foo(), but if we
remove the get_unaligned() from bar as follows:

long long bar (long long *x) {return *x; }

then the resulting code is:

bar:
	ldmia	r0, {r0, r1}	@ x,,
	bx	lr	@

So this proves that the presumed aligned vs unaligned cases does have
influence on the instructions the compiler may use and that the above
unaligned code results are not just an accident.

Still... this isn't fully conclusive without at least looking at the
resulting assembly fron a pre ARMv6 compilation.  Let's see with an
ARMv5 target:

foo:
	ldrb	r3, [r0, #0]	@ zero_extendqisi2	@ tmp139,* x
	ldrb	r1, [r0, #1]	@ zero_extendqisi2	@ tmp140,
	ldrb	r2, [r0, #2]	@ zero_extendqisi2	@ tmp143,
	ldrb	r0, [r0, #3]	@ zero_extendqisi2	@ tmp146,
	orr	r3, r3, r1, asl #8	@, tmp142, tmp139, tmp140,
	orr	r3, r3, r2, asl #16	@, tmp145, tmp142, tmp143,
	orr	r0, r3, r0, asl #24	@,, tmp145, tmp146,
	bx	lr	@

bar:
	stmfd	sp!, {r4, r5, r6, r7}	@,
	ldrb	r2, [r0, #0]	@ zero_extendqisi2	@ tmp139,* x
	ldrb	r7, [r0, #1]	@ zero_extendqisi2	@ tmp140,
	ldrb	r3, [r0, #4]	@ zero_extendqisi2	@ tmp149,
	ldrb	r6, [r0, #5]	@ zero_extendqisi2	@ tmp150,
	ldrb	r5, [r0, #2]	@ zero_extendqisi2	@ tmp143,
	ldrb	r4, [r0, #6]	@ zero_extendqisi2	@ tmp153,
	ldrb	r1, [r0, #7]	@ zero_extendqisi2	@ tmp156,
	ldrb	ip, [r0, #3]	@ zero_extendqisi2	@ tmp146,
	orr	r2, r2, r7, asl #8	@, tmp142, tmp139, tmp140,
	orr	r3, r3, r6, asl #8	@, tmp152, tmp149, tmp150,
	orr	r2, r2, r5, asl #16	@, tmp145, tmp142, tmp143,
	orr	r3, r3, r4, asl #16	@, tmp155, tmp152, tmp153,
	orr	r0, r2, ip, asl #24	@,, tmp145, tmp146,
	orr	r1, r3, r1, asl #24	@,, tmp155, tmp156,
	ldmfd	sp!, {r4, r5, r6, r7}
	bx	lr

Compared to the initial results, this is really nicely optimized and I
couldn't do much better if I were to hand code it myself.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
d25c881

@Olipro Olipro pushed a commit to Olipro/linux-RPi that referenced this issue Dec 26, 2012

@davem330 Eric Dumazet + davem330 net: qdisc busylock needs lockdep annotations
It seems we need to provide ability for stacked devices
to use specific lock_class_key for sch->busylock

We could instead default l2tpeth tx_queue_len to 0 (no qdisc), but
a user might use a qdisc anyway.

(So same fixes are probably needed on non LLTX stacked drivers)

Noticed while stressing L2TPV3 setup :

======================================================
 [ INFO: possible circular locking dependency detected ]
 3.6.0-rc3+ #788 Not tainted
 -------------------------------------------------------
 netperf/4660 is trying to acquire lock:
  (l2tpsock){+.-...}, at: [<ffffffffa0208db2>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core]

 but task is already holding lock:
  (&(&sch->busylock)->rlock){+.-...}, at: [<ffffffff81596595>] dev_queue_xmit+0xd75/0xe00

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #1 (&(&sch->busylock)->rlock){+.-...}:
        [<ffffffff810a5df0>] lock_acquire+0x90/0x200
        [<ffffffff817499fc>] _raw_spin_lock_irqsave+0x4c/0x60
        [<ffffffff81074872>] __wake_up+0x32/0x70
        [<ffffffff8136d39e>] tty_wakeup+0x3e/0x80
        [<ffffffff81378fb3>] pty_write+0x73/0x80
        [<ffffffff8136cb4c>] tty_put_char+0x3c/0x40
        [<ffffffff813722b2>] process_echoes+0x142/0x330
        [<ffffffff813742ab>] n_tty_receive_buf+0x8fb/0x1230
        [<ffffffff813777b2>] flush_to_ldisc+0x142/0x1c0
        [<ffffffff81062818>] process_one_work+0x198/0x760
        [<ffffffff81063236>] worker_thread+0x186/0x4b0
        [<ffffffff810694d3>] kthread+0x93/0xa0
        [<ffffffff81753e24>] kernel_thread_helper+0x4/0x10

 -> #0 (l2tpsock){+.-...}:
        [<ffffffff810a5288>] __lock_acquire+0x1628/0x1b10
        [<ffffffff810a5df0>] lock_acquire+0x90/0x200
        [<ffffffff817498c1>] _raw_spin_lock+0x41/0x50
        [<ffffffffa0208db2>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
        [<ffffffffa021a802>] l2tp_eth_dev_xmit+0x32/0x60 [l2tp_eth]
        [<ffffffff815952b2>] dev_hard_start_xmit+0x502/0xa70
        [<ffffffff815b63ce>] sch_direct_xmit+0xfe/0x290
        [<ffffffff81595a05>] dev_queue_xmit+0x1e5/0xe00
        [<ffffffff815d9d60>] ip_finish_output+0x3d0/0x890
        [<ffffffff815db019>] ip_output+0x59/0xf0
        [<ffffffff815da36d>] ip_local_out+0x2d/0xa0
        [<ffffffff815da5a3>] ip_queue_xmit+0x1c3/0x680
        [<ffffffff815f4192>] tcp_transmit_skb+0x402/0xa60
        [<ffffffff815f4a94>] tcp_write_xmit+0x1f4/0xa30
        [<ffffffff815f5300>] tcp_push_one+0x30/0x40
        [<ffffffff815e6672>] tcp_sendmsg+0xe82/0x1040
        [<ffffffff81614495>] inet_sendmsg+0x125/0x230
        [<ffffffff81576cdc>] sock_sendmsg+0xdc/0xf0
        [<ffffffff81579ece>] sys_sendto+0xfe/0x130
        [<ffffffff81752c92>] system_call_fastpath+0x16/0x1b
  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&(&sch->busylock)->rlock);
                                lock(l2tpsock);
                                lock(&(&sch->busylock)->rlock);
   lock(l2tpsock);

  *** DEADLOCK ***

 5 locks held by netperf/4660:
  #0:  (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff815e581c>] tcp_sendmsg+0x2c/0x1040
  #1:  (rcu_read_lock){.+.+..}, at: [<ffffffff815da3e0>] ip_queue_xmit+0x0/0x680
  #2:  (rcu_read_lock_bh){.+....}, at: [<ffffffff815d9ac5>] ip_finish_output+0x135/0x890
  #3:  (rcu_read_lock_bh){.+....}, at: [<ffffffff81595820>] dev_queue_xmit+0x0/0xe00
  #4:  (&(&sch->busylock)->rlock){+.-...}, at: [<ffffffff81596595>] dev_queue_xmit+0xd75/0xe00

 stack backtrace:
 Pid: 4660, comm: netperf Not tainted 3.6.0-rc3+ #788
 Call Trace:
  [<ffffffff8173dbf8>] print_circular_bug+0x1fb/0x20c
  [<ffffffff810a5288>] __lock_acquire+0x1628/0x1b10
  [<ffffffff810a334b>] ? check_usage+0x9b/0x4d0
  [<ffffffff810a3f44>] ? __lock_acquire+0x2e4/0x1b10
  [<ffffffff810a5df0>] lock_acquire+0x90/0x200
  [<ffffffffa0208db2>] ? l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
  [<ffffffff817498c1>] _raw_spin_lock+0x41/0x50
  [<ffffffffa0208db2>] ? l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
  [<ffffffffa0208db2>] l2tp_xmit_skb+0x172/0xa50 [l2tp_core]
  [<ffffffffa021a802>] l2tp_eth_dev_xmit+0x32/0x60 [l2tp_eth]
  [<ffffffff815952b2>] dev_hard_start_xmit+0x502/0xa70
  [<ffffffff81594e0e>] ? dev_hard_start_xmit+0x5e/0xa70
  [<ffffffff81595961>] ? dev_queue_xmit+0x141/0xe00
  [<ffffffff815b63ce>] sch_direct_xmit+0xfe/0x290
  [<ffffffff81595a05>] dev_queue_xmit+0x1e5/0xe00
  [<ffffffff81595820>] ? dev_hard_start_xmit+0xa70/0xa70
  [<ffffffff815d9d60>] ip_finish_output+0x3d0/0x890
  [<ffffffff815d9ac5>] ? ip_finish_output+0x135/0x890
  [<ffffffff815db019>] ip_output+0x59/0xf0
  [<ffffffff815da36d>] ip_local_out+0x2d/0xa0
  [<ffffffff815da5a3>] ip_queue_xmit+0x1c3/0x680
  [<ffffffff815da3e0>] ? ip_local_out+0xa0/0xa0
  [<ffffffff815f4192>] tcp_transmit_skb+0x402/0xa60
  [<ffffffff815fa25e>] ? tcp_md5_do_lookup+0x18e/0x1a0
  [<ffffffff815f4a94>] tcp_write_xmit+0x1f4/0xa30
  [<ffffffff815f5300>] tcp_push_one+0x30/0x40
  [<ffffffff815e6672>] tcp_sendmsg+0xe82/0x1040
  [<ffffffff81614495>] inet_sendmsg+0x125/0x230
  [<ffffffff81614370>] ? inet_create+0x6b0/0x6b0
  [<ffffffff8157e6e2>] ? sock_update_classid+0xc2/0x3b0
  [<ffffffff8157e750>] ? sock_update_classid+0x130/0x3b0
  [<ffffffff81576cdc>] sock_sendmsg+0xdc/0xf0
  [<ffffffff81162579>] ? fget_light+0x3f9/0x4f0
  [<ffffffff81579ece>] sys_sendto+0xfe/0x130
  [<ffffffff810a69ad>] ? trace_hardirqs_on+0xd/0x10
  [<ffffffff8174a0b0>] ? _raw_spin_unlock_irq+0x30/0x50
  [<ffffffff810757e3>] ? finish_task_switch+0x83/0xf0
  [<ffffffff810757a6>] ? finish_task_switch+0x46/0xf0
  [<ffffffff81752cb7>] ? sysret_check+0x1b/0x56
  [<ffffffff81752c92>] system_call_fastpath+0x16/0x1b

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
23d3b8b

@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Jan 15, 2013

@torvalds Jiri Kosina + torvalds mm: mmap: annotate vm_lock_anon_vma locking properly for lockdep
Commit 5a50508 ("mm/rmap: Convert the struct anon_vma::mutex to an
rwsem") turned anon_vma mutex to rwsem.

However, the properly annotated nested locking in mm_take_all_locks()
has been converted from

	mutex_lock_nest_lock(&anon_vma->root->mutex, &mm->mmap_sem);

to

	down_write(&anon_vma->root->rwsem);

which is incomplete, and causes the false positive report from lockdep
below.

Annotate the fact that mmap_sem is used as an outter lock to serialize
taking of all the anon_vma rwsems at once no matter the order, using the
down_write_nest_lock() primitive.

This patch fixes this lockdep report:

 =============================================
 [ INFO: possible recursive locking detected ]
 3.8.0-rc2-00036-g5f73896 #171 Not tainted
 ---------------------------------------------
 qemu-kvm/2315 is trying to acquire lock:
  (&anon_vma->rwsem){+.+...}, at: mm_take_all_locks+0x149/0x1b0

 but task is already holding lock:
  (&anon_vma->rwsem){+.+...}, at: mm_take_all_locks+0x149/0x1b0

 other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(&anon_vma->rwsem);
   lock(&anon_vma->rwsem);

  *** DEADLOCK ***

  May be due to missing lock nesting notation

 4 locks held by qemu-kvm/2315:
  #0:  (&mm->mmap_sem){++++++}, at: do_mmu_notifier_register+0xfc/0x170
  #1:  (mm_all_locks_mutex){+.+...}, at: mm_take_all_locks+0x36/0x1b0
  #2:  (&mapping->i_mmap_mutex){+.+...}, at: mm_take_all_locks+0xc9/0x1b0
  #3:  (&anon_vma->rwsem){+.+...}, at: mm_take_all_locks+0x149/0x1b0

 stack backtrace:
 Pid: 2315, comm: qemu-kvm Not tainted 3.8.0-rc2-00036-g5f73896 #171
 Call Trace:
   print_deadlock_bug+0xf2/0x100
   validate_chain+0x4f6/0x720
   __lock_acquire+0x359/0x580
   lock_acquire+0x121/0x190
   down_write+0x3f/0x70
   mm_take_all_locks+0x149/0x1b0
   do_mmu_notifier_register+0x68/0x170
   mmu_notifier_register+0xe/0x10
   kvm_create_vm+0x22b/0x330 [kvm]
   kvm_dev_ioctl+0xf8/0x1a0 [kvm]
   do_vfs_ioctl+0x9d/0x350
   sys_ioctl+0x91/0xb0
   system_call_fastpath+0x16/0x1b

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Cc: Rik van Riel <riel@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mel Gorman <mel@csn.ul.ie>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
572043c

@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Jan 29, 2013

@davem330 davem330 Merge branch 'usb_cdc_fixes'
Bjørn Mork says:

====================
The 2 first patches in this series are required to make the Sierra
Wireless MC7710 card work in MBIM mode.  They may also be
required for other Qualcomm firmware based MBIM devices.

Patch #1 was previously posted as a standalone patch.  This version
is a replacement, removing a theoretical NULL pointer exception.

Patch #3 fixes a bug I introduced in v3.7
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
f91f334

@swarren swarren pushed a commit to swarren/linux-rpi that referenced this issue Feb 9, 2013

Andre Guedes + Gustavo Padovan Bluetooth: Reduce critical section in sco_conn_ready
This patch reduces the critical section protected by sco_conn_lock in
sco_conn_ready function. The lock is acquired only when it is really
needed.

This patch fixes the following lockdep warning which is generated
when the host terminates a SCO connection.

Today, this warning is a false positive. There is no way those
two threads reported by lockdep are running at the same time since
hdev->workqueue (where rx_work is queued) is single-thread. However,
if somehow this behavior is changed in future, we will have a
potential deadlock.

======================================================
[ INFO: possible circular locking dependency detected ]
3.8.0-rc1+ #7 Not tainted
-------------------------------------------------------
kworker/u:1H/1018 is trying to acquire lock:
 (&(&conn->lock)->rlock){+.+...}, at: [<ffffffffa0033ba6>] sco_chan_del+0x66/0x190 [bluetooth]

but task is already holding lock:
 (slock-AF_BLUETOOTH-BTPROTO_SCO){+.+...}, at: [<ffffffffa0033d5a>] sco_conn_del+0x8a/0xe0 [bluetooth]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (slock-AF_BLUETOOTH-BTPROTO_SCO){+.+...}:
       [<ffffffff81083011>] lock_acquire+0xb1/0xe0
       [<ffffffff813efd01>] _raw_spin_lock+0x41/0x80
       [<ffffffffa003436e>] sco_connect_cfm+0xbe/0x350 [bluetooth]
       [<ffffffffa0015d6c>] hci_event_packet+0xd3c/0x29b0 [bluetooth]
       [<ffffffffa0004583>] hci_rx_work+0x133/0x870 [bluetooth]
       [<ffffffff8104d65f>] process_one_work+0x2bf/0x4f0
       [<ffffffff81050022>] worker_thread+0x2b2/0x3e0
       [<ffffffff81056021>] kthread+0xd1/0xe0
       [<ffffffff813f14bc>] ret_from_fork+0x7c/0xb0

-> #0 (&(&conn->lock)->rlock){+.+...}:
       [<ffffffff81082215>] __lock_acquire+0x1465/0x1c70
       [<ffffffff81083011>] lock_acquire+0xb1/0xe0
       [<ffffffff813efd01>] _raw_spin_lock+0x41/0x80
       [<ffffffffa0033ba6>] sco_chan_del+0x66/0x190 [bluetooth]
       [<ffffffffa0033d6d>] sco_conn_del+0x9d/0xe0 [bluetooth]
       [<ffffffffa0034653>] sco_disconn_cfm+0x53/0x60 [bluetooth]
       [<ffffffffa000fef3>] hci_disconn_complete_evt.isra.54+0x363/0x3c0 [bluetooth]
       [<ffffffffa00150f7>] hci_event_packet+0xc7/0x29b0 [bluetooth]
       [<ffffffffa0004583>] hci_rx_work+0x133/0x870 [bluetooth]
       [<ffffffff8104d65f>] process_one_work+0x2bf/0x4f0
       [<ffffffff81050022>] worker_thread+0x2b2/0x3e0
       [<ffffffff81056021>] kthread+0xd1/0xe0
       [<ffffffff813f14bc>] ret_from_fork+0x7c/0xb0

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(slock-AF_BLUETOOTH-BTPROTO_SCO);
                               lock(&(&conn->lock)->rlock);
                               lock(slock-AF_BLUETOOTH-BTPROTO_SCO);
  lock(&(&conn->lock)->rlock);

 *** DEADLOCK ***

4 locks held by kworker/u:1H/1018:
 #0:  (hdev->name#2){.+.+.+}, at: [<ffffffff8104d5f8>] process_one_work+0x258/0x4f0
 #1:  ((&hdev->rx_work)){+.+.+.}, at: [<ffffffff8104d5f8>] process_one_work+0x258/0x4f0
 #2:  (&hdev->lock){+.+.+.}, at: [<ffffffffa000fbe9>] hci_disconn_complete_evt.isra.54+0x59/0x3c0 [bluetooth]
 #3:  (slock-AF_BLUETOOTH-BTPROTO_SCO){+.+...}, at: [<ffffffffa0033d5a>] sco_conn_del+0x8a/0xe0 [bluetooth]

stack backtrace:
Pid: 1018, comm: kworker/u:1H Not tainted 3.8.0-rc1+ #7
Call Trace:
 [<ffffffff813e92f9>] print_circular_bug+0x1fb/0x20c
 [<ffffffff81082215>] __lock_acquire+0x1465/0x1c70
 [<ffffffff81083011>] lock_acquire+0xb1/0xe0
 [<ffffffffa0033ba6>] ? sco_chan_del+0x66/0x190 [bluetooth]
 [<ffffffff813efd01>] _raw_spin_lock+0x41/0x80
 [<ffffffffa0033ba6>] ? sco_chan_del+0x66/0x190 [bluetooth]
 [<ffffffffa0033ba6>] sco_chan_del+0x66/0x190 [bluetooth]
 [<ffffffffa0033d6d>] sco_conn_del+0x9d/0xe0 [bluetooth]
 [<ffffffffa0034653>] sco_disconn_cfm+0x53/0x60 [bluetooth]
 [<ffffffffa000fef3>] hci_disconn_complete_evt.isra.54+0x363/0x3c0 [bluetooth]
 [<ffffffffa000fbd0>] ? hci_disconn_complete_evt.isra.54+0x40/0x3c0 [bluetooth]
 [<ffffffffa00150f7>] hci_event_packet+0xc7/0x29b0 [bluetooth]
 [<ffffffff81202e90>] ? __dynamic_pr_debug+0x80/0x90
 [<ffffffff8133ff7d>] ? kfree_skb+0x2d/0x40
 [<ffffffffa0021644>] ? hci_send_to_monitor+0x1a4/0x1c0 [bluetooth]
 [<ffffffffa0004583>] hci_rx_work+0x133/0x870 [bluetooth]
 [<ffffffff8104d5f8>] ? process_one_work+0x258/0x4f0
 [<ffffffff8104d65f>] process_one_work+0x2bf/0x4f0
 [<ffffffff8104d5f8>] ? process_one_work+0x258/0x4f0
 [<ffffffff8104fdc1>] ? worker_thread+0x51/0x3e0
 [<ffffffffa0004450>] ? hci_tx_work+0x800/0x800 [bluetooth]
 [<ffffffff81050022>] worker_thread+0x2b2/0x3e0
 [<ffffffff8104fd70>] ? busy_worker_rebind_fn+0x100/0x100
 [<ffffffff81056021>] kthread+0xd1/0xe0
 [<ffffffff81055f50>] ? flush_kthread_worker+0xc0/0xc0
 [<ffffffff813f14bc>] ret_from_fork+0x7c/0xb0
 [<ffffffff81055f50>] ? flush_kthread_worker+0xc0/0xc0

Signed-off-by: Andre Guedes <andre.guedes@openbossa.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
4052808

@popcornmix popcornmix pushed a commit that referenced this issue May 9, 2016

Marek Vasut + Jonathan Cameron iio:adc:at91-sama5d2: Repair crash on module removal
The driver never calls platform_set_drvdata() , so platform_get_drvdata()
in .remove returns NULL and thus $indio_dev variable in .remove is NULL.
Then it's only a matter of dereferencing the indio_dev variable to make
the kernel blow as seen below. This patch adds the platform_set_drvdata()
call to fix the problem.

root@armhf:~# rmmod at91-sama5d2_adc

Unable to handle kernel NULL pointer dereference at virtual address 000001d4
pgd = dd57c000
[000001d4] *pgd=00000000
Internal error: Oops: 5 [#1] ARM
Modules linked in: at91_sama5d2_adc(-)
CPU: 0 PID: 1334 Comm: rmmod Not tainted 4.6.0-rc3-next-20160418+ #3
Hardware name: Atmel SAMA5
task: dd4fcc40 ti: de910000 task.ti: de910000
PC is at mutex_lock+0x4/0x24
LR is at iio_device_unregister+0x14/0x6c
pc : [<c05f4624>]    lr : [<c0471f74>]    psr: a00d0013
               sp : de911f00  ip : 00000000  fp : be898bd8
r10: 00000000  r9 : de910000  r8 : c0107724
r7 : 00000081  r6 : bf001048  r5 : 000001d4  r4 : 00000000
r3 : bf000000  r2 : 00000000  r1 : 00000004  r0 : 000001d4
Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 10c53c7d  Table: 3d57c059  DAC: 00000051
Process rmmod (pid: 1334, stack limit = 0xde910208)
Stack: (0xde911f00 to 0xde912000)
1f00: bf000000 00000000 df5c7e10 bf000010 bf000000 df5c7e10 df5c7e10 c0351ca8
1f20: c0351c84 df5c7e10 bf001048 c0350734 bf001048 df5c7e10 df5c7e44 c035087c
1f40: bf001048 7f62dd4c 00000800 c034fb30 bf0010c0 c0158ee8 de910000 31397461
1f60: 6d61735f 32643561 6364615f 00000000 de911f90 de910000 de910000 00000000
1f80: de911fb0 10c53c7d de911f9c c05f33d8 de911fa0 00910000 be898ecb 7f62dd10
1fa0: 00000000 c0107560 be898ecb 7f62dd10 7f62dd4c 00000800 6f844800 6f844800
1fc0: be898ecb 7f62dd10 00000000 00000081 00000000 7f62dd10 be898bd8 be898bd8
1fe0: b6eedab1 be898b6c 7f61056b b6eedab6 000d0030 7f62dd4c 00000000 00000000
[<c05f4624>] (mutex_lock) from [<c0471f74>] (iio_device_unregister+0x14/0x6c)
[<c0471f74>] (iio_device_unregister) from [<bf000010>] (at91_adc_remove+0x10/0x3c [at91_sama5d2_adc])
[<bf000010>] (at91_adc_remove [at91_sama5d2_adc]) from [<c0351ca8>] (platform_drv_remove+0x24/0x3c)
[<c0351ca8>] (platform_drv_remove) from [<c0350734>] (__device_release_driver+0x84/0x110)
[<c0350734>] (__device_release_driver) from [<c035087c>] (driver_detach+0x8c/0x90)
[<c035087c>] (driver_detach) from [<c034fb30>] (bus_remove_driver+0x4c/0xa0)
[<c034fb30>] (bus_remove_driver) from [<c0158ee8>] (SyS_delete_module+0x110/0x1d0)
[<c0158ee8>] (SyS_delete_module) from [<c0107560>] (ret_fast_syscall+0x0/0x3c)
Code: e3520001 1affffd5 eafffff4 f5d0f000 (e1902f9f)
---[ end trace 86914d7ad3696fca ]---

Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Ludovic Desroches <ludovic.desroches@atmel.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
8e6cb47

@davet321 davet321 pushed a commit to davet321/rpi-linux that referenced this issue May 16, 2016

@mchehab mchehab Revert "[media] videobuf2-v4l2: Verify planes array in buffer dequeue…
…ing"

This patch causes a Kernel panic when called on a DVB driver.

This was also reported by David R <david@unsolicited.net>:

May  7 14:47:35 server kernel: [  501.247123] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
May  7 14:47:35 server kernel: [  501.247239] IP: [<ffffffffa0222c71>] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2]
May  7 14:47:35 server kernel: [  501.247354] PGD cae6f067 PUD ca99c067 PMD 0
May  7 14:47:35 server kernel: [  501.247426] Oops: 0000 [#1] SMP
May  7 14:47:35 server kernel: [  501.247482] Modules linked in: xfs tun xt_connmark xt_TCPMSS xt_tcpmss xt_owner xt_REDIRECT nf_nat_redirect xt_nat ipt_MASQUERADE nf_nat_masquerade_ipv4 ts_kmp ts_bm xt_string ipt_REJECT nf_reject_ipv4 xt_recent xt_conntrack xt_multiport xt_pkttype xt_tcpudp xt_mark nf_log_ipv4 nf_log_common xt_LOG xt_limit iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables ip6table_filter ip6_tables x_tables pppoe pppox dm_crypt ts2020 regmap_i2c ds3000 cx88_dvb dvb_pll cx88_vp3054_i2c mt352 videobuf2_dvb cx8800 cx8802 cx88xx pl2303 tveeprom videobuf2_dma_sg ppdev videobuf2_memops videobuf2_v4l2 videobuf2_core dvb_usb_digitv snd_hda_codec_via snd_hda_codec_hdmi snd_hda_codec_generic radeon dvb_usb snd_hda_intel amd64_edac_mod serio_raw snd_hda_codec edac_core fbcon k10temp bitblit softcursor snd_hda_core font snd_pcm_oss i2c_piix4 snd_mixer_oss tileblit drm_kms_helper syscopyarea snd_pcm snd_seq_dummy sysfillrect snd_seq_oss sysimgblt fb_sys_fops ttm snd_seq_midi r8169 snd_rawmidi drm snd_seq_midi_event e1000e snd_seq snd_seq_device snd_timer snd ptp pps_core i2c_algo_bit soundcore parport_pc ohci_pci shpchp tpm_tis tpm nfsd auth_rpcgss oid_registry hwmon_vid exportfs nfs_acl mii nfs bonding lockd grace lp sunrpc parport
May  7 14:47:35 server kernel: [  501.249564] CPU: 1 PID: 6889 Comm: vb2-cx88[0] Not tainted 4.5.3 #3
May  7 14:47:35 server kernel: [  501.249644] Hardware name: System manufacturer System Product Name/M4A785TD-V EVO, BIOS 0211    07/08/2009
May  7 14:47:35 server kernel: [  501.249767] task: ffff8800aebf3600 ti: ffff8801e07a0000 task.ti: ffff8801e07a0000
May  7 14:47:35 server kernel: [  501.249861] RIP: 0010:[<ffffffffa0222c71>]  [<ffffffffa0222c71>] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2]
May  7 14:47:35 server kernel: [  501.250002] RSP: 0018:ffff8801e07a3de8  EFLAGS: 00010086
May  7 14:47:35 server kernel: [  501.250071] RAX: 0000000000000283 RBX: ffff880210dc5000 RCX: 0000000000000283
May  7 14:47:35 server kernel: [  501.250161] RDX: ffffffffa0222cf0 RSI: 0000000000000000 RDI: ffff880210dc5014
May  7 14:47:35 server kernel: [  501.250251] RBP: ffff8801e07a3df8 R08: ffff8801e07a0000 R09: 0000000000000000
May  7 14:47:35 server kernel: [  501.250348] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8800cda2a9d8
May  7 14:47:35 server kernel: [  501.250438] R13: ffff880210dc51b8 R14: 0000000000000000 R15: ffff8800cda2a828
May  7 14:47:35 server kernel: [  501.250528] FS:  00007f5b77fff700(0000) GS:ffff88021fc40000(0000) knlGS:00000000adaffb40
May  7 14:47:35 server kernel: [  501.250631] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
May  7 14:47:35 server kernel: [  501.250704] CR2: 0000000000000004 CR3: 00000000ca19d000 CR4: 00000000000006e0
May  7 14:47:35 server kernel: [  501.250794] Stack:
May  7 14:47:35 server kernel: [  501.250822]  ffff8801e07a3df8 ffffffffa0222cfd ffff8801e07a3e70 ffffffffa0236beb
May  7 14:47:35 server kernel: [  501.250937]  0000000000000283 ffff8801e07a3e94 0000000000000000 0000000000000000
May  7 14:47:35 server kernel: [  501.251051]  ffff8800aebf3600 ffffffff8108d8e0 ffff8801e07a3e38 ffff8801e07a3e38
May  7 14:47:35 server kernel: [  501.251165] Call Trace:
May  7 14:47:35 server kernel: [  501.251200]  [<ffffffffa0222cfd>] ? __verify_planes_array_core+0xd/0x10 [videobuf2_v4l2]
May  7 14:47:35 server kernel: [  501.251306]  [<ffffffffa0236beb>] vb2_core_dqbuf+0x2eb/0x4c0 [videobuf2_core]
May  7 14:47:35 server kernel: [  501.251398]  [<ffffffff8108d8e0>] ? prepare_to_wait_event+0x100/0x100
May  7 14:47:35 server kernel: [  501.251482]  [<ffffffffa023855b>] vb2_thread+0x1cb/0x220 [videobuf2_core]
May  7 14:47:35 server kernel: [  501.251569]  [<ffffffffa0238390>] ? vb2_core_qbuf+0x230/0x230 [videobuf2_core]
May  7 14:47:35 server kernel: [  501.251662]  [<ffffffffa0238390>] ? vb2_core_qbuf+0x230/0x230 [videobuf2_core]
May  7 14:47:35 server kernel: [  501.255982]  [<ffffffff8106f984>] kthread+0xc4/0xe0
May  7 14:47:35 server kernel: [  501.260292]  [<ffffffff8106f8c0>] ? kthread_park+0x50/0x50
May  7 14:47:35 server kernel: [  501.264615]  [<ffffffff81697a5f>] ret_from_fork+0x3f/0x70
May  7 14:47:35 server kernel: [  501.268962]  [<ffffffff8106f8c0>] ? kthread_park+0x50/0x50
May  7 14:47:35 server kernel: [  501.273216] Code: 0d 01 74 16 48 8b 46 28 48 8b 56 30 48 89 87 d0 01 00 00 48 89 97 d8 01 00 00 5d c3 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 <8b> 46 04 48 89 e5 8d 50 f7 31 c0 83 fa 01 76 02 5d c3 48 83 7e
May  7 14:47:35 server kernel: [  501.282146] RIP  [<ffffffffa0222c71>] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2]
May  7 14:47:35 server kernel: [  501.286391]  RSP <ffff8801e07a3de8>
May  7 14:47:35 server kernel: [  501.290619] CR2: 0000000000000004
May  7 14:47:35 server kernel: [  501.294786] ---[ end trace b2b354153ccad110 ]---

This reverts commit 2c1f695.

Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
Cc: Hans Verkuil <hans.verkuil@cisco.com>
Cc: stable@vger.kernel.org
Fixes: 2c1f695 ("[media] videobuf2-v4l2: Verify planes array in buffer dequeueing")
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
93f0750

@popcornmix popcornmix pushed a commit that referenced this issue May 19, 2016

@mchehab @gregkh mchehab + gregkh Revert "[media] videobuf2-v4l2: Verify planes array in buffer dequeue…
…ing"

commit 93f0750 upstream.

This patch causes a Kernel panic when called on a DVB driver.

This was also reported by David R <david@unsolicited.net>:

May  7 14:47:35 server kernel: [  501.247123] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
May  7 14:47:35 server kernel: [  501.247239] IP: [<ffffffffa0222c71>] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2]
May  7 14:47:35 server kernel: [  501.247354] PGD cae6f067 PUD ca99c067 PMD 0
May  7 14:47:35 server kernel: [  501.247426] Oops: 0000 [#1] SMP
May  7 14:47:35 server kernel: [  501.247482] Modules linked in: xfs tun xt_connmark xt_TCPMSS xt_tcpmss xt_owner xt_REDIRECT nf_nat_redirect xt_nat ipt_MASQUERADE nf_nat_masquerade_ipv4 ts_kmp ts_bm xt_string ipt_REJECT nf_reject_ipv4 xt_recent xt_conntrack xt_multiport xt_pkttype xt_tcpudp xt_mark nf_log_ipv4 nf_log_common xt_LOG xt_limit iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables ip6table_filter ip6_tables x_tables pppoe pppox dm_crypt ts2020 regmap_i2c ds3000 cx88_dvb dvb_pll cx88_vp3054_i2c mt352 videobuf2_dvb cx8800 cx8802 cx88xx pl2303 tveeprom videobuf2_dma_sg ppdev videobuf2_memops videobuf2_v4l2 videobuf2_core dvb_usb_digitv snd_hda_codec_via snd_hda_codec_hdmi snd_hda_codec_generic radeon dvb_usb snd_hda_intel amd64_edac_mod serio_raw snd_hda_codec edac_core fbcon k10temp bitblit softcursor snd_hda_core font snd_pcm_oss i2c_piix4 snd_mixer_oss tileblit drm_kms_helper syscopyarea snd_pcm snd_seq_dummy sysfillrect snd_seq_oss sysimgblt fb_sys_fops ttm snd_seq_midi r8169 snd_rawmidi drm snd_seq_midi_event e1000e snd_seq snd_seq_device snd_timer snd ptp pps_core i2c_algo_bit soundcore parport_pc ohci_pci shpchp tpm_tis tpm nfsd auth_rpcgss oid_registry hwmon_vid exportfs nfs_acl mii nfs bonding lockd grace lp sunrpc parport
May  7 14:47:35 server kernel: [  501.249564] CPU: 1 PID: 6889 Comm: vb2-cx88[0] Not tainted 4.5.3 #3
May  7 14:47:35 server kernel: [  501.249644] Hardware name: System manufacturer System Product Name/M4A785TD-V EVO, BIOS 0211    07/08/2009
May  7 14:47:35 server kernel: [  501.249767] task: ffff8800aebf3600 ti: ffff8801e07a0000 task.ti: ffff8801e07a0000
May  7 14:47:35 server kernel: [  501.249861] RIP: 0010:[<ffffffffa0222c71>]  [<ffffffffa0222c71>] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2]
May  7 14:47:35 server kernel: [  501.250002] RSP: 0018:ffff8801e07a3de8  EFLAGS: 00010086
May  7 14:47:35 server kernel: [  501.250071] RAX: 0000000000000283 RBX: ffff880210dc5000 RCX: 0000000000000283
May  7 14:47:35 server kernel: [  501.250161] RDX: ffffffffa0222cf0 RSI: 0000000000000000 RDI: ffff880210dc5014
May  7 14:47:35 server kernel: [  501.250251] RBP: ffff8801e07a3df8 R08: ffff8801e07a0000 R09: 0000000000000000
May  7 14:47:35 server kernel: [  501.250348] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8800cda2a9d8
May  7 14:47:35 server kernel: [  501.250438] R13: ffff880210dc51b8 R14: 0000000000000000 R15: ffff8800cda2a828
May  7 14:47:35 server kernel: [  501.250528] FS:  00007f5b77fff700(0000) GS:ffff88021fc40000(0000) knlGS:00000000adaffb40
May  7 14:47:35 server kernel: [  501.250631] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
May  7 14:47:35 server kernel: [  501.250704] CR2: 0000000000000004 CR3: 00000000ca19d000 CR4: 00000000000006e0
May  7 14:47:35 server kernel: [  501.250794] Stack:
May  7 14:47:35 server kernel: [  501.250822]  ffff8801e07a3df8 ffffffffa0222cfd ffff8801e07a3e70 ffffffffa0236beb
May  7 14:47:35 server kernel: [  501.250937]  0000000000000283 ffff8801e07a3e94 0000000000000000 0000000000000000
May  7 14:47:35 server kernel: [  501.251051]  ffff8800aebf3600 ffffffff8108d8e0 ffff8801e07a3e38 ffff8801e07a3e38
May  7 14:47:35 server kernel: [  501.251165] Call Trace:
May  7 14:47:35 server kernel: [  501.251200]  [<ffffffffa0222cfd>] ? __verify_planes_array_core+0xd/0x10 [videobuf2_v4l2]
May  7 14:47:35 server kernel: [  501.251306]  [<ffffffffa0236beb>] vb2_core_dqbuf+0x2eb/0x4c0 [videobuf2_core]
May  7 14:47:35 server kernel: [  501.251398]  [<ffffffff8108d8e0>] ? prepare_to_wait_event+0x100/0x100
May  7 14:47:35 server kernel: [  501.251482]  [<ffffffffa023855b>] vb2_thread+0x1cb/0x220 [videobuf2_core]
May  7 14:47:35 server kernel: [  501.251569]  [<ffffffffa0238390>] ? vb2_core_qbuf+0x230/0x230 [videobuf2_core]
May  7 14:47:35 server kernel: [  501.251662]  [<ffffffffa0238390>] ? vb2_core_qbuf+0x230/0x230 [videobuf2_core]
May  7 14:47:35 server kernel: [  501.255982]  [<ffffffff8106f984>] kthread+0xc4/0xe0
May  7 14:47:35 server kernel: [  501.260292]  [<ffffffff8106f8c0>] ? kthread_park+0x50/0x50
May  7 14:47:35 server kernel: [  501.264615]  [<ffffffff81697a5f>] ret_from_fork+0x3f/0x70
May  7 14:47:35 server kernel: [  501.268962]  [<ffffffff8106f8c0>] ? kthread_park+0x50/0x50
May  7 14:47:35 server kernel: [  501.273216] Code: 0d 01 74 16 48 8b 46 28 48 8b 56 30 48 89 87 d0 01 00 00 48 89 97 d8 01 00 00 5d c3 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 <8b> 46 04 48 89 e5 8d 50 f7 31 c0 83 fa 01 76 02 5d c3 48 83 7e
May  7 14:47:35 server kernel: [  501.282146] RIP  [<ffffffffa0222c71>] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2]
May  7 14:47:35 server kernel: [  501.286391]  RSP <ffff8801e07a3de8>
May  7 14:47:35 server kernel: [  501.290619] CR2: 0000000000000004
May  7 14:47:35 server kernel: [  501.294786] ---[ end trace b2b354153ccad110 ]---

This reverts commit 2c1f695.

Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
Cc: Hans Verkuil <hans.verkuil@cisco.com>
Fixes: 2c1f695 ("[media] videobuf2-v4l2: Verify planes array in buffer dequeueing")
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
1943bd0

@popcornmix popcornmix pushed a commit that referenced this issue May 19, 2016

@fdmanana @gregkh fdmanana + gregkh Btrfs: fix deadlock between direct IO reads and buffered writes
commit ade7702 upstream.

While running a test with a mix of buffered IO and direct IO against
the same files I hit a deadlock reported by the following trace:

[11642.140352] INFO: task kworker/u32:3:15282 blocked for more than 120 seconds.
[11642.142452]       Not tainted 4.4.0-rc6-btrfs-next-21+ #1
[11642.143982] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[11642.146332] kworker/u32:3   D ffff880230ef7988 [11642.147737] systemd-journald[571]: Sent WATCHDOG=1 notification.
[11642.149771]     0 15282      2 0x00000000
[11642.151205] Workqueue: btrfs-flush_delalloc btrfs_flush_delalloc_helper [btrfs]
[11642.154074]  ffff880230ef7988 0000000000000246 0000000000014ec0 ffff88023ec94ec0
[11642.156722]  ffff880233fe8f80 ffff880230ef8000 ffff88023ec94ec0 7fffffffffffffff
[11642.159205]  0000000000000002 ffffffff8147b7f9 ffff880230ef79a0 ffffffff8147b541
[11642.161403] Call Trace:
[11642.162129]  [<ffffffff8147b7f9>] ? bit_wait+0x2f/0x2f
[11642.163396]  [<ffffffff8147b541>] schedule+0x82/0x9a
[11642.164871]  [<ffffffff8147e7fe>] schedule_timeout+0x43/0x109
[11642.167020]  [<ffffffff8147b7f9>] ? bit_wait+0x2f/0x2f
[11642.167931]  [<ffffffff8108afd1>] ? trace_hardirqs_on_caller+0x17b/0x197
[11642.182320]  [<ffffffff8108affa>] ? trace_hardirqs_on+0xd/0xf
[11642.183762]  [<ffffffff810b079b>] ? timekeeping_get_ns+0xe/0x33
[11642.185308]  [<ffffffff810b0f61>] ? ktime_get+0x41/0x52
[11642.186782]  [<ffffffff8147ac08>] io_schedule_timeout+0xa0/0x102
[11642.188217]  [<ffffffff8147ac08>] ? io_schedule_timeout+0xa0/0x102
[11642.189626]  [<ffffffff8147b814>] bit_wait_io+0x1b/0x39
[11642.190803]  [<ffffffff8147bb21>] __wait_on_bit_lock+0x4c/0x90
[11642.192158]  [<ffffffff8111829f>] __lock_page+0x66/0x68
[11642.193379]  [<ffffffff81082f29>] ? autoremove_wake_function+0x3a/0x3a
[11642.194831]  [<ffffffffa0450ddd>] lock_page+0x31/0x34 [btrfs]
[11642.197068]  [<ffffffffa0454e3b>] extent_write_cache_pages.isra.19.constprop.35+0x1af/0x2f4 [btrfs]
[11642.199188]  [<ffffffffa0455373>] extent_writepages+0x4b/0x5c [btrfs]
[11642.200723]  [<ffffffffa043c913>] ? btrfs_writepage_start_hook+0xce/0xce [btrfs]
[11642.202465]  [<ffffffffa043aa82>] btrfs_writepages+0x28/0x2a [btrfs]
[11642.203836]  [<ffffffff811236bc>] do_writepages+0x23/0x2c
[11642.205624]  [<ffffffff811198c9>] __filemap_fdatawrite_range+0x5a/0x61
[11642.207057]  [<ffffffff81119946>] filemap_fdatawrite_range+0x13/0x15
[11642.208529]  [<ffffffffa044f87e>] btrfs_start_ordered_extent+0xd0/0x1a1 [btrfs]
[11642.210375]  [<ffffffffa0462613>] ? btrfs_scrubparity_helper+0x140/0x33a [btrfs]
[11642.212132]  [<ffffffffa044f974>] btrfs_run_ordered_extent_work+0x25/0x34 [btrfs]
[11642.213837]  [<ffffffffa046262f>] btrfs_scrubparity_helper+0x15c/0x33a [btrfs]
[11642.215457]  [<ffffffffa046293b>] btrfs_flush_delalloc_helper+0xe/0x10 [btrfs]
[11642.217095]  [<ffffffff8106483e>] process_one_work+0x256/0x48b
[11642.218324]  [<ffffffff81064f20>] worker_thread+0x1f5/0x2a7
[11642.219466]  [<ffffffff81064d2b>] ? rescuer_thread+0x289/0x289
[11642.220801]  [<ffffffff8106a500>] kthread+0xd4/0xdc
[11642.222032]  [<ffffffff8106a42c>] ? kthread_parkme+0x24/0x24
[11642.223190]  [<ffffffff8147fdef>] ret_from_fork+0x3f/0x70
[11642.224394]  [<ffffffff8106a42c>] ? kthread_parkme+0x24/0x24
[11642.226295] 2 locks held by kworker/u32:3/15282:
[11642.227273]  #0:  ("%s-%s""btrfs", name){++++.+}, at: [<ffffffff8106474d>] process_one_work+0x165/0x48b
[11642.229412]  #1:  ((&work->normal_work)){+.+.+.}, at: [<ffffffff8106474d>] process_one_work+0x165/0x48b
[11642.231414] INFO: task kworker/u32:8:15289 blocked for more than 120 seconds.
[11642.232872]       Not tainted 4.4.0-rc6-btrfs-next-21+ #1
[11642.234109] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[11642.235776] kworker/u32:8   D ffff88020de5f848     0 15289      2 0x00000000
[11642.237412] Workqueue: writeback wb_workfn (flush-btrfs-481)
[11642.238670]  ffff88020de5f848 0000000000000246 0000000000014ec0 ffff88023ed54ec0
[11642.240475]  ffff88021b1ece40 ffff88020de60000 ffff88023ed54ec0 7fffffffffffffff
[11642.242154]  0000000000000002 ffffffff8147b7f9 ffff88020de5f860 ffffffff8147b541
[11642.243715] Call Trace:
[11642.244390]  [<ffffffff8147b7f9>] ? bit_wait+0x2f/0x2f
[11642.245432]  [<ffffffff8147b541>] schedule+0x82/0x9a
[11642.246392]  [<ffffffff8147e7fe>] schedule_timeout+0x43/0x109
[11642.247479]  [<ffffffff8147b7f9>] ? bit_wait+0x2f/0x2f
[11642.248551]  [<ffffffff8108afd1>] ? trace_hardirqs_on_caller+0x17b/0x197
[11642.249968]  [<ffffffff8108affa>] ? trace_hardirqs_on+0xd/0xf
[11642.251043]  [<ffffffff810b079b>] ? timekeeping_get_ns+0xe/0x33
[11642.252202]  [<ffffffff810b0f61>] ? ktime_get+0x41/0x52
[11642.253210]  [<ffffffff8147ac08>] io_schedule_timeout+0xa0/0x102
[11642.254307]  [<ffffffff8147ac08>] ? io_schedule_timeout+0xa0/0x102
[11642.256118]  [<ffffffff8147b814>] bit_wait_io+0x1b/0x39
[11642.257131]  [<ffffffff8147bb21>] __wait_on_bit_lock+0x4c/0x90
[11642.258200]  [<ffffffff8111829f>] __lock_page+0x66/0x68
[11642.259168]  [<ffffffff81082f29>] ? autoremove_wake_function+0x3a/0x3a
[11642.260516]  [<ffffffffa0450ddd>] lock_page+0x31/0x34 [btrfs]
[11642.261841]  [<ffffffffa0454e3b>] extent_write_cache_pages.isra.19.constprop.35+0x1af/0x2f4 [btrfs]
[11642.263531]  [<ffffffffa0455373>] extent_writepages+0x4b/0x5c [btrfs]
[11642.264747]  [<ffffffffa043c913>] ? btrfs_writepage_start_hook+0xce/0xce [btrfs]
[11642.266148]  [<ffffffffa043aa82>] btrfs_writepages+0x28/0x2a [btrfs]
[11642.267264]  [<ffffffff811236bc>] do_writepages+0x23/0x2c
[11642.268280]  [<ffffffff81192a2b>] __writeback_single_inode+0xda/0x5ba
[11642.269407]  [<ffffffff811939f0>] writeback_sb_inodes+0x27b/0x43d
[11642.270476]  [<ffffffff81193c28>] __writeback_inodes_wb+0x76/0xae
[11642.271547]  [<ffffffff81193ea6>] wb_writeback+0x19e/0x41c
[11642.272588]  [<ffffffff81194821>] wb_workfn+0x201/0x341
[11642.273523]  [<ffffffff81194821>] ? wb_workfn+0x201/0x341
[11642.274479]  [<ffffffff8106483e>] process_one_work+0x256/0x48b
[11642.275497]  [<ffffffff81064f20>] worker_thread+0x1f5/0x2a7
[11642.276518]  [<ffffffff81064d2b>] ? rescuer_thread+0x289/0x289
[11642.277520]  [<ffffffff81064d2b>] ? rescuer_thread+0x289/0x289
[11642.278517]  [<ffffffff8106a500>] kthread+0xd4/0xdc
[11642.279371]  [<ffffffff8106a42c>] ? kthread_parkme+0x24/0x24
[11642.280468]  [<ffffffff8147fdef>] ret_from_fork+0x3f/0x70
[11642.281607]  [<ffffffff8106a42c>] ? kthread_parkme+0x24/0x24
[11642.282604] 3 locks held by kworker/u32:8/15289:
[11642.283423]  #0:  ("writeback"){++++.+}, at: [<ffffffff8106474d>] process_one_work+0x165/0x48b
[11642.285629]  #1:  ((&(&wb->dwork)->work)){+.+.+.}, at: [<ffffffff8106474d>] process_one_work+0x165/0x48b
[11642.287538]  #2:  (&type->s_umount_key#37){+++++.}, at: [<ffffffff81171217>] trylock_super+0x1b/0x4b
[11642.289423] INFO: task fdm-stress:26848 blocked for more than 120 seconds.
[11642.290547]       Not tainted 4.4.0-rc6-btrfs-next-21+ #1
[11642.291453] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[11642.292864] fdm-stress      D ffff88022c107c20     0 26848  26591 0x00000000
[11642.294118]  ffff88022c107c20 000000038108affa 0000000000014ec0 ffff88023ed54ec0
[11642.295602]  ffff88013ab1ca40 ffff88022c108000 ffff8800b2fc19d0 00000000000e0fff
[11642.297098]  ffff8800b2fc19b0 ffff88022c107c88 ffff88022c107c38 ffffffff8147b541
[11642.298433] Call Trace:
[11642.298896]  [<ffffffff8147b541>] schedule+0x82/0x9a
[11642.299738]  [<ffffffffa045225d>] lock_extent_bits+0xfe/0x1a3 [btrfs]
[11642.300833]  [<ffffffff81082eef>] ? add_wait_queue_exclusive+0x44/0x44
[11642.301943]  [<ffffffffa0447516>] lock_and_cleanup_extent_if_need+0x68/0x18e [btrfs]
[11642.303270]  [<ffffffffa04485ba>] __btrfs_buffered_write+0x238/0x4c1 [btrfs]
[11642.304552]  [<ffffffffa044b50a>] ? btrfs_file_write_iter+0x17c/0x408 [btrfs]
[11642.305782]  [<ffffffffa044b682>] btrfs_file_write_iter+0x2f4/0x408 [btrfs]
[11642.306878]  [<ffffffff8116e298>] __vfs_write+0x7c/0xa5
[11642.307729]  [<ffffffff8116e7d1>] vfs_write+0x9d/0xe8
[11642.308602]  [<ffffffff8116efbb>] SyS_write+0x50/0x7e
[11642.309410]  [<ffffffff8147fa97>] entry_SYSCALL_64_fastpath+0x12/0x6b
[11642.310403] 3 locks held by fdm-stress/26848:
[11642.311108]  #0:  (&f->f_pos_lock){+.+.+.}, at: [<ffffffff811877e8>] __fdget_pos+0x3a/0x40
[11642.312578]  #1:  (sb_writers#11){.+.+.+}, at: [<ffffffff811706ee>] __sb_start_write+0x5f/0xb0
[11642.314170]  #2:  (&sb->s_type->i_mutex_key#15){+.+.+.}, at: [<ffffffffa044b401>] btrfs_file_write_iter+0x73/0x408 [btrfs]
[11642.316796] INFO: task fdm-stress:26849 blocked for more than 120 seconds.
[11642.317842]       Not tainted 4.4.0-rc6-btrfs-next-21+ #1
[11642.318691] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[11642.319959] fdm-stress      D ffff8801964ffa68     0 26849  26591 0x00000000
[11642.321312]  ffff8801964ffa68 00ff8801e9975f80 0000000000014ec0 ffff88023ed94ec0
[11642.322555]  ffff8800b00b4840 ffff880196500000 ffff8801e9975f20 0000000000000002
[11642.323715]  ffff8801e9975f18 ffff8800b00b4840 ffff8801964ffa80 ffffffff8147b541
[11642.325096] Call Trace:
[11642.325532]  [<ffffffff8147b541>] schedule+0x82/0x9a
[11642.326303]  [<ffffffff8147e7fe>] schedule_timeout+0x43/0x109
[11642.327180]  [<ffffffff8108ae40>] ? mark_held_locks+0x5e/0x74
[11642.328114]  [<ffffffff8147f30e>] ? _raw_spin_unlock_irq+0x2c/0x4a
[11642.329051]  [<ffffffff8108afd1>] ? trace_hardirqs_on_caller+0x17b/0x197
[11642.330053]  [<ffffffff8147bceb>] __wait_for_common+0x109/0x147
[11642.330952]  [<ffffffff8147bceb>] ? __wait_for_common+0x109/0x147
[11642.331869]  [<ffffffff8147e7bb>] ? usleep_range+0x4a/0x4a
[11642.332925]  [<ffffffff81074075>] ? wake_up_q+0x47/0x47
[11642.333736]  [<ffffffff8147bd4d>] wait_for_completion+0x24/0x26
[11642.334672]  [<ffffffffa044f5ce>] btrfs_wait_ordered_extents+0x1c8/0x217 [btrfs]
[11642.335858]  [<ffffffffa0465b5a>] btrfs_mksubvol+0x224/0x45d [btrfs]
[11642.336854]  [<ffffffff81082eef>] ? add_wait_queue_exclusive+0x44/0x44
[11642.337820]  [<ffffffffa0465edb>] btrfs_ioctl_snap_create_transid+0x148/0x17a [btrfs]
[11642.339026]  [<ffffffffa046603b>] btrfs_ioctl_snap_create_v2+0xc7/0x110 [btrfs]
[11642.340214]  [<ffffffffa0468582>] btrfs_ioctl+0x590/0x27bd [btrfs]
[11642.341123]  [<ffffffff8147dc00>] ? mutex_unlock+0xe/0x10
[11642.341934]  [<ffffffffa00fa6e9>] ? ext4_file_write_iter+0x2a3/0x36f [ext4]
[11642.342936]  [<ffffffff8108895d>] ? __lock_is_held+0x3c/0x57
[11642.343772]  [<ffffffff81186a1d>] ? rcu_read_unlock+0x3e/0x5d
[11642.344673]  [<ffffffff8117dc95>] do_vfs_ioctl+0x458/0x4dc
[11642.346024]  [<ffffffff81186bbe>] ? __fget_light+0x62/0x71
[11642.346873]  [<ffffffff8117dd70>] SyS_ioctl+0x57/0x79
[11642.347720]  [<ffffffff8147fa97>] entry_SYSCALL_64_fastpath+0x12/0x6b
[11642.350222] 4 locks held by fdm-stress/26849:
[11642.350898]  #0:  (sb_writers#11){.+.+.+}, at: [<ffffffff811706ee>] __sb_start_write+0x5f/0xb0
[11642.352375]  #1:  (&type->i_mutex_dir_key#4/1){+.+.+.}, at: [<ffffffffa0465981>] btrfs_mksubvol+0x4b/0x45d [btrfs]
[11642.354072]  #2:  (&fs_info->subvol_sem){++++..}, at: [<ffffffffa0465a2a>] btrfs_mksubvol+0xf4/0x45d [btrfs]
[11642.355647]  #3:  (&root->ordered_extent_mutex){+.+...}, at: [<ffffffffa044f456>] btrfs_wait_ordered_extents+0x50/0x217 [btrfs]
[11642.357516] INFO: task fdm-stress:26850 blocked for more than 120 seconds.
[11642.358508]       Not tainted 4.4.0-rc6-btrfs-next-21+ #1
[11642.359376] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[11642.368625] fdm-stress      D ffff88021f167688     0 26850  26591 0x00000000
[11642.369716]  ffff88021f167688 0000000000000001 0000000000014ec0 ffff88023edd4ec0
[11642.370950]  ffff880128a98680 ffff88021f168000 ffff88023edd4ec0 7fffffffffffffff
[11642.372210]  0000000000000002 ffffffff8147b7f9 ffff88021f1676a0 ffffffff8147b541
[11642.373430] Call Trace:
[11642.373853]  [<ffffffff8147b7f9>] ? bit_wait+0x2f/0x2f
[11642.374623]  [<ffffffff8147b541>] schedule+0x82/0x9a
[11642.375948]  [<ffffffff8147e7fe>] schedule_timeout+0x43/0x109
[11642.376862]  [<ffffffff8147b7f9>] ? bit_wait+0x2f/0x2f
[11642.377637]  [<ffffffff8108afd1>] ? trace_hardirqs_on_caller+0x17b/0x197
[11642.378610]  [<ffffffff8108affa>] ? trace_hardirqs_on+0xd/0xf
[11642.379457]  [<ffffffff810b079b>] ? timekeeping_get_ns+0xe/0x33
[11642.380366]  [<ffffffff810b0f61>] ? ktime_get+0x41/0x52
[11642.381353]  [<ffffffff8147ac08>] io_schedule_timeout+0xa0/0x102
[11642.382255]  [<ffffffff8147ac08>] ? io_schedule_timeout+0xa0/0x102
[11642.383162]  [<ffffffff8147b814>] bit_wait_io+0x1b/0x39
[11642.383945]  [<ffffffff8147bb21>] __wait_on_bit_lock+0x4c/0x90
[11642.384875]  [<ffffffff8111829f>] __lock_page+0x66/0x68
[11642.385749]  [<ffffffff81082f29>] ? autoremove_wake_function+0x3a/0x3a
[11642.386721]  [<ffffffffa0450ddd>] lock_page+0x31/0x34 [btrfs]
[11642.387596]  [<ffffffffa0454e3b>] extent_write_cache_pages.isra.19.constprop.35+0x1af/0x2f4 [btrfs]
[11642.389030]  [<ffffffffa0455373>] extent_writepages+0x4b/0x5c [btrfs]
[11642.389973]  [<ffffffff810a25ad>] ? rcu_read_lock_sched_held+0x61/0x69
[11642.390939]  [<ffffffffa043c913>] ? btrfs_writepage_start_hook+0xce/0xce [btrfs]
[11642.392271]  [<ffffffffa0451c32>] ? __clear_extent_bit+0x26e/0x2c0 [btrfs]
[11642.393305]  [<ffffffffa043aa82>] btrfs_writepages+0x28/0x2a [btrfs]
[11642.394239]  [<ffffffff811236bc>] do_writepages+0x23/0x2c
[11642.395045]  [<ffffffff811198c9>] __filemap_fdatawrite_range+0x5a/0x61
[11642.395991]  [<ffffffff81119946>] filemap_fdatawrite_range+0x13/0x15
[11642.397144]  [<ffffffffa044f87e>] btrfs_start_ordered_extent+0xd0/0x1a1 [btrfs]
[11642.398392]  [<ffffffffa0452094>] ? clear_extent_bit+0x17/0x19 [btrfs]
[11642.399363]  [<ffffffffa0445945>] btrfs_get_blocks_direct+0x12b/0x61c [btrfs]
[11642.400445]  [<ffffffff8119f7a1>] ? dio_bio_add_page+0x3d/0x54
[11642.401309]  [<ffffffff8119fa93>] ? submit_page_section+0x7b/0x111
[11642.402213]  [<ffffffff811a0258>] do_blockdev_direct_IO+0x685/0xc24
[11642.403139]  [<ffffffffa044581a>] ? btrfs_page_exists_in_range+0x1a1/0x1a1 [btrfs]
[11642.404360]  [<ffffffffa043d267>] ? btrfs_get_extent_fiemap+0x1c0/0x1c0 [btrfs]
[11642.406187]  [<ffffffff811a0828>] __blockdev_direct_IO+0x31/0x33
[11642.407070]  [<ffffffff811a0828>] ? __blockdev_direct_IO+0x31/0x33
[11642.407990]  [<ffffffffa043d267>] ? btrfs_get_extent_fiemap+0x1c0/0x1c0 [btrfs]
[11642.409192]  [<ffffffffa043b4ca>] btrfs_direct_IO+0x1c7/0x27e [btrfs]
[11642.410146]  [<ffffffffa043d267>] ? btrfs_get_extent_fiemap+0x1c0/0x1c0 [btrfs]
[11642.411291]  [<ffffffff81119a2c>] generic_file_read_iter+0x89/0x4e1
[11642.412263]  [<ffffffff8108ac05>] ? mark_lock+0x24/0x201
[11642.413057]  [<ffffffff8116e1f8>] __vfs_read+0x79/0x9d
[11642.413897]  [<ffffffff8116e6f1>] vfs_read+0x8f/0xd2
[11642.414708]  [<ffffffff8116ef3d>] SyS_read+0x50/0x7e
[11642.415573]  [<ffffffff8147fa97>] entry_SYSCALL_64_fastpath+0x12/0x6b
[11642.416572] 1 lock held by fdm-stress/26850:
[11642.417345]  #0:  (&f->f_pos_lock){+.+.+.}, at: [<ffffffff811877e8>] __fdget_pos+0x3a/0x40
[11642.418703] INFO: task fdm-stress:26851 blocked for more than 120 seconds.
[11642.419698]       Not tainted 4.4.0-rc6-btrfs-next-21+ #1
[11642.420612] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[11642.421807] fdm-stress      D ffff880196483d28     0 26851  26591 0x00000000
[11642.422878]  ffff880196483d28 00ff8801c8f60740 0000000000014ec0 ffff88023ed94ec0
[11642.424149]  ffff8801c8f60740 ffff880196484000 0000000000000246 ffff8801c8f60740
[11642.425374]  ffff8801bb711840 ffff8801bb711878 ffff880196483d40 ffffffff8147b541
[11642.426591] Call Trace:
[11642.427013]  [<ffffffff8147b541>] schedule+0x82/0x9a
[11642.427856]  [<ffffffff8147b6d5>] schedule_preempt_disabled+0x18/0x24
[11642.428852]  [<ffffffff8147c23a>] mutex_lock_nested+0x1d7/0x3b4
[11642.429743]  [<ffffffffa044f456>] ? btrfs_wait_ordered_extents+0x50/0x217 [btrfs]
[11642.430911]  [<ffffffffa044f456>] btrfs_wait_ordered_extents+0x50/0x217 [btrfs]
[11642.432102]  [<ffffffffa044f674>] ? btrfs_wait_ordered_roots+0x57/0x191 [btrfs]
[11642.433259]  [<ffffffffa044f456>] ? btrfs_wait_ordered_extents+0x50/0x217 [btrfs]
[11642.434431]  [<ffffffffa044f6ea>] btrfs_wait_ordered_roots+0xcd/0x191 [btrfs]
[11642.436079]  [<ffffffffa0410cab>] btrfs_sync_fs+0xe0/0x1ad [btrfs]
[11642.437009]  [<ffffffff81197900>] ? SyS_tee+0x23c/0x23c
[11642.437860]  [<ffffffff81197920>] sync_fs_one_sb+0x20/0x22
[11642.438723]  [<ffffffff81171435>] iterate_supers+0x75/0xc2
[11642.439597]  [<ffffffff81197d00>] sys_sync+0x52/0x80
[11642.440454]  [<ffffffff8147fa97>] entry_SYSCALL_64_fastpath+0x12/0x6b
[11642.441533] 3 locks held by fdm-stress/26851:
[11642.442370]  #0:  (&type->s_umount_key#37){+++++.}, at: [<ffffffff8117141f>] iterate_supers+0x5f/0xc2
[11642.444043]  #1:  (&fs_info->ordered_operations_mutex){+.+...}, at: [<ffffffffa044f661>] btrfs_wait_ordered_roots+0x44/0x191 [btrfs]
[11642.446010]  #2:  (&root->ordered_extent_mutex){+.+...}, at: [<ffffffffa044f456>] btrfs_wait_ordered_extents+0x50/0x217 [btrfs]

This happened because under specific timings the path for direct IO reads
can deadlock with concurrent buffered writes. The diagram below shows how
this happens for an example file that has the following layout:

     [  extent A  ]  [  extent B  ]  [ ....
     0K              4K              8K

     CPU 1                                               CPU 2                             CPU 3

DIO read against range
 [0K, 8K[ starts

btrfs_direct_IO()
  --> calls btrfs_get_blocks_direct()
      which finds the extent map for the
      extent A and leaves the range
      [0K, 4K[ locked in the inode's
      io tree

                                                   buffered write against
                                                   range [4K, 8K[ starts

                                                   __btrfs_buffered_write()
                                                     --> dirties page at 4K

                                                                                     a user space
                                                                                     task calls sync
                                                                                     for e.g or
                                                                                     writepages() is
                                                                                     invoked by mm

                                                                                     writepages()
                                                                                       run_delalloc_range()
                                                                                         cow_file_range()
                                                                                           --> ordered extent X
                                                                                               for the buffered
                                                                                               write is created
                                                                                               and
                                                                                               writeback starts

  --> calls btrfs_get_blocks_direct()
      again, without submitting first
      a bio for reading extent A, and
      finds the extent map for extent B

  --> calls lock_extent_direct()

      --> locks range [4K, 8K[
      --> finds ordered extent X
          covering range [4K, 8K[
      --> unlocks range [4K, 8K[

                                                  buffered write against
                                                  range [0K, 8K[ starts

                                                  __btrfs_buffered_write()
                                                    prepare_pages()
                                                      --> locks pages with
                                                          offsets 0 and 4K
                                                    lock_and_cleanup_extent_if_need()
                                                      --> blocks attempting to
                                                          lock range [0K, 8K[ in
                                                          the inode's io tree,
                                                          because the range [0, 4K[
                                                          is already locked by the
                                                          direct IO task at CPU 1

      --> calls
          btrfs_start_ordered_extent(oe X)

          btrfs_start_ordered_extent(oe X)

            --> At this point writeback for ordered
                extent X has not finished yet

            filemap_fdatawrite_range()
              btrfs_writepages()
                extent_writepages()
                  extent_write_cache_pages()
                    --> finds page with offset 0
                        with the writeback tag
                        (and not dirty)
                    --> tries to lock it
                         --> deadlock, task at CPU 2
                             has the page locked and
                             is blocked on the io range
                             [0, 4K[ that was locked
                             earlier by this task

So fix this by falling back to a buffered read in the direct IO read path
when an ordered extent for a buffered write is found.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8eb64c9

@popcornmix popcornmix pushed a commit that referenced this issue May 19, 2016

@mchehab @gregkh mchehab + gregkh Revert "[media] videobuf2-v4l2: Verify planes array in buffer dequeue…
…ing"

commit 93f0750 upstream.

This patch causes a Kernel panic when called on a DVB driver.

This was also reported by David R <david@unsolicited.net>:

May  7 14:47:35 server kernel: [  501.247123] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
May  7 14:47:35 server kernel: [  501.247239] IP: [<ffffffffa0222c71>] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2]
May  7 14:47:35 server kernel: [  501.247354] PGD cae6f067 PUD ca99c067 PMD 0
May  7 14:47:35 server kernel: [  501.247426] Oops: 0000 [#1] SMP
May  7 14:47:35 server kernel: [  501.247482] Modules linked in: xfs tun xt_connmark xt_TCPMSS xt_tcpmss xt_owner xt_REDIRECT nf_nat_redirect xt_nat ipt_MASQUERADE nf_nat_masquerade_ipv4 ts_kmp ts_bm xt_string ipt_REJECT nf_reject_ipv4 xt_recent xt_conntrack xt_multiport xt_pkttype xt_tcpudp xt_mark nf_log_ipv4 nf_log_common xt_LOG xt_limit iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables ip6table_filter ip6_tables x_tables pppoe pppox dm_crypt ts2020 regmap_i2c ds3000 cx88_dvb dvb_pll cx88_vp3054_i2c mt352 videobuf2_dvb cx8800 cx8802 cx88xx pl2303 tveeprom videobuf2_dma_sg ppdev videobuf2_memops videobuf2_v4l2 videobuf2_core dvb_usb_digitv snd_hda_codec_via snd_hda_codec_hdmi snd_hda_codec_generic radeon dvb_usb snd_hda_intel amd64_edac_mod serio_raw snd_hda_codec edac_core fbcon k10temp bitblit softcursor snd_hda_core font snd_pcm_oss i2c_piix4 snd_mixer_oss tileblit drm_kms_helper syscopyarea snd_pcm snd_seq_dummy sysfillrect snd_seq_oss sysimgblt fb_sys_fops ttm snd_seq_midi r8169 snd_rawmidi drm snd_seq_midi_event e1000e snd_seq snd_seq_device snd_timer snd ptp pps_core i2c_algo_bit soundcore parport_pc ohci_pci shpchp tpm_tis tpm nfsd auth_rpcgss oid_registry hwmon_vid exportfs nfs_acl mii nfs bonding lockd grace lp sunrpc parport
May  7 14:47:35 server kernel: [  501.249564] CPU: 1 PID: 6889 Comm: vb2-cx88[0] Not tainted 4.5.3 #3
May  7 14:47:35 server kernel: [  501.249644] Hardware name: System manufacturer System Product Name/M4A785TD-V EVO, BIOS 0211    07/08/2009
May  7 14:47:35 server kernel: [  501.249767] task: ffff8800aebf3600 ti: ffff8801e07a0000 task.ti: ffff8801e07a0000
May  7 14:47:35 server kernel: [  501.249861] RIP: 0010:[<ffffffffa0222c71>]  [<ffffffffa0222c71>] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2]
May  7 14:47:35 server kernel: [  501.250002] RSP: 0018:ffff8801e07a3de8  EFLAGS: 00010086
May  7 14:47:35 server kernel: [  501.250071] RAX: 0000000000000283 RBX: ffff880210dc5000 RCX: 0000000000000283
May  7 14:47:35 server kernel: [  501.250161] RDX: ffffffffa0222cf0 RSI: 0000000000000000 RDI: ffff880210dc5014
May  7 14:47:35 server kernel: [  501.250251] RBP: ffff8801e07a3df8 R08: ffff8801e07a0000 R09: 0000000000000000
May  7 14:47:35 server kernel: [  501.250348] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8800cda2a9d8
May  7 14:47:35 server kernel: [  501.250438] R13: ffff880210dc51b8 R14: 0000000000000000 R15: ffff8800cda2a828
May  7 14:47:35 server kernel: [  501.250528] FS:  00007f5b77fff700(0000) GS:ffff88021fc40000(0000) knlGS:00000000adaffb40
May  7 14:47:35 server kernel: [  501.250631] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
May  7 14:47:35 server kernel: [  501.250704] CR2: 0000000000000004 CR3: 00000000ca19d000 CR4: 00000000000006e0
May  7 14:47:35 server kernel: [  501.250794] Stack:
May  7 14:47:35 server kernel: [  501.250822]  ffff8801e07a3df8 ffffffffa0222cfd ffff8801e07a3e70 ffffffffa0236beb
May  7 14:47:35 server kernel: [  501.250937]  0000000000000283 ffff8801e07a3e94 0000000000000000 0000000000000000
May  7 14:47:35 server kernel: [  501.251051]  ffff8800aebf3600 ffffffff8108d8e0 ffff8801e07a3e38 ffff8801e07a3e38
May  7 14:47:35 server kernel: [  501.251165] Call Trace:
May  7 14:47:35 server kernel: [  501.251200]  [<ffffffffa0222cfd>] ? __verify_planes_array_core+0xd/0x10 [videobuf2_v4l2]
May  7 14:47:35 server kernel: [  501.251306]  [<ffffffffa0236beb>] vb2_core_dqbuf+0x2eb/0x4c0 [videobuf2_core]
May  7 14:47:35 server kernel: [  501.251398]  [<ffffffff8108d8e0>] ? prepare_to_wait_event+0x100/0x100
May  7 14:47:35 server kernel: [  501.251482]  [<ffffffffa023855b>] vb2_thread+0x1cb/0x220 [videobuf2_core]
May  7 14:47:35 server kernel: [  501.251569]  [<ffffffffa0238390>] ? vb2_core_qbuf+0x230/0x230 [videobuf2_core]
May  7 14:47:35 server kernel: [  501.251662]  [<ffffffffa0238390>] ? vb2_core_qbuf+0x230/0x230 [videobuf2_core]
May  7 14:47:35 server kernel: [  501.255982]  [<ffffffff8106f984>] kthread+0xc4/0xe0
May  7 14:47:35 server kernel: [  501.260292]  [<ffffffff8106f8c0>] ? kthread_park+0x50/0x50
May  7 14:47:35 server kernel: [  501.264615]  [<ffffffff81697a5f>] ret_from_fork+0x3f/0x70
May  7 14:47:35 server kernel: [  501.268962]  [<ffffffff8106f8c0>] ? kthread_park+0x50/0x50
May  7 14:47:35 server kernel: [  501.273216] Code: 0d 01 74 16 48 8b 46 28 48 8b 56 30 48 89 87 d0 01 00 00 48 89 97 d8 01 00 00 5d c3 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 <8b> 46 04 48 89 e5 8d 50 f7 31 c0 83 fa 01 76 02 5d c3 48 83 7e
May  7 14:47:35 server kernel: [  501.282146] RIP  [<ffffffffa0222c71>] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2]
May  7 14:47:35 server kernel: [  501.286391]  RSP <ffff8801e07a3de8>
May  7 14:47:35 server kernel: [  501.290619] CR2: 0000000000000004
May  7 14:47:35 server kernel: [  501.294786] ---[ end trace b2b354153ccad110 ]---

This reverts commit 2c1f695.

Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
Cc: Hans Verkuil <hans.verkuil@cisco.com>
Fixes: 2c1f695 ("[media] videobuf2-v4l2: Verify planes array in buffer dequeueing")
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9df2dc6

@popcornmix popcornmix added a commit that referenced this issue May 31, 2016

@mchehab @popcornmix mchehab + popcornmix Revert "[media] videobuf2-v4l2: Verify planes array in buffer dequeue…
…ing"

commit 93f0750 upstream.

This patch causes a Kernel panic when called on a DVB driver.

This was also reported by David R <david@unsolicited.net>:

May  7 14:47:35 server kernel: [  501.247123] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
May  7 14:47:35 server kernel: [  501.247239] IP: [<ffffffffa0222c71>] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2]
May  7 14:47:35 server kernel: [  501.247354] PGD cae6f067 PUD ca99c067 PMD 0
May  7 14:47:35 server kernel: [  501.247426] Oops: 0000 [#1] SMP
May  7 14:47:35 server kernel: [  501.247482] Modules linked in: xfs tun xt_connmark xt_TCPMSS xt_tcpmss xt_owner xt_REDIRECT nf_nat_redirect xt_nat ipt_MASQUERADE nf_nat_masquerade_ipv4 ts_kmp ts_bm xt_string ipt_REJECT nf_reject_ipv4 xt_recent xt_conntrack xt_multiport xt_pkttype xt_tcpudp xt_mark nf_log_ipv4 nf_log_common xt_LOG xt_limit iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables ip6table_filter ip6_tables x_tables pppoe pppox dm_crypt ts2020 regmap_i2c ds3000 cx88_dvb dvb_pll cx88_vp3054_i2c mt352 videobuf2_dvb cx8800 cx8802 cx88xx pl2303 tveeprom videobuf2_dma_sg ppdev videobuf2_memops videobuf2_v4l2 videobuf2_core dvb_usb_digitv snd_hda_codec_via snd_hda_codec_hdmi snd_hda_codec_generic radeon dvb_usb snd_hda_intel amd64_edac_mod serio_raw snd_hda_codec edac_core fbcon k10temp bitblit softcursor snd_hda_core font snd_pcm_oss i2c_piix4 snd_mixer_oss tileblit drm_kms_helper syscopyarea snd_pcm snd_seq_dummy sysfillrect snd_seq_oss sysimgblt fb_sys_fops ttm snd_seq_midi r8169 snd_rawmidi drm snd_seq_midi_event e1000e snd_seq snd_seq_device snd_timer snd ptp pps_core i2c_algo_bit soundcore parport_pc ohci_pci shpchp tpm_tis tpm nfsd auth_rpcgss oid_registry hwmon_vid exportfs nfs_acl mii nfs bonding lockd grace lp sunrpc parport
May  7 14:47:35 server kernel: [  501.249564] CPU: 1 PID: 6889 Comm: vb2-cx88[0] Not tainted 4.5.3 #3
May  7 14:47:35 server kernel: [  501.249644] Hardware name: System manufacturer System Product Name/M4A785TD-V EVO, BIOS 0211    07/08/2009
May  7 14:47:35 server kernel: [  501.249767] task: ffff8800aebf3600 ti: ffff8801e07a0000 task.ti: ffff8801e07a0000
May  7 14:47:35 server kernel: [  501.249861] RIP: 0010:[<ffffffffa0222c71>]  [<ffffffffa0222c71>] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2]
May  7 14:47:35 server kernel: [  501.250002] RSP: 0018:ffff8801e07a3de8  EFLAGS: 00010086
May  7 14:47:35 server kernel: [  501.250071] RAX: 0000000000000283 RBX: ffff880210dc5000 RCX: 0000000000000283
May  7 14:47:35 server kernel: [  501.250161] RDX: ffffffffa0222cf0 RSI: 0000000000000000 RDI: ffff880210dc5014
May  7 14:47:35 server kernel: [  501.250251] RBP: ffff8801e07a3df8 R08: ffff8801e07a0000 R09: 0000000000000000
May  7 14:47:35 server kernel: [  501.250348] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8800cda2a9d8
May  7 14:47:35 server kernel: [  501.250438] R13: ffff880210dc51b8 R14: 0000000000000000 R15: ffff8800cda2a828
May  7 14:47:35 server kernel: [  501.250528] FS:  00007f5b77fff700(0000) GS:ffff88021fc40000(0000) knlGS:00000000adaffb40
May  7 14:47:35 server kernel: [  501.250631] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
May  7 14:47:35 server kernel: [  501.250704] CR2: 0000000000000004 CR3: 00000000ca19d000 CR4: 00000000000006e0
May  7 14:47:35 server kernel: [  501.250794] Stack:
May  7 14:47:35 server kernel: [  501.250822]  ffff8801e07a3df8 ffffffffa0222cfd ffff8801e07a3e70 ffffffffa0236beb
May  7 14:47:35 server kernel: [  501.250937]  0000000000000283 ffff8801e07a3e94 0000000000000000 0000000000000000
May  7 14:47:35 server kernel: [  501.251051]  ffff8800aebf3600 ffffffff8108d8e0 ffff8801e07a3e38 ffff8801e07a3e38
May  7 14:47:35 server kernel: [  501.251165] Call Trace:
May  7 14:47:35 server kernel: [  501.251200]  [<ffffffffa0222cfd>] ? __verify_planes_array_core+0xd/0x10 [videobuf2_v4l2]
May  7 14:47:35 server kernel: [  501.251306]  [<ffffffffa0236beb>] vb2_core_dqbuf+0x2eb/0x4c0 [videobuf2_core]
May  7 14:47:35 server kernel: [  501.251398]  [<ffffffff8108d8e0>] ? prepare_to_wait_event+0x100/0x100
May  7 14:47:35 server kernel: [  501.251482]  [<ffffffffa023855b>] vb2_thread+0x1cb/0x220 [videobuf2_core]
May  7 14:47:35 server kernel: [  501.251569]  [<ffffffffa0238390>] ? vb2_core_qbuf+0x230/0x230 [videobuf2_core]
May  7 14:47:35 server kernel: [  501.251662]  [<ffffffffa0238390>] ? vb2_core_qbuf+0x230/0x230 [videobuf2_core]
May  7 14:47:35 server kernel: [  501.255982]  [<ffffffff8106f984>] kthread+0xc4/0xe0
May  7 14:47:35 server kernel: [  501.260292]  [<ffffffff8106f8c0>] ? kthread_park+0x50/0x50
May  7 14:47:35 server kernel: [  501.264615]  [<ffffffff81697a5f>] ret_from_fork+0x3f/0x70
May  7 14:47:35 server kernel: [  501.268962]  [<ffffffff8106f8c0>] ? kthread_park+0x50/0x50
May  7 14:47:35 server kernel: [  501.273216] Code: 0d 01 74 16 48 8b 46 28 48 8b 56 30 48 89 87 d0 01 00 00 48 89 97 d8 01 00 00 5d c3 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 <8b> 46 04 48 89 e5 8d 50 f7 31 c0 83 fa 01 76 02 5d c3 48 83 7e
May  7 14:47:35 server kernel: [  501.282146] RIP  [<ffffffffa0222c71>] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2]
May  7 14:47:35 server kernel: [  501.286391]  RSP <ffff8801e07a3de8>
May  7 14:47:35 server kernel: [  501.290619] CR2: 0000000000000004
May  7 14:47:35 server kernel: [  501.294786] ---[ end trace b2b354153ccad110 ]---

This reverts commit 2c1f695.

Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
Cc: Hans Verkuil <hans.verkuil@cisco.com>
Fixes: 2c1f695 ("[media] videobuf2-v4l2: Verify planes array in buffer dequeueing")
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
cce1987

@popcornmix popcornmix pushed a commit that referenced this issue Jun 2, 2016

@rogerq @gregkh rogerq + gregkh mfd: omap-usb-tll: Fix scheduling while atomic BUG
commit b49b927 upstream.

We shouldn't be calling clk_prepare_enable()/clk_prepare_disable()
in an atomic context.

Fixes the following issue:

[    5.830970] ehci-omap: OMAP-EHCI Host Controller driver
[    5.830974] driver_register 'ehci-omap'
[    5.895849] driver_register 'wl1271_sdio'
[    5.896870] BUG: scheduling while atomic: udevd/994/0x00000002
[    5.896876] 4 locks held by udevd/994:
[    5.896904]  #0:  (&dev->mutex){......}, at: [<c049597c>] __driver_attach+0x60/0xac
[    5.896923]  #1:  (&dev->mutex){......}, at: [<c049598c>] __driver_attach+0x70/0xac
[    5.896946]  #2:  (tll_lock){+.+...}, at: [<c04c2630>] omap_tll_enable+0x2c/0xd0
[    5.896966]  #3:  (prepare_lock){+.+...}, at: [<c05ce9c8>] clk_prepare_lock+0x48/0xe0
[    5.897042] Modules linked in: wlcore_sdio(+) ehci_omap(+) dwc3_omap snd_soc_ts3a225e leds_is31fl319x bq27xxx_battery_i2c tsc2007 bq27xxx_battery bq2429x_charger ina2xx tca8418_keypad as5013 leds_tca6507 twl6040_vibra gpio_twl6040 bmp085_i2c(+) palmas_gpadc usb3503 palmas_pwrbutton bmg160_i2c(+) bmp085 bma150(+) bmg160_core bmp280 input_polldev snd_soc_omap_mcbsp snd_soc_omap_mcpdm snd_soc_omap snd_pcm_dmaengine
[    5.897048] Preemption disabled at:[<  (null)>]   (null)
[    5.897051]
[    5.897059] CPU: 0 PID: 994 Comm: udevd Not tainted 4.6.0-rc5-letux+ #233
[    5.897062] Hardware name: Generic OMAP5 (Flattened Device Tree)
[    5.897076] [<c010e714>] (unwind_backtrace) from [<c010af34>] (show_stack+0x10/0x14)
[    5.897087] [<c010af34>] (show_stack) from [<c040aa7c>] (dump_stack+0x88/0xc0)
[    5.897099] [<c040aa7c>] (dump_stack) from [<c020c558>] (__schedule_bug+0xac/0xd0)
[    5.897111] [<c020c558>] (__schedule_bug) from [<c06f3d44>] (__schedule+0x88/0x7e4)
[    5.897120] [<c06f3d44>] (__schedule) from [<c06f46d8>] (schedule+0x9c/0xc0)
[    5.897129] [<c06f46d8>] (schedule) from [<c06f4904>] (schedule_preempt_disabled+0x14/0x20)
[    5.897140] [<c06f4904>] (schedule_preempt_disabled) from [<c06f64e4>] (mutex_lock_nested+0x258/0x43c)
[    5.897150] [<c06f64e4>] (mutex_lock_nested) from [<c05ce9c8>] (clk_prepare_lock+0x48/0xe0)
[    5.897160] [<c05ce9c8>] (clk_prepare_lock) from [<c05d0e7c>] (clk_prepare+0x10/0x28)
[    5.897169] [<c05d0e7c>] (clk_prepare) from [<c04c2668>] (omap_tll_enable+0x64/0xd0)
[    5.897180] [<c04c2668>] (omap_tll_enable) from [<c04c1728>] (usbhs_runtime_resume+0x18/0x17c)
[    5.897192] [<c04c1728>] (usbhs_runtime_resume) from [<c049d404>] (pm_generic_runtime_resume+0x2c/0x40)
[    5.897202] [<c049d404>] (pm_generic_runtime_resume) from [<c049f180>] (__rpm_callback+0x38/0x68)
[    5.897210] [<c049f180>] (__rpm_callback) from [<c049f220>] (rpm_callback+0x70/0x88)
[    5.897218] [<c049f220>] (rpm_callback) from [<c04a0a00>] (rpm_resume+0x4ec/0x7ec)
[    5.897227] [<c04a0a00>] (rpm_resume) from [<c04a0f48>] (__pm_runtime_resume+0x4c/0x64)
[    5.897236] [<c04a0f48>] (__pm_runtime_resume) from [<c04958dc>] (driver_probe_device+0x30/0x70)
[    5.897246] [<c04958dc>] (driver_probe_device) from [<c04959a4>] (__driver_attach+0x88/0xac)
[    5.897256] [<c04959a4>] (__driver_attach) from [<c04940f8>] (bus_for_each_dev+0x50/0x84)
[    5.897267] [<c04940f8>] (bus_for_each_dev) from [<c0494e40>] (bus_add_driver+0xcc/0x1e4)
[    5.897276] [<c0494e40>] (bus_add_driver) from [<c0496914>] (driver_register+0xac/0xf4)
[    5.897286] [<c0496914>] (driver_register) from [<c01018e0>] (do_one_initcall+0x100/0x1b8)
[    5.897296] [<c01018e0>] (do_one_initcall) from [<c01c7a54>] (do_init_module+0x58/0x1c0)
[    5.897304] [<c01c7a54>] (do_init_module) from [<c01c8a3c>] (SyS_finit_module+0x88/0x90)
[    5.897313] [<c01c8a3c>] (SyS_finit_module) from [<c0107120>] (ret_fast_syscall+0x0/0x1c)
[    5.912697] ------------[ cut here ]------------
[    5.912711] WARNING: CPU: 0 PID: 994 at kernel/sched/core.c:2996 _raw_spin_unlock+0x28/0x58
[    5.912717] DEBUG_LOCKS_WARN_ON(val > preempt_count())

Reported-by: H. Nikolaus Schaller <hns@goldelico.com>
Tested-by: H. Nikolaus Schaller <hns@goldelico.com>
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
c521599

@davet321 davet321 pushed a commit to davet321/rpi-linux that referenced this issue Jun 2, 2016

@rogerq @gregkh rogerq + gregkh mfd: omap-usb-tll: Fix scheduling while atomic BUG
commit b49b927 upstream.

We shouldn't be calling clk_prepare_enable()/clk_prepare_disable()
in an atomic context.

Fixes the following issue:

[    5.830970] ehci-omap: OMAP-EHCI Host Controller driver
[    5.830974] driver_register 'ehci-omap'
[    5.895849] driver_register 'wl1271_sdio'
[    5.896870] BUG: scheduling while atomic: udevd/994/0x00000002
[    5.896876] 4 locks held by udevd/994:
[    5.896904]  #0:  (&dev->mutex){......}, at: [<c049597c>] __driver_attach+0x60/0xac
[    5.896923]  #1:  (&dev->mutex){......}, at: [<c049598c>] __driver_attach+0x70/0xac
[    5.896946]  #2:  (tll_lock){+.+...}, at: [<c04c2630>] omap_tll_enable+0x2c/0xd0
[    5.896966]  #3:  (prepare_lock){+.+...}, at: [<c05ce9c8>] clk_prepare_lock+0x48/0xe0
[    5.897042] Modules linked in: wlcore_sdio(+) ehci_omap(+) dwc3_omap snd_soc_ts3a225e leds_is31fl319x bq27xxx_battery_i2c tsc2007 bq27xxx_battery bq2429x_charger ina2xx tca8418_keypad as5013 leds_tca6507 twl6040_vibra gpio_twl6040 bmp085_i2c(+) palmas_gpadc usb3503 palmas_pwrbutton bmg160_i2c(+) bmp085 bma150(+) bmg160_core bmp280 input_polldev snd_soc_omap_mcbsp snd_soc_omap_mcpdm snd_soc_omap snd_pcm_dmaengine
[    5.897048] Preemption disabled at:[<  (null)>]   (null)
[    5.897051]
[    5.897059] CPU: 0 PID: 994 Comm: udevd Not tainted 4.6.0-rc5-letux+ #233
[    5.897062] Hardware name: Generic OMAP5 (Flattened Device Tree)
[    5.897076] [<c010e714>] (unwind_backtrace) from [<c010af34>] (show_stack+0x10/0x14)
[    5.897087] [<c010af34>] (show_stack) from [<c040aa7c>] (dump_stack+0x88/0xc0)
[    5.897099] [<c040aa7c>] (dump_stack) from [<c020c558>] (__schedule_bug+0xac/0xd0)
[    5.897111] [<c020c558>] (__schedule_bug) from [<c06f3d44>] (__schedule+0x88/0x7e4)
[    5.897120] [<c06f3d44>] (__schedule) from [<c06f46d8>] (schedule+0x9c/0xc0)
[    5.897129] [<c06f46d8>] (schedule) from [<c06f4904>] (schedule_preempt_disabled+0x14/0x20)
[    5.897140] [<c06f4904>] (schedule_preempt_disabled) from [<c06f64e4>] (mutex_lock_nested+0x258/0x43c)
[    5.897150] [<c06f64e4>] (mutex_lock_nested) from [<c05ce9c8>] (clk_prepare_lock+0x48/0xe0)
[    5.897160] [<c05ce9c8>] (clk_prepare_lock) from [<c05d0e7c>] (clk_prepare+0x10/0x28)
[    5.897169] [<c05d0e7c>] (clk_prepare) from [<c04c2668>] (omap_tll_enable+0x64/0xd0)
[    5.897180] [<c04c2668>] (omap_tll_enable) from [<c04c1728>] (usbhs_runtime_resume+0x18/0x17c)
[    5.897192] [<c04c1728>] (usbhs_runtime_resume) from [<c049d404>] (pm_generic_runtime_resume+0x2c/0x40)
[    5.897202] [<c049d404>] (pm_generic_runtime_resume) from [<c049f180>] (__rpm_callback+0x38/0x68)
[    5.897210] [<c049f180>] (__rpm_callback) from [<c049f220>] (rpm_callback+0x70/0x88)
[    5.897218] [<c049f220>] (rpm_callback) from [<c04a0a00>] (rpm_resume+0x4ec/0x7ec)
[    5.897227] [<c04a0a00>] (rpm_resume) from [<c04a0f48>] (__pm_runtime_resume+0x4c/0x64)
[    5.897236] [<c04a0f48>] (__pm_runtime_resume) from [<c04958dc>] (driver_probe_device+0x30/0x70)
[    5.897246] [<c04958dc>] (driver_probe_device) from [<c04959a4>] (__driver_attach+0x88/0xac)
[    5.897256] [<c04959a4>] (__driver_attach) from [<c04940f8>] (bus_for_each_dev+0x50/0x84)
[    5.897267] [<c04940f8>] (bus_for_each_dev) from [<c0494e40>] (bus_add_driver+0xcc/0x1e4)
[    5.897276] [<c0494e40>] (bus_add_driver) from [<c0496914>] (driver_register+0xac/0xf4)
[    5.897286] [<c0496914>] (driver_register) from [<c01018e0>] (do_one_initcall+0x100/0x1b8)
[    5.897296] [<c01018e0>] (do_one_initcall) from [<c01c7a54>] (do_init_module+0x58/0x1c0)
[    5.897304] [<c01c7a54>] (do_init_module) from [<c01c8a3c>] (SyS_finit_module+0x88/0x90)
[    5.897313] [<c01c8a3c>] (SyS_finit_module) from [<c0107120>] (ret_fast_syscall+0x0/0x1c)
[    5.912697] ------------[ cut here ]------------
[    5.912711] WARNING: CPU: 0 PID: 994 at kernel/sched/core.c:2996 _raw_spin_unlock+0x28/0x58
[    5.912717] DEBUG_LOCKS_WARN_ON(val > preempt_count())

Reported-by: H. Nikolaus Schaller <hns@goldelico.com>
Tested-by: H. Nikolaus Schaller <hns@goldelico.com>
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
8936cc8

@popcornmix popcornmix pushed a commit that referenced this issue Jun 3, 2016

@rogerq @gregkh rogerq + gregkh mfd: omap-usb-tll: Fix scheduling while atomic BUG
commit b49b927 upstream.

We shouldn't be calling clk_prepare_enable()/clk_prepare_disable()
in an atomic context.

Fixes the following issue:

[    5.830970] ehci-omap: OMAP-EHCI Host Controller driver
[    5.830974] driver_register 'ehci-omap'
[    5.895849] driver_register 'wl1271_sdio'
[    5.896870] BUG: scheduling while atomic: udevd/994/0x00000002
[    5.896876] 4 locks held by udevd/994:
[    5.896904]  #0:  (&dev->mutex){......}, at: [<c049597c>] __driver_attach+0x60/0xac
[    5.896923]  #1:  (&dev->mutex){......}, at: [<c049598c>] __driver_attach+0x70/0xac
[    5.896946]  #2:  (tll_lock){+.+...}, at: [<c04c2630>] omap_tll_enable+0x2c/0xd0
[    5.896966]  #3:  (prepare_lock){+.+...}, at: [<c05ce9c8>] clk_prepare_lock+0x48/0xe0
[    5.897042] Modules linked in: wlcore_sdio(+) ehci_omap(+) dwc3_omap snd_soc_ts3a225e leds_is31fl319x bq27xxx_battery_i2c tsc2007 bq27xxx_battery bq2429x_charger ina2xx tca8418_keypad as5013 leds_tca6507 twl6040_vibra gpio_twl6040 bmp085_i2c(+) palmas_gpadc usb3503 palmas_pwrbutton bmg160_i2c(+) bmp085 bma150(+) bmg160_core bmp280 input_polldev snd_soc_omap_mcbsp snd_soc_omap_mcpdm snd_soc_omap snd_pcm_dmaengine
[    5.897048] Preemption disabled at:[<  (null)>]   (null)
[    5.897051]
[    5.897059] CPU: 0 PID: 994 Comm: udevd Not tainted 4.6.0-rc5-letux+ #233
[    5.897062] Hardware name: Generic OMAP5 (Flattened Device Tree)
[    5.897076] [<c010e714>] (unwind_backtrace) from [<c010af34>] (show_stack+0x10/0x14)
[    5.897087] [<c010af34>] (show_stack) from [<c040aa7c>] (dump_stack+0x88/0xc0)
[    5.897099] [<c040aa7c>] (dump_stack) from [<c020c558>] (__schedule_bug+0xac/0xd0)
[    5.897111] [<c020c558>] (__schedule_bug) from [<c06f3d44>] (__schedule+0x88/0x7e4)
[    5.897120] [<c06f3d44>] (__schedule) from [<c06f46d8>] (schedule+0x9c/0xc0)
[    5.897129] [<c06f46d8>] (schedule) from [<c06f4904>] (schedule_preempt_disabled+0x14/0x20)
[    5.897140] [<c06f4904>] (schedule_preempt_disabled) from [<c06f64e4>] (mutex_lock_nested+0x258/0x43c)
[    5.897150] [<c06f64e4>] (mutex_lock_nested) from [<c05ce9c8>] (clk_prepare_lock+0x48/0xe0)
[    5.897160] [<c05ce9c8>] (clk_prepare_lock) from [<c05d0e7c>] (clk_prepare+0x10/0x28)
[    5.897169] [<c05d0e7c>] (clk_prepare) from [<c04c2668>] (omap_tll_enable+0x64/0xd0)
[    5.897180] [<c04c2668>] (omap_tll_enable) from [<c04c1728>] (usbhs_runtime_resume+0x18/0x17c)
[    5.897192] [<c04c1728>] (usbhs_runtime_resume) from [<c049d404>] (pm_generic_runtime_resume+0x2c/0x40)
[    5.897202] [<c049d404>] (pm_generic_runtime_resume) from [<c049f180>] (__rpm_callback+0x38/0x68)
[    5.897210] [<c049f180>] (__rpm_callback) from [<c049f220>] (rpm_callback+0x70/0x88)
[    5.897218] [<c049f220>] (rpm_callback) from [<c04a0a00>] (rpm_resume+0x4ec/0x7ec)
[    5.897227] [<c04a0a00>] (rpm_resume) from [<c04a0f48>] (__pm_runtime_resume+0x4c/0x64)
[    5.897236] [<c04a0f48>] (__pm_runtime_resume) from [<c04958dc>] (driver_probe_device+0x30/0x70)
[    5.897246] [<c04958dc>] (driver_probe_device) from [<c04959a4>] (__driver_attach+0x88/0xac)
[    5.897256] [<c04959a4>] (__driver_attach) from [<c04940f8>] (bus_for_each_dev+0x50/0x84)
[    5.897267] [<c04940f8>] (bus_for_each_dev) from [<c0494e40>] (bus_add_driver+0xcc/0x1e4)
[    5.897276] [<c0494e40>] (bus_add_driver) from [<c0496914>] (driver_register+0xac/0xf4)
[    5.897286] [<c0496914>] (driver_register) from [<c01018e0>] (do_one_initcall+0x100/0x1b8)
[    5.897296] [<c01018e0>] (do_one_initcall) from [<c01c7a54>] (do_init_module+0x58/0x1c0)
[    5.897304] [<c01c7a54>] (do_init_module) from [<c01c8a3c>] (SyS_finit_module+0x88/0x90)
[    5.897313] [<c01c8a3c>] (SyS_finit_module) from [<c0107120>] (ret_fast_syscall+0x0/0x1c)
[    5.912697] ------------[ cut here ]------------
[    5.912711] WARNING: CPU: 0 PID: 994 at kernel/sched/core.c:2996 _raw_spin_unlock+0x28/0x58
[    5.912717] DEBUG_LOCKS_WARN_ON(val > preempt_count())

Reported-by: H. Nikolaus Schaller <hns@goldelico.com>
Tested-by: H. Nikolaus Schaller <hns@goldelico.com>
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
0f25db7

@popcornmix popcornmix pushed a commit that referenced this issue Jun 6, 2016

@htejun @torvalds htejun + torvalds memcg: add RCU locking around css_for_each_descendant_pre() in memcg_…
…offline_kmem()

memcg_offline_kmem() may be called from memcg_free_kmem() after a css
init failure.  memcg_free_kmem() is a ->css_free callback which is
called without cgroup_mutex and memcg_offline_kmem() ends up using
css_for_each_descendant_pre() without any locking.  Fix it by adding rcu
read locking around it.

    mkdir: cannot create directory `65530': No space left on device
    ===============================
    [ INFO: suspicious RCU usage. ]
    4.6.0-work+ #321 Not tainted
    -------------------------------
    kernel/cgroup.c:4008 cgroup_mutex or RCU read lock required!
     [  527.243970] other info that might help us debug this:
     [  527.244715]
    rcu_scheduler_active = 1, debug_locks = 0
    2 locks held by kworker/0:5/1664:
     #0:  ("cgroup_destroy"){.+.+..}, at: [<ffffffff81060ab5>] process_one_work+0x165/0x4a0
     #1:  ((&css->destroy_work)#3){+.+...}, at: [<ffffffff81060ab5>] process_one_work+0x165/0x4a0
     [  527.248098] stack backtrace:
    CPU: 0 PID: 1664 Comm: kworker/0:5 Not tainted 4.6.0-work+ #321
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.1-1.fc24 04/01/2014
    Workqueue: cgroup_destroy css_free_work_fn
    Call Trace:
      dump_stack+0x68/0xa1
      lockdep_rcu_suspicious+0xd7/0x110
      css_next_descendant_pre+0x7d/0xb0
      memcg_offline_kmem.part.44+0x4a/0xc0
      mem_cgroup_css_free+0x1ec/0x200
      css_free_work_fn+0x49/0x5e0
      process_one_work+0x1c5/0x4a0
      worker_thread+0x49/0x490
      kthread+0xea/0x100
      ret_from_fork+0x1f/0x40

Link: http://lkml.kernel.org/r/20160526203018.GG23194@mtj.duckdns.org
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: <stable@vger.kernel.org>	[4.5+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3a06bb7

jsarenik commented Jun 8, 2016

#3 testing if pre-formatted text makes a link to issue

jsarenik commented Jun 8, 2016

@popcornmix @davet321 @ollllivier @giraldeau @ffeldbauer @syntheticpp and others...

Please see https://help.github.com/articles/basic-writing-and-formatting-syntax/#quoting-code and use quotes to mark everything (at least what could contain #3) - i.e. three back-ticks on a line by themselves before and after the quoted text.

I understand this is an extra effort when the console output is just pasted. On the other hand it allows some processing (which GitHub does in order to cross-reference issues for example).

popcornmix closed this Jun 8, 2016

@popcornmix popcornmix pushed a commit that referenced this issue Jun 8, 2016

@asj @gregkh asj + gregkh btrfs: fix lock dep warning, move scratch dev out of device_list_mute…
…x and uuid_mutex

commit 779bf3f upstream.

When the replace target fails, the target device will be taken
out of fs device list, scratch + update_dev_time and freed. However
we could do the scratch  + update_dev_time and free part after the
device has been taken out of device list, so that we don't have to
hold the device_list_mutex and uuid_mutex locks.

Reported issue:

[ 5375.718845] ======================================================
[ 5375.718846] [ INFO: possible circular locking dependency detected ]
[ 5375.718849] 4.4.5-scst31x-debug-11+ #40 Not tainted
[ 5375.718849] -------------------------------------------------------
[ 5375.718851] btrfs-health/4662 is trying to acquire lock:
[ 5375.718861]  (sb_writers){.+.+.+}, at: [<ffffffff812214f7>] __sb_start_write+0xb7/0xf0
[ 5375.718862]
[ 5375.718862] but task is already holding lock:
[ 5375.718907]  (&fs_devs->device_list_mutex){+.+.+.}, at: [<ffffffffa028263c>] btrfs_destroy_dev_replace_tgtdev+0x3c/0x150 [btrfs]
[ 5375.718907]
[ 5375.718907] which lock already depends on the new lock.
[ 5375.718907]
[ 5375.718908]
[ 5375.718908] the existing dependency chain (in reverse order) is:
[ 5375.718911]
[ 5375.718911] -> #3 (&fs_devs->device_list_mutex){+.+.+.}:
[ 5375.718917]        [<ffffffff810da4be>] lock_acquire+0xce/0x1e0
[ 5375.718921]        [<ffffffff81633949>] mutex_lock_nested+0x69/0x3c0
[ 5375.718940]        [<ffffffffa0219bf6>] btrfs_show_devname+0x36/0x210 [btrfs]
[ 5375.718945]        [<ffffffff81267079>] show_vfsmnt+0x49/0x150
[ 5375.718948]        [<ffffffff81240b07>] m_show+0x17/0x20
[ 5375.718951]        [<ffffffff81246868>] seq_read+0x2d8/0x3b0
[ 5375.718955]        [<ffffffff8121df28>] __vfs_read+0x28/0xd0
[ 5375.718959]        [<ffffffff8121e806>] vfs_read+0x86/0x130
[ 5375.718962]        [<ffffffff8121f4c9>] SyS_read+0x49/0xa0
[ 5375.718966]        [<ffffffff81637976>] entry_SYSCALL_64_fastpath+0x16/0x7a
[ 5375.718968]
[ 5375.718968] -> #2 (namespace_sem){+++++.}:
[ 5375.718971]        [<ffffffff810da4be>] lock_acquire+0xce/0x1e0
[ 5375.718974]        [<ffffffff81635199>] down_write+0x49/0x80
[ 5375.718977]        [<ffffffff81243593>] lock_mount+0x43/0x1c0
[ 5375.718979]        [<ffffffff81243c13>] do_add_mount+0x23/0xd0
[ 5375.718982]        [<ffffffff81244afb>] do_mount+0x27b/0xe30
[ 5375.718985]        [<ffffffff812459dc>] SyS_mount+0x8c/0xd0
[ 5375.718988]        [<ffffffff81637976>] entry_SYSCALL_64_fastpath+0x16/0x7a
[ 5375.718991]
[ 5375.718991] -> #1 (&sb->s_type->i_mutex_key#5){+.+.+.}:
[ 5375.718994]        [<ffffffff810da4be>] lock_acquire+0xce/0x1e0
[ 5375.718996]        [<ffffffff81633949>] mutex_lock_nested+0x69/0x3c0
[ 5375.719001]        [<ffffffff8122d608>] path_openat+0x468/0x1360
[ 5375.719004]        [<ffffffff8122f86e>] do_filp_open+0x7e/0xe0
[ 5375.719007]        [<ffffffff8121da7b>] do_sys_open+0x12b/0x210
[ 5375.719010]        [<ffffffff8121db7e>] SyS_open+0x1e/0x20
[ 5375.719013]        [<ffffffff81637976>] entry_SYSCALL_64_fastpath+0x16/0x7a
[ 5375.719015]
[ 5375.719015] -> #0 (sb_writers){.+.+.+}:
[ 5375.719018]        [<ffffffff810d97ca>] __lock_acquire+0x17ba/0x1ae0
[ 5375.719021]        [<ffffffff810da4be>] lock_acquire+0xce/0x1e0
[ 5375.719026]        [<ffffffff810d3bef>] percpu_down_read+0x4f/0xa0
[ 5375.719028]        [<ffffffff812214f7>] __sb_start_write+0xb7/0xf0
[ 5375.719031]        [<ffffffff81242eb4>] mnt_want_write+0x24/0x50
[ 5375.719035]        [<ffffffff8122ded2>] path_openat+0xd32/0x1360
[ 5375.719037]        [<ffffffff8122f86e>] do_filp_open+0x7e/0xe0
[ 5375.719040]        [<ffffffff8121d8a4>] file_open_name+0xe4/0x130
[ 5375.719043]        [<ffffffff8121d923>] filp_open+0x33/0x60
[ 5375.719073]        [<ffffffffa02776a6>] update_dev_time+0x16/0x40 [btrfs]
[ 5375.719099]        [<ffffffffa02825be>] btrfs_scratch_superblocks+0x4e/0x90 [btrfs]
[ 5375.719123]        [<ffffffffa0282665>] btrfs_destroy_dev_replace_tgtdev+0x65/0x150 [btrfs]
[ 5375.719150]        [<ffffffffa02c6c80>] btrfs_dev_replace_finishing+0x6b0/0x990 [btrfs]
[ 5375.719175]        [<ffffffffa02c729e>] btrfs_dev_replace_start+0x33e/0x540 [btrfs]
[ 5375.719199]        [<ffffffffa02c7f58>] btrfs_auto_replace_start+0xf8/0x140 [btrfs]
[ 5375.719222]        [<ffffffffa02464e6>] health_kthread+0x246/0x490 [btrfs]
[ 5375.719225]        [<ffffffff810a70df>] kthread+0xef/0x110
[ 5375.719229]        [<ffffffff81637d2f>] ret_from_fork+0x3f/0x70
[ 5375.719230]
[ 5375.719230] other info that might help us debug this:
[ 5375.719230]
[ 5375.719233] Chain exists of:
[ 5375.719233]   sb_writers --> namespace_sem --> &fs_devs->device_list_mutex
[ 5375.719233]
[ 5375.719234]  Possible unsafe locking scenario:
[ 5375.719234]
[ 5375.719234]        CPU0                    CPU1
[ 5375.719235]        ----                    ----
[ 5375.719236]   lock(&fs_devs->device_list_mutex);
[ 5375.719238]                                lock(namespace_sem);
[ 5375.719239]                                lock(&fs_devs->device_list_mutex);
[ 5375.719241]   lock(sb_writers);
[ 5375.719241]
[ 5375.719241]  *** DEADLOCK ***
[ 5375.719241]
[ 5375.719243] 4 locks held by btrfs-health/4662:
[ 5375.719266]  #0:  (&fs_info->health_mutex){+.+.+.}, at: [<ffffffffa0246303>] health_kthread+0x63/0x490 [btrfs]
[ 5375.719293]  #1:  (&fs_info->dev_replace.lock_finishing_cancel_unmount){+.+.+.}, at: [<ffffffffa02c6611>] btrfs_dev_replace_finishing+0x41/0x990 [btrfs]
[ 5375.719319]  #2:  (uuid_mutex){+.+.+.}, at: [<ffffffffa0282620>] btrfs_destroy_dev_replace_tgtdev+0x20/0x150 [btrfs]
[ 5375.719343]  #3:  (&fs_devs->device_list_mutex){+.+.+.}, at: [<ffffffffa028263c>] btrfs_destroy_dev_replace_tgtdev+0x3c/0x150 [btrfs]
[ 5375.719343]
[ 5375.719343] stack backtrace:
[ 5375.719347] CPU: 2 PID: 4662 Comm: btrfs-health Not tainted 4.4.5-scst31x-debug-11+ #40
[ 5375.719348] Hardware name: Supermicro SYS-6018R-WTRT/X10DRW-iT, BIOS 1.0c 01/07/2015
[ 5375.719352]  0000000000000000 ffff880856f73880 ffffffff813529e3 ffffffff826182a0
[ 5375.719354]  ffffffff8260c090 ffff880856f738c0 ffffffff810d667c ffff880856f73930
[ 5375.719357]  ffff880861f32b40 ffff880861f32b68 0000000000000003 0000000000000004
[ 5375.719357] Call Trace:
[ 5375.719363]  [<ffffffff813529e3>] dump_stack+0x85/0xc2
[ 5375.719366]  [<ffffffff810d667c>] print_circular_bug+0x1ec/0x260
[ 5375.719369]  [<ffffffff810d97ca>] __lock_acquire+0x17ba/0x1ae0
[ 5375.719373]  [<ffffffff810f606d>] ? debug_lockdep_rcu_enabled+0x1d/0x20
[ 5375.719376]  [<ffffffff810da4be>] lock_acquire+0xce/0x1e0
[ 5375.719378]  [<ffffffff812214f7>] ? __sb_start_write+0xb7/0xf0
[ 5375.719383]  [<ffffffff810d3bef>] percpu_down_read+0x4f/0xa0
[ 5375.719385]  [<ffffffff812214f7>] ? __sb_start_write+0xb7/0xf0
[ 5375.719387]  [<ffffffff812214f7>] __sb_start_write+0xb7/0xf0
[ 5375.719389]  [<ffffffff81242eb4>] mnt_want_write+0x24/0x50
[ 5375.719393]  [<ffffffff8122ded2>] path_openat+0xd32/0x1360
[ 5375.719415]  [<ffffffffa02462a0>] ? btrfs_congested_fn+0x180/0x180 [btrfs]
[ 5375.719418]  [<ffffffff810f606d>] ? debug_lockdep_rcu_enabled+0x1d/0x20
[ 5375.719420]  [<ffffffff8122f86e>] do_filp_open+0x7e/0xe0
[ 5375.719423]  [<ffffffff810f615d>] ? rcu_read_lock_sched_held+0x6d/0x80
[ 5375.719426]  [<ffffffff81201a9b>] ? kmem_cache_alloc+0x26b/0x5d0
[ 5375.719430]  [<ffffffff8122e7d4>] ? getname_kernel+0x34/0x120
[ 5375.719433]  [<ffffffff8121d8a4>] file_open_name+0xe4/0x130
[ 5375.719436]  [<ffffffff8121d923>] filp_open+0x33/0x60
[ 5375.719462]  [<ffffffffa02776a6>] update_dev_time+0x16/0x40 [btrfs]
[ 5375.719485]  [<ffffffffa02825be>] btrfs_scratch_superblocks+0x4e/0x90 [btrfs]
[ 5375.719506]  [<ffffffffa0282665>] btrfs_destroy_dev_replace_tgtdev+0x65/0x150 [btrfs]
[ 5375.719530]  [<ffffffffa02c6c80>] btrfs_dev_replace_finishing+0x6b0/0x990 [btrfs]
[ 5375.719554]  [<ffffffffa02c6b23>] ? btrfs_dev_replace_finishing+0x553/0x990 [btrfs]
[ 5375.719576]  [<ffffffffa02c729e>] btrfs_dev_replace_start+0x33e/0x540 [btrfs]
[ 5375.719598]  [<ffffffffa02c7f58>] btrfs_auto_replace_start+0xf8/0x140 [btrfs]
[ 5375.719621]  [<ffffffffa02464e6>] health_kthread+0x246/0x490 [btrfs]
[ 5375.719641]  [<ffffffffa02463d8>] ? health_kthread+0x138/0x490 [btrfs]
[ 5375.719661]  [<ffffffffa02462a0>] ? btrfs_congested_fn+0x180/0x180 [btrfs]
[ 5375.719663]  [<ffffffff810a70df>] kthread+0xef/0x110
[ 5375.719666]  [<ffffffff810a6ff0>] ? kthread_create_on_node+0x200/0x200
[ 5375.719669]  [<ffffffff81637d2f>] ret_from_fork+0x3f/0x70
[ 5375.719672]  [<ffffffff810a6ff0>] ? kthread_create_on_node+0x200/0x200
[ 5375.719697] ------------[ cut here ]------------

Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reported-by: Yauhen Kharuzhy <yauhen.kharuzhy@zavadatar.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
72de21b

@popcornmix popcornmix pushed a commit that referenced this issue Jun 21, 2016

@anderco @jnikula anderco + jnikula drm/i915: Fix NULL pointer deference when out of PLLs in IVB
In commit f9476a6 ("drm/i915: Refactor platform specifics out of
intel_get_shared_dpll()"), the ibx_get_dpll() function lacked an error
check, that can lead to a NULL pointer dereference when trying to enable
three pipes.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000068
IP: [<ffffffffa0482275>] intel_reference_shared_dpll+0x15/0x100 [i915]
PGD cec87067 PUD d30ce067 PMD 0
Oops: 0000 [#1] PREEMPT SMP
Modules linked in: snd_hda_intel i915 drm_kms_helper drm intel_gtt sch_fq_codel cfg80211 binfmt_misc i2c_algo_bit cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea intel_rapl iosf_mbi x86_pkg_temp_thermal coretemp agpgart kvm_intel snd_hda_codec_hdmi kvm iTCO_wdt snd_hda_codec_realtek snd_hda_codec_generic irqbypass aesni_intel aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd psmouse pcspkr snd_hda_codec i2c_i801 snd_hwdep snd_hda_core snd_pcm snd_timer lpc_ich mfd_core snd soundcore wmi evdev tpm_tis tpm [last unloaded: drm]
CPU: 3 PID: 5810 Comm: kms_flip Tainted: G     U  W       4.6.0-test+ #3
Hardware name:                  /DZ77BH-55K, BIOS BHZ7710H.86A.0100.2013.0517.0942 05/17/2013
task: ffff8800d3908040 ti: ffff8801166c8000 task.ti: ffff8801166c8000
RIP: 0010:[<ffffffffa0482275>]  [<ffffffffa0482275>] intel_reference_shared_dpll+0x15/0x100 [i915]
RSP: 0018:ffff8801166cba60  EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002
RDX: 0000000000000001 RSI: ffff8800d07f1bf8 RDI: 0000000000000000
RBP: ffff8801166cba88 R08: 0000000000000002 R09: ffff8800d32e5698
R10: 0000000000000001 R11: ffff8800cc89ac88 R12: ffff8800d07f1bf8
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS:  00007f4c3fc8d8c0(0000) GS:ffff88011bcc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000068 CR3: 00000000d3b4c000 CR4: 00000000001406e0
Stack:
 0000000000000000