Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Ramdisk files for emergency kernel #14

Closed
Hexxeh opened this Issue · 2 comments

2 participants

Liam McLoughlin popcornmix
Liam McLoughlin

Can the FS files for the emergency kernel be made public? They're referred to as ../target_fs in the emergency kernel config, but are nowhere to be found in the repo.

I'm trying to build something that runs entirely out of a ramdisk, and I'm having trouble getting /anything/ running in a ramdisk, so it'd be nice to be able to use the emergency one as a reference.

Liam McLoughlin

Excellent, thanks!

Liam McLoughlin Hexxeh closed this
Chris Boot bootc referenced this issue from a commit in bootc/linux-rpi-orig
Mel Gorman mm: compaction: check pfn_valid when entering a new MAX_ORDER_NR_PAGE…
…S block during isolation for migration

commit 0bf380b upstream.

When isolating for migration, migration starts at the start of a zone
which is not necessarily pageblock aligned.  Further, it stops isolating
when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally
not aligned.  This allows isolate_migratepages() to call pfn_to_page() on
an invalid PFN which can result in a crash.  This was originally reported
against a 3.0-based kernel with the following trace in a crash dump.

PID: 9902   TASK: d47aecd0  CPU: 0   COMMAND: "memcg_process_s"
 #0 [d72d3ad0] crash_kexec at c028cfdb
 #1 [d72d3b24] oops_end at c05c5322
 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60
 #3 [d72d3bec] bad_area at c0227fb6
 #4 [d72d3c00] do_page_fault at c05c72ec
 #5 [d72d3c80] error_code (via page_fault) at c05c47a4
    EAX: 00000000  EBX: 000c0000  ECX: 00000001  EDX: 00000807  EBP: 000c0000
    DS:  007b      ESI: 00000001  ES:  007b      EDI: f3000a80  GS:  6f50
    CS:  0060      EIP: c030b15a  ERR: ffffffff  EFLAGS: 00010002
 #6 [d72d3cb4] isolate_migratepages at c030b15a
 #7 [d72d3d14] zone_watermark_ok at c02d26cb
 #8 [d72d3d2c] compact_zone at c030b8de
 #9 [d72d3d68] compact_zone_order at c030bba1
#10 [d72d3db4] try_to_compact_pages at c030bc84
#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7
#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7
#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97
#14 [d72d3eb8] alloc_pages_vma at c030a845
#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb
#16 [d72d3f00] handle_mm_fault at c02f36c6
#17 [d72d3f30] do_page_fault at c05c70ed
#18 [d72d3fb0] error_code (via page_fault) at c05c47a4
    EAX: b71ff000  EBX: 00000001  ECX: 00001600  EDX: 00000431
    DS:  007b      ESI: 08048950  ES:  007b      EDI: bfaa3788
    SS:  007b      ESP: bfaa36e0  EBP: bfaa3828  GS:  6f50
    CS:  0073      EIP: 080487c8  ERR: ffffffff  EFLAGS: 00010202

It was also reported by Herbert van den Bergh against 3.1-based kernel
with the following snippet from the console log.

BUG: unable to handle kernel paging request at 01c00008
IP: [<c0522399>] isolate_migratepages+0x119/0x390
*pdpt = 000000002f7ce001 *pde = 0000000000000000

It is expected that it also affects 3.2.x and current mainline.

The problem is that pfn_valid is only called on the first PFN being
checked and that PFN is not necessarily aligned.  Lets say we have a case
like this

H = MAX_ORDER_NR_PAGES boundary
| = pageblock boundary
m = cc->migrate_pfn
f = cc->free_pfn
o = memory hole

H------|------H------|----m-Hoooooo|ooooooH-f----|------H

The migrate_pfn is just below a memory hole and the free scanner is beyond
the hole.  When isolate_migratepages started, it scans from migrate_pfn to
migrate_pfn+pageblock_nr_pages which is now in a memory hole.  It checks
pfn_valid() on the first PFN but then scans into the hole where there are
not necessarily valid struct pages.

This patch ensures that isolate_migratepages calls pfn_valid when
necessary.

Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9da11af
Peter Huewe PeterHuewe referenced this issue from a commit in PeterHuewe/linux
Mel Gorman mm: compaction: check pfn_valid when entering a new MAX_ORDER_NR_PAGE…
…S block during isolation for migration

When isolating for migration, migration starts at the start of a zone
which is not necessarily pageblock aligned.  Further, it stops isolating
when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally
not aligned.  This allows isolate_migratepages() to call pfn_to_page() on
an invalid PFN which can result in a crash.  This was originally reported
against a 3.0-based kernel with the following trace in a crash dump.

PID: 9902   TASK: d47aecd0  CPU: 0   COMMAND: "memcg_process_s"
 #0 [d72d3ad0] crash_kexec at c028cfdb
 #1 [d72d3b24] oops_end at c05c5322
 #2 [d72d3b38] __bad_area_nosemaphore at c0227e60
 #3 [d72d3bec] bad_area at c0227fb6
 #4 [d72d3c00] do_page_fault at c05c72ec
 #5 [d72d3c80] error_code (via page_fault) at c05c47a4
    EAX: 00000000  EBX: 000c0000  ECX: 00000001  EDX: 00000807  EBP: 000c0000
    DS:  007b      ESI: 00000001  ES:  007b      EDI: f3000a80  GS:  6f50
    CS:  0060      EIP: c030b15a  ERR: ffffffff  EFLAGS: 00010002
 #6 [d72d3cb4] isolate_migratepages at c030b15a
 #7 [d72d3d14] zone_watermark_ok at c02d26cb
 #8 [d72d3d2c] compact_zone at c030b8de
 #9 [d72d3d68] compact_zone_order at c030bba1
#10 [d72d3db4] try_to_compact_pages at c030bc84
#11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7
#12 [d72d3e08] __alloc_pages_slowpath at c02d66c7
#13 [d72d3e78] __alloc_pages_nodemask at c02d6a97
#14 [d72d3eb8] alloc_pages_vma at c030a845
#15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb
#16 [d72d3f00] handle_mm_fault at c02f36c6
#17 [d72d3f30] do_page_fault at c05c70ed
#18 [d72d3fb0] error_code (via page_fault) at c05c47a4
    EAX: b71ff000  EBX: 00000001  ECX: 00001600  EDX: 00000431
    DS:  007b      ESI: 08048950  ES:  007b      EDI: bfaa3788
    SS:  007b      ESP: bfaa36e0  EBP: bfaa3828  GS:  6f50
    CS:  0073      EIP: 080487c8  ERR: ffffffff  EFLAGS: 00010202

It was also reported by Herbert van den Bergh against 3.1-based kernel
with the following snippet from the console log.

BUG: unable to handle kernel paging request at 01c00008
IP: [<c0522399>] isolate_migratepages+0x119/0x390
*pdpt = 000000002f7ce001 *pde = 0000000000000000

It is expected that it also affects 3.2.x and current mainline.

The problem is that pfn_valid is only called on the first PFN being
checked and that PFN is not necessarily aligned.  Lets say we have a case
like this

H = MAX_ORDER_NR_PAGES boundary
| = pageblock boundary
m = cc->migrate_pfn
f = cc->free_pfn
o = memory hole

H------|------H------|----m-Hoooooo|ooooooH-f----|------H

The migrate_pfn is just below a memory hole and the free scanner is beyond
the hole.  When isolate_migratepages started, it scans from migrate_pfn to
migrate_pfn+pageblock_nr_pages which is now in a memory hole.  It checks
pfn_valid() on the first PFN but then scans into the hole where there are
not necessarily valid struct pages.

This patch ensures that isolate_migratepages calls pfn_valid when
necessary.

Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
0bf380b
Peter Huewe PeterHuewe referenced this issue from a commit in PeterHuewe/linux
Joe Perches checkpatch: Warn on code with 6+ tab indentation
Overly indented code should be refactored.

Suggest refactoring excessive indentation of of
if/else/for/do/while/switch statements.

For example:

$ cat t.c
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv)
{

	if (1)
		if (2)
			if (3)
				if (4)
					if (5)
						if (6)
							if (7)
								if (8)
									;
	return 0;
}

$ ./scripts/checkpatch.pl -f t.c
WARNING: Too many leading tabs - consider code refactoring
#12: FILE: t.c:12:
+						if (6)

WARNING: Too many leading tabs - consider code refactoring
#13: FILE: t.c:13:
+							if (7)

WARNING: Too many leading tabs - consider code refactoring
#14: FILE: t.c:14:
+								if (8)

total: 0 errors, 3 warnings, 17 lines checked

t.c has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8eef05d
Peter Huewe PeterHuewe referenced this issue from a commit in PeterHuewe/linux
yizou yizou ixgbe: do not update real num queues when netdev is going away
If the netdev is already in NETREG_UNREGISTERING/_UNREGISTERED state, do not
update the real num tx queues. netdev_queue_update_kobjects() is already
called via remove_queue_kobjects() at NETREG_UNREGISTERING time. So, when
upper layer driver, e.g., FCoE protocol stack is monitoring the netdev
event of NETDEV_UNREGISTER and calls back to LLD ndo_fcoe_disable() to remove
extra queues allocated for FCoE, the associated txq sysfs kobjects are already
removed, and trying to update the real num queues would cause something like
below:

...
PID: 25138  TASK: ffff88021e64c440  CPU: 3   COMMAND: "kworker/3:3"
 #0 [ffff88021f007760] machine_kexec at ffffffff810226d9
 #1 [ffff88021f0077d0] crash_kexec at ffffffff81089d2d
 #2 [ffff88021f0078a0] oops_end at ffffffff813bca78
 #3 [ffff88021f0078d0] no_context at ffffffff81029e72
 #4 [ffff88021f007920] __bad_area_nosemaphore at ffffffff8102a155
 #5 [ffff88021f0079f0] bad_area_nosemaphore at ffffffff8102a23e
 #6 [ffff88021f007a00] do_page_fault at ffffffff813bf32e
 #7 [ffff88021f007b10] page_fault at ffffffff813bc045
    [exception RIP: sysfs_find_dirent+17]
    RIP: ffffffff81178611  RSP: ffff88021f007bc0  RFLAGS: 00010246
    RAX: ffff88021e64c440  RBX: ffffffff8156cc63  RCX: 0000000000000004
    RDX: ffffffff8156cc63  RSI: 0000000000000000  RDI: 0000000000000000
    RBP: ffff88021f007be0   R8: 0000000000000004   R9: 0000000000000008
    R10: ffffffff816fed00  R11: 0000000000000004  R12: 0000000000000000
    R13: ffffffff8156cc63  R14: 0000000000000000  R15: ffff8802222a0000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #8 [ffff88021f007be8] sysfs_get_dirent at ffffffff81178c07
 #9 [ffff88021f007c18] sysfs_remove_group at ffffffff8117ac27
#10 [ffff88021f007c48] netdev_queue_update_kobjects at ffffffff813178f9
#11 [ffff88021f007c88] netif_set_real_num_tx_queues at ffffffff81303e38
#12 [ffff88021f007cc8] ixgbe_set_num_queues at ffffffffa0249763 [ixgbe]
#13 [ffff88021f007cf8] ixgbe_init_interrupt_scheme at ffffffffa024ea89 [ixgbe]
#14 [ffff88021f007d48] ixgbe_fcoe_disable at ffffffffa0267113 [ixgbe]
#15 [ffff88021f007d68] vlan_dev_fcoe_disable at ffffffffa014fef5 [8021q]
#16 [ffff88021f007d78] fcoe_interface_cleanup at ffffffffa02b7dfd [fcoe]
#17 [ffff88021f007df8] fcoe_destroy_work at ffffffffa02b7f08 [fcoe]
#18 [ffff88021f007e18] process_one_work at ffffffff8105d7ca
#19 [ffff88021f007e68] worker_thread at ffffffff81060513
#20 [ffff88021f007ee8] kthread at ffffffff810648b6
#21 [ffff88021f007f48] kernel_thread_helper at ffffffff813c40f4

Signed-off-by: Yi Zou <yi.zou@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
9d837ea
popcornmix popcornmix referenced this issue from a commit in popcornmix/linux
Jeff Layton nfs: skip commit in releasepage if we're freeing memory for fs-relate…
…d reasons

commit 5cf02d0 upstream.

We've had some reports of a deadlock where rpciod ends up with a stack
trace like this:

    PID: 2507   TASK: ffff88103691ab40  CPU: 14  COMMAND: "rpciod/14"
     #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
     #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
     #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
     #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
     #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
     #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
     #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
     #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
     #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
     #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
    #10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
    #11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
    #12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
    #13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
    #14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
    #15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
    #16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
    #17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
    #18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
    #19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
    #20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
    #21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
    #22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
    #23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
    #24 [ffff8810343bfee8] kthread at ffffffff8108dd96
    #25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca

rpciod is trying to allocate memory for a new socket to talk to the
server. The VM ends up calling ->releasepage to get more memory, and it
tries to do a blocking commit. That commit can't succeed however without
a connected socket, so we deadlock.

Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
socket allocation, and having nfs_release_page check for that flag when
deciding whether to do a commit call. Also, set PF_FSTRANS
unconditionally in rpc_async_schedule since that function can also do
allocations sometimes.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
1c88c58
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi
Ben Myers xfs: stop the sync worker before xfs_unmountfs
Cancel work of the xfs_sync_worker before teardown of the log in
xfs_unmountfs.  This prevents occasional crashes on unmount like so:

PID: 21602  TASK: ee9df060  CPU: 0   COMMAND: "kworker/0:3"
 #0 [c5377d28] crash_kexec at c0292c94
 #1 [c5377d80] oops_end at c07090c2
 #2 [c5377d98] no_context at c06f614e
 #3 [c5377dbc] __bad_area_nosemaphore at c06f6281
 #4 [c5377df4] bad_area_nosemaphore at c06f629b
 #5 [c5377e00] do_page_fault at c070b0cb
 #6 [c5377e7c] error_code (via page_fault) at c070892c
    EAX: f300c6a8  EBX: f300c6a8  ECX: 000000c0  EDX: 000000c0  EBP: c5377ed0
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 00000001  GS:  ffffad20
    CS:  0060      EIP: c0481ad0  ERR: ffffffff  EFLAGS: 00010246
 #7 [c5377eb0] atomic64_read_cx8 at c0481ad0
 #8 [c5377ebc] xlog_assign_tail_lsn_locked at f7cc7c6e [xfs]
 #9 [c5377ed4] xfs_trans_ail_delete_bulk at f7ccd520 [xfs]
#10 [c5377f0c] xfs_buf_iodone at f7ccb602 [xfs]
#11 [c5377f24] xfs_buf_do_callbacks at f7cca524 [xfs]
#12 [c5377f30] xfs_buf_iodone_callbacks at f7cca5da [xfs]
#13 [c5377f4c] xfs_buf_iodone_work at f7c718d0 [xfs]
#14 [c5377f58] process_one_work at c024ee4c
#15 [c5377f98] worker_thread at c024f43d
#16 [c5377fbc] kthread at c025326b
#17 [c5377fe8] kernel_thread_helper at c070e834

PID: 26653  TASK: e79143b0  CPU: 3   COMMAND: "umount"
 #0 [cde0fda0] __schedule at c0706595
 #1 [cde0fe28] schedule at c0706b89
 #2 [cde0fe30] schedule_timeout at c0705600
 #3 [cde0fe94] __down_common at c0706098
 #4 [cde0fec8] __down at c0706122
 #5 [cde0fed0] down at c025936f
 #6 [cde0fee0] xfs_buf_lock at f7c7131d [xfs]
 #7 [cde0ff00] xfs_freesb at f7cc2236 [xfs]
 #8 [cde0ff10] xfs_fs_put_super at f7c80f21 [xfs]
 #9 [cde0ff1c] generic_shutdown_super at c0333d7a
#10 [cde0ff38] kill_block_super at c0333e0f
#11 [cde0ff48] deactivate_locked_super at c0334218
#12 [cde0ff58] deactivate_super at c033495d
#13 [cde0ff68] mntput_no_expire at c034bc13
#14 [cde0ff7c] sys_umount at c034cc69
#15 [cde0ffa0] sys_oldumount at c034ccd4
#16 [cde0ffb0] system_call at c0707e66

commit 11159a0 added this to xfs_log_unmount and needs to be cleaned up
at a later date.

Signed-off-by: Ben Myers <bpm@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
4026c9f
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi
Ben Myers xfs: stop the sync worker before xfs_unmountfs
Cancel work of the xfs_sync_worker before teardown of the log in
xfs_unmountfs.  This prevents occasional crashes on unmount like so:

PID: 21602  TASK: ee9df060  CPU: 0   COMMAND: "kworker/0:3"
 #0 [c5377d28] crash_kexec at c0292c94
 #1 [c5377d80] oops_end at c07090c2
 #2 [c5377d98] no_context at c06f614e
 #3 [c5377dbc] __bad_area_nosemaphore at c06f6281
 #4 [c5377df4] bad_area_nosemaphore at c06f629b
 #5 [c5377e00] do_page_fault at c070b0cb
 #6 [c5377e7c] error_code (via page_fault) at c070892c
    EAX: f300c6a8  EBX: f300c6a8  ECX: 000000c0  EDX: 000000c0  EBP: c5377ed0
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 00000001  GS:  ffffad20
    CS:  0060      EIP: c0481ad0  ERR: ffffffff  EFLAGS: 00010246
 #7 [c5377eb0] atomic64_read_cx8 at c0481ad0
 #8 [c5377ebc] xlog_assign_tail_lsn_locked at f7cc7c6e [xfs]
 #9 [c5377ed4] xfs_trans_ail_delete_bulk at f7ccd520 [xfs]
#10 [c5377f0c] xfs_buf_iodone at f7ccb602 [xfs]
#11 [c5377f24] xfs_buf_do_callbacks at f7cca524 [xfs]
#12 [c5377f30] xfs_buf_iodone_callbacks at f7cca5da [xfs]
#13 [c5377f4c] xfs_buf_iodone_work at f7c718d0 [xfs]
#14 [c5377f58] process_one_work at c024ee4c
#15 [c5377f98] worker_thread at c024f43d
#16 [c5377fbc] kthread at c025326b
#17 [c5377fe8] kernel_thread_helper at c070e834

PID: 26653  TASK: e79143b0  CPU: 3   COMMAND: "umount"
 #0 [cde0fda0] __schedule at c0706595
 #1 [cde0fe28] schedule at c0706b89
 #2 [cde0fe30] schedule_timeout at c0705600
 #3 [cde0fe94] __down_common at c0706098
 #4 [cde0fec8] __down at c0706122
 #5 [cde0fed0] down at c025936f
 #6 [cde0fee0] xfs_buf_lock at f7c7131d [xfs]
 #7 [cde0ff00] xfs_freesb at f7cc2236 [xfs]
 #8 [cde0ff10] xfs_fs_put_super at f7c80f21 [xfs]
 #9 [cde0ff1c] generic_shutdown_super at c0333d7a
#10 [cde0ff38] kill_block_super at c0333e0f
#11 [cde0ff48] deactivate_locked_super at c0334218
#12 [cde0ff58] deactivate_super at c033495d
#13 [cde0ff68] mntput_no_expire at c034bc13
#14 [cde0ff7c] sys_umount at c034cc69
#15 [cde0ffa0] sys_oldumount at c034ccd4
#16 [cde0ffb0] system_call at c0707e66

commit 11159a0 added this to xfs_log_unmount and needs to be cleaned up
at a later date.

Signed-off-by: Ben Myers <bpm@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
0ba6e53
popcornmix popcornmix referenced this issue from a commit
Luck, Tony x86: Remove some noise from boot log when starting cpus
Printing the "start_ip" for every secondary cpu is very noisy on a large
system - and doesn't add any value. Drop this message.

Console log before:
Booting Node   0, Processors  #1
smpboot cpu 1: start_ip = 96000
 #2
smpboot cpu 2: start_ip = 96000
 #3
smpboot cpu 3: start_ip = 96000
 #4
smpboot cpu 4: start_ip = 96000
       ...
 #31
smpboot cpu 31: start_ip = 96000
Brought up 32 CPUs

Console log after:
Booting Node   0, Processors  #1 #2 #3 #4 #5 #6 #7 Ok.
Booting Node   1, Processors  #8 #9 #10 #11 #12 #13 #14 #15 Ok.
Booting Node   0, Processors  #16 #17 #18 #19 #20 #21 #22 #23 Ok.
Booting Node   1, Processors  #24 #25 #26 #27 #28 #29 #30 #31
Brought up 32 CPUs

Acked-by: Borislav Petkov <bp@amd64.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/4f452eb42507460426@agluck-desktop.sc.intel.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
140f190
popcornmix popcornmix referenced this issue from a commit
Borislav Petkov x86/smp: Fix topology checks on AMD MCM CPUs
The warning below triggers on AMD MCM packages because physical package
IDs on the cores of a _physical_ socket are the same. I.e., this field
says which CPUs belong to the same physical package.

However, the same two CPUs belong to two different internal, i.e.
"logical" nodes in the same physical socket which is reflected in the
CPU-to-node map on x86 with NUMA.

Which makes this check wrong on the above topologies so circumvent it.

[    0.444413] Booting Node   0, Processors  #1 #2 #3 #4 #5 Ok.
[    0.461388] ------------[ cut here ]------------
[    0.465997] WARNING: at arch/x86/kernel/smpboot.c:310 topology_sane.clone.1+0x6e/0x81()
[    0.473960] Hardware name: Dinar
[    0.477170] sched: CPU #6's mc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency.
[    0.486860] Booting Node   1, Processors  #6
[    0.491104] Modules linked in:
[    0.494141] Pid: 0, comm: swapper/6 Not tainted 3.4.0+ #1
[    0.499510] Call Trace:
[    0.501946]  [<ffffffff8144bf92>] ? topology_sane.clone.1+0x6e/0x81
[    0.508185]  [<ffffffff8102f1fc>] warn_slowpath_common+0x85/0x9d
[    0.514163]  [<ffffffff8102f2b7>] warn_slowpath_fmt+0x46/0x48
[    0.519881]  [<ffffffff8144bf92>] topology_sane.clone.1+0x6e/0x81
[    0.525943]  [<ffffffff8144c234>] set_cpu_sibling_map+0x251/0x371
[    0.532004]  [<ffffffff8144c4ee>] start_secondary+0x19a/0x218
[    0.537729] ---[ end trace 4eaa2a86a8e2da22 ]---
[    0.628197]  #7 #8 #9 #10 #11 Ok.
[    0.807108] Booting Node   3, Processors  #12 #13 #14 #15 #16 #17 Ok.
[    0.897587] Booting Node   2, Processors  #18 #19 #20 #21 #22 #23 Ok.
[    0.917443] Brought up 24 CPUs

We ran a topology sanity check test we have here on it and
it all looks ok... hopefully :).

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120529135442.GE29157@aftab.osrc.amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
161270f
popcornmix popcornmix referenced this issue from a commit
Eric Dumazet net: l2tp_eth: use LLTX to avoid LOCKDEP splats
Denys Fedoryshchenko reported a LOCKDEP issue with l2tp code.

[ 8683.927442] ======================================================
[ 8683.927555] [ INFO: possible circular locking dependency detected ]
[ 8683.927672] 3.4.1-build-0061 #14 Not tainted
[ 8683.927782] -------------------------------------------------------
[ 8683.927895] swapper/0/0 is trying to acquire lock:
[ 8683.928007]  (slock-AF_INET){+.-...}, at: [<e0fc73ec>]
l2tp_xmit_skb+0x173/0x47e [l2tp_core]
[ 8683.928121]
[ 8683.928121] but task is already holding lock:
[ 8683.928121]  (_xmit_ETHER#2){+.-...}, at: [<c02f062d>]
sch_direct_xmit+0x36/0x119
[ 8683.928121]
[ 8683.928121] which lock already depends on the new lock.
[ 8683.928121]
[ 8683.928121]
[ 8683.928121] the existing dependency chain (in reverse order) is:
[ 8683.928121]
[ 8683.928121] -> #1 (_xmit_ETHER#2){+.-...}:
[ 8683.928121]        [<c015a561>] lock_acquire+0x71/0x85
[ 8683.928121]        [<c034da2d>] _raw_spin_lock+0x33/0x40
[ 8683.928121]        [<c0304e0c>] ip_send_reply+0xf2/0x1ce
[ 8683.928121]        [<c0317dbc>] tcp_v4_send_reset+0x153/0x16f
[ 8683.928121]        [<c0317f4a>] tcp_v4_do_rcv+0x172/0x194
[ 8683.928121]        [<c031929b>] tcp_v4_rcv+0x387/0x5a0
[ 8683.928121]        [<c03001d0>] ip_local_deliver_finish+0x13a/0x1e9
[ 8683.928121]        [<c0300645>] NF_HOOK.clone.11+0x46/0x4d
[ 8683.928121]        [<c030075b>] ip_local_deliver+0x41/0x45
[ 8683.928121]        [<c03005dd>] ip_rcv_finish+0x31a/0x33c
[ 8683.928121]        [<c0300645>] NF_HOOK.clone.11+0x46/0x4d
[ 8683.928121]        [<c0300960>] ip_rcv+0x201/0x23d
[ 8683.928121]        [<c02de91b>] __netif_receive_skb+0x329/0x378
[ 8683.928121]        [<c02deae8>] netif_receive_skb+0x4e/0x7d
[ 8683.928121]        [<e08d5ef3>] rtl8139_poll+0x243/0x33d [8139too]
[ 8683.928121]        [<c02df103>] net_rx_action+0x90/0x15d
[ 8683.928121]        [<c012b2b5>] __do_softirq+0x7b/0x118
[ 8683.928121]
[ 8683.928121] -> #0 (slock-AF_INET){+.-...}:
[ 8683.928121]        [<c0159f1b>] __lock_acquire+0x9a3/0xc27
[ 8683.928121]        [<c015a561>] lock_acquire+0x71/0x85
[ 8683.928121]        [<c034da2d>] _raw_spin_lock+0x33/0x40
[ 8683.928121]        [<e0fc73ec>] l2tp_xmit_skb+0x173/0x47e
[l2tp_core]
[ 8683.928121]        [<e0fe31fb>] l2tp_eth_dev_xmit+0x1a/0x2f
[l2tp_eth]
[ 8683.928121]        [<c02e01e7>] dev_hard_start_xmit+0x333/0x3f2
[ 8683.928121]        [<c02f064c>] sch_direct_xmit+0x55/0x119
[ 8683.928121]        [<c02e0528>] dev_queue_xmit+0x282/0x418
[ 8683.928121]        [<c031f4fb>] NF_HOOK.clone.19+0x45/0x4c
[ 8683.928121]        [<c031f524>] arp_xmit+0x22/0x24
[ 8683.928121]        [<c031f567>] arp_send+0x41/0x48
[ 8683.928121]        [<c031fa7d>] arp_process+0x289/0x491
[ 8683.928121]        [<c031f4fb>] NF_HOOK.clone.19+0x45/0x4c
[ 8683.928121]        [<c031f7a0>] arp_rcv+0xb1/0xc3
[ 8683.928121]        [<c02de91b>] __netif_receive_skb+0x329/0x378
[ 8683.928121]        [<c02de9d3>] process_backlog+0x69/0x130
[ 8683.928121]        [<c02df103>] net_rx_action+0x90/0x15d
[ 8683.928121]        [<c012b2b5>] __do_softirq+0x7b/0x118
[ 8683.928121]
[ 8683.928121] other info that might help us debug this:
[ 8683.928121]
[ 8683.928121]  Possible unsafe locking scenario:
[ 8683.928121]
[ 8683.928121]        CPU0                    CPU1
[ 8683.928121]        ----                    ----
[ 8683.928121]   lock(_xmit_ETHER#2);
[ 8683.928121]                                lock(slock-AF_INET);
[ 8683.928121]                                lock(_xmit_ETHER#2);
[ 8683.928121]   lock(slock-AF_INET);
[ 8683.928121]
[ 8683.928121]  *** DEADLOCK ***
[ 8683.928121]
[ 8683.928121] 3 locks held by swapper/0/0:
[ 8683.928121]  #0:  (rcu_read_lock){.+.+..}, at: [<c02dbc10>]
rcu_lock_acquire+0x0/0x30
[ 8683.928121]  #1:  (rcu_read_lock_bh){.+....}, at: [<c02dbc10>]
rcu_lock_acquire+0x0/0x30
[ 8683.928121]  #2:  (_xmit_ETHER#2){+.-...}, at: [<c02f062d>]
sch_direct_xmit+0x36/0x119
[ 8683.928121]
[ 8683.928121] stack backtrace:
[ 8683.928121] Pid: 0, comm: swapper/0 Not tainted 3.4.1-build-0061 #14
[ 8683.928121] Call Trace:
[ 8683.928121]  [<c034bdd2>] ? printk+0x18/0x1a
[ 8683.928121]  [<c0158904>] print_circular_bug+0x1ac/0x1b6
[ 8683.928121]  [<c0159f1b>] __lock_acquire+0x9a3/0xc27
[ 8683.928121]  [<c015a561>] lock_acquire+0x71/0x85
[ 8683.928121]  [<e0fc73ec>] ? l2tp_xmit_skb+0x173/0x47e [l2tp_core]
[ 8683.928121]  [<c034da2d>] _raw_spin_lock+0x33/0x40
[ 8683.928121]  [<e0fc73ec>] ? l2tp_xmit_skb+0x173/0x47e [l2tp_core]
[ 8683.928121]  [<e0fc73ec>] l2tp_xmit_skb+0x173/0x47e [l2tp_core]
[ 8683.928121]  [<e0fe31fb>] l2tp_eth_dev_xmit+0x1a/0x2f [l2tp_eth]
[ 8683.928121]  [<c02e01e7>] dev_hard_start_xmit+0x333/0x3f2
[ 8683.928121]  [<c02f064c>] sch_direct_xmit+0x55/0x119
[ 8683.928121]  [<c02e0528>] dev_queue_xmit+0x282/0x418
[ 8683.928121]  [<c02e02a6>] ? dev_hard_start_xmit+0x3f2/0x3f2
[ 8683.928121]  [<c031f4fb>] NF_HOOK.clone.19+0x45/0x4c
[ 8683.928121]  [<c031f524>] arp_xmit+0x22/0x24
[ 8683.928121]  [<c02e02a6>] ? dev_hard_start_xmit+0x3f2/0x3f2
[ 8683.928121]  [<c031f567>] arp_send+0x41/0x48
[ 8683.928121]  [<c031fa7d>] arp_process+0x289/0x491
[ 8683.928121]  [<c031f7f4>] ? __neigh_lookup.clone.20+0x42/0x42
[ 8683.928121]  [<c031f4fb>] NF_HOOK.clone.19+0x45/0x4c
[ 8683.928121]  [<c031f7a0>] arp_rcv+0xb1/0xc3
[ 8683.928121]  [<c031f7f4>] ? __neigh_lookup.clone.20+0x42/0x42
[ 8683.928121]  [<c02de91b>] __netif_receive_skb+0x329/0x378
[ 8683.928121]  [<c02de9d3>] process_backlog+0x69/0x130
[ 8683.928121]  [<c02df103>] net_rx_action+0x90/0x15d
[ 8683.928121]  [<c012b2b5>] __do_softirq+0x7b/0x118
[ 8683.928121]  [<c012b23a>] ? local_bh_enable+0xd/0xd
[ 8683.928121]  <IRQ>  [<c012b4d0>] ? irq_exit+0x41/0x91
[ 8683.928121]  [<c0103c6f>] ? do_IRQ+0x79/0x8d
[ 8683.928121]  [<c0157ea1>] ? trace_hardirqs_off_caller+0x2e/0x86
[ 8683.928121]  [<c034ef6e>] ? common_interrupt+0x2e/0x34
[ 8683.928121]  [<c0108a33>] ? default_idle+0x23/0x38
[ 8683.928121]  [<c01091a8>] ? cpu_idle+0x55/0x6f
[ 8683.928121]  [<c033df25>] ? rest_init+0xa1/0xa7
[ 8683.928121]  [<c033de84>] ? __read_lock_failed+0x14/0x14
[ 8683.928121]  [<c0498745>] ? start_kernel+0x303/0x30a
[ 8683.928121]  [<c0498209>] ? repair_env_string+0x51/0x51
[ 8683.928121]  [<c04980a8>] ? i386_start_kernel+0xa8/0xaf

It appears that like most virtual devices, l2tp should be converted to
LLTX mode.

This patch takes care of statistics using atomic_long in both RX and TX
paths, and fix a bug in l2tp_eth_dev_recv(), which was caching skb->data
before a pskb_may_pull() call.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Denys Fedoryshchenko <denys@visp.net.lb>
Cc: James Chapman <jchapman@katalix.com>
Cc: Hong zhi guo <honkiko@gmail.com>
Cc: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
a2842a1
popcornmix popcornmix referenced this issue from a commit
Fengguang Wu ALSA: es1688 - freeup resources on init failure
This will fix the following oops:

[    6.169981] genirq: Flags mismatch irq 5. 00000000 (ES1688) vs. 00000000 (ES1688)
[    6.170851] Pid: 1, comm: swapper Not tainted 3.5.0-00004-gceee0e9 #14
[    6.170851] Call Trace:
[    6.170851]  [<c1062237>] ? __setup_irq+0x3c7/0x420
[    6.170851]  [<c1062486>] ? request_threaded_irq+0x76/0x140
[    6.170851]  [<c1290220>] ? snd_es1688_ioctl+0x10/0x10
[    6.170851]  [<c10624c2>] ? request_threaded_irq+0xb2/0x140
[    6.170851]  [<c1291196>] ? snd_es1688_create+0x96/0x330
[    6.170851]  [<c138365d>] ? snd_gusextreme_probe+0x18d/0x5a2
[    6.170851]  [<c11c9d80>] ? __driver_attach+0x80/0x80
[    6.170851]  [<c10db22f>] ? sysfs_create_link+0xf/0x20
[    6.170851]  [<c11c9d80>] ? __driver_attach+0x80/0x80
[    6.170851]  [<c11d1502>] ? isa_bus_probe+0x12/0x20
[    6.170851]  [<c11c9b95>] ? driver_probe_device+0x55/0x1c0
[    6.170851]  [<c13ae04f>] ? _raw_spin_unlock+0xf/0x30
[    6.170851]  [<c13705ea>] ? klist_next+0x6a/0xe0
[    6.170851]  [<c11d15c1>] ? isa_bus_match+0x21/0x40
[    6.170851]  [<c11c8a24>] ? bus_for_each_drv+0x34/0x70
[    6.170851]  [<c11c9e4b>] ? device_attach+0x7b/0x90
[    6.170851]  [<c11c9d80>] ? __driver_attach+0x80/0x80
[    6.170851]  [<c11c8bff>] ? bus_probe_device+0x5f/0x80
[    6.170851]  [<c11c7493>] ? device_add+0x573/0x620
[    6.170851]  [<c1042820>] ? complete_all+0x40/0x60
[    6.170851]  [<c13ae08a>] ? _raw_spin_unlock_irqrestore+0x1a/0x30
[    6.170851]  [<c11d16c6>] ? isa_register_driver+0xb6/0x150
[    6.170851]  [<c15c9002>] ? alsa_card_gusmax_init+0xf/0xf
[    6.170851]  [<c15a99bc>] ? do_one_initcall+0x7f/0x12b
[    6.170851]  [<c15a9b7a>] ? kernel_init+0x112/0x1a9
[    6.170851]  [<c15a9423>] ? do_early_param+0x77/0x77
[    6.170851]  [<c15a9a68>] ? do_one_initcall+0x12b/0x12b
[    6.170851]  [<c13aefbe>] ? kernel_thread_helper+0x6/0xd
[    6.190170] es1688: can't grab IRQ 5
[    6.190613] genirq: Flags mismatch irq 5. 00000000 (ES1688) vs. 00000000 (ES1688)
[    6.191566] Pid: 1, comm: swapper Not tainted 3.5.0-00004-gceee0e9 #14
[    6.192394] Call Trace:
[    6.192685]  [<c1062237>] ? __setup_irq+0x3c7/0x420
[    6.193342]  [<c1062486>] ? request_threaded_irq+0x76/0x140
[    6.194081]  [<c1290220>] ? snd_es1688_ioctl+0x10/0x10
[    6.194607]  [<c10624c2>] ? request_threaded_irq+0xb2/0x140
[    6.194607]  [<c1291196>] ? snd_es1688_create+0x96/0x330
[    6.194607]  [<c138365d>] ? snd_gusextreme_probe+0x18d/0x5a2
[    6.194607]  [<c11c9d80>] ? __driver_attach+0x80/0x80
[    6.194607]  [<c10db22f>] ? sysfs_create_link+0xf/0x20
[    6.194607]  [<c11c9d80>] ? __driver_attach+0x80/0x80
[    6.194607]  [<c11d1502>] ? isa_bus_probe+0x12/0x20
[    6.194607]  [<c11c9b95>] ? driver_probe_device+0x55/0x1c0
[    6.194607]  [<c13ae04f>] ? _raw_spin_unlock+0xf/0x30
[    6.194607]  [<c13705ea>] ? klist_next+0x6a/0xe0
[    6.194607]  [<c11d15c1>] ? isa_bus_match+0x21/0x40
[    6.194607]  [<c11c8a24>] ? bus_for_each_drv+0x34/0x70
[    6.194607]  [<c11c9e4b>] ? device_attach+0x7b/0x90
[    6.194607]  [<c11c9d80>] ? __driver_attach+0x80/0x80
[    6.194607]  [<c11c8bff>] ? bus_probe_device+0x5f/0x80
[    6.194607]  [<c11c7493>] ? device_add+0x573/0x620
[    6.194607]  [<c1042820>] ? complete_all+0x40/0x60
[    6.194607]  [<c13ae08a>] ? _raw_spin_unlock_irqrestore+0x1a/0x30
[    6.194607]  [<c11d16c6>] ? isa_register_driver+0xb6/0x150
[    6.194607]  [<c15c9002>] ? alsa_card_gusmax_init+0xf/0xf
[    6.194607]  [<c15a99bc>] ? do_one_initcall+0x7f/0x12b
[    6.194607]  [<c15a9b7a>] ? kernel_init+0x112/0x1a9
[    6.194607]  [<c15a9423>] ? do_early_param+0x77/0x77
[    6.194607]  [<c15a9a68>] ? do_one_initcall+0x12b/0x12b
[    6.194607]  [<c13aefbe>] ? kernel_thread_helper+0x6/0xd
[    6.210779] es1688: can't grab IRQ 5
[    6.211305] gusextreme: probe of gusextreme.0 failed with error -16

Signed-off-by: Daniel Mack <zonque@gmail.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
e5b3542
popcornmix popcornmix referenced this issue from a commit
Jeff Layton nfs: skip commit in releasepage if we're freeing memory for fs-relate…
…d reasons

We've had some reports of a deadlock where rpciod ends up with a stack
trace like this:

    PID: 2507   TASK: ffff88103691ab40  CPU: 14  COMMAND: "rpciod/14"
     #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
     #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
     #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
     #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
     #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
     #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
     #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
     #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
     #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
     #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
    #10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
    #11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
    #12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
    #13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
    #14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
    #15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
    #16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
    #17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
    #18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
    #19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
    #20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
    #21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
    #22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
    #23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
    #24 [ffff8810343bfee8] kthread at ffffffff8108dd96
    #25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca

rpciod is trying to allocate memory for a new socket to talk to the
server. The VM ends up calling ->releasepage to get more memory, and it
tries to do a blocking commit. That commit can't succeed however without
a connected socket, so we deadlock.

Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
socket allocation, and having nfs_release_page check for that flag when
deciding whether to do a commit call. Also, set PF_FSTRANS
unconditionally in rpc_async_schedule since that function can also do
allocations sometimes.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@vger.kernel.org
5cf02d0
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi
Eric Dumazet tuntap: dont use skb after netif_rx_ni(skb)
On Wed, 2012-12-12 at 23:16 -0500, Dave Jones wrote:
> Since todays net merge, I see this when I start openvpn..
>
> general protection fault: 0000 [#1] PREEMPT SMP
> Modules linked in: ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack nf_conntrack ip6table_filter ip6_tables xfs iTCO_wdt iTCO_vendor_support snd_emu10k1 snd_util_mem snd_ac97_codec coretemp ac97_bus microcode snd_hwdep snd_seq pcspkr snd_pcm snd_page_alloc snd_timer lpc_ich i2c_i801 snd_rawmidi mfd_core snd_seq_device snd e1000e soundcore emu10k1_gp gameport i82975x_edac edac_core vhost_net tun macvtap macvlan kvm_intel kvm binfmt_misc nfsd auth_rpcgss nfs_acl lockd sunrpc btrfs libcrc32c zlib_deflate firewire_ohci sata_sil firewire_core crc_itu_t radeon i2c_algo_bit drm_kms_helper ttm drm i2c_core floppy
> CPU 0
> Pid: 1381, comm: openvpn Not tainted 3.7.0+ #14                  /D975XBX
> RIP: 0010:[<ffffffff815b54a4>]  [<ffffffff815b54a4>] skb_flow_dissect+0x314/0x3e0
> RSP: 0018:ffff88007d0d9c48  EFLAGS: 00010206
> RAX: 000000000000055d RBX: 6b6b6b6b6b6b6b4b RCX: 1471030a0180040a
> RDX: 0000000000000005 RSI: 00000000ffffffe0 RDI: ffff8800ba83fa80
> RBP: ffff88007d0d9cb8 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000101 R12: ffff8800ba83fa80
> R13: 0000000000000008 R14: ffff88007d0d9cc8 R15: ffff8800ba83fa80
> FS:  00007f6637104800(0000) GS:ffff8800bf600000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f563f5b01c4 CR3: 000000007d140000 CR4: 00000000000007f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process openvpn (pid: 1381, threadinfo ffff88007d0d8000, task ffff8800a540cd60)
> Stack:
>  ffff8800ba83fa80 0000000000000296 0000000000000000 0000000000000000
>  ffff88007d0d9cc8 ffffffff815bcff4 ffff88007d0d9ce8 ffffffff815b1831
>  ffff88007d0d9ca8 00000000703f6364 ffff8800ba83fa80 0000000000000000
> Call Trace:
>  [<ffffffff815bcff4>] ? netif_rx+0x114/0x4c0
>  [<ffffffff815b1831>] ? skb_copy_datagram_from_iovec+0x61/0x290
>  [<ffffffff815b672a>] __skb_get_rxhash+0x1a/0xd0
>  [<ffffffffa03b9538>] tun_get_user+0x418/0x810 [tun]
>  [<ffffffff8135f468>] ? delay_tsc+0x98/0xf0
>  [<ffffffff8109605c>] ? __rcu_read_unlock+0x5c/0xa0
>  [<ffffffffa03b9a41>] tun_chr_aio_write+0x81/0xb0 [tun]
>  [<ffffffff81145011>] ? __buffer_unlock_commit+0x41/0x50
>  [<ffffffff811db917>] do_sync_write+0xa7/0xe0
>  [<ffffffff811dc01f>] vfs_write+0xaf/0x190
>  [<ffffffff811dc375>] sys_write+0x55/0xa0
>  [<ffffffff81705540>] tracesys+0xdd/0xe2
> Code: 41 8b 44 24 68 41 2b 44 24 6c 01 de 29 f0 83 f8 03 0f 8e a0 00 00 00 48 63 de 49 03 9c 24 e0 00 00 00 48 85 db 0f 84 72 fe ff ff <8b> 03 41 89 46 08 b8 01 00 00 00 e9 43 fd ff ff 0f 1f 40 00 48
> RIP  [<ffffffff815b54a4>] skb_flow_dissect+0x314/0x3e0
>  RSP <ffff88007d0d9c48>
> ---[ end trace 6d42c834c72c002e ]---
>
>
> Faulting instruction is
>
>    0:	8b 03                	mov    (%rbx),%eax
>
> rbx is slab poison (-20) so this looks like a use-after-free here...
>
>                         flow->ports = *ports;
>  314:   8b 03                   mov    (%rbx),%eax
>  316:   41 89 46 08             mov    %eax,0x8(%r14)
>
> in the inlined skb_header_pointer in skb_flow_dissect
>
> 	Dave
>

commit 96442e4 (tuntap: choose the txq based on rxq) added
a use after free.

Cache rxhash in a temp variable before calling netif_rx_ni()

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jason Wang <jasowang@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4997442
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi
Cong Wang bridge: fix some kernel warning in multicast timer
Several people reported the warning: "kernel BUG at kernel/timer.c:729!"
and the stack trace is:

	#7 [ffff880214d25c10] mod_timer+501 at ffffffff8106d905
	#8 [ffff880214d25c50] br_multicast_del_pg.isra.20+261 at ffffffffa0731d25 [bridge]
	#9 [ffff880214d25c80] br_multicast_disable_port+88 at ffffffffa0732948 [bridge]
	#10 [ffff880214d25cb0] br_stp_disable_port+154 at ffffffffa072bcca [bridge]
	#11 [ffff880214d25ce8] br_device_event+520 at ffffffffa072a4e8 [bridge]
	#12 [ffff880214d25d18] notifier_call_chain+76 at ffffffff8164aafc
	#13 [ffff880214d25d50] raw_notifier_call_chain+22 at ffffffff810858f6
	#14 [ffff880214d25d60] call_netdevice_notifiers+45 at ffffffff81536aad
	#15 [ffff880214d25d80] dev_close_many+183 at ffffffff81536d17
	#16 [ffff880214d25dc0] rollback_registered_many+168 at ffffffff81537f68
	#17 [ffff880214d25de8] rollback_registered+49 at ffffffff81538101
	#18 [ffff880214d25e10] unregister_netdevice_queue+72 at ffffffff815390d8
	#19 [ffff880214d25e30] __tun_detach+272 at ffffffffa074c2f0 [tun]
	#20 [ffff880214d25e88] tun_chr_close+45 at ffffffffa074c4bd [tun]
	#21 [ffff880214d25ea8] __fput+225 at ffffffff8119b1f1
	#22 [ffff880214d25ef0] ____fput+14 at ffffffff8119b3fe
	#23 [ffff880214d25f00] task_work_run+159 at ffffffff8107cf7f
	#24 [ffff880214d25f30] do_notify_resume+97 at ffffffff810139e1
	#25 [ffff880214d25f50] int_signal+18 at ffffffff8164f292

this is due to I forgot to check if mp->timer is armed in
br_multicast_del_pg(). This bug is introduced by
commit 9f00b2e (bridge: only expire the mdb entry
when query is received).

Same for __br_mdb_del().

Tested-by: poma <pomidorabelisima@gmail.com>
Reported-by: LiYonghua <809674045@qq.com>
Reported-by: Robert Hancock <hancockrwd@gmail.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
c7e8e8a
popcornmix popcornmix referenced this issue from a commit
Harshula Jayasuriya nfsd: nfsd_open: when dentry_open returns an error do not propagate a…
…s struct file

commit e4daf1f upstream.

The following call chain:
------------------------------------------------------------
nfs4_get_vfs_file
- nfsd_open
  - dentry_open
    - do_dentry_open
      - __get_file_write_access
        - get_write_access
          - return atomic_inc_unless_negative(&inode->i_writecount) ? 0 : -ETXTBSY;
------------------------------------------------------------

can result in the following state:
------------------------------------------------------------
struct nfs4_file {
...
  fi_fds = {0xffff880c1fa65c80, 0xffffffffffffffe6, 0x0},
  fi_access = {{
      counter = 0x1
    }, {
      counter = 0x0
    }},
...
------------------------------------------------------------

1) First time around, in nfs4_get_vfs_file() fp->fi_fds[O_WRONLY] is
NULL, hence nfsd_open() is called where we get status set to an error
and fp->fi_fds[O_WRONLY] to -ETXTBSY. Thus we do not reach
nfs4_file_get_access() and fi_access[O_WRONLY] is not incremented.

2) Second time around, in nfs4_get_vfs_file() fp->fi_fds[O_WRONLY] is
NOT NULL (-ETXTBSY), so nfsd_open() is NOT called, but
nfs4_file_get_access() IS called and fi_access[O_WRONLY] is incremented.
Thus we leave a landmine in the form of the nfs4_file data structure in
an incorrect state.

3) Eventually, when __nfs4_file_put_access() is called it finds
fi_access[O_WRONLY] being non-zero, it decrements it and calls
nfs4_file_put_fd() which tries to fput -ETXTBSY.
------------------------------------------------------------
...
     [exception RIP: fput+0x9]
     RIP: ffffffff81177fa9  RSP: ffff88062e365c90  RFLAGS: 00010282
     RAX: ffff880c2b3d99cc  RBX: ffff880c2b3d9978  RCX: 0000000000000002
     RDX: dead000000100101  RSI: 0000000000000001  RDI: ffffffffffffffe6
     RBP: ffff88062e365c90   R8: ffff88041fe797d8   R9: ffff88062e365d58
     R10: 0000000000000008  R11: 0000000000000000  R12: 0000000000000001
     R13: 0000000000000007  R14: 0000000000000000  R15: 0000000000000000
     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
  #9 [ffff88062e365c98] __nfs4_file_put_access at ffffffffa0562334 [nfsd]
 #10 [ffff88062e365cc8] nfs4_file_put_access at ffffffffa05623ab [nfsd]
 #11 [ffff88062e365ce8] free_generic_stateid at ffffffffa056634d [nfsd]
 #12 [ffff88062e365d18] release_open_stateid at ffffffffa0566e4b [nfsd]
 #13 [ffff88062e365d38] nfsd4_close at ffffffffa0567401 [nfsd]
 #14 [ffff88062e365d88] nfsd4_proc_compound at ffffffffa0557f28 [nfsd]
 #15 [ffff88062e365dd8] nfsd_dispatch at ffffffffa054543e [nfsd]
 #16 [ffff88062e365e18] svc_process_common at ffffffffa04ba5a4 [sunrpc]
 #17 [ffff88062e365e98] svc_process at ffffffffa04babe0 [sunrpc]
 #18 [ffff88062e365eb8] nfsd at ffffffffa0545b62 [nfsd]
 #19 [ffff88062e365ee8] kthread at ffffffff81090886
 #20 [ffff88062e365f48] kernel_thread at ffffffff8100c14a
------------------------------------------------------------

Signed-off-by: Harshula Jayasuriya <harshula@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
ce4eb5e
popcornmix popcornmix referenced this issue from a commit
Harshula Jayasuriya nfsd: nfsd_open: when dentry_open returns an error do not propagate a…
…s struct file

The following call chain:
------------------------------------------------------------
nfs4_get_vfs_file
- nfsd_open
  - dentry_open
    - do_dentry_open
      - __get_file_write_access
        - get_write_access
          - return atomic_inc_unless_negative(&inode->i_writecount) ? 0 : -ETXTBSY;
------------------------------------------------------------

can result in the following state:
------------------------------------------------------------
struct nfs4_file {
...
  fi_fds = {0xffff880c1fa65c80, 0xffffffffffffffe6, 0x0},
  fi_access = {{
      counter = 0x1
    }, {
      counter = 0x0
    }},
...
------------------------------------------------------------

1) First time around, in nfs4_get_vfs_file() fp->fi_fds[O_WRONLY] is
NULL, hence nfsd_open() is called where we get status set to an error
and fp->fi_fds[O_WRONLY] to -ETXTBSY. Thus we do not reach
nfs4_file_get_access() and fi_access[O_WRONLY] is not incremented.

2) Second time around, in nfs4_get_vfs_file() fp->fi_fds[O_WRONLY] is
NOT NULL (-ETXTBSY), so nfsd_open() is NOT called, but
nfs4_file_get_access() IS called and fi_access[O_WRONLY] is incremented.
Thus we leave a landmine in the form of the nfs4_file data structure in
an incorrect state.

3) Eventually, when __nfs4_file_put_access() is called it finds
fi_access[O_WRONLY] being non-zero, it decrements it and calls
nfs4_file_put_fd() which tries to fput -ETXTBSY.
------------------------------------------------------------
...
     [exception RIP: fput+0x9]
     RIP: ffffffff81177fa9  RSP: ffff88062e365c90  RFLAGS: 00010282
     RAX: ffff880c2b3d99cc  RBX: ffff880c2b3d9978  RCX: 0000000000000002
     RDX: dead000000100101  RSI: 0000000000000001  RDI: ffffffffffffffe6
     RBP: ffff88062e365c90   R8: ffff88041fe797d8   R9: ffff88062e365d58
     R10: 0000000000000008  R11: 0000000000000000  R12: 0000000000000001
     R13: 0000000000000007  R14: 0000000000000000  R15: 0000000000000000
     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
  #9 [ffff88062e365c98] __nfs4_file_put_access at ffffffffa0562334 [nfsd]
 #10 [ffff88062e365cc8] nfs4_file_put_access at ffffffffa05623ab [nfsd]
 #11 [ffff88062e365ce8] free_generic_stateid at ffffffffa056634d [nfsd]
 #12 [ffff88062e365d18] release_open_stateid at ffffffffa0566e4b [nfsd]
 #13 [ffff88062e365d38] nfsd4_close at ffffffffa0567401 [nfsd]
 #14 [ffff88062e365d88] nfsd4_proc_compound at ffffffffa0557f28 [nfsd]
 #15 [ffff88062e365dd8] nfsd_dispatch at ffffffffa054543e [nfsd]
 #16 [ffff88062e365e18] svc_process_common at ffffffffa04ba5a4 [sunrpc]
 #17 [ffff88062e365e98] svc_process at ffffffffa04babe0 [sunrpc]
 #18 [ffff88062e365eb8] nfsd at ffffffffa0545b62 [nfsd]
 #19 [ffff88062e365ee8] kthread at ffffffff81090886
 #20 [ffff88062e365f48] kernel_thread at ffffffff8100c14a
------------------------------------------------------------

Cc: stable@vger.kernel.org
Signed-off-by: Harshula Jayasuriya <harshula@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
e4daf1f
popcornmix popcornmix referenced this issue from a commit
Libin x86/smpboot: Fix announce_cpu() to printk() the last "OK" properly
When booting secondary CPUs, announce_cpu() is called to show which cpu has
been brought up. For example:

[    0.402751] smpboot: Booting Node   0, Processors  #1 #2 #3 #4 #5 OK
[    0.525667] smpboot: Booting Node   1, Processors  #6 #7 #8 #9 #10 #11 OK
[    0.755592] smpboot: Booting Node   0, Processors  #12 #13 #14 #15 #16 #17 OK
[    0.890495] smpboot: Booting Node   1, Processors  #18 #19 #20 #21 #22 #23

But the last "OK" is lost, because 'nr_cpu_ids-1' represents the maximum
possible cpu id. It should use the maximum present cpu id in case not all
CPUs booted up.

Signed-off-by: Libin <huawei.libin@huawei.com>
Cc: <guohanjun@huawei.com>
Cc: <wangyijing@huawei.com>
Cc: <fenghua.yu@intel.com>
Cc: <paul.gortmaker@windriver.com>
Link: http://lkml.kernel.org/r/1378378676-18276-1-git-send-email-huawei.libin@huawei.com
[ tweaked the changelog, removed unnecessary line break, tweaked the format to align the fields vertically. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
5223948
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi
Borislav Petkov x86: Improve the printout of the SMP bootup CPU table
As the new x86 CPU bootup printout format code maintainer, I am
taking immediate action to improve and clean (and thus indulge
my OCD) the reporting of the cores when coming up online.

Fix padding to a right-hand alignment, cleanup code and bind
reporting width to the max number of supported CPUs on the
system, like this:

 [    0.074509] smpboot: Booting Node   0, Processors:      #1  #2  #3  #4  #5  #6  #7 OK
 [    0.644008] smpboot: Booting Node   1, Processors:  #8  #9 #10 #11 #12 #13 #14 #15 OK
 [    1.245006] smpboot: Booting Node   2, Processors: #16 #17 #18 #19 #20 #21 #22 #23 OK
 [    1.864005] smpboot: Booting Node   3, Processors: #24 #25 #26 #27 #28 #29 #30 #31 OK
 [    2.489005] smpboot: Booting Node   4, Processors: #32 #33 #34 #35 #36 #37 #38 #39 OK
 [    3.093005] smpboot: Booting Node   5, Processors: #40 #41 #42 #43 #44 #45 #46 #47 OK
 [    3.698005] smpboot: Booting Node   6, Processors: #48 #49 #50 #51 #52 #53 #54 #55 OK
 [    4.304005] smpboot: Booting Node   7, Processors: #56 #57 #58 #59 #60 #61 #62 #63 OK
 [    4.961413] Brought up 64 CPUs

and this:

 [    0.072367] smpboot: Booting Node   0, Processors:    #1 #2 #3 #4 #5 #6 #7 OK
 [    0.686329] Brought up 8 CPUs

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Libin <huawei.libin@huawei.com>
Cc: wangyijing@huawei.com
Cc: fenghua.yu@intel.com
Cc: guohanjun@huawei.com
Cc: paul.gortmaker@windriver.com
Link: http://lkml.kernel.org/r/20130927143554.GF4422@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
646e29a
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi
Borislav Petkov x86/boot: Further compress CPUs bootup message
Turn it into (for example):

[    0.073380] x86: Booting SMP configuration:
[    0.074005] .... node   #0, CPUs:          #1   #2   #3   #4   #5   #6   #7
[    0.603005] .... node   #1, CPUs:     #8   #9  #10  #11  #12  #13  #14  #15
[    1.200005] .... node   #2, CPUs:    #16  #17  #18  #19  #20  #21  #22  #23
[    1.796005] .... node   #3, CPUs:    #24  #25  #26  #27  #28  #29  #30  #31
[    2.393005] .... node   #4, CPUs:    #32  #33  #34  #35  #36  #37  #38  #39
[    2.996005] .... node   #5, CPUs:    #40  #41  #42  #43  #44  #45  #46  #47
[    3.600005] .... node   #6, CPUs:    #48  #49  #50  #51  #52  #53  #54  #55
[    4.202005] .... node   #7, CPUs:    #56  #57  #58  #59  #60  #61  #62  #63
[    4.811005] .... node   #8, CPUs:    #64  #65  #66  #67  #68  #69  #70  #71
[    5.421006] .... node   #9, CPUs:    #72  #73  #74  #75  #76  #77  #78  #79
[    6.032005] .... node  #10, CPUs:    #80  #81  #82  #83  #84  #85  #86  #87
[    6.648006] .... node  #11, CPUs:    #88  #89  #90  #91  #92  #93  #94  #95
[    7.262005] .... node  #12, CPUs:    #96  #97  #98  #99 #100 #101 #102 #103
[    7.865005] .... node  #13, CPUs:   #104 #105 #106 #107 #108 #109 #110 #111
[    8.466005] .... node  #14, CPUs:   #112 #113 #114 #115 #116 #117 #118 #119
[    9.073006] .... node  #15, CPUs:   #120 #121 #122 #123 #124 #125 #126 #127
[    9.679901] x86: Booted up 16 nodes, 128 CPUs

and drop useless elements.

Change num_digits() to hpa's division-avoiding, cell-phone-typed
version which he went at great lengths and pains to submit on a
Saturday evening.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: huawei.libin@huawei.com
Cc: wangyijing@huawei.com
Cc: fenghua.yu@intel.com
Cc: guohanjun@huawei.com
Cc: paul.gortmaker@windriver.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20130930095624.GB16383@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
a17bce4
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi
Mikulas Patocka blk-core: Fix memory corruption if blkcg_init_queue fails
If blkcg_init_queue fails, blk_alloc_queue_node doesn't call bdi_destroy
to clean up structures allocated by the backing dev.

------------[ cut here ]------------
WARNING: at lib/debugobjects.c:260 debug_print_object+0x85/0xa0()
ODEBUG: free active (active state 0) object type: percpu_counter hint:           (null)
Modules linked in: dm_loop dm_mod ip6table_filter ip6_tables uvesafb cfbcopyarea cfbimgblt cfbfillrect fbcon font bitblit fbcon_rotate fbcon_cw fbcon_ud fbcon_ccw softcursor fb fbdev ipt_MASQUERADE iptable_nat nf_nat_ipv4 msr nf_conntrack_ipv4 nf_defrag_ipv4 xt_state ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc tun ipv6 cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_ondemand cpufreq_conservative spadfs fuse hid_generic usbhid hid raid0 md_mod dmi_sysfs nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack lm85 hwmon_vid snd_usb_audio snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_hwdep snd_usbmidi_lib snd_rawmidi snd soundcore acpi_cpufreq freq_table mperf sata_svw serverworks kvm_amd ide_core ehci_pci ohci_hcd libata ehci_hcd kvm usbcore tg3 usb_common libphy k10temp pcspkr ptp i2c_piix4 i2c_core evdev microcode hwmon rtc_cmos pps_core e100 skge floppy mii processor button unix
CPU: 0 PID: 2739 Comm: lvchange Tainted: G        W
3.10.15-devel #14
Hardware name: empty empty/S3992-E, BIOS 'V1.06   ' 06/09/2009
 0000000000000009 ffff88023c3c1ae8 ffffffff813c8fd4 ffff88023c3c1b20
 ffffffff810399eb ffff88043d35cd58 ffffffff81651940 ffff88023c3c1bf8
 ffffffff82479d90 0000000000000005 ffff88023c3c1b80 ffffffff81039a67
Call Trace:
 [<ffffffff813c8fd4>] dump_stack+0x19/0x1b
 [<ffffffff810399eb>] warn_slowpath_common+0x6b/0xa0
 [<ffffffff81039a67>] warn_slowpath_fmt+0x47/0x50
 [<ffffffff8122aaaf>] ? debug_check_no_obj_freed+0xcf/0x250
 [<ffffffff81229a15>] debug_print_object+0x85/0xa0
 [<ffffffff8122abe3>] debug_check_no_obj_freed+0x203/0x250
 [<ffffffff8113c4ac>] kmem_cache_free+0x20c/0x3a0
 [<ffffffff811f6709>] blk_alloc_queue_node+0x2a9/0x2c0
 [<ffffffff811f672e>] blk_alloc_queue+0xe/0x10
 [<ffffffffa04c0093>] dm_create+0x1a3/0x530 [dm_mod]
 [<ffffffffa04c6bb0>] ? list_version_get_info+0xe0/0xe0 [dm_mod]
 [<ffffffffa04c6c07>] dev_create+0x57/0x2b0 [dm_mod]
 [<ffffffffa04c6bb0>] ? list_version_get_info+0xe0/0xe0 [dm_mod]
 [<ffffffffa04c6bb0>] ? list_version_get_info+0xe0/0xe0 [dm_mod]
 [<ffffffffa04c6528>] ctl_ioctl+0x268/0x500 [dm_mod]
 [<ffffffff81097662>] ? get_lock_stats+0x22/0x70
 [<ffffffffa04c67ce>] dm_ctl_ioctl+0xe/0x20 [dm_mod]
 [<ffffffff81161aad>] do_vfs_ioctl+0x2ed/0x520
 [<ffffffff8116cfc7>] ? fget_light+0x377/0x4e0
 [<ffffffff81161d2b>] SyS_ioctl+0x4b/0x90
 [<ffffffff813cff16>] system_call_fastpath+0x1a/0x1f
---[ end trace 4b5ff0d55673d986 ]---
------------[ cut here ]------------

This fix should be backported to stable kernels starting with 2.6.37. Note
that in the kernels prior to 3.5 the affected code is different, but the
bug is still there - bdi_init is called and bdi_destroy isn't.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: stable@kernel.org	# 2.6.37+
Signed-off-by: Jens Axboe <axboe@kernel.dk>
fff4996
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi
Linus Torvalds torvalds Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/lin…
…ux/kernel/git/tip/tip

Pull x86 boot changes from Ingo Molnar:
 "Two changes that prettify and compactify the SMP bootup output from:

     smpboot: Booting Node   0, Processors  #1 #2 #3 OK
     smpboot: Booting Node   1, Processors  #4 #5 #6 #7 OK
     smpboot: Booting Node   2, Processors  #8 #9 #10 #11 OK
     smpboot: Booting Node   3, Processors  #12 #13 #14 #15 OK
     Brought up 16 CPUs

  to something like:

     x86: Booting SMP configuration:
     .... node  #0, CPUs:        #1  #2  #3
     .... node  #1, CPUs:    #4  #5  #6  #7
     .... node  #2, CPUs:    #8  #9 #10 #11
     .... node  #3, CPUs:   #12 #13 #14 #15
     x86: Booted up 4 nodes, 16 CPUs"

* 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/boot: Further compress CPUs bootup message
  x86: Improve the printout of the SMP bootup CPU table
014d595
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi
Horia Geantă horiag crypto: ccm - Fix handling of zero plaintext when computing mac
There are cases when cryptlen can be zero in crypto_ccm_auth():
-encryptiom: input scatterlist length is zero (no plaintext)
-decryption: input scatterlist contains only the mac
plus the condition of having different source and destination buffers
(or else scatterlist length = max(plaintext_len, ciphertext_len)).

These are not handled correctly, leading to crashes like:

root@p4080ds:~/crypto# insmod tcrypt.ko mode=45
------------[ cut here ]------------
kernel BUG at crypto/scatterwalk.c:37!
Oops: Exception in kernel mode, sig: 5 [#1]
SMP NR_CPUS=8 P4080 DS
Modules linked in: tcrypt(+) crc32c xts xcbc vmac pcbc ecb gcm ghash_generic gf128mul ccm ctr seqiv
CPU: 3 PID: 1082 Comm: cryptomgr_test Not tainted 3.11.0 #14
task: ee12c5b0 ti: eecd0000 task.ti: eecd0000
NIP: c0204d98 LR: f9225848 CTR: c0204d80
REGS: eecd1b70 TRAP: 0700   Not tainted  (3.11.0)
MSR: 00029002 <CE,EE,ME>  CR: 22044022  XER: 20000000

GPR00: f9225c94 eecd1c20 ee12c5b0 eecd1c28 ee879400 ee879400 00000000 ee607464
GPR08: 00000001 00000001 00000000 006b0000 c0204d80 00000000 00000002 c0698e20
GPR16: ee987000 ee895000 fffffff4 ee879500 00000100 eecd1d58 00000001 00000000
GPR24: ee879400 00000020 00000000 00000000 ee5b2800 ee607430 00000004 ee607460
NIP [c0204d98] scatterwalk_start+0x18/0x30
LR [f9225848] get_data_to_compute+0x28/0x2f0 [ccm]
Call Trace:
[eecd1c20] [f9225974] get_data_to_compute+0x154/0x2f0 [ccm] (unreliable)
[eecd1c70] [f9225c94] crypto_ccm_auth+0x184/0x1d0 [ccm]
[eecd1cb0] [f9225d40] crypto_ccm_encrypt+0x60/0x2d0 [ccm]
[eecd1cf0] [c020d77c] __test_aead+0x3ec/0xe20
[eecd1e20] [c020f35c] test_aead+0x6c/0xe0
[eecd1e40] [c020f420] alg_test_aead+0x50/0xd0
[eecd1e60] [c020e5e4] alg_test+0x114/0x2e0
[eecd1ee0] [c020bd1c] cryptomgr_test+0x4c/0x60
[eecd1ef0] [c0047058] kthread+0xa8/0xb0
[eecd1f40] [c000eb0c] ret_from_kernel_thread+0x5c/0x64
Instruction dump:
0f080000 81290024 552807fe 0f080000 5529003a 4bffffb4 90830000 39400000
39000001 8124000c 2f890000 7d28579e <0f090000> 81240008 91230004 4e800020
---[ end trace 6d652dfcd1be37bd ]---

Cc: <stable@vger.kernel.org>
Cc: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Horia Geanta <horia.geanta@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
5638cab
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi
fabioestevam fabioestevam ARM: 7907/1: lib: delay-loop: Add align directive to fix BogoMIPS cal…
…culation

Currently mx53 (CortexA8) running at 1GHz reports:
Calibrating delay loop... 663.55 BogoMIPS (lpj=3317760)

Tom Evans verified that alignments of 0x0 and 0x8 run the two instructions of __loop_delay in one clock cycle (1 clock/loop), while alignments of 0x4 and 0xc take 3 clocks to run the loop twice. (1.5 clock/loop)

The original object code looks like this:

00000010 <__loop_const_udelay>:
  10:	e3e01000 	mvn	r1, #0
  14:	e51f201c 	ldr	r2, [pc, #-28]	; 0 <__loop_udelay-0x8>
  18:	e5922000 	ldr	r2, [r2]
  1c:	e0800921 	add	r0, r0, r1, lsr #18
  20:	e1a00720 	lsr	r0, r0, #14
  24:	e0822b21 	add	r2, r2, r1, lsr #22
  28:	e1a02522 	lsr	r2, r2, #10
  2c:	e0000092 	mul	r0, r2, r0
  30:	e0800d21 	add	r0, r0, r1, lsr #26
  34:	e1b00320 	lsrs	r0, r0, #6
  38:	01a0f00e 	moveq	pc, lr

0000003c <__loop_delay>:
  3c:	e2500001 	subs	r0, r0, #1
  40:	8afffffe 	bhi	3c <__loop_delay>
  44:	e1a0f00e 	mov	pc, lr

After adding the 'align 3' directive to __loop_delay (align to 8 bytes):

00000010 <__loop_const_udelay>:
  10:	e3e01000 	mvn	r1, #0
  14:	e51f201c 	ldr	r2, [pc, #-28]	; 0 <__loop_udelay-0x8>
  18:	e5922000 	ldr	r2, [r2]
  1c:	e0800921 	add	r0, r0, r1, lsr #18
  20:	e1a00720 	lsr	r0, r0, #14
  24:	e0822b21 	add	r2, r2, r1, lsr #22
  28:	e1a02522 	lsr	r2, r2, #10
  2c:	e0000092 	mul	r0, r2, r0
  30:	e0800d21 	add	r0, r0, r1, lsr #26
  34:	e1b00320 	lsrs	r0, r0, #6
  38:	01a0f00e 	moveq	pc, lr
  3c:	e320f000 	nop	{0}

00000040 <__loop_delay>:
  40:	e2500001 	subs	r0, r0, #1
  44:	8afffffe 	bhi	40 <__loop_delay>
  48:	e1a0f00e 	mov	pc, lr
  4c:	e320f000 	nop	{0}

, which now reports:
Calibrating delay loop... 996.14 BogoMIPS (lpj=4980736)

Some more test results:

On mx31 (ARM1136) running at 532 MHz, before the patch:
Calibrating delay loop... 351.43 BogoMIPS (lpj=1757184)

On mx31 (ARM1136) running at 532 MHz after the patch:
Calibrating delay loop... 528.79 BogoMIPS (lpj=2643968)

Also tested on mx6 (CortexA9) and on mx27 (ARM926), which shows the same
BogoMIPS value before and after this patch.

Reported-by: Tom Evans <tom_usenet@optusnet.com.au>
Suggested-by: Tom Evans <tom_usenet@optusnet.com.au>
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
11d4bb1
popcornmix popcornmix referenced this issue from a commit
Mikulas Patocka blk-core: Fix memory corruption if blkcg_init_queue fails
commit fff4996 upstream.

If blkcg_init_queue fails, blk_alloc_queue_node doesn't call bdi_destroy
to clean up structures allocated by the backing dev.

------------[ cut here ]------------
WARNING: at lib/debugobjects.c:260 debug_print_object+0x85/0xa0()
ODEBUG: free active (active state 0) object type: percpu_counter hint:           (null)
Modules linked in: dm_loop dm_mod ip6table_filter ip6_tables uvesafb cfbcopyarea cfbimgblt cfbfillrect fbcon font bitblit fbcon_rotate fbcon_cw fbcon_ud fbcon_ccw softcursor fb fbdev ipt_MASQUERADE iptable_nat nf_nat_ipv4 msr nf_conntrack_ipv4 nf_defrag_ipv4 xt_state ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc tun ipv6 cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_ondemand cpufreq_conservative spadfs fuse hid_generic usbhid hid raid0 md_mod dmi_sysfs nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack lm85 hwmon_vid snd_usb_audio snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_hwdep snd_usbmidi_lib snd_rawmidi snd soundcore acpi_cpufreq freq_table mperf sata_svw serverworks kvm_amd ide_core ehci_pci ohci_hcd libata ehci_hcd kvm usbcore tg3 usb_common libphy k10temp pcspkr ptp i2c_piix4 i2c_core evdev microcode hwmon rtc_cmos pps_core e100 skge floppy mii processor button unix
CPU: 0 PID: 2739 Comm: lvchange Tainted: G        W
3.10.15-devel #14
Hardware name: empty empty/S3992-E, BIOS 'V1.06   ' 06/09/2009
 0000000000000009 ffff88023c3c1ae8 ffffffff813c8fd4 ffff88023c3c1b20
 ffffffff810399eb ffff88043d35cd58 ffffffff81651940 ffff88023c3c1bf8
 ffffffff82479d90 0000000000000005 ffff88023c3c1b80 ffffffff81039a67
Call Trace:
 [<ffffffff813c8fd4>] dump_stack+0x19/0x1b
 [<ffffffff810399eb>] warn_slowpath_common+0x6b/0xa0
 [<ffffffff81039a67>] warn_slowpath_fmt+0x47/0x50
 [<ffffffff8122aaaf>] ? debug_check_no_obj_freed+0xcf/0x250
 [<ffffffff81229a15>] debug_print_object+0x85/0xa0
 [<ffffffff8122abe3>] debug_check_no_obj_freed+0x203/0x250
 [<ffffffff8113c4ac>] kmem_cache_free+0x20c/0x3a0
 [<ffffffff811f6709>] blk_alloc_queue_node+0x2a9/0x2c0
 [<ffffffff811f672e>] blk_alloc_queue+0xe/0x10
 [<ffffffffa04c0093>] dm_create+0x1a3/0x530 [dm_mod]
 [<ffffffffa04c6bb0>] ? list_version_get_info+0xe0/0xe0 [dm_mod]
 [<ffffffffa04c6c07>] dev_create+0x57/0x2b0 [dm_mod]
 [<ffffffffa04c6bb0>] ? list_version_get_info+0xe0/0xe0 [dm_mod]
 [<ffffffffa04c6bb0>] ? list_version_get_info+0xe0/0xe0 [dm_mod]
 [<ffffffffa04c6528>] ctl_ioctl+0x268/0x500 [dm_mod]
 [<ffffffff81097662>] ? get_lock_stats+0x22/0x70
 [<ffffffffa04c67ce>] dm_ctl_ioctl+0xe/0x20 [dm_mod]
 [<ffffffff81161aad>] do_vfs_ioctl+0x2ed/0x520
 [<ffffffff8116cfc7>] ? fget_light+0x377/0x4e0
 [<ffffffff81161d2b>] SyS_ioctl+0x4b/0x90
 [<ffffffff813cff16>] system_call_fastpath+0x1a/0x1f
---[ end trace 4b5ff0d55673d986 ]---
------------[ cut here ]------------

This fix should be backported to stable kernels starting with 2.6.37. Note
that in the kernels prior to 3.5 the affected code is different, but the
bug is still there - bdi_init is called and bdi_destroy isn't.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
f5360a4
popcornmix popcornmix referenced this issue from a commit
Mikulas Patocka blk-core: Fix memory corruption if blkcg_init_queue fails
commit fff4996 upstream.

If blkcg_init_queue fails, blk_alloc_queue_node doesn't call bdi_destroy
to clean up structures allocated by the backing dev.

------------[ cut here ]------------
WARNING: at lib/debugobjects.c:260 debug_print_object+0x85/0xa0()
ODEBUG: free active (active state 0) object type: percpu_counter hint:           (null)
Modules linked in: dm_loop dm_mod ip6table_filter ip6_tables uvesafb cfbcopyarea cfbimgblt cfbfillrect fbcon font bitblit fbcon_rotate fbcon_cw fbcon_ud fbcon_ccw softcursor fb fbdev ipt_MASQUERADE iptable_nat nf_nat_ipv4 msr nf_conntrack_ipv4 nf_defrag_ipv4 xt_state ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc tun ipv6 cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_ondemand cpufreq_conservative spadfs fuse hid_generic usbhid hid raid0 md_mod dmi_sysfs nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack lm85 hwmon_vid snd_usb_audio snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_hwdep snd_usbmidi_lib snd_rawmidi snd soundcore acpi_cpufreq freq_table mperf sata_svw serverworks kvm_amd ide_core ehci_pci ohci_hcd libata ehci_hcd kvm usbcore tg3 usb_common libphy k10temp pcspkr ptp i2c_piix4 i2c_core evdev microcode hwmon rtc_cmos pps_core e100 skge floppy mii processor button unix
CPU: 0 PID: 2739 Comm: lvchange Tainted: G        W
3.10.15-devel #14
Hardware name: empty empty/S3992-E, BIOS 'V1.06   ' 06/09/2009
 0000000000000009 ffff88023c3c1ae8 ffffffff813c8fd4 ffff88023c3c1b20
 ffffffff810399eb ffff88043d35cd58 ffffffff81651940 ffff88023c3c1bf8
 ffffffff82479d90 0000000000000005 ffff88023c3c1b80 ffffffff81039a67
Call Trace:
 [<ffffffff813c8fd4>] dump_stack+0x19/0x1b
 [<ffffffff810399eb>] warn_slowpath_common+0x6b/0xa0
 [<ffffffff81039a67>] warn_slowpath_fmt+0x47/0x50
 [<ffffffff8122aaaf>] ? debug_check_no_obj_freed+0xcf/0x250
 [<ffffffff81229a15>] debug_print_object+0x85/0xa0
 [<ffffffff8122abe3>] debug_check_no_obj_freed+0x203/0x250
 [<ffffffff8113c4ac>] kmem_cache_free+0x20c/0x3a0
 [<ffffffff811f6709>] blk_alloc_queue_node+0x2a9/0x2c0
 [<ffffffff811f672e>] blk_alloc_queue+0xe/0x10
 [<ffffffffa04c0093>] dm_create+0x1a3/0x530 [dm_mod]
 [<ffffffffa04c6bb0>] ? list_version_get_info+0xe0/0xe0 [dm_mod]
 [<ffffffffa04c6c07>] dev_create+0x57/0x2b0 [dm_mod]
 [<ffffffffa04c6bb0>] ? list_version_get_info+0xe0/0xe0 [dm_mod]
 [<ffffffffa04c6bb0>] ? list_version_get_info+0xe0/0xe0 [dm_mod]
 [<ffffffffa04c6528>] ctl_ioctl+0x268/0x500 [dm_mod]
 [<ffffffff81097662>] ? get_lock_stats+0x22/0x70
 [<ffffffffa04c67ce>] dm_ctl_ioctl+0xe/0x20 [dm_mod]
 [<ffffffff81161aad>] do_vfs_ioctl+0x2ed/0x520
 [<ffffffff8116cfc7>] ? fget_light+0x377/0x4e0
 [<ffffffff81161d2b>] SyS_ioctl+0x4b/0x90
 [<ffffffff813cff16>] system_call_fastpath+0x1a/0x1f
---[ end trace 4b5ff0d55673d986 ]---
------------[ cut here ]------------

This fix should be backported to stable kernels starting with 2.6.37. Note
that in the kernels prior to 3.5 the affected code is different, but the
bug is still there - bdi_init is called and bdi_destroy isn't.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
d8db1a5
popcornmix popcornmix referenced this issue from a commit
Horia Geantă horiag crypto: ccm - Fix handling of zero plaintext when computing mac
commit 5638cab upstream.

There are cases when cryptlen can be zero in crypto_ccm_auth():
-encryptiom: input scatterlist length is zero (no plaintext)
-decryption: input scatterlist contains only the mac
plus the condition of having different source and destination buffers
(or else scatterlist length = max(plaintext_len, ciphertext_len)).

These are not handled correctly, leading to crashes like:

root@p4080ds:~/crypto# insmod tcrypt.ko mode=45
------------[ cut here ]------------
kernel BUG at crypto/scatterwalk.c:37!
Oops: Exception in kernel mode, sig: 5 [#1]
SMP NR_CPUS=8 P4080 DS
Modules linked in: tcrypt(+) crc32c xts xcbc vmac pcbc ecb gcm ghash_generic gf128mul ccm ctr seqiv
CPU: 3 PID: 1082 Comm: cryptomgr_test Not tainted 3.11.0 #14
task: ee12c5b0 ti: eecd0000 task.ti: eecd0000
NIP: c0204d98 LR: f9225848 CTR: c0204d80
REGS: eecd1b70 TRAP: 0700   Not tainted  (3.11.0)
MSR: 00029002 <CE,EE,ME>  CR: 22044022  XER: 20000000

GPR00: f9225c94 eecd1c20 ee12c5b0 eecd1c28 ee879400 ee879400 00000000 ee607464
GPR08: 00000001 00000001 00000000 006b0000 c0204d80 00000000 00000002 c0698e20
GPR16: ee987000 ee895000 fffffff4 ee879500 00000100 eecd1d58 00000001 00000000
GPR24: ee879400 00000020 00000000 00000000 ee5b2800 ee607430 00000004 ee607460
NIP [c0204d98] scatterwalk_start+0x18/0x30
LR [f9225848] get_data_to_compute+0x28/0x2f0 [ccm]
Call Trace:
[eecd1c20] [f9225974] get_data_to_compute+0x154/0x2f0 [ccm] (unreliable)
[eecd1c70] [f9225c94] crypto_ccm_auth+0x184/0x1d0 [ccm]
[eecd1cb0] [f9225d40] crypto_ccm_encrypt+0x60/0x2d0 [ccm]
[eecd1cf0] [c020d77c] __test_aead+0x3ec/0xe20
[eecd1e20] [c020f35c] test_aead+0x6c/0xe0
[eecd1e40] [c020f420] alg_test_aead+0x50/0xd0
[eecd1e60] [c020e5e4] alg_test+0x114/0x2e0
[eecd1ee0] [c020bd1c] cryptomgr_test+0x4c/0x60
[eecd1ef0] [c0047058] kthread+0xa8/0xb0
[eecd1f40] [c000eb0c] ret_from_kernel_thread+0x5c/0x64
Instruction dump:
0f080000 81290024 552807fe 0f080000 5529003a 4bffffb4 90830000 39400000
39000001 8124000c 2f890000 7d28579e <0f090000> 81240008 91230004 4e800020
---[ end trace 6d652dfcd1be37bd ]---

Cc: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Horia Geanta <horia.geanta@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
bf2e523
davet321 davet321 referenced this issue from a commit in davet321/rpi-linux
Horia Geantă horiag crypto: ccm - Fix handling of zero plaintext when computing mac
commit 5638cab upstream.

There are cases when cryptlen can be zero in crypto_ccm_auth():
-encryptiom: input scatterlist length is zero (no plaintext)
-decryption: input scatterlist contains only the mac
plus the condition of having different source and destination buffers
(or else scatterlist length = max(plaintext_len, ciphertext_len)).

These are not handled correctly, leading to crashes like:

root@p4080ds:~/crypto# insmod tcrypt.ko mode=45
------------[ cut here ]------------
kernel BUG at crypto/scatterwalk.c:37!
Oops: Exception in kernel mode, sig: 5 [#1]
SMP NR_CPUS=8 P4080 DS
Modules linked in: tcrypt(+) crc32c xts xcbc vmac pcbc ecb gcm ghash_generic gf128mul ccm ctr seqiv
CPU: 3 PID: 1082 Comm: cryptomgr_test Not tainted 3.11.0 #14
task: ee12c5b0 ti: eecd0000 task.ti: eecd0000
NIP: c0204d98 LR: f9225848 CTR: c0204d80
REGS: eecd1b70 TRAP: 0700   Not tainted  (3.11.0)
MSR: 00029002 <CE,EE,ME>  CR: 22044022  XER: 20000000

GPR00: f9225c94 eecd1c20 ee12c5b0 eecd1c28 ee879400 ee879400 00000000 ee607464
GPR08: 00000001 00000001 00000000 006b0000 c0204d80 00000000 00000002 c0698e20
GPR16: ee987000 ee895000 fffffff4 ee879500 00000100 eecd1d58 00000001 00000000
GPR24: ee879400 00000020 00000000 00000000 ee5b2800 ee607430 00000004 ee607460
NIP [c0204d98] scatterwalk_start+0x18/0x30
LR [f9225848] get_data_to_compute+0x28/0x2f0 [ccm]
Call Trace:
[eecd1c20] [f9225974] get_data_to_compute+0x154/0x2f0 [ccm] (unreliable)
[eecd1c70] [f9225c94] crypto_ccm_auth+0x184/0x1d0 [ccm]
[eecd1cb0] [f9225d40] crypto_ccm_encrypt+0x60/0x2d0 [ccm]
[eecd1cf0] [c020d77c] __test_aead+0x3ec/0xe20
[eecd1e20] [c020f35c] test_aead+0x6c/0xe0
[eecd1e40] [c020f420] alg_test_aead+0x50/0xd0
[eecd1e60] [c020e5e4] alg_test+0x114/0x2e0
[eecd1ee0] [c020bd1c] cryptomgr_test+0x4c/0x60
[eecd1ef0] [c0047058] kthread+0xa8/0xb0
[eecd1f40] [c000eb0c] ret_from_kernel_thread+0x5c/0x64
Instruction dump:
0f080000 81290024 552807fe 0f080000 5529003a 4bffffb4 90830000 39400000
39000001 8124000c 2f890000 7d28579e <0f090000> 81240008 91230004 4e800020
---[ end trace 6d652dfcd1be37bd ]---

Cc: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Horia Geanta <horia.geanta@freescale.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
9a63ac8
popcornmix popcornmix referenced this issue from a commit
hschaa hschaa netfilter: nf_ct_timestamp: Fix BUG_ON after netns deletion
When having nf_conntrack_timestamp enabled deleting a netns
can lead to the following BUG being triggered:

[63836.660000] Kernel bug detected[#1]:
[63836.660000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.10.18 #14
[63836.660000] task: 802d9420 ti: 802d2000 task.ti: 802d2000
[63836.660000] $ 0   : 00000000 00000000 00000000 00000000
[63836.660000] $ 4   : 00000001 00000004 00000020 00000020
[63836.660000] $ 8   : 00000000 80064910 00000000 00000000
[63836.660000] $12   : 0bff0002 00000001 00000000 0a0a0abe
[63836.660000] $16   : 802e70a0 85f29d80 00000000 00000004
[63836.660000] $20   : 85fb62a 00000002 802d3bc0 85fb62a
[63836.660000] $24   : 00000000 87138110
[63836.660000] $28   : 802d2000 802d3b40 00000014 871327cc
[63836.660000] Hi    : 000005ff
[63836.660000] Lo    : f2edd000
[63836.660000] epc   : 87138794 __nf_ct_ext_add_length+0xe8/0x1ec [nf_conntrack]
[63836.660000]     Not tainted
[63836.660000] ra    : 871327cc nf_conntrack_in+0x31c/0x7b8 [nf_conntrack]
[63836.660000] Status: 1100d403 KERNEL EXL IE
[63836.660000] Cause : 00800034
[63836.660000] PrId  : 0001974c (MIPS 74Kc)
[63836.660000] Modules linked in: ath9k ath9k_common pppoe ppp_async iptable_nat ath9k_hw ath pppox ppp_generic nf_nat_ipv4 nf_conntrack_ipv4 mac80211 ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_state xt_quota xt_policy xt_pkttype xt_owner xt_nat xt_multiport xt_mark xh
[63836.660000] Process swapper (pid: 0, threadinfo=802d2000, task=802d9420, tls=00000000)
[63836.660000] Stack : 802e70a0 871323d4 00000005 87080234 802e70a0 86d2a840 00000000 00000000
[63836.660000] Call Trace:
[63836.660000] [<87138794>] __nf_ct_ext_add_length+0xe8/0x1ec [nf_conntrack]
[63836.660000] [<871327cc>] nf_conntrack_in+0x31c/0x7b8 [nf_conntrack]
[63836.660000] [<801ff63c>] nf_iterate+0x90/0xec
[63836.660000] [<801ff730>] nf_hook_slow+0x98/0x164
[63836.660000] [<80205968>] ip_rcv+0x3e8/0x40c
[63836.660000] [<801d9754>] __netif_receive_skb_core+0x624/0x6a4
[63836.660000] [<801da124>] process_backlog+0xa4/0x16c
[63836.660000] [<801d9bb4>] net_rx_action+0x10c/0x1e0
[63836.660000] [<8007c5a4>] __do_softirq+0xd0/0x1bc
[63836.660000] [<8007c730>] do_softirq+0x48/0x68
[63836.660000] [<8007c964>] irq_exit+0x54/0x70
[63836.660000] [<80060830>] ret_from_irq+0x0/0x4
[63836.660000] [<8006a9f8>] r4k_wait_irqoff+0x18/0x1c
[63836.660000] [<8009cfb8>] cpu_startup_entry+0xa4/0x104
[63836.660000] [<802eb918>] start_kernel+0x394/0x3ac
[63836.660000]
[63836.660000]
Code: 00821021  8c420000  2c440001 <00040336> 90440011  92350010  90560010  2485ffff  02a5a821
[63837.040000] ---[ end trace ebf660c3ce3b55e7 ]---
[63837.050000] Kernel panic - not syncing: Fatal exception in interrupt
[63837.050000] Rebooting in 3 seconds..

Fix this by not unregistering the conntrack extension in the per-netns
cleanup code.

This bug was introduced in (73f4001 netfilter: nf_ct_tstamp: move
initialization out of pernet_operations).

Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
443d20f
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi
Prarit Bhargava x86: Add check for number of available vectors before CPU down
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=64791

When a cpu is downed on a system, the irqs on the cpu are assigned to
other cpus.  It is possible, however, that when a cpu is downed there
aren't enough free vectors on the remaining cpus to account for the
vectors from the cpu that is being downed.

This results in an interesting "overflow" condition where irqs are
"assigned" to a CPU but are not handled.

For example, when downing cpus on a 1-64 logical processor system:

<snip>
[  232.021745] smpboot: CPU 61 is now offline
[  238.480275] smpboot: CPU 62 is now offline
[  245.991080] ------------[ cut here ]------------
[  245.996270] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:264 dev_watchdog+0x246/0x250()
[  246.005688] NETDEV WATCHDOG: p786p1 (ixgbe): transmit queue 0 timed out
[  246.013070] Modules linked in: lockd sunrpc iTCO_wdt iTCO_vendor_support sb_edac ixgbe microcode e1000e pcspkr joydev edac_core lpc_ich ioatdma ptp mdio mfd_core i2c_i801 dca pps_core i2c_core wmi acpi_cpufreq isci libsas scsi_transport_sas
[  246.037633] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.12.0+ #14
[  246.044451] Hardware name: Intel Corporation S4600LH ........../SVRBD-ROW_T, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
[  246.057371]  0000000000000009 ffff88081fa03d40 ffffffff8164fbf6 ffff88081fa0ee48
[  246.065728]  ffff88081fa03d90 ffff88081fa03d80 ffffffff81054ecc ffff88081fa13040
[  246.074073]  0000000000000000 ffff88200cce0000 0000000000000040 0000000000000000
[  246.082430] Call Trace:
[  246.085174]  <IRQ>  [<ffffffff8164fbf6>] dump_stack+0x46/0x58
[  246.091633]  [<ffffffff81054ecc>] warn_slowpath_common+0x8c/0xc0
[  246.098352]  [<ffffffff81054fb6>] warn_slowpath_fmt+0x46/0x50
[  246.104786]  [<ffffffff815710d6>] dev_watchdog+0x246/0x250
[  246.110923]  [<ffffffff81570e90>] ? dev_deactivate_queue.constprop.31+0x80/0x80
[  246.119097]  [<ffffffff8106092a>] call_timer_fn+0x3a/0x110
[  246.125224]  [<ffffffff8106280f>] ? update_process_times+0x6f/0x80
[  246.132137]  [<ffffffff81570e90>] ? dev_deactivate_queue.constprop.31+0x80/0x80
[  246.140308]  [<ffffffff81061db0>] run_timer_softirq+0x1f0/0x2a0
[  246.146933]  [<ffffffff81059a80>] __do_softirq+0xe0/0x220
[  246.152976]  [<ffffffff8165fedc>] call_softirq+0x1c/0x30
[  246.158920]  [<ffffffff810045f5>] do_softirq+0x55/0x90
[  246.164670]  [<ffffffff81059d35>] irq_exit+0xa5/0xb0
[  246.170227]  [<ffffffff8166062a>] smp_apic_timer_interrupt+0x4a/0x60
[  246.177324]  [<ffffffff8165f40a>] apic_timer_interrupt+0x6a/0x70
[  246.184041]  <EOI>  [<ffffffff81505a1b>] ? cpuidle_enter_state+0x5b/0xe0
[  246.191559]  [<ffffffff81505a17>] ? cpuidle_enter_state+0x57/0xe0
[  246.198374]  [<ffffffff81505b5d>] cpuidle_idle_call+0xbd/0x200
[  246.204900]  [<ffffffff8100b7ae>] arch_cpu_idle+0xe/0x30
[  246.210846]  [<ffffffff810a47b0>] cpu_startup_entry+0xd0/0x250
[  246.217371]  [<ffffffff81646b47>] rest_init+0x77/0x80
[  246.223028]  [<ffffffff81d09e8e>] start_kernel+0x3ee/0x3fb
[  246.229165]  [<ffffffff81d0989f>] ? repair_env_string+0x5e/0x5e
[  246.235787]  [<ffffffff81d095a5>] x86_64_start_reservations+0x2a/0x2c
[  246.242990]  [<ffffffff81d0969f>] x86_64_start_kernel+0xf8/0xfc
[  246.249610] ---[ end trace fb74fdef54d79039 ]---
[  246.254807] ixgbe 0000:c2:00.0 p786p1: initiating reset due to tx timeout
[  246.262489] ixgbe 0000:c2:00.0 p786p1: Reset adapter
Last login: Mon Nov 11 08:35:14 from 10.18.17.119
[root@(none) ~]# [  246.792676] ixgbe 0000:c2:00.0 p786p1: detected SFP+: 5
[  249.231598] ixgbe 0000:c2:00.0 p786p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX
[  246.792676] ixgbe 0000:c2:00.0 p786p1: detected SFP+: 5
[  249.231598] ixgbe 0000:c2:00.0 p786p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX

(last lines keep repeating.  ixgbe driver is dead until module reload.)

If the downed cpu has more vectors than are free on the remaining cpus on the
system, it is possible that some vectors are "orphaned" even though they are
assigned to a cpu.  In this case, since the ixgbe driver had a watchdog, the
watchdog fired and notified that something was wrong.

This patch adds a function, check_vectors(), to compare the number of vectors
on the CPU going down and compares it to the number of vectors available on
the system.  If there aren't enough vectors for the CPU to go down, an
error is returned and propogated back to userspace.

v2: Do not need to look at percpu irqs
v3: Need to check affinity to prevent counting of MSIs in IOAPIC Lowest
    Priority Mode
v4: Additional changes suggested by Gong Chen.
v5/v6/v7/v8: Updated comment text

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Link: http://lkml.kernel.org/r/1389613861-3853-1-git-send-email-prarit@redhat.com
Reviewed-by: Gong Chen <gong.chen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Seiji Aguchi <seiji.aguchi@hds.com>
Cc: Yang Zhang <yang.z.zhang@Intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Janet Morgan <janet.morgan@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Ruiv Wang <ruiv.wang@gmail.com>
Cc: Gong Chen <gong.chen@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@vger.kernel.org>
da6139e
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi
Azat Khuzhin azat ext4: initialize multi-block allocator before checking block descriptors
With EXT4FS_DEBUG ext4_count_free_clusters() will call
ext4_read_block_bitmap() without s_group_info initialized, so we need to
initialize multi-block allocator before.

And we can't initialize multi-block allocator without group descriptors,
since it use them.
Also we need to install s_op before initializing multi-block allocator,
because in ext4_mb_init_backend() new inode is created.

Here is bt:
(gdb) bt
 #0  ext4_get_group_info (group=0, sb=0xffff880079a10000) at ext4.h:2430
 #1  ext4_validate_block_bitmap (sb=sb@entry=0xffff880079a10000, desc=desc@entry=0xffff880056510000, block_group=block_group@entry=0,
     bh=bh@entry=0xffff88007bf2b2d8) at balloc.c:358
 #2  0xffffffff81232202 in ext4_wait_block_bitmap (sb=sb@entry=0xffff880079a10000, block_group=block_group@entry=0,
     bh=bh@entry=0xffff88007bf2b2d8) at balloc.c:476
 #3  0xffffffff81232eaf in ext4_read_block_bitmap (sb=sb@entry=0xffff880079a10000, block_group=block_group@entry=0) at balloc.c:489
 #4  0xffffffff81232fc0 in ext4_count_free_clusters (sb=sb@entry=0xffff880079a10000) at balloc.c:665
 #5  0xffffffff81259ffa in ext4_check_descriptors (first_not_zeroed=<synthetic pointer>, sb=0xffff880079a10000) at super.c:2143
 #6  ext4_fill_super (sb=sb@entry=0xffff880079a10000, data=<optimized out>, data@entry=0x0 <irq_stack_union>, silent=silent@entry=0)
     at super.c:3851
 #7  0xffffffff811b8340 in mount_bdev (fs_type=<optimized out>, flags=0, dev_name=<optimized out>, data=0x0 <irq_stack_union>,
     fill_super=fill_super@entry=0xffffffff812589c0 <ext4_fill_super>) at super.c:987
 #8  0xffffffff8124ec35 in ext4_mount (fs_type=<optimized out>, flags=<optimized out>, dev_name=<optimized out>, data=<optimized out>)
     at super.c:5365
 #9  0xffffffff811b8cf9 in mount_fs (type=type@entry=0xffffffff81c71840 <ext4_fs_type>, flags=flags@entry=0,
     name=name@entry=0xffff880077a80c70 "/dev/loop4", data=data@entry=0x0 <irq_stack_union>) at super.c:1090
 #10 0xffffffff811d2ff3 in vfs_kern_mount (type=type@entry=0xffffffff81c71840 <ext4_fs_type>, flags=0,
     name=name@entry=0xffff880077a80c70 "/dev/loop4", data=data@entry=0x0 <irq_stack_union>) at namespace.c:813
 #11 0xffffffff811d55de in do_new_mount (data=0x0 <irq_stack_union>, name=0xffff880077a80c70 "/dev/loop4", mnt_flags=32,
     flags=<optimized out>, fstype=0xffff880077a80ca0 "ext4-insane", path=0xffff88007a5b1ed0) at namespace.c:2068
 #12 do_mount (dev_name=0xffff880077a80c70 "/dev/loop4", dir_name=<optimized out>, type_page=0xffff880077a80ca0 "ext4-insane",
     flags=<optimized out>, flags@entry=3236757504, data_page=0x0 <irq_stack_union>) at namespace.c:2392
 #13 0xffffffff811d6183 in SYSC_mount (data=0x0 <irq_stack_union>, flags=3236757504, type=<optimized out>, dir_name=<optimized out>,
     dev_name=0x7ffad9649c20 "/dev/loop4") at namespace.c:2586
 #14 SyS_mount (dev_name=140715365800992, dir_name=<optimized out>, type=<optimized out>, flags=3236757504, data=0) at namespace.c:2559

Signed-off-by: Azat Khuzhin <a3at.mail@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
52d249e
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi
Juri Lelli jlelli sched/core: Make dl_b->lock IRQ safe
Fix this lockdep warning:

[   44.804600] =========================================================
[   44.805746] [ INFO: possible irq lock inversion dependency detected ]
[   44.805746] 3.14.0-rc2-test+ #14 Not tainted
[   44.805746] ---------------------------------------------------------
[   44.805746] bash/3674 just changed the state of lock:
[   44.805746]  (&dl_b->lock){+.....}, at: [<ffffffff8106ad15>] sched_rt_handler+0x132/0x248
[   44.805746] but this lock was taken by another, HARDIRQ-safe lock in the past:
[   44.805746]  (&rq->lock){-.-.-.}

and interrupts could create inverse lock ordering between them.

[   44.805746]
[   44.805746] other info that might help us debug this:
[   44.805746]  Possible interrupt unsafe locking scenario:
[   44.805746]
[   44.805746]        CPU0                    CPU1
[   44.805746]        ----                    ----
[   44.805746]   lock(&dl_b->lock);
[   44.805746]                                local_irq_disable();
[   44.805746]                                lock(&rq->lock);
[   44.805746]                                lock(&dl_b->lock);
[   44.805746]   <Interrupt>
[   44.805746]     lock(&rq->lock);

by making dl_b->lock acquiring always IRQ safe.

Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Juri Lelli <juri.lelli@gmail.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1392107067-19907-3-git-send-email-juri.lelli@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
4951634
popcornmix popcornmix referenced this issue from a commit
mlombard mlombard [SCSI] st: fix corruption of the st_modedef structures in st_set_opti…
…ons()

When copying the st_modedef structures the devs pointers must be preserved
in the same way as with the cdevs pointers.

This fixes bug 70271: https://bugzilla.kernel.org/show_bug.cgi?id=70271

[  135.037052] BUG: unable to handle kernel NULL pointer dereference at 0000000000000098
[  135.045048] IP: [<ffffffff812af6a1>] kernfs_find_ns+0x21/0x150
[  135.050999] PGD 220623067 PUD 222171067 PMD 0
[  135.055593] Oops: 0000 [#1] SMP
[  135.058938] Modules linked in: bnx2fc cnic uio fcoe libfcoe libfc 8021q mrp scsi_transport_fc garp scsi_tgt stp llc binfmt_misc dm_round_robin dm_multipath uinput iTCO_wdt iTCO_vendor_support microcode sg pcspkr serio_raw osst st(-) i2c_i801 lpc_ich mfd_core e1000e ptp pps_core ipmi_si ipmi_msghandler video tpm_infineon ext4(F) jbd2(F) mbcache(F) sd_mod(F) crc_t10dif(F) crct10dif_common(F) sr_mod(F) cdrom(F) pata_acpi(F) ata_generic(F) ata_piix(F) libata(F) mpt2sas(F) scsi_transport_sas(F) raid_class(F) ast(F) ttm(F) drm_kms_helper(F) drm(F) i2c_algo_bit(F) sysimgblt(F) sysfillrect(F) i2c_core(F) syscopyarea(F) dm_mirror(F) dm_region_hash(F) dm_log(F) dm_mod(F)
[  135.119686] CPU: 2 PID: 2028 Comm: rmmod Tainted: GF            3.14.0-rc1-linux-mainline+ #14
[  135.128453] Hardware name: wortmann To be filled by O.E.M./P8B-M Series, BIOS 6103 12/06/2012
[  135.137127] task: ffff880001de29d0 ti: ffff8802206e4000 task.ti: ffff8802206e4000
[  135.144742] RIP: 0010:[<ffffffff812af6a1>]  [<ffffffff812af6a1>] kernfs_find_ns+0x21/0x150
[  135.153148] RSP: 0018:ffff8802206e5c98  EFLAGS: 00010282
[  135.158562] RAX: ffff880001de29d0 RBX: 0000000000000000 RCX: 0000000000000006
[  135.165814] RDX: 0000000000000000 RSI: ffffffff817627e0 RDI: 0000000000000000
[  135.173040] RBP: ffff8802206e5cc8 R08: 0000000000000000 R09: 0000000000000001
[  135.180303] R10: 0000000000000000 R11: 0000000000000001 R12: ffffffff817627e0
[  135.187554] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000001
[  135.194774] FS:  00007f817c720700(0000) GS:ffff880227200000(0000) knlGS:0000000000000000
[  135.202995] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  135.208878] CR2: 0000000000000098 CR3: 00000002219b0000 CR4: 00000000000407e0
[  135.216139] Stack:
[  135.218185]  ffffffff81af63a0 0000000000000000 ffffffff817627e0 0000000000000000
[  135.225783]  0000000000000000 0000000000000001 ffff8802206e5cf8 ffffffff812af8de
[  135.233347]  ffff880226801900 ffffffff81b43320 0000000000000000 ffff880221a7c1c0
[  135.240972] Call Trace:
[  135.243463]  [<ffffffff812af8de>] kernfs_find_and_get_ns+0x3e/0x70
[  135.249743]  [<ffffffff812ae27d>] sysfs_unmerge_group+0x1d/0x60
[  135.255716]  [<ffffffff81464da9>] pm_qos_sysfs_remove_latency+0x19/0x20
[  135.262430]  [<ffffffff81466a91>] dev_pm_qos_constraints_destroy+0x31/0x1e0
[  135.269500]  [<ffffffff81464de6>] dpm_sysfs_remove+0x16/0x50
[  135.275263]  [<ffffffff8145c077>] device_del+0x47/0x1e0
[  135.280554]  [<ffffffff8145c232>] device_unregister+0x22/0x60
[  135.286406]  [<ffffffffa02e23bd>] remove_cdevs+0x4d/0x90 [st]
[  135.292247]  [<ffffffffa02e78ff>] st_remove+0x3f/0xb0 [st]
[  135.297851]  [<ffffffff8145f39f>] __device_release_driver+0x7f/0xf0
[  135.304237]  [<ffffffff8145f4e8>] driver_detach+0xd8/0xe0
[  135.309722]  [<ffffffff8145e0fc>] bus_remove_driver+0x5c/0xd0
[  135.315553]  [<ffffffff81460170>] driver_unregister+0x30/0x70
[  135.321366]  [<ffffffffa02e97f4>] exit_st+0x5c/0x868 [st]
[  135.326861]  [<ffffffff8111b31a>] SyS_delete_module+0x19a/0x1f0
[  135.332891]  [<ffffffff810e336d>] ? trace_hardirqs_on+0xd/0x10
[  135.338811]  [<ffffffff81141974>] ? __audit_syscall_entry+0x94/0x100
[  135.345282]  [<ffffffff8135b1fe>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[  135.351806]  [<ffffffff816e8de9>] system_call_fastpath+0x16/0x1b
[  135.357859] Code: ff eb e3 0f 1f 80 00 00 00 00 55 48 89 e5 48 83 ec 30 48 89 5d d8 4c 89 65 e0 4c 89 6d e8 4c 89 75 f0 4c 89 7d f8 66 66 66 66 90 <44> 0f b7 bf 98 00 00 00 8b 05 71 6d 87 00 48 89 fb 49 89 f4 49
[  135.378282] RIP  [<ffffffff812af6a1>] kernfs_find_ns+0x21/0x150
[  135.384355]  RSP <ffff8802206e5c98>
[  135.387881] CR2: 0000000000000098
[  135.391298] ---[ end trace 1968409221ddb3c8 ]---

Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Acked-by: Kai Mäkisara <kai.makisara@kolumbus.fi>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
d6216c4
popcornmix popcornmix referenced this issue from a commit
Wengang Wang ocfs2: refcount: take rw_lock in ocfs2_reflink
This patch tries to fix this crash:

 #5 [ffff88003c1cd690] do_invalid_op at ffffffff810166d5
 #6 [ffff88003c1cd730] invalid_op at ffffffff8159b2de
    [exception RIP: ocfs2_direct_IO_get_blocks+359]
    RIP: ffffffffa05dfa27  RSP: ffff88003c1cd7e8  RFLAGS: 00010202
    RAX: 0000000000000000  RBX: ffff88003c1cdaa8  RCX: 0000000000000000
    RDX: 000000000000000c  RSI: ffff880027a95000  RDI: ffff88003c79b540
    RBP: ffff88003c1cd858   R8: 0000000000000000   R9: ffffffff815f6ba0
    R10: 00000000000001c9  R11: 00000000000001c9  R12: ffff88002d271500
    R13: 0000000000000001  R14: 0000000000000000  R15: 0000000000001000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #7 [ffff88003c1cd860] do_direct_IO at ffffffff811cd31b
 #8 [ffff88003c1cd950] direct_IO_iovec at ffffffff811cde9c
 #9 [ffff88003c1cd9b0] do_blockdev_direct_IO at ffffffff811ce764
#10 [ffff88003c1cdb80] __blockdev_direct_IO at ffffffff811ce7cc
#11 [ffff88003c1cdbb0] ocfs2_direct_IO at ffffffffa05df756 [ocfs2]
#12 [ffff88003c1cdbe0] generic_file_direct_write_iter at ffffffff8112f935
#13 [ffff88003c1cdc40] ocfs2_file_write_iter at ffffffffa0600ccc [ocfs2]
#14 [ffff88003c1cdd50] do_aio_write at ffffffff8119126c
#15 [ffff88003c1cddc0] aio_rw_vect_retry at ffffffff811d9bb4
#16 [ffff88003c1cddf0] aio_run_iocb at ffffffff811db880
#17 [ffff88003c1cde30] io_submit_one at ffffffff811dc238
#18 [ffff88003c1cde80] do_io_submit at ffffffff811dc437
#19 [ffff88003c1cdf70] sys_io_submit at ffffffff811dc530
#20 [ffff88003c1cdf80] system_call_fastpath at ffffffff8159a159

It crashes at
        BUG_ON(create && (ext_flags & OCFS2_EXT_REFCOUNTED));
in ocfs2_direct_IO_get_blocks.

ocfs2_direct_IO_get_blocks is expecting the OCFS2_EXT_REFCOUNTED be removed in
ocfs2_prepare_inode_for_write() if it was there. But no cluster lock is taken
during the time before (or inside) ocfs2_prepare_inode_for_write() and after
ocfs2_direct_IO_get_blocks().

It can happen in this case:

Node A(which crashes)				Node B
------------------------                 ---------------------------
ocfs2_file_aio_write
  ocfs2_prepare_inode_for_write
    ocfs2_inode_lock
    ...
    ocfs2_inode_unlock
  #no refcount found
....					ocfs2_reflink
                                          ocfs2_inode_lock
                                          ...
                                          ocfs2_inode_unlock
                                          #now, refcount flag set on extent

                                        ...
                                        flush change to disk

ocfs2_direct_IO_get_blocks
  ocfs2_get_clusters
    #extent map miss
    #buffer_head miss
    read extents from disk
  found refcount flag on extent
  crash..

Fix:
Take rw_lock in ocfs2_reflink path

Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8a8ad1c
popcornmix popcornmix referenced this issue from a commit
Prarit Bhargava x86: Add check for number of available vectors before CPU down
commit da6139e upstream.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=64791

When a cpu is downed on a system, the irqs on the cpu are assigned to
other cpus.  It is possible, however, that when a cpu is downed there
aren't enough free vectors on the remaining cpus to account for the
vectors from the cpu that is being downed.

This results in an interesting "overflow" condition where irqs are
"assigned" to a CPU but are not handled.

For example, when downing cpus on a 1-64 logical processor system:

<snip>
[  232.021745] smpboot: CPU 61 is now offline
[  238.480275] smpboot: CPU 62 is now offline
[  245.991080] ------------[ cut here ]------------
[  245.996270] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:264 dev_watchdog+0x246/0x250()
[  246.005688] NETDEV WATCHDOG: p786p1 (ixgbe): transmit queue 0 timed out
[  246.013070] Modules linked in: lockd sunrpc iTCO_wdt iTCO_vendor_support sb_edac ixgbe microcode e1000e pcspkr joydev edac_core lpc_ich ioatdma ptp mdio mfd_core i2c_i801 dca pps_core i2c_core wmi acpi_cpufreq isci libsas scsi_transport_sas
[  246.037633] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.12.0+ #14
[  246.044451] Hardware name: Intel Corporation S4600LH ........../SVRBD-ROW_T, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
[  246.057371]  0000000000000009 ffff88081fa03d40 ffffffff8164fbf6 ffff88081fa0ee48
[  246.065728]  ffff88081fa03d90 ffff88081fa03d80 ffffffff81054ecc ffff88081fa13040
[  246.074073]  0000000000000000 ffff88200cce0000 0000000000000040 0000000000000000
[  246.082430] Call Trace:
[  246.085174]  <IRQ>  [<ffffffff8164fbf6>] dump_stack+0x46/0x58
[  246.091633]  [<ffffffff81054ecc>] warn_slowpath_common+0x8c/0xc0
[  246.098352]  [<ffffffff81054fb6>] warn_slowpath_fmt+0x46/0x50
[  246.104786]  [<ffffffff815710d6>] dev_watchdog+0x246/0x250
[  246.110923]  [<ffffffff81570e90>] ? dev_deactivate_queue.constprop.31+0x80/0x80
[  246.119097]  [<ffffffff8106092a>] call_timer_fn+0x3a/0x110
[  246.125224]  [<ffffffff8106280f>] ? update_process_times+0x6f/0x80
[  246.132137]  [<ffffffff81570e90>] ? dev_deactivate_queue.constprop.31+0x80/0x80
[  246.140308]  [<ffffffff81061db0>] run_timer_softirq+0x1f0/0x2a0
[  246.146933]  [<ffffffff81059a80>] __do_softirq+0xe0/0x220
[  246.152976]  [<ffffffff8165fedc>] call_softirq+0x1c/0x30
[  246.158920]  [<ffffffff810045f5>] do_softirq+0x55/0x90
[  246.164670]  [<ffffffff81059d35>] irq_exit+0xa5/0xb0
[  246.170227]  [<ffffffff8166062a>] smp_apic_timer_interrupt+0x4a/0x60
[  246.177324]  [<ffffffff8165f40a>] apic_timer_interrupt+0x6a/0x70
[  246.184041]  <EOI>  [<ffffffff81505a1b>] ? cpuidle_enter_state+0x5b/0xe0
[  246.191559]  [<ffffffff81505a17>] ? cpuidle_enter_state+0x57/0xe0
[  246.198374]  [<ffffffff81505b5d>] cpuidle_idle_call+0xbd/0x200
[  246.204900]  [<ffffffff8100b7ae>] arch_cpu_idle+0xe/0x30
[  246.210846]  [<ffffffff810a47b0>] cpu_startup_entry+0xd0/0x250
[  246.217371]  [<ffffffff81646b47>] rest_init+0x77/0x80
[  246.223028]  [<ffffffff81d09e8e>] start_kernel+0x3ee/0x3fb
[  246.229165]  [<ffffffff81d0989f>] ? repair_env_string+0x5e/0x5e
[  246.235787]  [<ffffffff81d095a5>] x86_64_start_reservations+0x2a/0x2c
[  246.242990]  [<ffffffff81d0969f>] x86_64_start_kernel+0xf8/0xfc
[  246.249610] ---[ end trace fb74fdef54d79039 ]---
[  246.254807] ixgbe 0000:c2:00.0 p786p1: initiating reset due to tx timeout
[  246.262489] ixgbe 0000:c2:00.0 p786p1: Reset adapter
Last login: Mon Nov 11 08:35:14 from 10.18.17.119
[root@(none) ~]# [  246.792676] ixgbe 0000:c2:00.0 p786p1: detected SFP+: 5
[  249.231598] ixgbe 0000:c2:00.0 p786p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX
[  246.792676] ixgbe 0000:c2:00.0 p786p1: detected SFP+: 5
[  249.231598] ixgbe 0000:c2:00.0 p786p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX

(last lines keep repeating.  ixgbe driver is dead until module reload.)

If the downed cpu has more vectors than are free on the remaining cpus on the
system, it is possible that some vectors are "orphaned" even though they are
assigned to a cpu.  In this case, since the ixgbe driver had a watchdog, the
watchdog fired and notified that something was wrong.

This patch adds a function, check_vectors(), to compare the number of vectors
on the CPU going down and compares it to the number of vectors available on
the system.  If there aren't enough vectors for the CPU to go down, an
error is returned and propogated back to userspace.

v2: Do not need to look at percpu irqs
v3: Need to check affinity to prevent counting of MSIs in IOAPIC Lowest
    Priority Mode
v4: Additional changes suggested by Gong Chen.
v5/v6/v7/v8: Updated comment text

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Link: http://lkml.kernel.org/r/1389613861-3853-1-git-send-email-prarit@redhat.com
Reviewed-by: Gong Chen <gong.chen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Seiji Aguchi <seiji.aguchi@hds.com>
Cc: Yang Zhang <yang.z.zhang@Intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Janet Morgan <janet.morgan@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Ruiv Wang <ruiv.wang@gmail.com>
Cc: Gong Chen <gong.chen@linux.intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
26a01cb
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi
Dave Hansen x86, sched: Add new topology for multi-NUMA-node CPUs
I'm getting the spew below when booting with Haswell (Xeon
E5-2699 v3) CPUs and the "Cluster-on-Die" (CoD) feature enabled
in the BIOS.  It seems similar to the issue that some folks from
AMD ran in to on their systems and addressed in this commit:

  161270f ("x86/smp: Fix topology checks on AMD MCM CPUs")

Both these Intel and AMD systems break an assumption which is
being enforced by topology_sane(): a socket may not contain more
than one NUMA node.

AMD special-cased their system by looking for a cpuid flag.  The
Intel mode is dependent on BIOS options and I do not know of a
way which it is enumerated other than the tables being parsed
during the CPU bringup process.  In other words, we have to trust
the ACPI tables <shudder>.

This detects the situation where a NUMA node occurs at a place in
the middle of the "CPU" sched domains.  It replaces the default
topology with one that relies on the NUMA information from the
firmware (SRAT table) for all levels of sched domains above the
hyperthreads.

This also fixes a sysfs bug.  We used to freak out when we saw
the "mc" group cross a node boundary, so we stopped building the
MC group.  MC gets exported as the 'core_siblings_list' in
/sys/devices/system/cpu/cpu*/topology/ and this caused CPUs with
the same 'physical_package_id' to not be listed together in
'core_siblings_list'.  This violates a statement from
Documentation/ABI/testing/sysfs-devices-system-cpu:

	core_siblings: internal kernel map of cpu#'s hardware threads
	within the same physical_package_id.

	core_siblings_list: human-readable list of the logical CPU
	numbers within the same physical_package_id as cpu#.

The sysfs effects here cause an issue with the hwloc tool where
it gets confused and thinks there are more sockets than are
physically present.

Before this patch, there are two packages:

# cd /sys/devices/system/cpu/
# cat cpu*/topology/physical_package_id | sort | uniq -c
     18 0
     18 1

But 4 _sets_ of core siblings:

# cat cpu*/topology/core_siblings_list | sort | uniq -c
      9 0-8
      9 18-26
      9 27-35
      9 9-17

After this set, there are only 2 sets of core siblings, which
is what we expect for a 2-socket system.

# cat cpu*/topology/physical_package_id | sort | uniq -c
     18 0
     18 1
# cat cpu*/topology/core_siblings_list | sort | uniq -c
     18 0-17
     18 18-35

Example spew:
...
	NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
	 #2  #3  #4  #5  #6  #7  #8
	.... node  #1, CPUs:    #9
	------------[ cut here ]------------
	WARNING: CPU: 9 PID: 0 at /home/ak/hle/linux-hle-2.6/arch/x86/kernel/smpboot.c:306 topology_sane.isra.2+0x74/0x90()
	sched: CPU #9's mc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency.
	Modules linked in:
	CPU: 9 PID: 0 Comm: swapper/9 Not tainted 3.17.0-rc1-00293-g8e01c4d-dirty #631
	Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS GRNDSDP1.86B.0036.R05.1407140519 07/14/2014
	0000000000000009 ffff88046ddabe00 ffffffff8172e485 ffff88046ddabe48
	ffff88046ddabe38 ffffffff8109691d 000000000000b001 0000000000000009
	ffff88086fc12580 000000000000b020 0000000000000009 ffff88046ddabe98
	Call Trace:
	[<ffffffff8172e485>] dump_stack+0x45/0x56
	[<ffffffff8109691d>] warn_slowpath_common+0x7d/0xa0
	[<ffffffff8109698c>] warn_slowpath_fmt+0x4c/0x50
	[<ffffffff81074f94>] topology_sane.isra.2+0x74/0x90
	[<ffffffff8107530e>] set_cpu_sibling_map+0x31e/0x4f0
	[<ffffffff8107568d>] start_secondary+0x1ad/0x240
	---[ end trace 3fe5f587a9fcde61 ]---
	#10 #11 #12 #13 #14 #15 #16 #17
	.... node  #2, CPUs:   #18 #19 #20 #21 #22 #23 #24 #25 #26
	.... node  #3, CPUs:   #27 #28 #29 #30 #31 #32 #33 #34 #35

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
[ Added LLC domain and s/match_mc/match_die/ ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: brice.goglin@gmail.com
Cc: "H. Peter Anvin" <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/20140918193334.C065EBCE@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
cebf15e
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi
Junxiao Bi ocfs2: fix deadlock due to wrong locking order
For commit ocfs2 journal, ocfs2 journal thread will acquire the mutex
osb->journal->j_trans_barrier and wake up jbd2 commit thread, then it
will wait until jbd2 commit thread done. In order journal mode, jbd2
needs flushing dirty data pages first, and this needs get page lock.
So osb->journal->j_trans_barrier should be got before page lock.

But ocfs2_write_zero_page() and ocfs2_write_begin_inline() obey this
locking order, and this will cause deadlock and hung the whole cluster.

One deadlock catched is the following:

PID: 13449  TASK: ffff8802e2f08180  CPU: 31  COMMAND: "oracle"
 #0 [ffff8802ee3f79b0] __schedule at ffffffff8150a524
 #1 [ffff8802ee3f7a58] schedule at ffffffff8150acbf
 #2 [ffff8802ee3f7a68] rwsem_down_failed_common at ffffffff8150cb85
 #3 [ffff8802ee3f7ad8] rwsem_down_read_failed at ffffffff8150cc55
 #4 [ffff8802ee3f7ae8] call_rwsem_down_read_failed at ffffffff812617a4
 #5 [ffff8802ee3f7b50] ocfs2_start_trans at ffffffffa0498919 [ocfs2]
 #6 [ffff8802ee3f7ba0] ocfs2_zero_start_ordered_transaction at ffffffffa048b2b8 [ocfs2]
 #7 [ffff8802ee3f7bf0] ocfs2_write_zero_page at ffffffffa048e9bd [ocfs2]
 #8 [ffff8802ee3f7c80] ocfs2_zero_extend_range at ffffffffa048ec83 [ocfs2]
 #9 [ffff8802ee3f7ce0] ocfs2_zero_extend at ffffffffa048edfd [ocfs2]
 #10 [ffff8802ee3f7d50] ocfs2_extend_file at ffffffffa049079e [ocfs2]
 #11 [ffff8802ee3f7da0] ocfs2_setattr at ffffffffa04910ed [ocfs2]
 #12 [ffff8802ee3f7e70] notify_change at ffffffff81187d29
 #13 [ffff8802ee3f7ee0] do_truncate at ffffffff8116bbc1
 #14 [ffff8802ee3f7f50] sys_ftruncate at ffffffff8116bcbd
 #15 [ffff8802ee3f7f80] system_call_fastpath at ffffffff81515142
    RIP: 00007f8de750c6f7  RSP: 00007fffe786e478  RFLAGS: 00000206
    RAX: 000000000000004d  RBX: ffffffff81515142  RCX: 0000000000000000
    RDX: 0000000000000200  RSI: 0000000000028400  RDI: 000000000000000d
    RBP: 00007fffe786e040   R8: 0000000000000000   R9: 000000000000000d
    R10: 0000000000000000  R11: 0000000000000206  R12: 000000000000000d
    R13: 00007fffe786e710  R14: 00007f8de70f8340  R15: 0000000000028400
    ORIG_RAX: 000000000000004d  CS: 0033  SS: 002b

crash64> bt
PID: 7610   TASK: ffff88100fd56140  CPU: 1   COMMAND: "ocfs2cmt"
 #0 [ffff88100f4d1c50] __schedule at ffffffff8150a524
 #1 [ffff88100f4d1cf8] schedule at ffffffff8150acbf
 #2 [ffff88100f4d1d08] jbd2_log_wait_commit at ffffffffa01274fd [jbd2]
 #3 [ffff88100f4d1d98] jbd2_journal_flush at ffffffffa01280b4 [jbd2]
 #4 [ffff88100f4d1dd8] ocfs2_commit_cache at ffffffffa0499b14 [ocfs2]
 #5 [ffff88100f4d1e38] ocfs2_commit_thread at ffffffffa0499d38 [ocfs2]
 #6 [ffff88100f4d1ee8] kthread at ffffffff81090db6
 #7 [ffff88100f4d1f48] kernel_thread_helper at ffffffff81516284

crash64> bt
PID: 7609   TASK: ffff88100f2d4480  CPU: 0   COMMAND: "jbd2/dm-20-86"
 #0 [ffff88100def3920] __schedule at ffffffff8150a524
 #1 [ffff88100def39c8] schedule at ffffffff8150acbf
 #2 [ffff88100def39d8] io_schedule at ffffffff8150ad6c
 #3 [ffff88100def39f8] sleep_on_page at ffffffff8111069e
 #4 [ffff88100def3a08] __wait_on_bit_lock at ffffffff8150b30a
 #5 [ffff88100def3a58] __lock_page at ffffffff81110687
 #6 [ffff88100def3ab8] write_cache_pages at ffffffff8111b752
 #7 [ffff88100def3be8] generic_writepages at ffffffff8111b901
 #8 [ffff88100def3c48] journal_submit_data_buffers at ffffffffa0120f67 [jbd2]
 #9 [ffff88100def3cf8] jbd2_journal_commit_transaction at ffffffffa0121372[jbd2]
 #10 [ffff88100def3e68] kjournald2 at ffffffffa0127a86 [jbd2]
 #11 [ffff88100def3ee8] kthread at ffffffff81090db6
 #12 [ffff88100def3f48] kernel_thread_helper at ffffffff81516284

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Alex Chen <alex.chen@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
f775da2
Stephen Warren swarren referenced this issue from a commit in swarren/linux-rpi
pranith pranith powerpc: Wire up sys_bpf() syscall
This patch wires up the new syscall sys_bpf() on powerpc.

Passes the tests in samples/bpf:

    #0 add+sub+mul OK
    #1 unreachable OK
    #2 unreachable2 OK
    #3 out of range jump OK
    #4 out of range jump2 OK
    #5 test1 ld_imm64 OK
    #6 test2 ld_imm64 OK
    #7 test3 ld_imm64 OK
    #8 test4 ld_imm64 OK
    #9 test5 ld_imm64 OK
    #10 no bpf_exit OK
    #11 loop (back-edge) OK
    #12 loop2 (back-edge) OK
    #13 conditional loop OK
    #14 read uninitialized register OK
    #15 read invalid register OK
    #16 program doesn't init R0 before exit OK
    #17 stack out of bounds OK
    #18 invalid call insn1 OK
    #19 invalid call insn2 OK
    #20 invalid function call OK
    #21 uninitialized stack1 OK
    #22 uninitialized stack2 OK
    #23 check valid spill/fill OK
    #24 check corrupted spill/fill OK
    #25 invalid src register in STX OK
    #26 invalid dst register in STX OK
    #27 invalid dst register in ST OK
    #28 invalid src register in LDX OK
    #29 invalid dst register in LDX OK
    #30 junk insn OK
    #31 junk insn2 OK
    #32 junk insn3 OK
    #33 junk insn4 OK
    #34 junk insn5 OK
    #35 misaligned read from stack OK
    #36 invalid map_fd for function call OK
    #37 don't check return value before access OK
    #38 access memory with incorrect alignment OK
    #39 sometimes access memory with incorrect alignment OK
    #40 jump test 1 OK
    #41 jump test 2 OK
    #42 jump test 3 OK
    #43 jump test 4 OK

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
[mpe: test using samples/bpf]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
fcbb539
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.