Update Documentation/ABI/obsolete/proc-pid-oom_adj #19

vargheseapm · 2012-06-01T06:30:21Z

No description provided.

Printing the "start_ip" for every secondary cpu is very noisy on a large system - and doesn't add any value. Drop this message. Console log before: Booting Node 0, Processors #1 smpboot cpu 1: start_ip = 96000 #2 smpboot cpu 2: start_ip = 96000 #3 smpboot cpu 3: start_ip = 96000 #4 smpboot cpu 4: start_ip = 96000 ... torvalds#31 smpboot cpu 31: start_ip = 96000 Brought up 32 CPUs Console log after: Booting Node 0, Processors #1 #2 #3 #4 #5 torvalds#6 torvalds#7 Ok. Booting Node 1, Processors torvalds#8 torvalds#9 torvalds#10 torvalds#11 torvalds#12 torvalds#13 torvalds#14 torvalds#15 Ok. Booting Node 0, Processors torvalds#16 torvalds#17 torvalds#18 torvalds#19 torvalds#20 torvalds#21 torvalds#22 torvalds#23 Ok. Booting Node 1, Processors torvalds#24 torvalds#25 torvalds#26 torvalds#27 torvalds#28 torvalds#29 torvalds#30 torvalds#31 Brought up 32 CPUs Acked-by: Borislav Petkov <bp@amd64.org> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/r/4f452eb42507460426@agluck-desktop.sc.intel.com Signed-off-by: H. Peter Anvin <hpa@zytor.com>

If the netdev is already in NETREG_UNREGISTERING/_UNREGISTERED state, do not update the real num tx queues. netdev_queue_update_kobjects() is already called via remove_queue_kobjects() at NETREG_UNREGISTERING time. So, when upper layer driver, e.g., FCoE protocol stack is monitoring the netdev event of NETDEV_UNREGISTER and calls back to LLD ndo_fcoe_disable() to remove extra queues allocated for FCoE, the associated txq sysfs kobjects are already removed, and trying to update the real num queues would cause something like below: ... PID: 25138 TASK: ffff88021e64c440 CPU: 3 COMMAND: "kworker/3:3" #0 [ffff88021f007760] machine_kexec at ffffffff810226d9 #1 [ffff88021f0077d0] crash_kexec at ffffffff81089d2d #2 [ffff88021f0078a0] oops_end at ffffffff813bca78 #3 [ffff88021f0078d0] no_context at ffffffff81029e72 #4 [ffff88021f007920] __bad_area_nosemaphore at ffffffff8102a155 #5 [ffff88021f0079f0] bad_area_nosemaphore at ffffffff8102a23e torvalds#6 [ffff88021f007a00] do_page_fault at ffffffff813bf32e torvalds#7 [ffff88021f007b10] page_fault at ffffffff813bc045 [exception RIP: sysfs_find_dirent+17] RIP: ffffffff81178611 RSP: ffff88021f007bc0 RFLAGS: 00010246 RAX: ffff88021e64c440 RBX: ffffffff8156cc63 RCX: 0000000000000004 RDX: ffffffff8156cc63 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff88021f007be0 R8: 0000000000000004 R9: 0000000000000008 R10: ffffffff816fed00 R11: 0000000000000004 R12: 0000000000000000 R13: ffffffff8156cc63 R14: 0000000000000000 R15: ffff8802222a0000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 torvalds#8 [ffff88021f007be8] sysfs_get_dirent at ffffffff81178c07 torvalds#9 [ffff88021f007c18] sysfs_remove_group at ffffffff8117ac27 torvalds#10 [ffff88021f007c48] netdev_queue_update_kobjects at ffffffff813178f9 torvalds#11 [ffff88021f007c88] netif_set_real_num_tx_queues at ffffffff81303e38 torvalds#12 [ffff88021f007cc8] ixgbe_set_num_queues at ffffffffa0249763 [ixgbe] torvalds#13 [ffff88021f007cf8] ixgbe_init_interrupt_scheme at ffffffffa024ea89 [ixgbe] torvalds#14 [ffff88021f007d48] ixgbe_fcoe_disable at ffffffffa0267113 [ixgbe] torvalds#15 [ffff88021f007d68] vlan_dev_fcoe_disable at ffffffffa014fef5 [8021q] torvalds#16 [ffff88021f007d78] fcoe_interface_cleanup at ffffffffa02b7dfd [fcoe] torvalds#17 [ffff88021f007df8] fcoe_destroy_work at ffffffffa02b7f08 [fcoe] torvalds#18 [ffff88021f007e18] process_one_work at ffffffff8105d7ca torvalds#19 [ffff88021f007e68] worker_thread at ffffffff81060513 torvalds#20 [ffff88021f007ee8] kthread at ffffffff810648b6 torvalds#21 [ffff88021f007f48] kernel_thread_helper at ffffffff813c40f4 Signed-off-by: Yi Zou <yi.zou@intel.com> Tested-by: Ross Brattain <ross.b.brattain@intel.com> Tested-by: Stephen Ko <stephen.s.ko@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

The warning below triggers on AMD MCM packages because physical package IDs on the cores of a _physical_ socket are the same. I.e., this field says which CPUs belong to the same physical package. However, the same two CPUs belong to two different internal, i.e. "logical" nodes in the same physical socket which is reflected in the CPU-to-node map on x86 with NUMA. Which makes this check wrong on the above topologies so circumvent it. [ 0.444413] Booting Node 0, Processors #1 #2 #3 #4 #5 Ok. [ 0.461388] ------------[ cut here ]------------ [ 0.465997] WARNING: at arch/x86/kernel/smpboot.c:310 topology_sane.clone.1+0x6e/0x81() [ 0.473960] Hardware name: Dinar [ 0.477170] sched: CPU #6's mc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency. [ 0.486860] Booting Node 1, Processors #6 [ 0.491104] Modules linked in: [ 0.494141] Pid: 0, comm: swapper/6 Not tainted 3.4.0+ #1 [ 0.499510] Call Trace: [ 0.501946] [<ffffffff8144bf92>] ? topology_sane.clone.1+0x6e/0x81 [ 0.508185] [<ffffffff8102f1fc>] warn_slowpath_common+0x85/0x9d [ 0.514163] [<ffffffff8102f2b7>] warn_slowpath_fmt+0x46/0x48 [ 0.519881] [<ffffffff8144bf92>] topology_sane.clone.1+0x6e/0x81 [ 0.525943] [<ffffffff8144c234>] set_cpu_sibling_map+0x251/0x371 [ 0.532004] [<ffffffff8144c4ee>] start_secondary+0x19a/0x218 [ 0.537729] ---[ end trace 4eaa2a86a8e2da22 ]--- [ 0.628197] #7 #8 #9 #10 #11 Ok. [ 0.807108] Booting Node 3, Processors #12 #13 #14 #15 #16 #17 Ok. [ 0.897587] Booting Node 2, Processors #18 #19 #20 #21 #22 #23 Ok. [ 0.917443] Brought up 24 CPUs We ran a topology sanity check test we have here on it and it all looks ok... hopefully :). Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20120529135442.GE29157@aftab.osrc.amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>

commit 97f7f81 upstream. If oprofile uses the nmi timer interrupt there is a crash while unloading the module. The bug can be triggered with oprofile build as module and kernel parameter nolapic set. This patch fixes this. oprofile: using NMI timer interrupt. BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 IP: [<ffffffff8123c226>] unregister_syscore_ops+0x41/0x58 PGD 42dbca067 PUD 41da6a067 PMD 0 Oops: 0002 [#1] PREEMPT SMP CPU 5 Modules linked in: oprofile(-) [last unloaded: oprofile] Pid: 2518, comm: modprobe Not tainted 3.1.0-rc7-00019-gb2fb49d torvalds#19 Advanced Micro Device Anaheim/Anaheim RIP: 0010:[<ffffffff8123c226>] [<ffffffff8123c226>] unregister_syscore_ops+0x41/0x58 RSP: 0018:ffff88041ef71e98 EFLAGS: 00010296 RAX: 0000000000000000 RBX: ffffffffa0017100 RCX: dead000000200200 RDX: 0000000000000000 RSI: dead000000100100 RDI: ffffffff8178c620 RBP: ffff88041ef71ea8 R08: 0000000000000001 R09: 0000000000000082 R10: 0000000000000000 R11: ffff88041ef71de8 R12: 0000000000000080 R13: fffffffffffffff5 R14: 0000000000000001 R15: 0000000000610210 FS: 00007fc902f20700(0000) GS:ffff88042fd40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000008 CR3: 000000041cdb6000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process modprobe (pid: 2518, threadinfo ffff88041ef70000, task ffff88041d348040) Stack: ffff88041ef71eb8 ffffffffa0017790 ffff88041ef71eb8 ffffffffa0013532 ffff88041ef71ec8 ffffffffa00132d6 ffff88041ef71ed8 ffffffffa00159b2 ffff88041ef71f78 ffffffff81073115 656c69666f72706f 0000000000610200 Call Trace: [<ffffffffa0013532>] op_nmi_exit+0x15/0x17 [oprofile] [<ffffffffa00132d6>] oprofile_arch_exit+0xe/0x10 [oprofile] [<ffffffffa00159b2>] oprofile_exit+0x1e/0x20 [oprofile] [<ffffffff81073115>] sys_delete_module+0x1c3/0x22f [<ffffffff811bf09e>] ? trace_hardirqs_on_thunk+0x3a/0x3f [<ffffffff8148070b>] system_call_fastpath+0x16/0x1b Code: 20 c6 78 81 e8 c5 cc 23 00 48 8b 13 48 8b 43 08 48 be 00 01 10 00 00 00 ad de 48 b9 00 02 20 00 00 00 ad de 48 c7 c7 20 c6 78 81 89 42 08 48 89 10 48 89 33 48 89 4b 08 e8 a6 c0 23 00 5a 5b RIP [<ffffffff8123c226>] unregister_syscore_ops+0x41/0x58 RSP <ffff88041ef71e98> CR2: 0000000000000008 ---[ end trace 43a541a52956b7b0 ]--- Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

commit 6b16351acbd415e66ba16bf7d473ece1574cf0bc Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Sun Jun 24 12:53:04 2012 -0700 Linux 3.5-rc4 commit 02b7d83436ae4b1d86a5df03e72c1c69af7e239d Author: Anatol Pomozov <anatol.pomozov@gmail.com> Date: Sat Jun 23 15:54:34 2012 -0700 Fix typo in printed messages Coult -> Could Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> commit 104452f052dfcf62dbf0c4110c9234a3285f59bf Merge: 08d49c4 081f323 Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Sun Jun 24 11:02:09 2012 -0700 Merge git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM fixes from Avi Kivity: "Fixing a scheduling-while-atomic bug in the ppc code, and a bug which allowed pci bridges to be assigned to guests." * git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: PPC: Book3S HV: Drop locks around call to kvmppc_pin_guest_page KVM: Fix PCI header check on device assignment commit 08d49c46cf61f707f3f44228b362947bb57343e7 Merge: a4a20fd 2e51fd3 Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Sun Jun 24 11:00:07 2012 -0700 Merge tag 'rdma-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull InfiniBand/RDMA fixes from Roland Dreier: - Fixes to new ocrdma driver - Typo in test in CMA * tag 'rdma-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: RDMA/cma: QP type check on received REQs should be AND not OR RDMA/ocrdma: Fix off by one in ocrdma_query_gid() RDMA/ocrdma: Fixed RQ error CQE polling RDMA/ocrdma: Correct queue SGE calculation RDMA/ocrdma: Correct reported max queue sizes RDMA/ocrdma: Fixed GID table for vlan and events commit a4a20fd981b2e6419556ca474b3b9689d42e5233 Merge: 2ecedc4 0fa1f06 Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Sun Jun 24 10:57:59 2012 -0700 Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "Nothing very controversial in here. Most of the fixes are for OMAP this time around, with some orion/kirkwood and a tegra patch mixed in." * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: Orion: Fix Virtual/Physical mixup with watchdog ARM: Kirkwood: clk_register_gate_fn: add fn assignment ARM: Orion5x - Restore parts of io.h, with rework ARM: OMAP4: hwmod data: Force HDMI in no-idle while enabled ARM: OMAP2+: mux: fix sparse warning ARM: OMAP2+: CM: increase the module disable timeout ARM: OMAP4: clock data: add clockdomains for clocks used as main clocks ARM: OMAP4: hwmod data: fix 32k sync timer idle modes ARM: OMAP4+: hwmod: fix issue causing IPs not going back to Smart-Standby ARM: OMAP: Fix Beagleboard DVI reset gpio arm/dts: OMAP2: Fix interrupt controller binding ARM: OMAP2: Fix tusb6010 GPIO interrupt for n8x0 ARM: OMAP2+: Fix MUSB ifdefs for platform init code ARM: tegra: make tegra_cpu_reset_handler_enable() __init ARM: OMAP: PM: Lock clocks list while generating summary ARM: iconnect: Remove include of removed linux/spi/orion_spi.h commit 2ecedc478e7a20597d95f48144cbe5ee568c0f1b Merge: 002b758 59bbe27 Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Sun Jun 24 10:57:15 2012 -0700 Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "Nothing major in here, one radeon SI fix for tiling, and one uninit var fix, two minor header file fixes." * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: drm: drop comment about this header being autogenerated. drm/edid: don't return stack garbage from supports_rb vga_switcheroo: Add include guard drm/radeon: SI tiling fixes for display commit 2e51fd3c13e330d9364a211ddcd3e38771eeb4b4 Merge: 4dd81e8 7b33dc2 Author: Roland Dreier <roland@purestorage.com> Date: Sun Jun 24 04:59:59 2012 -0700 Merge branches 'cma' and 'ocrdma' into for-linus commit 0fa1f0609a0c1fe8b2be3c0089a2cb48f7fda521 Author: Andrew Lunn <andrew@lunn.ch> Date: Fri Jun 22 08:54:02 2012 +0200 ARM: Orion: Fix Virtual/Physical mixup with watchdog The orion watchdog is expecting to be passed the physcial address of the hardware, and will ioremap() it to give a virtual address it will use as the base address for the hardware. However, when creating the platform resource record, a virtual address was being used. Add the necassary #define's so we can pass the physical address as expected. Tested on Kirkwood and Orion5x. Cc: stable <stable@vger.kernel.org> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Olof Johansson <olof@lixom.net> commit 5fb2ce119c113e5c987fa81ed89e73b2653e28e4 Author: Marc Kleine-Budde <mkl@blackshift.org> Date: Fri Jun 22 08:54:01 2012 +0200 ARM: Kirkwood: clk_register_gate_fn: add fn assignment In commit: 98d9986 ARM: Kirkwood: Replace clock gating the kirkwood clock gating has been reworked. A custom variant of clock gating, that calls a custom function before gating the clock off, has been introduced. However in clk_register_gate_fn() this custom function "fn" is never assigned. This patch adds the missing fn assignment. Cc: stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@blackshift.org> Tested-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Olof Johansson <olof@lixom.net> commit b5e12229a4850ae9b19ee5252508749da8844b3c Author: Andrew Lunn <andrew@lunn.ch> Date: Fri Jun 22 20:57:57 2012 +0200 ARM: Orion5x - Restore parts of io.h, with rework Commit 4d5fc58dbe34b78157c05b319669bb3e064ba8bd (ARM: remove bunch of now unused mach/io.h files) removed the orion5x io.h. Unfortunately, this is still needed for the definition of IO_SPACE_LIMIT which overrides the default 64K. All Orion based systems have 1Mbyte of IO space per PCI[e] bus, and try to request_resource() this size. Orion5x has two such PCI buses. It is likely that the original, removed version, was broken. This version might be less broken. However, it has not been tested on hardware with a PCI card, let alone hardware with a PCI card with IO capabilities. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Rob Herring <rob.herring@calxeda.com> Signed-off-by: Olof Johansson <olof@lixom.net> commit a34a3b7264fdb40c8d4be8bebb38fd56dc48d162 Merge: e23d709 dc57aef Author: Olof Johansson <olof@lixom.net> Date: Sat Jun 23 16:16:29 2012 -0700 Merge tag 'omap-fixes-a-for-3.5rc' of git://git.kernel.org/pub/scm/linux/kernel/git/pjw/omap-pending into fixes From Paul Walmsley (as per Tony Lindgren's request): "Some uncontroversial OMAP clock, hwmod, and compiler warning fixes for 3.5-rc" * tag 'omap-fixes-a-for-3.5rc' of git://git.kernel.org/pub/scm/linux/kernel/git/pjw/omap-pending: ARM: OMAP4: hwmod data: Force HDMI in no-idle while enabled ARM: OMAP2+: mux: fix sparse warning ARM: OMAP2+: CM: increase the module disable timeout ARM: OMAP4: clock data: add clockdomains for clocks used as main clocks ARM: OMAP4: hwmod data: fix 32k sync timer idle modes ARM: OMAP4+: hwmod: fix issue causing IPs not going back to Smart-Standby ARM: OMAP: PM: Lock clocks list while generating summary commit e23d7096f9633d37aa35dffab9b0bd594ed64533 Merge: 6355f25 aef2b89 Author: Olof Johansson <olof@lixom.net> Date: Sat Jun 23 16:11:50 2012 -0700 Merge tag 'omap-fixes-for-v3.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes From Tony Lindgren: "Here are a few fixes with the biggest one being fix for Beagle DVI reset. All of them are regression fixes, except for the missing omap2 interrupt controller binding that somehow got missed earlier." * tag 'omap-fixes-for-v3.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: OMAP: Fix Beagleboard DVI reset gpio arm/dts: OMAP2: Fix interrupt controller binding ARM: OMAP2: Fix tusb6010 GPIO interrupt for n8x0 ARM: OMAP2+: Fix MUSB ifdefs for platform init code commit 002b758b6dc4d840e662f25625f696d7b43d48f4 Merge: 369c4f5 642c0db Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Fri Jun 22 17:47:08 2012 -0700 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph fixes from Sage Weil: "There are a couple of fixes from Yan for bad pointer dereferences in the messenger code and when fiddling with page->private after page migration, a fix from Alex for a use-after-free in the osd client code, and a couple fixes for the message refcounting and shutdown ordering." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: libceph: flush msgr queue during mon_client shutdown rbd: Clear ceph_msg->bio_iter for retransmitted message libceph: use con get/put ops from osd_client libceph: osd_client: don't drop reply reference too early ceph: check PG_Private flag before accessing page->private commit 369c4f542fd5e197ace5f9fdd33c558fb2358480 Merge: a116371 f7bdf03 Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Fri Jun 22 11:07:55 2012 -0700 Merge tag 'for-linus-Jun-21-2012' of git://oss.sgi.com/xfs/xfs Pull XFS fixes from Ben Myers: - Fix stale data exposure with unwritten extents - Fix a warning in xfs_alloc_vextent with ODEBUG - Fix overallocation and alignment of pages for xfs_bufs - Fix a cursor leak - Fix a log hang - Fix a crash related to xfs_sync_worker - Rename xfs log structure from struct log to struct xlog so we can use crash dumps effectively * tag 'for-linus-Jun-21-2012' of git://oss.sgi.com/xfs/xfs: xfs: rename log structure to xlog xfs: shutdown xfs_sync_worker before the log xfs: Fix overallocation in xfs_buf_allocate_memory() xfs: fix allocbt cursor leak in xfs_alloc_ag_vextent_near xfs: check for stale inode before acquiring iflock on push xfs: fix debug_object WARN at xfs_alloc_vextent() xfs: xfs_vm_writepage clear iomap_valid when !buffer_uptodate (REV2) commit a11637194adc8bf2c2022ac89314dbdd1fcd1778 Merge: 636040b 6921a57 Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Fri Jun 22 10:58:57 2012 -0700 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar. * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: ftrace: Make all inline tags also include notrace perf: Use css_tryget() to avoid propping up css refcount perf tools: Fix synthesizing tracepoint names from the perf.data headers perf stat: Fix default output file perf tools: Fix endianity swapping for adds_features bitmask commit 59bbe27ba0f6bae1d85f1521e43181d98ee9c5ab Author: Dave Airlie <airlied@redhat.com> Date: Fri Jun 22 11:04:55 2012 +0100 drm: drop comment about this header being autogenerated. This comment is well out of date. Signed-off-by: Dave Airlie <airlied@redhat.com> commit dc57aef503859dbf724f6126c58b2e1672e215f3 Author: Ricardo Neri <ricardo.neri@ti.com> Date: Thu Jun 21 10:08:53 2012 +0200 ARM: OMAP4: hwmod data: Force HDMI in no-idle while enabled As per the OMAP4 documentation, audio over HDMI must be transmitted in no-idle mode. This patch adds the HWMOD_SWSUP_SIDLE so that omap_hwmod uses no-idle/force-idle settings instead of smart-idle mode. This is required as the DSS interface clock is used as functional clock for the HDMI wrapper audio FIFO. If no-idle mode is not used, audio could be choppy, have bad quality or not be audible at all. Signed-off-by: Ricardo Neri <ricardo.neri@ti.com> [b-cousson@ti.com: Update the subject and align the .flags location with the script template] Signed-off-by: Benoit Cousson <b-cousson@ti.com> Signed-off-by: Paul Walmsley <paul@pwsan.com> commit 65e25976b75c4996c5650afadc4180db09ec7475 Author: Paul Walmsley <paul@pwsan.com> Date: Sun Jun 17 12:00:58 2012 -0600 ARM: OMAP2+: mux: fix sparse warning Commit bbd707acee279a61177a604822db92e8164d00db ("ARM: omap2: use machine specific hook for late init") resulted in the addition of this sparse warning: arch/arm/mach-omap2/mux.c:791:12: warning: symbol 'omap_mux_late_init' was not declared. Should it be static? Fix by including the header file containing the prototype. Signed-off-by: Paul Walmsley <paul@pwsan.com> Cc: Shawn Guo <shawn.guo@linaro.org> Cc: Tony Lindgren <tony@atomide.com> commit b8f15b7e1dd3638d35cfff85571df1d7ea96e35e Author: Paul Walmsley <paul@pwsan.com> Date: Sun Jun 17 11:57:53 2012 -0600 ARM: OMAP2+: CM: increase the module disable timeout Increase the timeout for disabling an IP block to five milliseconds. This is to handle the usb_host_fs idle latency, which takes almost four milliseconds after a host controller reset. This is the second of two patches needed to resolve the following boot warning: omap_hwmod: usb_host_fs: _wait_target_disable failed Thanks to Sergei Shtylyov <sshtylyov@mvista.com> for finding an unrelated hunk in a previous version of this patch. Signed-off-by: Paul Walmsley <paul@pwsan.com> Cc: Sergei Shtylyov <sshtylyov@mvista.com> Cc: Tero Kristo <t-kristo@ti.com> commit 9a47d32d5c4b91a4ce4c459f3b7b0290185e7578 Author: Paul Walmsley <paul@pwsan.com> Date: Sun Jun 17 11:57:52 2012 -0600 ARM: OMAP4: clock data: add clockdomains for clocks used as main clocks Until the OMAP4 code is converted to disable the use of the clock framework-based clockdomain enable/disable sequence, any clock used as a hwmod main_clk must have a clockdomain associated with it. This patch populates some clock structure clockdomain names to resolve the following warnings during kernel init: omap_hwmod: dpll_mpu_m2_ck: missing clockdomain for dpll_mpu_m2_ck. omap_hwmod: trace_clk_div_ck: missing clockdomain for trace_clk_div_ck. omap_hwmod: l3_div_ck: missing clockdomain for l3_div_ck. omap_hwmod: ddrphy_ck: missing clockdomain for ddrphy_ck. Signed-off-by: Paul Walmsley <paul@pwsan.com> Cc: Rajendra Nayak <rnayak@ti.com> Cc: Benoît Cousson <b-cousson@ti.com> commit 252a4c5443ec4d0bd310ae5b0d1b7f9c5b1e7f46 Author: Paul Walmsley <paul@pwsan.com> Date: Sun Jun 17 11:57:51 2012 -0600 ARM: OMAP4: hwmod data: fix 32k sync timer idle modes The 32k sync timer IP block target idle modes in the hwmod data are incorrect. The IP block does not support any smart-idle modes. Update the data to reflect the correct modes. This problem was initially identified and a diff fragment posted to the lists by Benoît Cousson <b-cousson@ti.com>. A patch description bug in the first version was also identified by Benoît. Signed-off-by: Paul Walmsley <paul@pwsan.com> Cc: Benoît Cousson <b-cousson@ti.com> Cc: Tero Kristo <t-kristo@ti.com> commit 561038f0a8aa1de272a2ac5dad24cc8246d9f496 Author: Djamil Elaidi <d-elaidi@ti.com> Date: Sun Jun 17 11:57:51 2012 -0600 ARM: OMAP4+: hwmod: fix issue causing IPs not going back to Smart-Standby If an IP is configured in Smart-Standby-Wakeup, when disabling wakeup feature the IP will not go back to Smart-Standby, but will remain in Smart-Standby-Wakeup. Signed-off-by: Djamil Elaidi <d-elaidi@ti.com> Signed-off-by: Paul Walmsley <paul@pwsan.com> commit 636040b4eddf6152b5d0b2d574663809f898953b Merge: 8874e81 b102743 Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Thu Jun 21 16:05:43 2012 -0700 Merge tag 'nfs-for-3.5-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client bugfixes from Trond Myklebust: - Fix a write hang due to an uninitalised variable when !defined(CONFIG_NFS_V4) - Address upcall races in the legacy NFSv4 idmapper - Remove an O_DIRECT refcounting issue - Fix a pNFS refcounting bug when the file layout metadata server is also acting as a data server - Fix a pNFS module loading race. * tag 'nfs-for-3.5-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFS: Force the legacy idmapper to be single threaded NFS: Initialise commit_info.rpc_out when !defined(CONFIG_NFS_V4) NFS: Fix a refcounting issue in O_DIRECT NFSv4.1: Fix a race in set_pnfs_layoutdriver NFSv4.1: Fix umount when filelayout DS is also the MDS commit 8874e812feb4926f4a51a82c4fca75c7daa05fc5 Merge: 7b83778 cb77fcd Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Thu Jun 21 13:41:07 2012 -0700 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "This is a small pull with btrfs fixes. The biggest of the bunch is another fix for the new backref walking code. We're still hammering out one btrfs dio vs buffered reads problem, but that one will have to wait for the next rc." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: delay iput with async extents Btrfs: add a missing spin_lock Btrfs: don't assume to be on the correct extent in add_all_parents Btrfs: introduce btrfs_next_old_item commit 7b8377862bd816a5e8ceb5c713d88bf82555c8d4 Merge: 7940b2a 2355375 Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Thu Jun 21 13:40:40 2012 -0700 Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: "Two minor fixes in emc2103 and applesmc drivers." * tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (emc2103) Fix use of an uninitilized variable in error case hwmon: (applesmc) Limit key length in warning messages commit f7bdf03a99efc083608cd9c0c3e03abff311c79e Author: Mark Tinguely <tinguely@sgi.com> Date: Thu Jun 14 09:22:15 2012 -0500 xfs: rename log structure to xlog Rename the XFS log structure to xlog to help crash distinquish it from the other logs in Linux. Signed-off-by: Mark Tinguely <tinguely@sgi.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com> commit 8866fc6fa55e31b2bce931b7963ff16641b39dc7 Author: Ben Myers <bpm@sgi.com> Date: Fri May 25 15:45:36 2012 -0500 xfs: shutdown xfs_sync_worker before the log Revert commit 1307bbd, which uses the s_umount semaphore to provide exclusion between xfs_sync_worker and unmount, in favor of shutting down the sync worker before freeing the log in xfs_log_unmount. This is a cleaner way of resolving the race between xfs_sync_worker and unmount than using s_umount. Signed-off-by: Ben Myers <bpm@sgi.com> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> commit 59c84ed0ddc11f1823b4a33ace4fbcc948261bb2 Author: Jan Kara <jack@suse.cz> Date: Wed Jun 6 00:32:26 2012 +0200 xfs: Fix overallocation in xfs_buf_allocate_memory() Commit de1cbee which removed b_file_offset in favor of b_bn introduced a bug causing xfs_buf_allocate_memory() to overestimate the number of necessary pages. The problem is that xfs_buf_alloc() sets b_bn to -1 and thus effectively every buffer is straddling a page boundary which causes xfs_buf_allocate_memory() to allocate two pages and use vmalloc() for access which is unnecessary. Dave says xfs_buf_alloc() doesn't need to set b_bn to -1 anymore since the buffer is inserted into the cache only after being fully initialized now. So just make xfs_buf_alloc() fill in proper block number from the beginning. CC: David Chinner <dchinner@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com> commit 76d095388b040229ea1aad7dea45be0cfa20f589 Author: Dave Chinner <dchinner@redhat.com> Date: Tue Jun 12 14:20:26 2012 +1000 xfs: fix allocbt cursor leak in xfs_alloc_ag_vextent_near When we fail to find an matching extent near the requested extent specification during a left-right distance search in xfs_alloc_ag_vextent_near, we fail to free the original cursor that we used to look up the XFS_BTNUM_CNT tree and hence leak it. Reported-by: Chris J Arges <chris.j.arges@canonical.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ben Myers <bpm@sgi.com> commit 9a3a5dab63461b84213052888bf38a962b22d035 Author: Brian Foster <bfoster@redhat.com> Date: Mon Jun 11 10:39:43 2012 -0400 xfs: check for stale inode before acquiring iflock on push An inode in the AIL can be flush locked and marked stale if a cluster free transaction occurs at the right time. The inode item is then marked as flushing, which causes xfsaild to spin and leaves the filesystem stalled. This is reproduced by running xfstests 273 in a loop for an extended period of time. Check for stale inodes before the flush lock. This marks the inode as pinned, leads to a log flush and allows the filesystem to proceed. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Mark Tinguely <tinguely@sgi.com> Signed-off-by: Ben Myers <bpm@sgi.com> commit cb77fcd88569cd2b7b25ecd4086ea932a53be9b3 Author: Josef Bacik <josef@redhat.com> Date: Fri Jun 15 12:19:48 2012 -0600 Btrfs: delay iput with async extents There is some concern that these iput()'s could be the final iputs and could induce lockups on people waiting on writeback. This would happen in the rare case that we don't create ordered extents because of an error, but it is theoretically possible and we already have a mechanism to deal with this so just make them delayed iputs to negate any worry. Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com> commit e18fca734278784bd6591de63ca148cc27344ca9 Author: Josef Bacik <josef@redhat.com> Date: Mon Jun 18 07:23:18 2012 -0600 Btrfs: add a missing spin_lock When fixing up the locking in the delayed ref destruction work I accidently broke the locking myself ;(. Add back a spin_lock that should be there and we are now all set. Thanks, Btrfs: add a missing spin_lock When fixing up the locking in the delayed ref destruction work I accidently broke the locking myself ;(. Add back a spin_lock that should be there and we are now all set. Thanks, Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Josef Bacik <josef@redhat.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com> commit 69bca40d41c613927b150c5392505f1894fe3010 Author: Alexander Block <ablock84@googlemail.com> Date: Tue Jun 19 07:42:26 2012 -0600 Btrfs: don't assume to be on the correct extent in add_all_parents add_all_parents did assume that path is already at a correct extent data item, which may not be true in case of data extents that were partly rewritten and splitted. We need to check if we're on a matching extent for every item and only for the ones after the first. The loop is changed to do this now. This patch also fixes a bug introduced with commit 3b127fd8 "Btrfs: remove obsolete btrfs_next_leaf call from __resolve_indirect_ref". The removal of next_leaf did sometimes result in slot==nritems when the above described case happens, and thus resulting in invalid values (e.g. wanted_obejctid) in add_all_parents (leading to missed backrefs or even crashes). Signed-off-by: Alexander Block <ablock84@googlemail.com> Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net> Signed-off-by: Chris Mason <chris.mason@fusionio.com> commit 1c8f52a5e9539600543347bcdefafa1854e07986 Author: Alexander Block <ablock84@googlemail.com> Date: Tue Jun 19 07:42:25 2012 -0600 Btrfs: introduce btrfs_next_old_item We introduce btrfs_next_old_item that uses btrfs_next_old_leaf instead of btrfs_next_leaf. btrfs_next_item is also changed to simply call btrfs_next_old_item with time_seq being 0. Signed-off-by: Alexander Block <ablock84@googlemail.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com> commit b196a4980ff7bb54db478e2a408dc8b12be15304 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Tue Jun 19 11:33:06 2012 +0200 drm/edid: don't return stack garbage from supports_rb We need to initialize this to false, because the is_rb callback only ever sets it to true. Noticed while reading through the code. Cc: stable@vger.kernel.org Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Adam Jackson <ajax@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com> commit d3decf3a0c1d28c80c76be170373f4c7a7217f09 Author: Ozan Çağlayan <ozancag@gmail.com> Date: Thu Jun 14 15:02:35 2012 +0300 vga_switcheroo: Add include guard Guard vga_switcheroo.h against multiple inclusion. Signed-off-by: Ozan Çağlayan <ozancag@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com> commit 7940b2adb474608318e51294d4e9fa0eee1bef61 Merge: 2ce5682 fdec53d Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Wed Jun 20 22:12:52 2012 -0700 Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dma Pull slave-dmaengine fixes from Vinod Koul: "A few fixes in pl330 and imx-sdma drivers." * 'fixes' of git://git.infradead.org/users/vkoul/slave-dma: DMA: PL330: Fix racy mutex unlock DMA: PL330: Add missing static storage class specifier dma: imx-sdma: buf_tail should be initialize in prepare function dmaengine: pl330: dont complete descriptor for cyclic dma commit 2ce5682947872061148b0e5ed2212e03d0d8bc8b Merge: c4c0e9e 8e3bbf4 Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Wed Jun 20 22:11:04 2012 -0700 Merge branch 'for-3.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull two cgroup fixes from Tejun Heo: "This containes two patches fixing a refcnt race bug during css_put(). Decrementing and checking the value weren't atomic and two tasks could think that they both pushed the counter to zero." * 'for-3.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroups: Account for CSS_DEACT_BIAS in __css_put cgroup: make sure that decisions in __css_put are atomic commit c4c0e9e544a0eb640798cc66e68f394fa4a561bf Author: David Rientjes <rientjes@google.com> Date: Wed Jun 20 18:00:12 2012 -0700 mm, mempolicy: fix mbind() to do synchronous migration If the range passed to mbind() is not allocated on nodes set in the nodemask, it migrates the pages to respect the constraint. The final formal of migrate_pages() is a mode of type enum migrate_mode, not a boolean. do_mbind() is currently passing "true" which is the equivalent of MIGRATE_SYNC_LIGHT. This should instead be MIGRATE_SYNC for synchronous page migration. Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> commit 0e3534c0417fef6e6535db8867a04798f2024e3f Author: Randy Dunlap <rdunlap@xenotime.net> Date: Wed Jun 20 18:30:30 2012 -0700 media: pms.c needs linux/slab.h drivers/media/video/pms.c uses kzalloc() and kfree() so it should include <linux/slab.h> to fix build errors and a warning. drivers/media/video/pms.c:1047:2: error: implicit declaration of function 'kzalloc' drivers/media/video/pms.c:1047:6: warning: assignment makes pointer from integer without a cast drivers/media/video/pms.c:1116:2: error: implicit declaration of function 'kfree' Found in mmotm but applies to mainline. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Cc: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> commit bc259adc9b76f625fff0423df3ffb80a03802927 Merge: fe80352 3026b0e Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Wed Jun 20 15:15:03 2012 -0700 Merge tag 'staging-3.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging tree fixes from Greg Kroah-Hartman: "Here are a number of small fixes for the drivers/staging tree, as well as iio and pstore drivers (which came from the staging tree in the 3.5-rc1 merge). All of these are tiny, but resolve issues that people have been reporting. There's also a documentation update to reflect what the iio drivers really are doing, which is good to get straightened out. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>" * tag 'staging-3.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: staging: r8712u: Add new USB IDs staging: gdm72xx: Release netlink socket properly iio: drop wrong reference from Kconfig pstore/inode: Make pstore_fill_super() static pstore/ram: Should zap persistent zone on unlink pstore/ram_core: Factor persistent_ram_zap() out of post_init() pstore/ram_core: Do not reset restored zone's position and size pstore/ram: Should update old dmesg buffer before reading staging:iio:ad7298: Fix linker error due to missing IIO kfifo buffer Revert "staging: usbip: bugfix for stack corruption on 64-bit architectures" staging: usbip: bugfix for stack corruption on 64-bit architectures staging/comedi: fix build for USB not enabled staging: omapdrm: fix crash when freeing bad fb staging:iio:ad7606: Re-add missing scale attribute iio: Fix potential use after free staging:iio: remove num_interrupt_lines from documentation iio: documentation: Add out_altvoltage and friends commit fe80352460971de12519bf46ed5ec4235350bcd7 Merge: f8fc0c9 96c9f05 Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Wed Jun 20 15:14:28 2012 -0700 Merge tag 'driver-core-3.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core and printk fixes from Greg Kroah-Hartman: "Here are some fixes for 3.5-rc4 that resolve the kmsg problems that people have reported showing up after the printk and kmsg changes went into 3.5-rc1. There are also a smattering of other tiny fixes for the extcon and hyper-v drivers that people have reported. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>" * tag 'driver-core-3.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: extcon: max8997: Add missing kfree for info->edev in max8997_muic_remove() extcon: Set platform drvdata in gpio_extcon_probe() and fix irq leak extcon: Fix wrong index in max8997_extcon_cable[] kmsg - kmsg_dump() fix CONFIG_PRINTK=n compilation printk: return -EINVAL if the message len is bigger than the buf size printk: use mutex lock to stop syslog_seq from going wild kmsg - kmsg_dump() use iterator to receive log buffer content vme: change maintainer e-mail address Extcon: Don't try to create duplicate link names driver core: fixup reversed deferred probe order printk: Fix alignment of buf causing crash on ARM EABI Tools: hv: verify origin of netlink connector message commit f8fc0c9a5f7f4f5a3d2e7dd58147e30053cc5dd8 Merge: a1821f7 49fbd3f Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Wed Jun 20 15:13:56 2012 -0700 Merge tag 'char-misc-3.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull misc tree updates from Greg Kroah-Hartman: "Here are some drivers/misc bugfixes (really just drivers/misc/mei/ fixes) for a few problems that have been reported. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>" * tag 'char-misc-3.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: misc: mei: set WDIOF_ALARMONLY on mei watchdog misc: mei: Disable MSI when IRQ registration fails misc: mei: fix stalled read misc: mei: unregister misc device in pci_remove function misc: mei: set IRQF_ONESHOT for msi request_threaded_irq commit a1821f774d4600727edf71005f259a9fdb73981e Merge: a2a2609 78d80c5 Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Wed Jun 20 15:13:13 2012 -0700 Merge tag 'tty-3.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull serial driver fixes from Greg Kroah-Hartman: "Here are 3 patches resolving a boot regression (the mop500 fix), a build warning fix, and a kernel-doc fix. All tiny, but should go into the final 3.5 release. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>" * tag 'tty-3.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: serial/amba-pl011: move custom pin control to driver serial: fix serial_txx9.c build warning/typo serial: fix kernel-doc warnings in 8250.c commit a2a2609c97c1e21996b9d87d10d2c9ff07277524 Merge: a4d7a12 48c3b58 Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Wed Jun 20 14:41:57 2012 -0700 Merge branch 'akpm' (Andrew's patch-bomb) * emailed from Andrew Morton <akpm@linux-foundation.org>: (21 patches) mm/memblock: fix overlapping allocation when doubling reserved array c/r: prctl: Move PR_GET_TID_ADDRESS to a proper place pidns: find_new_reaper() can no longer switch to init_pid_ns.child_reaper pidns: guarantee that the pidns init will be the last pidns process reaped fault-inject: avoid call to random32() if fault injection is disabled Viresh has moved get_maintainer: Fix --help warning mm/memory.c: fix kernel-doc warnings mm: fix kernel-doc warnings mm: correctly synchronize rss-counters at exit/exec mm, thp: print useful information when mmap_sem is unlocked in zap_pmd_range h8300: use the declarations provided by <asm/sections.h> h8300: fix use of extinct _sbss and _ebss xtensa: use the declarations provided by <asm/sections.h> xtensa: use "test -e" instead of bashism "test -a" xtensa: replace xtensa-specific _f{data,text} by _s{data,text} memcg: fix use_hierarchy css_is_ancestor oops regression mm, oom: fix and cleanup oom score calculations nilfs2: ensure proper cache clearing for gc-inodes thp: avoid atomic64_read in pmd_read_atomic for 32bit PAE ... commit 48c3b583bbddad2220ca4c22319ca5d1f78b2090 Author: Greg Pearson <greg.pearson@hp.com> Date: Wed Jun 20 12:53:05 2012 -0700 mm/memblock: fix overlapping allocation when doubling reserved array __alloc_memory_core_early() asks memblock for a range of memory then try to reserve it. If the reserved region array lacks space for the new range, memblock_double_array() is called to allocate more space for the array. If memblock is used to allocate memory for the new array it can end up using a range that overlaps with the range originally allocated in __alloc_memory_core_early(), leading to possible data corruption. With this patch memblock_double_array() now calls memblock_find_in_range() with a narrowed candidate range (in cases where the reserved.regions array is being doubled) so any memory allocated will not overlap with the original range that was being reserved. The range is narrowed by passing in the starting address and size of the previously allocated range. Then the range above the ending address is searched and if a candidate is not found, the range below the starting address is searched. Signed-off-by: Greg Pearson <greg.pearson@hp.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Tejun Heo <tj@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> commit 5702c5eeab959e86ee2d9b4fe7f2d87e65b25d46 Author: Cyrill Gorcunov <gorcunov@openvz.org> Date: Wed Jun 20 12:53:04 2012 -0700 c/r: prctl: Move PR_GET_TID_ADDRESS to a proper place During merging of PR_GET_TID_ADDRESS patch the code has been misplaced (it happened to appear under PR_MCE_KILL) in result noone can use this option. Fix it by moving code snippet to a proper place. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Kees Cook <keescook@chromium.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Andrey Vagin <avagin@openvz.org> Cc: Serge Hallyn <serge.hallyn@canonical.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> commit 50d75f8daead8a1f850c40a3b6c6575ab19b48cf Author: Oleg Nesterov <oleg@redhat.com> Date: Wed Jun 20 12:53:04 2012 -0700 pidns: find_new_reaper() can no longer switch to init_pid_ns.child_reaper find_new_reaper() changes pid_ns->child_reaper, see add0d4df ("pid_ns: zap_pid_ns_processes: fix the ->child_reaper changing"). The original reason has gone away after the previous patch, ->children list must be empty after zap_pid_ns_processes(). However now we can not switch to init_pid_ns.child_reaper. __unhash_process() relies on the "->child_reaper == parent" check, but this check does not work if the last exiting task is also the child reaper. As Eric sugested, we can change __unhash_process() to use the parent's pid_ns and remove this code. Also, with this change we can move detach_pid(PIDTYPE_PID) back, where it was before the previous fix. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Louis Rilling <louis.rilling@kerlabs.com> Cc: Mike Galbraith <efault@gmx.de> Acked-by: Pavel Emelyanov <xemul@parallels.com> Tested-by: Andrew Wagin <avagin@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> commit 6347e90091041e34bea625370794c92f4ce71228 Author: Eric W. Biederman <ebiederm@xmission.com> Date: Wed Jun 20 12:53:03 2012 -0700 pidns: guarantee that the pidns init will be the last pidns process reaped Today we have a twofold bug. Sometimes release_task on pid == 1 in a pid namespace can run before other processes in a pid namespace have had release task called. With the result that pid_ns_release_proc can be called before the last proc_flus_task() is done using upid->ns->proc_mnt, resulting in the use of a stale pointer. This same set of circumstances can lead to waitpid(...) returning for a processes started with clone(CLONE_NEWPID) before the every process in the pid namespace has actually exited. To fix this modify zap_pid_ns_processess wait until all other processes in the pid namespace have exited, even EXIT_DEAD zombies. The delay_group_leader and related tests ensure that the thread gruop leader will be the last thread of a process group to be reaped, or to become EXIT_DEAD and self reap. With the change to zap_pid_ns_processes we get the guarantee that pid == 1 in a pid namespace will be the last task that release_task is called on. With pid == 1 being the last task to pass through release_task pid_ns_release_proc can no longer be called too early nor can wait return before all of the EXIT_DEAD tasks in a pid namespace have exited. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Louis Rilling <louis.rilling@kerlabs.com> Cc: Mike Galbraith <efault@gmx.de> Acked-by: Pavel Emelyanov <xemul@parallels.com> Tested-by: Andrew Wagin <avagin@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> commit f39cdaebb89dc3e6dd4f3e75b6d4e87ef12190af Author: Anton Blanchard <anton@samba.org> Date: Wed Jun 20 12:53:03 2012 -0700 fault-inject: avoid call to random32() if fault injection is disabled After enabling CONFIG_FAILSLAB I noticed random32 in profiles even if slub fault injection wasn't enabled at runtime. should_fail forces a comparison against random32() even if probability is 0: if (attr->probability <= random32() % 100) return false; Add a check up front for probability == 0 and avoid all of the more complicated checks. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> commit 10d8935f46e5028847b179757ecbf9238b13d129 Author: Viresh Kumar <viresh.linux@gmail.com> Date: Wed Jun 20 12:53:02 2012 -0700 Viresh has moved viresh.kumar@st.com email-id doesn't exist anymore as I have left the company. Replace ST's id with viresh.linux@gmail.com. It also updates .mailmap file to fix address for 'git shortlog' Signed-off-by: Viresh Kumar <viresh.linux@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> commit 7dea26813507bfa3d261a81f70494336c3b28293 Author: Joe Perches <joe@perches.com> Date: Wed Jun 20 12:53:02 2012 -0700 get_maintainer: Fix --help warning Using --help emits a concatenation error. Fix it. Signed-off-by: Joe Perches <joe@perches.com> Reported-by: Paul Bolle <pebolle@tiscali.nl> Tested-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> commit eb4546bbbdb160aff084d50511165f385756af18 Author: Randy Dunlap <rdunlap@xenotime.net> Date: Wed Jun 20 12:53:02 2012 -0700 mm/memory.c: fix kernel-doc warnings Fix kernel-doc warnings in mm/memory.c: Warning(mm/memory.c:1377): No description found for parameter 'start' Warning(mm/memory.c:1377): Excess function parameter 'address' description in 'zap_page_range' Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> commit dad7557eb705688040aac134efa5418b66d5ed92 Author: Wanpeng Li <liwp@linux.vnet.ibm.com> Date: Wed Jun 20 12:53:01 2012 -0700 mm: fix kernel-doc warnings Fix kernel-doc warnings such as Warning(../mm/page_cgroup.c:432): No description found for parameter 'id' Warning(../mm/page_cgroup.c:432): Excess function parameter 'mem' description in 'swap_cgroup_record' Signed-off-by: Wanpeng Li <liwp@linux.vnet.ibm.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> commit 4fe7efdbdfb1c7e7a7f31decfd831c0f31d37091 Author: Konstantin Khlebnikov <khlebnikov@openvz.org> Date: Wed Jun 20 12:53:01 2012 -0700 mm: correctly synchronize rss-counters at exit/exec do_exit() and exec_mmap() call sync_mm_rss() before mm_release() does put_user(clear_child_tid) which can update task->rss_stat and thus make mm->rss_stat inconsistent. This triggers the "BUG:" printk in check_mm(). Let's fix this bug in the safest way, and optimize/cleanup this later. Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> commit e0897d75f0b22e8c3a7287a48548c5686ef73447 Author: David Rientjes <rientjes@google.com> Date: Wed Jun 20 12:53:00 2012 -0700 mm, thp: print useful information when mmap_sem is unlocked in zap_pmd_range Andrea asked for addr, end, vma->vm_start, and vma->vm_end to be emitted when !rwsem_is_locked(&tlb->mm->mmap_sem). Otherwise, debugging the underlying issue is more difficult. Suggested-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> commit 436814e61f5c526ed123853a9bf63fb2ff4ff94b Author: Geert Uytterhoeven <geert@linux-m68k.org> Date: Wed Jun 20 12:53:00 2012 -0700 h8300: use the declarations provided by <asm/sections.h> Cleanups: - Include <asm/sections.h>, - Remove the (different) extern declarations, - Remove the no longer needed address-of ('&') operators, - Remove the superfluous casts, use proper printk formatting instead. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> commit ffb20313c01addc04212577af5f9b1399156a5bd Author: Geert Uytterhoeven <geert@linux-m68k.org> Date: Wed Jun 20 12:53:00 2012 -0700 h8300: fix use of extinct _sbss and _ebss Nowadays it should use __bss_start and __bss_stop Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> commit f022d0fa183c4def30ddd74f2baebd72c4254fcf Author: Geert Uytterhoeven <geert@linux-m68k.org> Date: Wed Jun 20 12:52:59 2012 -0700 xtensa: use the declarations provided by <asm/sections.h> Cleanups: - Include <asm/sections.h>, - Remove the (different) extern declarations, - Remove the no longer needed address-of ('&') operators, - Use %p to format pointer differences. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Chris Zankel <chris@zankel.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> commit 0eff08b5d1bf4b30f1a549a58c85806748256b18 Author: Geert Uytterhoeven <geert@linux-m68k.org> Date: Wed Jun 20 12:52:59 2012 -0700 xtensa: use "test -e" instead of bashism "test -a" On Ubuntu, /bin/sh is a symlink to dash, which does not support "test -a". This causes messages like test: 1: -a: unexpected operator test: 1: -a: unexpected operator and link failures like (.init.text+0x132): undefined reference to `platform_init' due to the appropriate platform code not being compiled. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Chris Zankel <chris@zankel.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> commit 5e7b6ed8e9bf3c8e3bb579fd0aec64f6526f8c81 Author: Geert Uytterhoeven <geert@linux-m68k.org> Date: Wed Jun 20 12:52:58 2012 -0700 xtensa: replace xtensa-specific _f{data,text} by _s{data,text} commit a2d063ac216c161 ("extable, core_kernel_data(): Make sure all archs define _sdata") missed xtensa. Xtensa does have a start of data marker, but calls it _fdata, causing kernel/built-in.o:(.text+0x964): undefined reference to `_sdata' _stext was already defined, but it was duplicated by _fdata. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Chris Zankel <chris@zankel.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> commit 3a981f482cc29f7d0aeab509e51ea15519a6e961 Author: Hugh Dickins <hughd@google.com> Date: Wed Jun 20 12:52:58 2012 -0700 memcg: fix use_hierarchy css_is_ancestor oops regression If use_hierarchy is set, reclaim testing soon oopses in css_is_ancestor() called from __mem_cgroup_same_or_subtree() called from page_referenced(): when processes are exiting, it's easy for mm_match_cgroup() to pass along a NULL memcg coming from a NULL mm->owner. Check for that in __mem_cgroup_same_or_subtree(). Return true or false? False because we cannot know if it was in the hierarchy, but also false because it's better not to count a reference from an exiting process. Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> commit 61eafb00d55dfbccdfce543c6b60e369ff4f8f18 Author: David Rientjes <rientjes@google.com> Date: Wed Jun 20 12:52:58 2012 -0700 mm, oom: fix and cleanup oom score calculations The divide in p->signal->oom_score_adj * totalpages / 1000 within oom_badness() was causing an overflow of the signed long data type. This adds both the root bias and p->signal->oom_score_adj before doing the normalization which fixes the issue and also cleans up the calculation. Tested-by: Dave Jones <davej@redhat.com> Signed-off-by: David Rientjes <rientjes@google.com> Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> commit fbb24a3a915f105016f1c828476be11aceac8504 Author: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Date: Wed Jun 20 12:52:57 2012 -0700 nilfs2: ensure proper cache clearing for gc-inodes A gc-inode is a pseudo inode used to buffer the blocks to be moved by garbage collection. Block caches of gc-inodes must be cleared every time a garbage collection function (nilfs_clean_segments) completes. Otherwise, stale blocks buffered in the caches may be wrongly reused in successive calls of the GC function. For user files, this is not a problem because their gc-inodes are distinguished by a checkpoint number as well as an inode number. They never buffer different blocks if either an inode number, a checkpoint number, or a block offset differs. However, gc-inodes of sufile, cpfile and DAT file can store different data for the same block offset. Thus, the nilfs_clean_segments function can move incorrect block for these meta-data files if an old block is cached. I found this is really causing meta-data corruption in nilfs. This fixes the issue by ensuring cache clear of gc-inodes and resolves reported GC problems including checkpoint file corruption, b-tree corruption, and the following warning during GC. nilfs_palloc_freev: entry number 307234 already freed. ... Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Tested-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: <stable@vger.kernel.org> [2.6.37+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> commit e4eed03fd06578571c01d4f1478c874bb432c815 Author: Andrea Arcangeli <aarcange@redhat.com> Date: Wed Jun 20 12:52:57 2012 -0700 thp: avoid atomic64_read in pmd_read_atomic for 32bit PAE In the x86 32bit PAE CONFIG_TRANSPARENT_HUGEPAGE=y case while holding the mmap_sem for reading, cmpxchg8b cannot be used to read pmd contents under Xen. So instead of dealing only with "consistent" pmdvals in pmd_none_or_trans_huge_or_clear_bad() (which would be conceptually simpler) we let pmd_none_or_trans_huge_or_clear_bad() deal with pmdvals where the low 32bit and high 32bit could be inconsistent (to avoid having to use cmpxchg8b). The only guarantee we get from pmd_read_atomic is that if the low part of the pmd was found null, the high part will be null too (so the pmd will be considered unstable). And if the low part of the pmd is found "stable" later, then it means the whole pmd was read atomically (because after a pmd is stable, neither MADV_DONTNEED nor page faults can alter it anymore, and we read the high part after the low part). In the 32bit PAE x86 case, it is enough to read the low part of the pmdval atomically to declare the pmd as "stable" and that's true for THP and no THP, furthermore in the THP case we also have a barrier() that will prevent any inconsistent pmdvals to be cached by a later re-read of the *pmd. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Jonathan Nieder <jrnieder@gmail.com> Cc: Ulrich Obergfell <uobergfe@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Petr Matousek <pmatouse@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Jan Beulich <jbeulich@suse.com> Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com> Tested-by: Andrew Jones <drjones@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> commit abca7c4965845924f65d40e0aa1092bdd895e314 Author: Pravin B Shelar <pshelar@nicira.com> Date: Wed Jun 20 12:52:56 2012 -0700 mm: fix slab->page _count corruption when using slub On arches that do not support this_cpu_cmpxchg_double() slab_lock is used to do atomic cmpxchg() on double word which contains page->_count. The page count can be changed from get_page() or put_page() without taking slab_lock. That corrupts page counter. Fix it by moving page->_count out of cmpxchg_double data. So that slub does no change it while updating slub meta-data in struct page. [akpm@linux-foundation.org: use standard comment layout, tweak comment text] Reported-by: Amey Bhide <abhide@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> commit 3b876c8f2a361ceeed3fed894980c69066f903a0 Author: Jeff Liu <jeff.liu@oracle.com> Date: Thu Jun 7 15:44:32 2012 +0800 xfs: fix debug_object WARN at xfs_alloc_vextent() Fengguang reports: [ 780.529603] XFS (vdd): Ending clean mount [ 781.454590] ODEBUG: object is on stack, but not annotated [ 781.455433] ------------[ cut here ]------------ [ 781.455433] WARNING: at /c/kernel-tests/sound/lib/debugobjects.c:301 __debug_object_init+0x173/0x1f1() [ 781.455433] Hardware name: Bochs [ 781.455433] Modules linked in: [ 781.455433] Pid: 26910, comm: kworker/0:2 Not tainted 3.4.0+ #51 [ 781.455433] Call Trace: [ 781.455433] [<ffffffff8106bc84>] warn_slowpath_common+0x83/0x9b [ 781.455433] [<ffffffff8106bcb6>] warn_slowpath_null+0x1a/0x1c [ 781.455433] [<ffffffff814919a5>] __debug_object_init+0x173/0x1f1 [ 781.455433] [<ffffffff81491c65>] debug_object_init+0x14/0x16 [ 781.455433] [<ffffffff8108842a>] __init_work+0x20/0x22 [ 781.455433] [<ffffffff8134ea56>] xfs_alloc_vextent+0x6c/0xd5 Use INIT_WORK_ONSTACK in xfs_alloc_vextent instead of INIT_WORK. Reported-by: Wu Fengguang <wfg@linux.intel.com> Signed-off-by: Jie Liu <jeff.liu@oracle.com> Signed-off-by: Ben Myers <bpm@sgi.com> commit 66f9311381b4772003d595fb6c518f1647450db0 Author: Alain Renaud <arenaud@sgi.com> Date: Fri Jun 8 15:34:46 2012 -0400 xfs: xfs_vm_writepage clear iomap_valid when !buffer_uptodate (REV2) On filesytems with a block size smaller than PAGE_SIZE we currently have a problem with unwritten extents. If a we have multi-block page for which an unwritten extent has been allocated, and only some of the buffers have been written to, and they are not contiguous, we can expose stale data from disk in the blocks between the writes after extent conversion. Example of a page with unwritten and real data. buffer content 0 empty b_state = 0 1 DATA b_state = 0x1023 Uptodate,Dirty,Mapped,Unwritten 2 DATA b_state = 0x1023 Uptodate,Dirty,Mapped,Unwritten 3 empty b_state = 0 4 empty b_state = 0 5 DATA b_state = 0x1023 Uptodate,Dirty,Mapped,Unwritten 6 DATA b_state = 0x1023 Uptodate,Dirty,Mapped,Unwritten 7 empty b_state = 0 Buffers 1, 2, 5, and 6 have been written to, leaving 0, 3, 4, and 7 empty. Currently buffers 1, 2, 5, and 6 are added to a single ioend, and when IO has completed, extent conversion creates a real extent from block 1 through block 6, leaving 0 and 7 unwritten. However buffers 3 and 4 were not written to disk, so stale data is exposed from those blocks on a subsequent read. Fix this by setting iomap_valid = 0 when we find a buffer that is not Uptodate. This ensures that buffers 5 and 6 are not added to the same ioend as buffers 1 and 2. Later these blocks will be converted into two separate real extents, leaving the blocks in between unwritten. Signed-off-by: Alain Renaud <arenaud@sgi.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Ben Myers <bpm@sgi.com> commit b7019b2f31fe7bec9f6f5dc1bf54cb0e0d73e047 Author: Alex Deucher <alexander.deucher@amd.com> Date: Thu Jun 14 15:58:25 2012 -0400 drm/radeon: SI tiling fixes for display - Use the correct union for getting the tiling info - Properly init the PIPE_CONFIG field for SI Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> commit b1027439dff844675f6c0df97a1b1d190791a699 Author: Bryan Schumaker <bjschuma@netapp.com> Date: Wed Jun 20 14:35:28 2012 -0400 NFS: Force the legacy idmapper to be single threaded It was initially coded under the assumption that there would only be one request at a time, so use a lock to enforce this requirement.. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> CC: stable@vger.kernel.org [3.4+] Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> commit a4d7a122385e27bdd91101635c704327d7c0d87f Merge: 61fcbc8 10aa5a3 Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Wed Jun 20 09:42:09 2012 -0700 Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm Pull ARM fixes from Russell King: "This includes three MMCI changes - one to fix up the wrong version of the DT support patch which was merged, and two to make deferred probing work. It also includes a fix to the OMAP SPI driver which is causing a boot time warning. The remainder are very minor ARM fixes." * 'fixes' of git://git.linaro.org/people/rmk/linux-arm: SPI: fix over-eager devm_xxx() conversion ARM: 7427/1: mmc: mmci: Defer probe() in case of yet uninitialized GPIOs ARM: 7426/1: mmc: mmci: Remove wrong error handling of gpio 0 ARM: 7425/1: extable: ensure fixup entries are 4-byte aligned ARM: 7421/1: bpf_jit: BPF_S_ANC_ALU_XOR_X support ARM: 7423/1: kprobes: run t32_simulate_ldr_literal() without insn slot ARM: 7422/1: mmc: mmci: Allocate platform memory during Device Tree boot commit 61fcbc8dfe40d6fb5e59ab31dfcef67d6019f1a5 Merge: f40759e daf7317 Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Wed Jun 20 09:41:21 2012 -0700 Merge tag 'pinctrl-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull two pinctrl fixes from Linus Walleij: - Fixed a 2-line compile error for MXS - A pure documentation fix for Nomadik * tag 'pinctrl-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl/nomadik: document Alt-C glitch pinctrl: mxs: Use kfree to fix build error commit aef2b89662b8a7506846d0dc0df672d196ddf8d0 Author: Russ Dill <Russ.Dill@ti.com> Date: Wed May 9 15:15:03 2012 -0700 ARM: OMAP: Fix Beagleboard DVI reset gpio Commit e813a55eb9c9bc6c8039fb16332cf43402125b30 ("OMAP: board-files: remove custom PD GPIO handling for DVI output") moved TFP410 chip's powerdown-gpio handling from the board files to the tfp410 driver. One gpio_request_one(powerdown-gpio, ...) was mistakenly left unremoved in the Beagle board file. This causes the tfp410 driver to fail to request the gpio on Beagle, causing the driver to fail and thus the DVI output doesn't work. This patch removes several boot errors from board-omap3beagle.c: - gpio_request: gpio--22 (DVI reset) status -22 - Unable to get DVI reset GPIO There is a combination of leftover code and revision confusion. Additionally, xM support is currently a hack. For original Beagleboard this removes the double initialization of GPIO 170, properly configures it as an output, and wraps the initialization in an if block so that xM does not attempt to request it. For Beagleboard xM it removes reference to GPIO 129 which was part of rev A1 and A2 designs, but never functioned. It then properly assigns beagle_dvi_device.reset_gpio in beagle_twl_gpio_setup and removes the hack of initializing it high. Additionally, it uses gpio_set_value_cansleep since this GPIO is connected through i2c. Unfortunately, there is no way to tell the difference between xM A2 and A3. However, GPIO 129 does not function on rev A1 and A2, and the TWL GPIO used on A3 and beyond is not used on rev A1 and A2, there are no problems created by this fix. Tested on Beagleboard-xM Rev C1 and Beagleboard Rev B4. Signed-off-by: Russ Dill <Russ.Dill@ti.com> Acked-by: Tomi Valkeinen <tomi.valkeinen@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com> commit 95dca12d6bf2dd5e7720506b8f9786318899b8d6 Author: Jon Hunter <jon-hunter@ti.com> Date: Tue Jun 12 19:40:46 2012 -0500 arm/dts: OMAP2: Fix interrupt controller binding When booting with device-tree on an OMAP2420H4, the kernel is hanging when initialising the interrupts and following kernel dumps is seen ... [ 0.000000] ------------[ cut here ]------------ [ 0.000000] WARNING: at arch/arm/mach-omap2/irq.c:271 omap_intc_of_init+0x50/0xb4() [ 0.000000] unable to get intc registers [ 0.000000] Modules linked in: [ 0.000000] [<c001befc>] (unwind_backtrace+0x0/0xf4) from [<c0040c34>] (warn_slowpath_common+0x4c/0x64) [ 0.000000] [<c0040c34>] (warn_slowpath_common+0x4c/0x64) from [<c0040ce0>] (warn_slowpath_fmt+0x30/0x40) [ 0.000000] [<c0040ce0>] (warn_slowpath_fmt+0x30/0x40) from [<c066b8a4>] (omap_intc_of_init+0x50/0xb4) [ 0.000000] [<c066b8a4>] (omap_intc_of_init+0x50/0xb4) from [<c0688b70>] (of_irq_init+0x144/0x288) [ 0.000000] [<c0688b70>] (of_irq_init+0x144/0x288) from [<c0663294>] (init_IRQ+0x14/0x1c) [ 0.000000] [<c0663294>] (init_IRQ+0x14/0x1c) from [<c06607fc>] (start_kernel+0x198/0x304) [ 0.000000] [<c06607fc>] (start_kernel+0x198/0x304) from [<80008044>] (0x80008044) [ 0.000000] ---[ end trace 1b75b31a2719ed1c ]--- [ 0.000000] of_irq_init: children remain, but no parents The OMAP2 interrupt controller binding is missing the number of interrupts and interrupt controller register address. Adding these fixes the problem. Signed-off-by: Jon Hunter <jon-hunter@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com> commit 3d09b33fecf204561e5c7126648ec05c756c631c Author: Tony Lindgren <tony@atomide.com> Date: Wed Jun 20 07:18:…

commit 97f7f81 upstream. If oprofile uses the nmi timer interrupt there is a crash while unloading the module. The bug can be triggered with oprofile build as module and kernel parameter nolapic set. This patch fixes this. oprofile: using NMI timer interrupt. BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 IP: [<ffffffff8123c226>] unregister_syscore_ops+0x41/0x58 PGD 42dbca067 PUD 41da6a067 PMD 0 Oops: 0002 [#1] PREEMPT SMP CPU 5 Modules linked in: oprofile(-) [last unloaded: oprofile] Pid: 2518, comm: modprobe Not tainted 3.1.0-rc7-00019-gb2fb49d torvalds#19 Advanced Micro Device Anaheim/Anaheim RIP: 0010:[<ffffffff8123c226>] [<ffffffff8123c226>] unregister_syscore_ops+0x41/0x58 RSP: 0018:ffff88041ef71e98 EFLAGS: 00010296 RAX: 0000000000000000 RBX: ffffffffa0017100 RCX: dead000000200200 RDX: 0000000000000000 RSI: dead000000100100 RDI: ffffffff8178c620 RBP: ffff88041ef71ea8 R08: 0000000000000001 R09: 0000000000000082 R10: 0000000000000000 R11: ffff88041ef71de8 R12: 0000000000000080 R13: fffffffffffffff5 R14: 0000000000000001 R15: 0000000000610210 FS: 00007fc902f20700(0000) GS:ffff88042fd40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000008 CR3: 000000041cdb6000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process modprobe (pid: 2518, threadinfo ffff88041ef70000, task ffff88041d348040) Stack: ffff88041ef71eb8 ffffffffa0017790 ffff88041ef71eb8 ffffffffa0013532 ffff88041ef71ec8 ffffffffa00132d6 ffff88041ef71ed8 ffffffffa00159b2 ffff88041ef71f78 ffffffff81073115 656c69666f72706f 0000000000610200 Call Trace: [<ffffffffa0013532>] op_nmi_exit+0x15/0x17 [oprofile] [<ffffffffa00132d6>] oprofile_arch_exit+0xe/0x10 [oprofile] [<ffffffffa00159b2>] oprofile_exit+0x1e/0x20 [oprofile] [<ffffffff81073115>] sys_delete_module+0x1c3/0x22f [<ffffffff811bf09e>] ? trace_hardirqs_on_thunk+0x3a/0x3f [<ffffffff8148070b>] system_call_fastpath+0x16/0x1b Code: 20 c6 78 81 e8 c5 cc 23 00 48 8b 13 48 8b 43 08 48 be 00 01 10 00 00 00 ad de 48 b9 00 02 20 00 00 00 ad de 48 c7 c7 20 c6 78 81 89 42 08 48 89 10 48 89 33 48 89 4b 08 e8 a6 c0 23 00 5a 5b RIP [<ffffffff8123c226>] unregister_syscore_ops+0x41/0x58 RSP <ffff88041ef71e98> CR2: 0000000000000008 ---[ end trace 43a541a52956b7b0 ]--- Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

BugLink: http://bugs.launchpad.net/bugs/902317 commit 97f7f81 upstream. If oprofile uses the nmi timer interrupt there is a crash while unloading the module. The bug can be triggered with oprofile build as module and kernel parameter nolapic set. This patch fixes this. oprofile: using NMI timer interrupt. BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 IP: [<ffffffff8123c226>] unregister_syscore_ops+0x41/0x58 PGD 42dbca067 PUD 41da6a067 PMD 0 Oops: 0002 [#1] PREEMPT SMP CPU 5 Modules linked in: oprofile(-) [last unloaded: oprofile] Pid: 2518, comm: modprobe Not tainted 3.1.0-rc7-00019-gb2fb49d torvalds#19 Advanced Micro Device Anaheim/Anaheim RIP: 0010:[<ffffffff8123c226>] [<ffffffff8123c226>] unregister_syscore_ops+0x41/0x58 RSP: 0018:ffff88041ef71e98 EFLAGS: 00010296 RAX: 0000000000000000 RBX: ffffffffa0017100 RCX: dead000000200200 RDX: 0000000000000000 RSI: dead000000100100 RDI: ffffffff8178c620 RBP: ffff88041ef71ea8 R08: 0000000000000001 R09: 0000000000000082 R10: 0000000000000000 R11: ffff88041ef71de8 R12: 0000000000000080 R13: fffffffffffffff5 R14: 0000000000000001 R15: 0000000000610210 FS: 00007fc902f20700(0000) GS:ffff88042fd40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000008 CR3: 000000041cdb6000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process modprobe (pid: 2518, threadinfo ffff88041ef70000, task ffff88041d348040) Stack: ffff88041ef71eb8 ffffffffa0017790 ffff88041ef71eb8 ffffffffa0013532 ffff88041ef71ec8 ffffffffa00132d6 ffff88041ef71ed8 ffffffffa00159b2 ffff88041ef71f78 ffffffff81073115 656c69666f72706f 0000000000610200 Call Trace: [<ffffffffa0013532>] op_nmi_exit+0x15/0x17 [oprofile] [<ffffffffa00132d6>] oprofile_arch_exit+0xe/0x10 [oprofile] [<ffffffffa00159b2>] oprofile_exit+0x1e/0x20 [oprofile] [<ffffffff81073115>] sys_delete_module+0x1c3/0x22f [<ffffffff811bf09e>] ? trace_hardirqs_on_thunk+0x3a/0x3f [<ffffffff8148070b>] system_call_fastpath+0x16/0x1b Code: 20 c6 78 81 e8 c5 cc 23 00 48 8b 13 48 8b 43 08 48 be 00 01 10 00 00 00 ad de 48 b9 00 02 20 00 00 00 ad de 48 c7 c7 20 c6 78 81 89 42 08 48 89 10 48 89 33 48 89 4b 08 e8 a6 c0 23 00 5a 5b RIP [<ffffffff8123c226>] unregister_syscore_ops+0x41/0x58 RSP <ffff88041ef71e98> CR2: 0000000000000008 ---[ end trace 43a541a52956b7b0 ]--- Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>

…d reasons We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f #10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e #11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f #12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad #13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 #14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a #15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 #16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b #17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 #18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c #19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 #20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 #21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] #22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] #23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 #24 [ffff8810343bfee8] kthread at ffffffff8108dd96 #25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org

…d reasons commit 5cf02d0 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] torvalds#6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 torvalds#7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 torvalds#8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 torvalds#9 [ffff8810343bf818] shrink_zone at ffffffff8112788f torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96 torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>

Add 6 new devices and one modified device, based on information from laptop vendor Windows drivers. Sony provides a driver with two new devices using a Gobi 2k+ layout (1199:68a5 and 1199:68a9). The Sony driver also adds a non-standard QMI/net interface to the already supported 1199:9011 Gobi device. We do not know whether this is an alternate interface number or an additional interface which might be present, but that doesn't really matter. Lenovo provides a driver supporting 4 new devices: - MC7770 (1199:901b) with standard Gobi 2k+ layout - MC7700 (0f3d:68a2) with layout similar to MC7710 - MC7750 (114f:68a2) with layout similar to MC7710 - EM7700 (1199:901c) with layout similar to MC7710 Note regaring the three devices similar to MC7710: The Windows drivers only support interface #8 on these devices. The MC7710 can support QMI/net functions on interface #19 and #20 as well, and this driver is verified to work on interface #19 (a firmware bug is suspected to prevent #20 from working). We do not enable these additional interfaces until they either show up in a Windows driver or are verified to work in some other way. Therefore limiting the new devices to interface #8 for now. Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>

…d reasons commit 5cf02d0 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96 torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>

…d reasons commit 5cf02d0 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] torvalds#6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 torvalds#7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 torvalds#8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 torvalds#9 [ffff8810343bf818] shrink_zone at ffffffff8112788f torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96 torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <jlayton at redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust at netapp.com> Signed-off-by: Ben Hutchings <ben at decadent.org.uk>

…d reasons commit 5cf02d0 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] torvalds#6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 torvalds#7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 torvalds#8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 torvalds#9 [ffff8810343bf818] shrink_zone at ffffffff8112788f torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96 torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

…d reasons commit 5cf02d0 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96 torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>

commit 9b469a6 upstream. Add 6 new devices and one modified device, based on information from laptop vendor Windows drivers. Sony provides a driver with two new devices using a Gobi 2k+ layout (1199:68a5 and 1199:68a9). The Sony driver also adds a non-standard QMI/net interface to the already supported 1199:9011 Gobi device. We do not know whether this is an alternate interface number or an additional interface which might be present, but that doesn't really matter. Lenovo provides a driver supporting 4 new devices: - MC7770 (1199:901b) with standard Gobi 2k+ layout - MC7700 (0f3d:68a2) with layout similar to MC7710 - MC7750 (114f:68a2) with layout similar to MC7710 - EM7700 (1199:901c) with layout similar to MC7710 Note regaring the three devices similar to MC7710: The Windows drivers only support interface #8 on these devices. The MC7710 can support QMI/net functions on interface #19 and #20 as well, and this driver is verified to work on interface #19 (a firmware bug is suspected to prevent #20 from working). We do not enable these additional interfaces until they either show up in a Windows driver or are verified to work in some other way. Therefore limiting the new devices to interface #8 for now. [bmork: backported to 3.4: use driver whitelisting] Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

…d reasons commit 5cf02d0 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96 torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>

commit 9b469a6 upstream. Add 6 new devices and one modified device, based on information from laptop vendor Windows drivers. Sony provides a driver with two new devices using a Gobi 2k+ layout (1199:68a5 and 1199:68a9). The Sony driver also adds a non-standard QMI/net interface to the already supported 1199:9011 Gobi device. We do not know whether this is an alternate interface number or an additional interface which might be present, but that doesn't really matter. Lenovo provides a driver supporting 4 new devices: - MC7770 (1199:901b) with standard Gobi 2k+ layout - MC7700 (0f3d:68a2) with layout similar to MC7710 - MC7750 (114f:68a2) with layout similar to MC7710 - EM7700 (1199:901c) with layout similar to MC7710 Note regaring the three devices similar to MC7710: The Windows drivers only support interface #8 on these devices. The MC7710 can support QMI/net functions on interface #19 and #20 as well, and this driver is verified to work on interface #19 (a firmware bug is suspected to prevent #20 from working). We do not enable these additional interfaces until they either show up in a Windows driver or are verified to work in some other way. Therefore limiting the new devices to interface #8 for now. [bmork: backported to 3.4: use driver whitelisting] Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

…d reasons commit 5cf02d0 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96 torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>

…d reasons commit 5cf02d0 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96 torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>

…d reasons BugLink: http://bugs.launchpad.net/bugs/1035435 commit 5cf02d0 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] torvalds#6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 torvalds#7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 torvalds#8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 torvalds#9 [ffff8810343bf818] shrink_zone at ffffffff8112788f torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96 torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>

…d reasons commit 5cf02d0 upstream. We've had some reports of a deadlock where rpciod ends up with a stack trace like this: PID: 2507 TASK: ffff88103691ab40 CPU: 14 COMMAND: "rpciod/14" #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9 #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs] #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8 #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs] #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs] #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670 #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271 #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638 #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f torvalds#10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e torvalds#11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f torvalds#12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad torvalds#13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942 torvalds#14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a torvalds#15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9 torvalds#16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b torvalds#17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808 torvalds#18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c torvalds#19 [ffff8810343bfce8] inet_create at ffffffff81483ba6 torvalds#20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7 torvalds#21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc] torvalds#22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc] torvalds#23 [ffff8810343bfe38] worker_thread at ffffffff810887d0 torvalds#24 [ffff8810343bfee8] kthread at ffffffff8108dd96 torvalds#25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca rpciod is trying to allocate memory for a new socket to talk to the server. The VM ends up calling ->releasepage to get more memory, and it tries to do a blocking commit. That commit can't succeed however without a connected socket, so we deadlock. Fix this by setting PF_FSTRANS on the workqueue task prior to doing the socket allocation, and having nfs_release_page check for that flag when deciding whether to do a commit call. Also, set PF_FSTRANS unconditionally in rpc_async_schedule since that function can also do allocations sometimes. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>

When exercising error injection on IBM pseries machine, I hit the following warning: [ 251.450043] RTAS: event: 89, Type: Platform Error, Severity: 2 [ 253.549822] cxgb3 0006:01:00.0: enabling device (0140 -> 0142) [ 253.713560] cxgb3 0006:01:00.0: adapter recovering, PEX ERR 0x100 [ 254.895437] RTNL: assertion failed at net/core/dev.c (2031) [ 254.895467] CPU: 6 PID: 5449 Comm: eehd Tainted: G W 3.10.0-rc7-00157-gea461ab #19 [ 254.895474] Call Trace: [ 254.895483] [c000000fac56f7d0] [c000000000014dcc] .show_stack+0x7c/0x1f0 (unreliable) [ 254.895493] [c000000fac56f8a0] [c0000000007ba318] .dump_stack+0x28/0x3c [ 254.895500] [c000000fac56f910] [c0000000006c0384] .netif_set_real_num_tx_queues+0x224/0x230 [ 254.895515] [c000000fac56f9b0] [d00000000ef35510] .cxgb_open+0x80/0x3f0 [cxgb3] [ 254.895525] [c000000fac56fa50] [d00000000ef35914] .t3_resume_ports+0x94/0x100 [cxgb3] [ 254.895533] [c000000fac56fae0] [c00000000005fc8c] .eeh_report_resume+0x8c/0xd0 [ 254.895539] [c000000fac56fb60] [c00000000005e9fc] .eeh_pe_dev_traverse+0x9c/0x190 [ 254.895545] [c000000fac56fc10] [c000000000060000] .eeh_handle_event+0x110/0x330 [ 254.895551] [c000000fac56fca0] [c000000000060350] .eeh_event_handler+0x130/0x1a0 [ 254.895558] [c000000fac56fd30] [c0000000000ad758] .kthread+0xe8/0xf0 [ 254.895566] [c000000fac56fe30] [c00000000000a05c] .ret_from_kernel_thread+0x5c/0x80 It appears that t3_resume_ports() is called with the rtnl_lock held from the fatal error task but not from the PCI error callbacks. This fixes it. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: David S. Miller <davem@davemloft.net>

The code in ocfs2_dio_end_io_write() estimates number of necessary transaction credits using ocfs2_calc_extend_credits(). This however does not take into account that the IO could be arbitrarily large and can contain arbitrary number of extents. Extent tree manipulations do often extend the current transaction but not in all of the cases. For example if we have only single block extents in the tree, ocfs2_mark_extent_written() will end up calling ocfs2_replace_extent_rec() all the time and we will never extend the current transaction and eventually exhaust all the transaction credits if the IO contains many single block extents. Once that happens a WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to this error. This was actually triggered by one of our customers on a heavily fragmented OCFS2 filesystem. To fix the issue make sure the transaction always has enough credits for one extent insert before each call of ocfs2_mark_extent_written(). Heming Zhao said: ------ PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error" PID: xxx TASK: xxxx CPU: 5 COMMAND: "SubmitThread-CA" #0 machine_kexec at ffffffff8c069932 #1 __crash_kexec at ffffffff8c1338fa #2 panic at ffffffff8c1d69b9 #3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2] #4 __ocfs2_abort at ffffffffc0c88387 [ocfs2] #5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2] torvalds#6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2] torvalds#7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2] torvalds#8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2] torvalds#9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2] torvalds#10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2] torvalds#11 dio_complete at ffffffff8c2b9fa7 torvalds#12 do_blockdev_direct_IO at ffffffff8c2bc09f torvalds#13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2] torvalds#14 generic_file_direct_write at ffffffff8c1dcf14 torvalds#15 __generic_file_write_iter at ffffffff8c1dd07b torvalds#16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2] torvalds#17 aio_write at ffffffff8c2cc72e torvalds#18 kmem_cache_alloc at ffffffff8c248dde torvalds#19 do_io_submit at ffffffff8c2ccada torvalds#20 do_syscall_64 at ffffffff8c004984 torvalds#21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba Link: https://lkml.kernel.org/r/20240617095543.6971-1-jack@suse.cz Link: https://lkml.kernel.org/r/20240614145243.8837-1-jack@suse.cz Fixes: c15471f ("ocfs2: fix sparse file & data ordering issue in direct io") Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Heming Zhao <heming.zhao@suse.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

If hung_task_panic is enabled, don't consider the value of hung_task_warnings and display the information of the hung tasks. In some cases, hung_task_panic might not be initially set up, after several hung tasks occur, the hung_task_warnings count reaches zero. If hung_task_panic is set up later, it may not display any helpful hung task info in dmesg, only showing messages like: Kernel panic - not syncing: hung_task: blocked tasks CPU: 3 PID: 58 Comm: khungtaskd Not tainted 6.10.0-rc3 torvalds#19 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) Call Trace: <TASK> panic+0x2f3/0x320 watchdog+0x2dd/0x510 ? __pfx_watchdog+0x10/0x10 kthread+0xe0/0x110 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2f/0x40 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> Link: https://lkml.kernel.org/r/20240613033159.3446265-1-leonylgao@gmail.com Signed-off-by: Yongliang Gao <leonylgao@tencent.com> Reviewed-by: Huang Cun <cunhuang@tencent.com> Cc: Joel Granados <j.granados@samsung.com> Cc: John Siddle <jsiddle@redhat.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

The code in ocfs2_dio_end_io_write() estimates number of necessary transaction credits using ocfs2_calc_extend_credits(). This however does not take into account that the IO could be arbitrarily large and can contain arbitrary number of extents. Extent tree manipulations do often extend the current transaction but not in all of the cases. For example if we have only single block extents in the tree, ocfs2_mark_extent_written() will end up calling ocfs2_replace_extent_rec() all the time and we will never extend the current transaction and eventually exhaust all the transaction credits if the IO contains many single block extents. Once that happens a WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to this error. This was actually triggered by one of our customers on a heavily fragmented OCFS2 filesystem. To fix the issue make sure the transaction always has enough credits for one extent insert before each call of ocfs2_mark_extent_written(). Heming Zhao said: ------ PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error" PID: xxx TASK: xxxx CPU: 5 COMMAND: "SubmitThread-CA" #0 machine_kexec at ffffffff8c069932 #1 __crash_kexec at ffffffff8c1338fa #2 panic at ffffffff8c1d69b9 #3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2] #4 __ocfs2_abort at ffffffffc0c88387 [ocfs2] #5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2] torvalds#6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2] torvalds#7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2] torvalds#8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2] torvalds#9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2] torvalds#10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2] torvalds#11 dio_complete at ffffffff8c2b9fa7 torvalds#12 do_blockdev_direct_IO at ffffffff8c2bc09f torvalds#13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2] torvalds#14 generic_file_direct_write at ffffffff8c1dcf14 torvalds#15 __generic_file_write_iter at ffffffff8c1dd07b torvalds#16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2] torvalds#17 aio_write at ffffffff8c2cc72e torvalds#18 kmem_cache_alloc at ffffffff8c248dde torvalds#19 do_io_submit at ffffffff8c2ccada torvalds#20 do_syscall_64 at ffffffff8c004984 torvalds#21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba Link: https://lkml.kernel.org/r/20240617095543.6971-1-jack@suse.cz Link: https://lkml.kernel.org/r/20240614145243.8837-1-jack@suse.cz Fixes: c15471f ("ocfs2: fix sparse file & data ordering issue in direct io") Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Heming Zhao <heming.zhao@suse.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

If hung_task_panic is enabled, don't consider the value of hung_task_warnings and display the information of the hung tasks. In some cases, hung_task_panic might not be initially set up, after several hung tasks occur, the hung_task_warnings count reaches zero. If hung_task_panic is set up later, it may not display any helpful hung task info in dmesg, only showing messages like: Kernel panic - not syncing: hung_task: blocked tasks CPU: 3 PID: 58 Comm: khungtaskd Not tainted 6.10.0-rc3 torvalds#19 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) Call Trace: <TASK> panic+0x2f3/0x320 watchdog+0x2dd/0x510 ? __pfx_watchdog+0x10/0x10 kthread+0xe0/0x110 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2f/0x40 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> Link: https://lkml.kernel.org/r/20240613033159.3446265-1-leonylgao@gmail.com Signed-off-by: Yongliang Gao <leonylgao@tencent.com> Reviewed-by: Huang Cun <cunhuang@tencent.com> Cc: Joel Granados <j.granados@samsung.com> Cc: John Siddle <jsiddle@redhat.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

The code in ocfs2_dio_end_io_write() estimates number of necessary transaction credits using ocfs2_calc_extend_credits(). This however does not take into account that the IO could be arbitrarily large and can contain arbitrary number of extents. Extent tree manipulations do often extend the current transaction but not in all of the cases. For example if we have only single block extents in the tree, ocfs2_mark_extent_written() will end up calling ocfs2_replace_extent_rec() all the time and we will never extend the current transaction and eventually exhaust all the transaction credits if the IO contains many single block extents. Once that happens a WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to this error. This was actually triggered by one of our customers on a heavily fragmented OCFS2 filesystem. To fix the issue make sure the transaction always has enough credits for one extent insert before each call of ocfs2_mark_extent_written(). Heming Zhao said: ------ PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error" PID: xxx TASK: xxxx CPU: 5 COMMAND: "SubmitThread-CA" #0 machine_kexec at ffffffff8c069932 #1 __crash_kexec at ffffffff8c1338fa #2 panic at ffffffff8c1d69b9 #3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2] #4 __ocfs2_abort at ffffffffc0c88387 [ocfs2] #5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2] torvalds#6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2] torvalds#7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2] torvalds#8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2] torvalds#9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2] torvalds#10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2] torvalds#11 dio_complete at ffffffff8c2b9fa7 torvalds#12 do_blockdev_direct_IO at ffffffff8c2bc09f torvalds#13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2] torvalds#14 generic_file_direct_write at ffffffff8c1dcf14 torvalds#15 __generic_file_write_iter at ffffffff8c1dd07b torvalds#16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2] torvalds#17 aio_write at ffffffff8c2cc72e torvalds#18 kmem_cache_alloc at ffffffff8c248dde torvalds#19 do_io_submit at ffffffff8c2ccada torvalds#20 do_syscall_64 at ffffffff8c004984 torvalds#21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba Link: https://lkml.kernel.org/r/20240617095543.6971-1-jack@suse.cz Link: https://lkml.kernel.org/r/20240614145243.8837-1-jack@suse.cz Fixes: c15471f ("ocfs2: fix sparse file & data ordering issue in direct io") Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Heming Zhao <heming.zhao@suse.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

If hung_task_panic is enabled, don't consider the value of hung_task_warnings and display the information of the hung tasks. In some cases, hung_task_panic might not be initially set up, after several hung tasks occur, the hung_task_warnings count reaches zero. If hung_task_panic is set up later, it may not display any helpful hung task info in dmesg, only showing messages like: Kernel panic - not syncing: hung_task: blocked tasks CPU: 3 PID: 58 Comm: khungtaskd Not tainted 6.10.0-rc3 torvalds#19 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) Call Trace: <TASK> panic+0x2f3/0x320 watchdog+0x2dd/0x510 ? __pfx_watchdog+0x10/0x10 kthread+0xe0/0x110 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2f/0x40 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> Link: https://lkml.kernel.org/r/20240613033159.3446265-1-leonylgao@gmail.com Signed-off-by: Yongliang Gao <leonylgao@tencent.com> Reviewed-by: Huang Cun <cunhuang@tencent.com> Cc: Joel Granados <j.granados@samsung.com> Cc: John Siddle <jsiddle@redhat.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

commit be346c1 upstream. The code in ocfs2_dio_end_io_write() estimates number of necessary transaction credits using ocfs2_calc_extend_credits(). This however does not take into account that the IO could be arbitrarily large and can contain arbitrary number of extents. Extent tree manipulations do often extend the current transaction but not in all of the cases. For example if we have only single block extents in the tree, ocfs2_mark_extent_written() will end up calling ocfs2_replace_extent_rec() all the time and we will never extend the current transaction and eventually exhaust all the transaction credits if the IO contains many single block extents. Once that happens a WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to this error. This was actually triggered by one of our customers on a heavily fragmented OCFS2 filesystem. To fix the issue make sure the transaction always has enough credits for one extent insert before each call of ocfs2_mark_extent_written(). Heming Zhao said: ------ PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error" PID: xxx TASK: xxxx CPU: 5 COMMAND: "SubmitThread-CA" #0 machine_kexec at ffffffff8c069932 #1 __crash_kexec at ffffffff8c1338fa #2 panic at ffffffff8c1d69b9 #3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2] #4 __ocfs2_abort at ffffffffc0c88387 [ocfs2] #5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2] torvalds#6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2] torvalds#7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2] torvalds#8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2] torvalds#9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2] torvalds#10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2] torvalds#11 dio_complete at ffffffff8c2b9fa7 torvalds#12 do_blockdev_direct_IO at ffffffff8c2bc09f torvalds#13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2] torvalds#14 generic_file_direct_write at ffffffff8c1dcf14 torvalds#15 __generic_file_write_iter at ffffffff8c1dd07b torvalds#16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2] torvalds#17 aio_write at ffffffff8c2cc72e torvalds#18 kmem_cache_alloc at ffffffff8c248dde torvalds#19 do_io_submit at ffffffff8c2ccada torvalds#20 do_syscall_64 at ffffffff8c004984 torvalds#21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba Link: https://lkml.kernel.org/r/20240617095543.6971-1-jack@suse.cz Link: https://lkml.kernel.org/r/20240614145243.8837-1-jack@suse.cz Fixes: c15471f ("ocfs2: fix sparse file & data ordering issue in direct io") Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Heming Zhao <heming.zhao@suse.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

The dead lock can happen if we try to use printk(), such as a call of SCHED_WARN_ON(), during the rq->__lock is held. The printk() will try to print the message to the console, and the console driver can call queue_work_on(), which will try to obtain rq->__lock again. This means that any WARN during the kernel function that hold the rq->__lock, such as schedule(), sched_ttwu_pending(), etc, can cause dead lock. Following is the call trace of the deadlock case that I encounter: PID: 0 TASK: ff36bfda010c8000 CPU: 156 COMMAND: "swapper/156" #0 crash_nmi_callback+30 #1 nmi_handle+85 #2 default_do_nmi+66 #3 exc_nmi+291 #4 end_repeat_nmi+22 [exception RIP: native_queued_spin_lock_slowpath+96] #5 native_queued_spin_lock_slowpath+96 torvalds#6 _raw_spin_lock+30 torvalds#7 ttwu_queue+111 torvalds#8 try_to_wake_up+375 torvalds#9 __queue_work+462 torvalds#10 queue_work_on+32 torvalds#11 soft_cursor+420 torvalds#12 bit_cursor+898 torvalds#13 hide_cursor+39 torvalds#14 vt_console_print+995 torvalds#15 call_console_drivers.constprop.0+204 torvalds#16 console_unlock+374 torvalds#17 vprintk_emit+280 torvalds#18 printk+88 torvalds#19 __warn_printk+71 torvalds#20 enqueue_task_fair+1779 torvalds#21 activate_task+102 torvalds#22 ttwu_do_activate+155 torvalds#23 sched_ttwu_pending+177 torvalds#24 flush_smp_call_function_from_idle+42 torvalds#25 do_idle+161 torvalds#26 cpu_startup_entry+25 torvalds#27 secondary_startup_64_no_verify+194 Fix this by using __printk_safe_enter()/__printk_safe_exit() in rq_pin_lock()/rq_unpin_lock(). Then, printk will defer to print out the buffers to the console. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Signed-off-by: Bin Lai <laib2@chinatelecom.cn>

iter_finish_branch_entry doesn't put the branch_info from/to map elements creating memory leaks. This can be seen with: ``` $ perf record -e cycles -b perf test -w noploop $ perf report -D ... Direct leak of 984344 byte(s) in 123043 object(s) allocated from: #0 0x7fb2654f3bd7 in malloc libsanitizer/asan/asan_malloc_linux.cpp:69 #1 0x564d3400d10b in map__get util/map.h:186 #2 0x564d3400d10b in ip__resolve_ams util/machine.c:1981 #3 0x564d34014d81 in sample__resolve_bstack util/machine.c:2151 #4 0x564d34094790 in iter_prepare_branch_entry util/hist.c:898 #5 0x564d34098fa4 in hist_entry_iter__add util/hist.c:1238 torvalds#6 0x564d33d1f0c7 in process_sample_event tools/perf/builtin-report.c:334 torvalds#7 0x564d34031eb7 in perf_session__deliver_event util/session.c:1655 torvalds#8 0x564d3403ba52 in do_flush util/ordered-events.c:245 torvalds#9 0x564d3403ba52 in __ordered_events__flush util/ordered-events.c:324 torvalds#10 0x564d3402d32e in perf_session__process_user_event util/session.c:1708 torvalds#11 0x564d34032480 in perf_session__process_event util/session.c:1877 torvalds#12 0x564d340336ad in reader__read_event util/session.c:2399 torvalds#13 0x564d34033fdc in reader__process_events util/session.c:2448 torvalds#14 0x564d34033fdc in __perf_session__process_events util/session.c:2495 torvalds#15 0x564d34033fdc in perf_session__process_events util/session.c:2661 torvalds#16 0x564d33d27113 in __cmd_report tools/perf/builtin-report.c:1065 torvalds#17 0x564d33d27113 in cmd_report tools/perf/builtin-report.c:1805 torvalds#18 0x564d33e0ccb7 in run_builtin tools/perf/perf.c:350 torvalds#19 0x564d33e0d45e in handle_internal_command tools/perf/perf.c:403 torvalds#20 0x564d33cdd827 in run_argv tools/perf/perf.c:447 torvalds#21 0x564d33cdd827 in main tools/perf/perf.c:561 ... ``` Clearing up the map_symbols properly creates maps reference count issues so resolve those. Resolving this issue doesn't improve peak heap consumption for the test above. Signed-off-by: Ian Rogers <irogers@google.com>

iter_finish_branch_entry() doesn't put the branch_info from/to map elements creating memory leaks. This can be seen with: ``` $ perf record -e cycles -b perf test -w noploop $ perf report -D ... Direct leak of 984344 byte(s) in 123043 object(s) allocated from: #0 0x7fb2654f3bd7 in malloc libsanitizer/asan/asan_malloc_linux.cpp:69 #1 0x564d3400d10b in map__get util/map.h:186 #2 0x564d3400d10b in ip__resolve_ams util/machine.c:1981 #3 0x564d34014d81 in sample__resolve_bstack util/machine.c:2151 #4 0x564d34094790 in iter_prepare_branch_entry util/hist.c:898 #5 0x564d34098fa4 in hist_entry_iter__add util/hist.c:1238 torvalds#6 0x564d33d1f0c7 in process_sample_event tools/perf/builtin-report.c:334 torvalds#7 0x564d34031eb7 in perf_session__deliver_event util/session.c:1655 torvalds#8 0x564d3403ba52 in do_flush util/ordered-events.c:245 torvalds#9 0x564d3403ba52 in __ordered_events__flush util/ordered-events.c:324 torvalds#10 0x564d3402d32e in perf_session__process_user_event util/session.c:1708 torvalds#11 0x564d34032480 in perf_session__process_event util/session.c:1877 torvalds#12 0x564d340336ad in reader__read_event util/session.c:2399 torvalds#13 0x564d34033fdc in reader__process_events util/session.c:2448 torvalds#14 0x564d34033fdc in __perf_session__process_events util/session.c:2495 torvalds#15 0x564d34033fdc in perf_session__process_events util/session.c:2661 torvalds#16 0x564d33d27113 in __cmd_report tools/perf/builtin-report.c:1065 torvalds#17 0x564d33d27113 in cmd_report tools/perf/builtin-report.c:1805 torvalds#18 0x564d33e0ccb7 in run_builtin tools/perf/perf.c:350 torvalds#19 0x564d33e0d45e in handle_internal_command tools/perf/perf.c:403 torvalds#20 0x564d33cdd827 in run_argv tools/perf/perf.c:447 torvalds#21 0x564d33cdd827 in main tools/perf/perf.c:561 ... ``` Clearing up the map_symbols properly creates maps reference count issues so resolve those. Resolving this issue doesn't improve peak heap consumption for the test above. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Yanteng Si <siyanteng@loongson.cn> Link: https://lore.kernel.org/r/20240807065136.1039977-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

iter_finish_branch_entry() doesn't put the branch_info from/to map elements creating memory leaks. This can be seen with: ``` $ perf record -e cycles -b perf test -w noploop $ perf report -D ... Direct leak of 984344 byte(s) in 123043 object(s) allocated from: #0 0x7fb2654f3bd7 in malloc libsanitizer/asan/asan_malloc_linux.cpp:69 #1 0x564d3400d10b in map__get util/map.h:186 #2 0x564d3400d10b in ip__resolve_ams util/machine.c:1981 #3 0x564d34014d81 in sample__resolve_bstack util/machine.c:2151 #4 0x564d34094790 in iter_prepare_branch_entry util/hist.c:898 #5 0x564d34098fa4 in hist_entry_iter__add util/hist.c:1238 torvalds#6 0x564d33d1f0c7 in process_sample_event tools/perf/builtin-report.c:334 torvalds#7 0x564d34031eb7 in perf_session__deliver_event util/session.c:1655 torvalds#8 0x564d3403ba52 in do_flush util/ordered-events.c:245 torvalds#9 0x564d3403ba52 in __ordered_events__flush util/ordered-events.c:324 torvalds#10 0x564d3402d32e in perf_session__process_user_event util/session.c:1708 torvalds#11 0x564d34032480 in perf_session__process_event util/session.c:1877 torvalds#12 0x564d340336ad in reader__read_event util/session.c:2399 torvalds#13 0x564d34033fdc in reader__process_events util/session.c:2448 torvalds#14 0x564d34033fdc in __perf_session__process_events util/session.c:2495 torvalds#15 0x564d34033fdc in perf_session__process_events util/session.c:2661 torvalds#16 0x564d33d27113 in __cmd_report tools/perf/builtin-report.c:1065 torvalds#17 0x564d33d27113 in cmd_report tools/perf/builtin-report.c:1805 torvalds#18 0x564d33e0ccb7 in run_builtin tools/perf/perf.c:350 torvalds#19 0x564d33e0d45e in handle_internal_command tools/perf/perf.c:403 torvalds#20 0x564d33cdd827 in run_argv tools/perf/perf.c:447 torvalds#21 0x564d33cdd827 in main tools/perf/perf.c:561 ... ``` Clearing up the map_symbols properly creates maps reference count issues so resolve those. Resolving this issue doesn't improve peak heap consumption for the test above. Committer testing: $ sudo dnf install libasan $ make -k CORESIGHT=1 EXTRA_CFLAGS="-fsanitize=address" CC=clang O=/tmp/build/$(basename $PWD)/ -C tools/perf install-bin Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Yanteng Si <siyanteng@loongson.cn> Link: https://lore.kernel.org/r/20240807065136.1039977-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

The code in ocfs2_dio_end_io_write() estimates number of necessary transaction credits using ocfs2_calc_extend_credits(). This however does not take into account that the IO could be arbitrarily large and can contain arbitrary number of extents. Extent tree manipulations do often extend the current transaction but not in all of the cases. For example if we have only single block extents in the tree, ocfs2_mark_extent_written() will end up calling ocfs2_replace_extent_rec() all the time and we will never extend the current transaction and eventually exhaust all the transaction credits if the IO contains many single block extents. Once that happens a WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to this error. This was actually triggered by one of our customers on a heavily fragmented OCFS2 filesystem. To fix the issue make sure the transaction always has enough credits for one extent insert before each call of ocfs2_mark_extent_written(). Heming Zhao said: ------ PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error" PID: xxx TASK: xxxx CPU: 5 COMMAND: "SubmitThread-CA" #0 machine_kexec at ffffffff8c069932 #1 __crash_kexec at ffffffff8c1338fa #2 panic at ffffffff8c1d69b9 #3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2] #4 __ocfs2_abort at ffffffffc0c88387 [ocfs2] #5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2] torvalds#6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2] torvalds#7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2] torvalds#8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2] torvalds#9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2] torvalds#10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2] torvalds#11 dio_complete at ffffffff8c2b9fa7 torvalds#12 do_blockdev_direct_IO at ffffffff8c2bc09f torvalds#13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2] torvalds#14 generic_file_direct_write at ffffffff8c1dcf14 torvalds#15 __generic_file_write_iter at ffffffff8c1dd07b torvalds#16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2] torvalds#17 aio_write at ffffffff8c2cc72e torvalds#18 kmem_cache_alloc at ffffffff8c248dde torvalds#19 do_io_submit at ffffffff8c2ccada torvalds#20 do_syscall_64 at ffffffff8c004984 torvalds#21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba Link: https://lkml.kernel.org/r/20240617095543.6971-1-jack@suse.cz Link: https://lkml.kernel.org/r/20240614145243.8837-1-jack@suse.cz Fixes: c15471f ("ocfs2: fix sparse file & data ordering issue in direct io") Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Heming Zhao <heming.zhao@suse.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

commit be346c1 upstream. The code in ocfs2_dio_end_io_write() estimates number of necessary transaction credits using ocfs2_calc_extend_credits(). This however does not take into account that the IO could be arbitrarily large and can contain arbitrary number of extents. Extent tree manipulations do often extend the current transaction but not in all of the cases. For example if we have only single block extents in the tree, ocfs2_mark_extent_written() will end up calling ocfs2_replace_extent_rec() all the time and we will never extend the current transaction and eventually exhaust all the transaction credits if the IO contains many single block extents. Once that happens a WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to this error. This was actually triggered by one of our customers on a heavily fragmented OCFS2 filesystem. To fix the issue make sure the transaction always has enough credits for one extent insert before each call of ocfs2_mark_extent_written(). Heming Zhao said: ------ PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error" PID: xxx TASK: xxxx CPU: 5 COMMAND: "SubmitThread-CA" #0 machine_kexec at ffffffff8c069932 #1 __crash_kexec at ffffffff8c1338fa #2 panic at ffffffff8c1d69b9 #3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2] #4 __ocfs2_abort at ffffffffc0c88387 [ocfs2] #5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2] torvalds#6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2] torvalds#7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2] torvalds#8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2] torvalds#9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2] torvalds#10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2] torvalds#11 dio_complete at ffffffff8c2b9fa7 torvalds#12 do_blockdev_direct_IO at ffffffff8c2bc09f torvalds#13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2] torvalds#14 generic_file_direct_write at ffffffff8c1dcf14 torvalds#15 __generic_file_write_iter at ffffffff8c1dd07b torvalds#16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2] torvalds#17 aio_write at ffffffff8c2cc72e torvalds#18 kmem_cache_alloc at ffffffff8c248dde torvalds#19 do_io_submit at ffffffff8c2ccada torvalds#20 do_syscall_64 at ffffffff8c004984 torvalds#21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba Link: https://lkml.kernel.org/r/20240617095543.6971-1-jack@suse.cz Link: https://lkml.kernel.org/r/20240614145243.8837-1-jack@suse.cz Fixes: c15471f ("ocfs2: fix sparse file & data ordering issue in direct io") Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Heming Zhao <heming.zhao@suse.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

This is against the 4.16 kernel, likely applies to later kernels as well. Firmware is beta ath10k-ct firmware for 9984 NIC. The patch is not firmware or chipset specific. When firmware crashes, packets can still be sent from the mac80211 stack, and that can cause crashes in the ath10k tx path. After adding this patch, I saw cases where the tx path was called in state ATH10K_STATE_RESTARTED. I have not tested the tx_64 path, but assume it has similar issues, so same patch was added to it. Here is example of crash without the patch: The line that crashes decodes as: (gdb) l *(ath10k_htt_tx_32+0x18ba) 0x74a9a is in ath10k_htt_tx_32 (/home/greearb/git/linux-4.16.dev.y/drivers/net/wireless/ath/ath10k/htt_tx.c:1257). 1252 sizeof(struct htt_msdu_ext_desc)); 1253 frags = (struct htt_data_tx_desc_frag *) 1254 &ext_desc_t[msdu_id].frags; 1255 ext_desc = &ext_desc_t[msdu_id]; 1256 frags[0].tword_addr.paddr_lo = 1257 __cpu_to_le32(skb_cb->paddr); 1258 frags[0].tword_addr.paddr_hi = 0; 1259 frags[0].tword_addr.len_16 = __cpu_to_le16(msdu->len); 1260 1261 frags_paddr = htt->frag_desc.paddr + ath10k_pci 0000:04:00.0: ATH10K_END ath10k_pci 0000:04:00.0: firmware crashed! (guid 033040e0-a2e0-499c-b2a2-3e06832c649e) ath10k_pci 0000:04:00.0: firmware register dump: ath10k_pci 0000:04:00.0: [00]: 0x0000000A 0x00000000 0x00000000 0x00000000 ... [snipped rest of crash dump for brevity] ... ath10k_pci 0000:04:00.0: wmi unified ready event not received ath10k_pci 0000:04:00.0: Could not init core: -110 ================================================================== BUG: KASAN: null-ptr-deref in ath10k_htt_tx_32+0x18ba/0x2b00 [ath10k_core] Write of size 64 at addr 0000000000000000 by task kworker/u8:2/5115 ================================================================== BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 IP: memset_erms+0x9/0x10 PGD 0 P4D 0 Oops: 0002 [#1] PREEMPT SMP KASAN PTI Modules linked in: ath10k_pci ath10k_core rpcsec_gss_krb5 nfsv4 nfs fscache nf_conntrack_netlink nf_conntrack nfnetlink nf_defrag_ipv4 libcrc32c vrf 8021q garp mrp stp llc fuse macvlan pktgen lm78 hwmon_vid iTCO_wdt iTCO_vendor_support coretemp intel_rapl x86_pkg_temp_thermal intel_powerclamp kvm_intel ath snd_hda_codec_hdmi kvm snd_hda_intel irqbypass mac80211 snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device cfg80211 snd_pcm i2c_i801 snd_timer snd shpchp soundcore mei_wdt intel_pch_thermal acpi_pad nfsd auth_rpcgss nfs_acl lockd grace sunrpc sch_fq_codel serio_raw e1000e i915 igb hwmon dca i2c_algo_bit drm_kms_helper drm i2c_core video ipv6 crc_ccitt [last unloaded: ath10k_core] CPU: 0 PID: 5115 Comm: kworker/u8:2 Tainted: G B W 4.16.15+ torvalds#19 Hardware name: _ _/, BIOS 5.11 08/26/2016 Workqueue: phy3 ieee80211_beacon_connection_loss_work [mac80211] RIP: 0010:memset_erms+0x9/0x10 RSP: 0018:ffff8801489af828 EFLAGS: 00010292 RAX: ffff880146920000 RBX: ffff88014a08d040 RCX: 0000000000000040 RDX: 0000000000000040 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff8801121f4fa8 R08: 0000000000000000 R09: 0000000000000000 R10: ffff8801489af630 R11: 1ffff10029135e88 R12: ffff8801121f44a0 R13: 0000000000000000 R14: 0000000000000000 R15: 00000000ffec0000 FS: 0000000000000000(0000) GS:ffff88014de00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000003a14005 CR4: 00000000003606f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ath10k_htt_tx_32+0x18ba/0x2b00 [ath10k_core] ? ath10k_htt_tx_free_msdu_id+0xc0/0xc0 [ath10k_core] ? invoke_tx_handlers_late+0x2340/0x2340 [mac80211] ath10k_mac_tx+0xd7c/0x1680 [ath10k_core] ath10k_mac_tx_push_txq+0x1a2/0x3e0 [ath10k_core] ath10k_mac_op_wake_tx_queue+0x2fa/0x4e0 [ath10k_core] ieee80211_queue_skb+0x7cf/0xfa0 [mac80211] ieee80211_tx+0x259/0x330 [mac80211] ? ieee80211_tx_prepare_skb+0x3f0/0x3f0 [mac80211] ? ieee80211_xmit+0x26b/0x520 [mac80211] __ieee80211_tx_skb_tid_band+0x1e6/0x290 [mac80211] ieee80211_send_nullfunc+0x223/0x3f0 [mac80211] ieee80211_mgd_probe_ap_send+0x1af/0x4b0 [mac80211] ieee80211_mgd_probe_ap.part.22+0x28d/0x380 [mac80211] process_one_work+0x5f7/0x14d0 ? pwq_dec_nr_in_flight+0x2b0/0x2b0 ? _raw_spin_unlock_irq+0x24/0x40 worker_thread+0xdc/0x12d0 ? rescuer_thread+0x12b0/0x12b0 kthread+0x2cf/0x3c0 ? kthread_delayed_work_timer_fn+0x1e0/0x1e0 ret_from_fork+0x24/0x30 ce 48 b8 01 01 01 01 01 RIP: memset_erms+0x9/0x10 RSP: ffff8801489af828 CR2: 0000000000000000 ---[ end trace 5e24737a5c492997 ]--- Kernel panic - not syncing: Fatal exception in interrupt Signed-off-by: Ben Greear <greearb@candelatech.com>

@offset

The action force umount(umount -f) will attempt to kill all rpc_task even umount operation may ultimately fail if some files remain open. Consequently, if an action attempts to open a file, it can potentially send two rpc_task to nfs server. NFS CLIENT thread1 thread2 open("file") ... nfs4_do_open _nfs4_do_open _nfs4_open_and_get_state _nfs4_proc_open nfs4_run_open_task /* rpc_task1 */ rpc_run_task rpc_wait_for_completion_task umount -f nfs_umount_begin rpc_killall_tasks rpc_signal_task rpc_task1 been wakeup and return -512 _nfs4_do_open // while loop ... nfs4_run_open_task /* rpc_task2 */ rpc_run_task rpc_wait_for_completion_task While processing an open request, nfsd will first attempt to find or allocate an nfs4_openowner. If it finds an nfs4_openowner that is not marked as NFS4_OO_CONFIRMED, this nfs4_openowner will released. Since two rpc_task can attempt to open the same file simultaneously from the client to server, and because two instances of nfsd can run concurrently, this situation can lead to lots of memory leak. Additionally, when we echo 0 to /proc/fs/nfsd/threads, warning will be triggered. NFS SERVER nfsd1 nfsd2 echo 0 > /proc/fs/nfsd/threads nfsd4_open nfsd4_process_open1 find_or_alloc_open_stateowner // alloc oo1, stateid1 nfsd4_open nfsd4_process_open1 find_or_alloc_open_stateowner // find oo1, without NFS4_OO_CONFIRMED release_openowner unhash_openowner_locked list_del_init(&oo->oo_perclient) // cannot find this oo // from client, LEAK!!! alloc_stateowner // alloc oo2 nfsd4_process_open2 init_open_stateid // associate oo1 // with stateid1, stateid1 LEAK!!! nfs4_get_vfs_file // alloc nfsd_file1 and nfsd_file_mark1 // all LEAK!!! nfsd4_process_open2 ... write_threads ... nfsd_destroy_serv nfsd_shutdown_net nfs4_state_shutdown_net nfs4_state_destroy_net destroy_client __destroy_client // won't find oo1!!! nfsd_shutdown_generic nfsd_file_cache_shutdown kmem_cache_destroy for nfsd_file_slab and nfsd_file_mark_slab // bark since nfsd_file1 // and nfsd_file_mark1 // still alive ======================================================================= BUG nfsd_file (Not tainted): Objects remaining in nfsd_file on __kmem_cache_shutdown() ----------------------------------------------------------------------- Slab 0xffd4000004438a80 objects=34 used=1 fp=0xff11000110e2ad28 flags=0x17ffffc0000240(workingset|head|node=0|zone=2|lastcpupid=0x1fffff) CPU: 4 UID: 0 PID: 757 Comm: sh Not tainted 6.12.0-rc6+ torvalds#19 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x53/0x70 slab_err+0xb0/0xf0 __kmem_cache_shutdown+0x15c/0x310 kmem_cache_destroy+0x66/0x160 nfsd_file_cache_shutdown+0xac/0x210 [nfsd] nfsd_destroy_serv+0x251/0x2a0 [nfsd] nfsd_svc+0x125/0x1e0 [nfsd] write_threads+0x16a/0x2a0 [nfsd] nfsctl_transaction_write+0x74/0xa0 [nfsd] vfs_write+0x1ae/0x6d0 ksys_write+0xc1/0x160 do_syscall_64+0x5f/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e Disabling lock debugging due to kernel taint Object 0xff11000110e2ac38 @offset=3128 Allocated in nfsd_file_do_acquire+0x20f/0xa30 [nfsd] age=1635 cpu=3 pid=800 nfsd_file_do_acquire+0x20f/0xa30 [nfsd] nfsd_file_acquire_opened+0x5f/0x90 [nfsd] nfs4_get_vfs_file+0x4c9/0x570 [nfsd] nfsd4_process_open2+0x713/0x1070 [nfsd] nfsd4_open+0x74b/0x8b0 [nfsd] nfsd4_proc_compound+0x70b/0xc20 [nfsd] nfsd_dispatch+0x1b4/0x3a0 [nfsd] svc_process_common+0x5b8/0xc50 [sunrpc] svc_process+0x2ab/0x3b0 [sunrpc] svc_handle_xprt+0x681/0xa20 [sunrpc] nfsd+0x183/0x220 [nfsd] kthread+0x199/0x1e0 ret_from_fork+0x31/0x60 ret_from_fork_asm+0x1a/0x30 Add nfs4_openowner_unhashed to help found unhashed nfs4_openowner, and break nfsd4_open process to fix this problem. Cc: stable@vger.kernel.org # 2.6 Signed-off-by: Yang Erkun <yangerkun@huawei.com>

@offset

The action force umount(umount -f) will attempt to kill all rpc_task even umount operation may ultimately fail if some files remain open. Consequently, if an action attempts to open a file, it can potentially send two rpc_task to nfs server. NFS CLIENT thread1 thread2 open("file") ... nfs4_do_open _nfs4_do_open _nfs4_open_and_get_state _nfs4_proc_open nfs4_run_open_task /* rpc_task1 */ rpc_run_task rpc_wait_for_completion_task umount -f nfs_umount_begin rpc_killall_tasks rpc_signal_task rpc_task1 been wakeup and return -512 _nfs4_do_open // while loop ... nfs4_run_open_task /* rpc_task2 */ rpc_run_task rpc_wait_for_completion_task While processing an open request, nfsd will first attempt to find or allocate an nfs4_openowner. If it finds an nfs4_openowner that is not marked as NFS4_OO_CONFIRMED, this nfs4_openowner will released. Since two rpc_task can attempt to open the same file simultaneously from the client to server, and because two instances of nfsd can run concurrently, this situation can lead to lots of memory leak. Additionally, when we echo 0 to /proc/fs/nfsd/threads, warning will be triggered. NFS SERVER nfsd1 nfsd2 echo 0 > /proc/fs/nfsd/threads nfsd4_open nfsd4_process_open1 find_or_alloc_open_stateowner // alloc oo1, stateid1 nfsd4_open nfsd4_process_open1 find_or_alloc_open_stateowner // find oo1, without NFS4_OO_CONFIRMED release_openowner unhash_openowner_locked list_del_init(&oo->oo_perclient) // cannot find this oo // from client, LEAK!!! alloc_stateowner // alloc oo2 nfsd4_process_open2 init_open_stateid // associate oo1 // with stateid1, stateid1 LEAK!!! nfs4_get_vfs_file // alloc nfsd_file1 and nfsd_file_mark1 // all LEAK!!! nfsd4_process_open2 ... write_threads ... nfsd_destroy_serv nfsd_shutdown_net nfs4_state_shutdown_net nfs4_state_destroy_net destroy_client __destroy_client // won't find oo1!!! nfsd_shutdown_generic nfsd_file_cache_shutdown kmem_cache_destroy for nfsd_file_slab and nfsd_file_mark_slab // bark since nfsd_file1 // and nfsd_file_mark1 // still alive ======================================================================= BUG nfsd_file (Not tainted): Objects remaining in nfsd_file on __kmem_cache_shutdown() ----------------------------------------------------------------------- Slab 0xffd4000004438a80 objects=34 used=1 fp=0xff11000110e2ad28 flags=0x17ffffc0000240(workingset|head|node=0|zone=2|lastcpupid=0x1fffff) CPU: 4 UID: 0 PID: 757 Comm: sh Not tainted 6.12.0-rc6+ torvalds#19 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x53/0x70 slab_err+0xb0/0xf0 __kmem_cache_shutdown+0x15c/0x310 kmem_cache_destroy+0x66/0x160 nfsd_file_cache_shutdown+0xac/0x210 [nfsd] nfsd_destroy_serv+0x251/0x2a0 [nfsd] nfsd_svc+0x125/0x1e0 [nfsd] write_threads+0x16a/0x2a0 [nfsd] nfsctl_transaction_write+0x74/0xa0 [nfsd] vfs_write+0x1ae/0x6d0 ksys_write+0xc1/0x160 do_syscall_64+0x5f/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e Disabling lock debugging due to kernel taint Object 0xff11000110e2ac38 @offset=3128 Allocated in nfsd_file_do_acquire+0x20f/0xa30 [nfsd] age=1635 cpu=3 pid=800 nfsd_file_do_acquire+0x20f/0xa30 [nfsd] nfsd_file_acquire_opened+0x5f/0x90 [nfsd] nfs4_get_vfs_file+0x4c9/0x570 [nfsd] nfsd4_process_open2+0x713/0x1070 [nfsd] nfsd4_open+0x74b/0x8b0 [nfsd] nfsd4_proc_compound+0x70b/0xc20 [nfsd] nfsd_dispatch+0x1b4/0x3a0 [nfsd] svc_process_common+0x5b8/0xc50 [sunrpc] svc_process+0x2ab/0x3b0 [sunrpc] svc_handle_xprt+0x681/0xa20 [sunrpc] nfsd+0x183/0x220 [nfsd] kthread+0x199/0x1e0 ret_from_fork+0x31/0x60 ret_from_fork_asm+0x1a/0x30 Add nfs4_openowner_unhashed to help found unhashed nfs4_openowner, and break nfsd4_open process to fix this problem. Cc: stable@vger.kernel.org # 2.6 Signed-off-by: Yang Erkun <yangerkun@huawei.com>

@offset

The action force umount(umount -f) will attempt to kill all rpc_task even umount operation may ultimately fail if some files remain open. Consequently, if an action attempts to open a file, it can potentially send two rpc_task to nfs server. NFS CLIENT thread1 thread2 open("file") ... nfs4_do_open _nfs4_do_open _nfs4_open_and_get_state _nfs4_proc_open nfs4_run_open_task /* rpc_task1 */ rpc_run_task rpc_wait_for_completion_task umount -f nfs_umount_begin rpc_killall_tasks rpc_signal_task rpc_task1 been wakeup and return -512 _nfs4_do_open // while loop ... nfs4_run_open_task /* rpc_task2 */ rpc_run_task rpc_wait_for_completion_task While processing an open request, nfsd will first attempt to find or allocate an nfs4_openowner. If it finds an nfs4_openowner that is not marked as NFS4_OO_CONFIRMED, this nfs4_openowner will released. Since two rpc_task can attempt to open the same file simultaneously from the client to server, and because two instances of nfsd can run concurrently, this situation can lead to lots of memory leak. Additionally, when we echo 0 to /proc/fs/nfsd/threads, warning will be triggered. NFS SERVER nfsd1 nfsd2 echo 0 > /proc/fs/nfsd/threads nfsd4_open nfsd4_process_open1 find_or_alloc_open_stateowner // alloc oo1, stateid1 nfsd4_open nfsd4_process_open1 find_or_alloc_open_stateowner // find oo1, without NFS4_OO_CONFIRMED release_openowner unhash_openowner_locked list_del_init(&oo->oo_perclient) // cannot find this oo // from client, LEAK!!! alloc_stateowner // alloc oo2 nfsd4_process_open2 init_open_stateid // associate oo1 // with stateid1, stateid1 LEAK!!! nfs4_get_vfs_file // alloc nfsd_file1 and nfsd_file_mark1 // all LEAK!!! nfsd4_process_open2 ... write_threads ... nfsd_destroy_serv nfsd_shutdown_net nfs4_state_shutdown_net nfs4_state_destroy_net destroy_client __destroy_client // won't find oo1!!! nfsd_shutdown_generic nfsd_file_cache_shutdown kmem_cache_destroy for nfsd_file_slab and nfsd_file_mark_slab // bark since nfsd_file1 // and nfsd_file_mark1 // still alive ======================================================================= BUG nfsd_file (Not tainted): Objects remaining in nfsd_file on __kmem_cache_shutdown() ----------------------------------------------------------------------- Slab 0xffd4000004438a80 objects=34 used=1 fp=0xff11000110e2ad28 flags=0x17ffffc0000240(workingset|head|node=0|zone=2|lastcpupid=0x1fffff) CPU: 4 UID: 0 PID: 757 Comm: sh Not tainted 6.12.0-rc6+ torvalds#19 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x53/0x70 slab_err+0xb0/0xf0 __kmem_cache_shutdown+0x15c/0x310 kmem_cache_destroy+0x66/0x160 nfsd_file_cache_shutdown+0xac/0x210 [nfsd] nfsd_destroy_serv+0x251/0x2a0 [nfsd] nfsd_svc+0x125/0x1e0 [nfsd] write_threads+0x16a/0x2a0 [nfsd] nfsctl_transaction_write+0x74/0xa0 [nfsd] vfs_write+0x1ae/0x6d0 ksys_write+0xc1/0x160 do_syscall_64+0x5f/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e Disabling lock debugging due to kernel taint Object 0xff11000110e2ac38 @offset=3128 Allocated in nfsd_file_do_acquire+0x20f/0xa30 [nfsd] age=1635 cpu=3 pid=800 nfsd_file_do_acquire+0x20f/0xa30 [nfsd] nfsd_file_acquire_opened+0x5f/0x90 [nfsd] nfs4_get_vfs_file+0x4c9/0x570 [nfsd] nfsd4_process_open2+0x713/0x1070 [nfsd] nfsd4_open+0x74b/0x8b0 [nfsd] nfsd4_proc_compound+0x70b/0xc20 [nfsd] nfsd_dispatch+0x1b4/0x3a0 [nfsd] svc_process_common+0x5b8/0xc50 [sunrpc] svc_process+0x2ab/0x3b0 [sunrpc] svc_handle_xprt+0x681/0xa20 [sunrpc] nfsd+0x183/0x220 [nfsd] kthread+0x199/0x1e0 ret_from_fork+0x31/0x60 ret_from_fork_asm+0x1a/0x30 Add nfs4_openowner_unhashed to help found unhashed nfs4_openowner, and break nfsd4_open process to fix this problem. Cc: stable@vger.kernel.org # 2.6 Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Yang Erkun <yangerkun@huawei.com>

…itmem() on RV32 platform. 1. Symptom: [ 44.486537] Unable to handle kernel paging request at virtual address c0800000 [ 44.509980] Oops [#1] [ 44.516975] Modules linked in: [ 44.526260] CPU: 0 PID: 1 Comm: swapper Not tainted 6.1.27-05153-g45f6a9286550-dirty torvalds#19 [ 44.550422] Hardware name: andestech,a45 (DT) [ 44.563473] epc : __memset+0x58/0xf4 [ 44.574353] ra : free_reserved_area+0xb0/0x1a4 [ 44.588144] epc : c05d4ca0 ra : c011f32c sp : c2c61f00 [ 44.603536] gp : c28a57c8 tp : c2c98000 t0 : c0800000 [ 44.618916] t1 : 07901b48 t2 : 0000000f s0 : c2c61f50 [ 44.634308] s1 : 00000001 a0 : c0800000 a1 : cccccccc [ 44.649696] a2 : 00001000 a3 : c0801000 a4 : 00000000 [ 44.665085] a5 : 02000000 a6 : c0800fff a7 : 00000c08 [ 44.680467] s2 : 000000cc s3 : ffffffff s4 : 00000000 [ 44.695846] s5 : c28a66cc s6 : c1eba000 s7 : c212582 [ 44.711225] s8 : c0800000 s9 : c212583c s10: c28a6648 [ 44.726623] s11: fe03c7c0 t3 : acf917bf t4 : e0000000 [ 44.742009] t5 : c2ca0011 t6 : c2ca0016 [ 44.753789] status: 00000120 badaddr: c0800000 cause: 0000000f [ 44.771234] [<c05d4ca0>] __memset+0x58/0xf4 [ 44.783895] [<c0003e54>] free_initmem+0x80/0x88 [ 44.797599] [<c05dcd5c>] kernel_init+0x3c/0x124 [ 44.811391] [<c0003428>] ret_from_exception+0x0/0x16 2. To reproduce the problem: a. Use the RV32 toolchain to build the system. b. Build in the SPI module and mtdpart module in the kernel Example: Enable the following configuration - CONFIG_SPI - CONFIG_MTD and CONFIG_MTD_SPI_NOR c. Enable the "Make kernel text and rodata read-only" option by using the following kernel config. - CONFIG_STRICT_KERNEL_RWX 3. Root cause: This problem occurs when the virtual address of the kernel paging request is mapped to a megapage on the RV32 platform. During system startup, free_initmem() calls set_kernel_memory() to change the memory attributes of the init section from RO to RW. It then calls free_initmem_default() to set the memory to POISON_FREE_INITMEM. If the system runs modprobe at boot time, it will trigger a fork/exec to create a new mm for the new process. If the modprobe was called before free_initmem(), it will cause a kernel oops because the memory attributes of the current mm were not changed by set_kernel_memory(). This is because the set_kernel_memory() changes the memory attributes of init_mm, but the pgd(satp) currently in use is another process's mm and it's memory attribute doesn't change. Thus, it causes a kernel oops because the memory region has an un-writable attribute. 4. The solution. A similar problem occurred on ARM platforms and was fixed in 08925c2 (ARM: 8464/1: Update all mm structures with section adjustments). This patch uses a similar approach to fix the problem on RV32 by synchronizing the memory attributes of the init section for all mm Reviewed-on: https://gitea.andestech.com/RD-SW/linux/pulls/60 Signed-off-by: CL Wang <cl634@andestech.com>

@offset

The action force umount(umount -f) will attempt to kill all rpc_task even umount operation may ultimately fail if some files remain open. Consequently, if an action attempts to open a file, it can potentially send two rpc_task to nfs server. NFS CLIENT thread1 thread2 open("file") ... nfs4_do_open _nfs4_do_open _nfs4_open_and_get_state _nfs4_proc_open nfs4_run_open_task /* rpc_task1 */ rpc_run_task rpc_wait_for_completion_task umount -f nfs_umount_begin rpc_killall_tasks rpc_signal_task rpc_task1 been wakeup and return -512 _nfs4_do_open // while loop ... nfs4_run_open_task /* rpc_task2 */ rpc_run_task rpc_wait_for_completion_task While processing an open request, nfsd will first attempt to find or allocate an nfs4_openowner. If it finds an nfs4_openowner that is not marked as NFS4_OO_CONFIRMED, this nfs4_openowner will released. Since two rpc_task can attempt to open the same file simultaneously from the client to server, and because two instances of nfsd can run concurrently, this situation can lead to lots of memory leak. Additionally, when we echo 0 to /proc/fs/nfsd/threads, warning will be triggered. NFS SERVER nfsd1 nfsd2 echo 0 > /proc/fs/nfsd/threads nfsd4_open nfsd4_process_open1 find_or_alloc_open_stateowner // alloc oo1, stateid1 nfsd4_open nfsd4_process_open1 find_or_alloc_open_stateowner // find oo1, without NFS4_OO_CONFIRMED release_openowner unhash_openowner_locked list_del_init(&oo->oo_perclient) // cannot find this oo // from client, LEAK!!! alloc_stateowner // alloc oo2 nfsd4_process_open2 init_open_stateid // associate oo1 // with stateid1, stateid1 LEAK!!! nfs4_get_vfs_file // alloc nfsd_file1 and nfsd_file_mark1 // all LEAK!!! nfsd4_process_open2 ... write_threads ... nfsd_destroy_serv nfsd_shutdown_net nfs4_state_shutdown_net nfs4_state_destroy_net destroy_client __destroy_client // won't find oo1!!! nfsd_shutdown_generic nfsd_file_cache_shutdown kmem_cache_destroy for nfsd_file_slab and nfsd_file_mark_slab // bark since nfsd_file1 // and nfsd_file_mark1 // still alive ======================================================================= BUG nfsd_file (Not tainted): Objects remaining in nfsd_file on __kmem_cache_shutdown() ----------------------------------------------------------------------- Slab 0xffd4000004438a80 objects=34 used=1 fp=0xff11000110e2ad28 flags=0x17ffffc0000240(workingset|head|node=0|zone=2|lastcpupid=0x1fffff) CPU: 4 UID: 0 PID: 757 Comm: sh Not tainted 6.12.0-rc6+ torvalds#19 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x53/0x70 slab_err+0xb0/0xf0 __kmem_cache_shutdown+0x15c/0x310 kmem_cache_destroy+0x66/0x160 nfsd_file_cache_shutdown+0xac/0x210 [nfsd] nfsd_destroy_serv+0x251/0x2a0 [nfsd] nfsd_svc+0x125/0x1e0 [nfsd] write_threads+0x16a/0x2a0 [nfsd] nfsctl_transaction_write+0x74/0xa0 [nfsd] vfs_write+0x1ae/0x6d0 ksys_write+0xc1/0x160 do_syscall_64+0x5f/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e Disabling lock debugging due to kernel taint Object 0xff11000110e2ac38 @offset=3128 Allocated in nfsd_file_do_acquire+0x20f/0xa30 [nfsd] age=1635 cpu=3 pid=800 nfsd_file_do_acquire+0x20f/0xa30 [nfsd] nfsd_file_acquire_opened+0x5f/0x90 [nfsd] nfs4_get_vfs_file+0x4c9/0x570 [nfsd] nfsd4_process_open2+0x713/0x1070 [nfsd] nfsd4_open+0x74b/0x8b0 [nfsd] nfsd4_proc_compound+0x70b/0xc20 [nfsd] nfsd_dispatch+0x1b4/0x3a0 [nfsd] svc_process_common+0x5b8/0xc50 [sunrpc] svc_process+0x2ab/0x3b0 [sunrpc] svc_handle_xprt+0x681/0xa20 [sunrpc] nfsd+0x183/0x220 [nfsd] kthread+0x199/0x1e0 ret_from_fork+0x31/0x60 ret_from_fork_asm+0x1a/0x30 Add nfs4_openowner_unhashed to help found unhashed nfs4_openowner, and break nfsd4_open process to fix this problem. Cc: stable@vger.kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Yang Erkun <yangerkun@huawei.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

Update Documentation/ABI/obsolete/proc-pid-oom_adj

abf6eb7

vargheseapm closed this Jun 1, 2012

vineetgarc referenced this pull request in foss-for-synopsys-dwc-arc-processors/linux Jul 27, 2012

ARC: SMP resurrect #19: clockevent routines rename - nothing semantical

faf04d6

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>

vineetgarc referenced this pull request in foss-for-synopsys-dwc-arc-processors/linux Oct 31, 2012

ARC: SMP resurrect #19: clockevent routines rename - nothing semantical

a00eef6

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>

vineetgarc referenced this pull request in foss-for-synopsys-dwc-arc-processors/linux Dec 31, 2012

ARC: SMP resurrect #19: clockevent routines rename - nothing semantical

b0ce701

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update Documentation/ABI/obsolete/proc-pid-oom_adj #19

Update Documentation/ABI/obsolete/proc-pid-oom_adj #19

vargheseapm commented Jun 1, 2012

Update Documentation/ABI/obsolete/proc-pid-oom_adj #19

Update Documentation/ABI/obsolete/proc-pid-oom_adj #19

Conversation

vargheseapm commented Jun 1, 2012