Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Commits on Oct 12, 2012
  1. @gregkh

    Linux 3.4.14

    gregkh authored
  2. @gregkh

    sched: Fix migration thread runtime bogosity

    Mike Galbraith authored gregkh committed
    commit 8f61896 upstream.
    
    Make stop scheduler class do the same accounting as other classes,
    
    Migration threads can be caught in the act while doing exec balancing,
    leading to the below due to use of unmaintained ->se.exec_start.  The
    load that triggered this particular instance was an apparently out of
    control heavily threaded application that does system monitoring in
    what equated to an exec bomb, with one of the VERY frequently migrated
    tasks being ps.
    
    %CPU   PID USER     CMD
    99.3    45 root     [migration/10]
    97.7    53 root     [migration/12]
    97.0    57 root     [migration/13]
    90.1    49 root     [migration/11]
    89.6    65 root     [migration/15]
    88.7    17 root     [migration/3]
    80.4    37 root     [migration/8]
    78.1    41 root     [migration/9]
    44.2    13 root     [migration/2]
    
    Signed-off-by: Mike Galbraith <mgalbraith@suse.de>
    Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Link: http://lkml.kernel.org/r/1344051854.6739.19.camel@marge.simpson.net
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Cc: Steven Rostedt <rostedt@goodmis.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  3. @gregkh

    udf: fix retun value on error path in udf_load_logicalvol

    Nikola Pajkovsky authored gregkh committed
    commit 68766a2 upstream.
    
    In case we detect a problem and bail out, we fail to set "ret" to a
    nonzero value, and udf_load_logicalvol will mistakenly report success.
    
    Signed-off-by: Nikola Pajkovsky <npajkovs@redhat.com>
    Signed-off-by: Jan Kara <jack@suse.cz>
    Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  4. @freddy77 @gregkh

    Convert properly UTF-8 to UTF-16

    freddy77 authored gregkh committed
    commit fd3ba42 upstream.
    
    wchar_t is currently 16bit so converting a utf8 encoded characters not
    in plane 0 (>= 0x10000) to wchar_t (that is calling char2uni) lead to a
    -EINVAL return. This patch detect utf8 in cifs_strtoUTF16 and add special
    code calling utf8s_to_utf16s.
    
    Signed-off-by: Frediano Ziglio <frediano.ziglio@citrix.com>
    Acked-by: Jeff Layton <jlayton@redhat.com>
    Signed-off-by: Steve French <smfrench@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  5. @gregkh

    cifs: reinstate the forcegid option

    Jeff Layton authored gregkh committed
    commit 72bd481 upstream.
    
    Apparently this was lost when we converted to the standard option
    parser in 8830d7e
    
    Reported-by: Gregory Lee Bartholomew <gregory.lee.bartholomew@gmail.com>
    Cc: Sachin Prabhu <sprabhu@redhat.com>
    Signed-off-by: Jeff Layton <jlayton@redhat.com>
    Signed-off-by: Steve French <smfrench@gmail.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  6. @computersforpeace @gregkh

    JFFS2: don't fail on bitflips in OOB

    computersforpeace authored gregkh committed
    commit 74d83be upstream.
    
    JFFS2 was designed without thought for OOB bitflips, it seems, but they
    can occur and will be reported to JFFS2 via mtd_read_oob()[1]. We don't
    want to fail on these transactions, since the data was corrected.
    
    [1] Few drivers report bitflips for OOB-only transactions. With such
        drivers, this patch should have no effect.
    
    Signed-off-by: Brian Norris <computersforpeace@gmail.com>
    Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
    Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  7. @lyakh @gregkh

    mmc: sh-mmcif: avoid oops on spurious interrupts

    lyakh authored gregkh committed
    commit 8464dd5 upstream.
    
    On some systems, e.g., kzm9g, MMCIF interfaces can produce spurious
    interrupts without any active request. To prevent the Oops, that results
    in such cases, don't dereference the mmc request pointer until we make
    sure, that we are indeed processing such a request.
    
    Reported-by: Tetsuyuki Kobayashi <koba@kmckk.co.jp>
    Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
    Signed-off-by: Chris Ball <cjb@laptop.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  8. @VaibhavBedia-xx @gregkh

    mmc: omap_hsmmc: Pass on the suspend failure to the PM core

    VaibhavBedia-xx authored gregkh committed
    commit c4c8eeb upstream.
    
    In some cases mmc_suspend_host() is not able to claim the
    host and proceed with the suspend process. The core returns
    -EBUSY to the host controller driver. Unfortunately, the
    host controller driver does not pass on this information
    to the PM core and hence the system suspend process continues.
    
    	ret = mmc_suspend_host(host->mmc);
    	if (ret) {
    		host->suspended = 0;
    		if (host->pdata->resume) {
    			ret = host->pdata->resume(dev, host->slot_id);
    
    The return status from mmc_suspend_host() is overwritten by return
    status from host->pdata->resume. So the original return status is lost.
    
    In these cases the MMC core gets to an unexpected state
    during resume and multiple issues related to MMC crop up.
    1. Host controller driver starts accessing the device registers
    before the clocks are enabled which leads to a prefetch abort.
    2. A file copy thread which was launched before suspend gets
    stuck due to the host not being reclaimed during resume.
    
    To avoid such problems pass on the -EBUSY status to the PM core
    from the host controller driver. With this change, MMC core
    suspend might still fail but it does not end up making the
    system unusable. Suspend gets aborted and the user can try
    suspending the system again.
    
    Signed-off-by: Vaibhav Bedia <vaibhav.bedia@ti.com>
    Signed-off-by: Hebbar, Gururaja <gururaja.hebbar@ti.com>
    Acked-by: Venkatraman S <svenkatr@ti.com>
    Signed-off-by: Chris Ball <cjb@laptop.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  9. @gregkh

    mtd: omap2: fix module loading

    Andreas Bießmann authored gregkh committed
    commit 4d3d688 upstream.
    
    Unloading the omap2 nand driver missed to release the memory region which will
    result in not being able to request it again if one want to load the driver
    later on.
    
    This patch fixes following error when loading omap2 module after unloading:
    ---8<---
    ~ $ rmmod omap2
    ~ $ modprobe omap2
    [   37.420928] omap2-nand: probe of omap2-nand.0 failed with error -16
    ~ $
    --->8---
    
    This error was introduced in 67ce04b which
    was the first commit of this driver.
    
    Signed-off-by: Andreas Bießmann <andreas@biessmann.de>
    Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
    Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  10. @gregkh

    mtd: omap2: fix omap_nand_remove segfault

    Andreas Bießmann authored gregkh committed
    commit 7d9b110 upstream.
    
    Do not kfree() the mtd_info; it is handled in the mtd subsystem and
    already freed by nand_release(). Instead kfree() the struct
    omap_nand_info allocated in omap_nand_probe which was not freed before.
    
    This patch fixes following error when unloading the omap2 module:
    
    ---8<---
    ~ $ rmmod omap2
    ------------[ cut here ]------------
    kernel BUG at mm/slab.c:3126!
    Internal error: Oops - BUG: 0 [#1] PREEMPT ARM
    Modules linked in: omap2(-)
    CPU: 0    Not tainted  (3.6.0-rc3-00230-g155e36d-dirty #3)
    PC is at cache_free_debugcheck+0x2d4/0x36c
    LR is at kfree+0xc8/0x2ac
    pc : [<c01125a0>]    lr : [<c0112efc>]    psr: 200d0193
    sp : c521fe08  ip : c0e8ef90  fp : c521fe5c
    r10: bf0001fc  r9 : c521e000  r8 : c0d99c8c
    r7 : c661ebc0  r6 : c065d5a4  r5 : c65c4060  r4 : c78005c0
    r3 : 00000000  r2 : 00001000  r1 : c65c4000  r0 : 00000001
    Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
    Control: 10c5387d  Table: 86694019  DAC: 00000015
    Process rmmod (pid: 549, stack limit = 0xc521e2f0)
    Stack: (0xc521fe08 to 0xc5220000)
    fe00:                   c008a874 c00bf44c c515c6d0 200d0193 c65c4860 c515c240
    fe20: c521fe3c c521fe30 c008a9c0 c008a854 c521fe5c c65c4860 c78005c0 bf0001fc
    fe40: c780ff40 a00d0113 c521e000 00000000 c521fe84 c521fe60 c0112efc c01122d8
    fe60: c65c4860 c0673778 c06737ac 00000000 00070013 00000000 c521fe9c c521fe88
    fe80: bf0001fc c0112e40 c0673778 bf001ca8 c521feac c521fea0 c02ca11c bf0001ac
    fea0: c521fec4 c521feb0 c02c82c4 c02ca100 c0673778 bf001ca8 c521fee4 c521fec8
    fec0: c02c8dd8 c02c8250 00000000 bf001ca8 bf001ca8 c0804ee0 c521ff04 c521fee8
    fee0: c02c804c c02c8d20 bf001924 00000000 bf001ca8 c521e000 c521ff1c c521ff08
    ff00: c02c950c c02c7fbc bf001d48 00000000 c521ff2c c521ff20 c02ca3a4 c02c94b8
    ff20: c521ff3c c521ff30 bf001938 c02ca394 c521ffa4 c521ff40 c009beb4 bf001930
    ff40: c521ff6c 70616d6f b6fe0032 c0014f84 70616d6f b6fe0032 00000081 60070010
    ff60: c521ff84 c521ff70 c008e1f4 c00bf328 0001a004 70616d6f c521ff94 0021ff88
    ff80: c008e368 0001a004 70616d6f b6fe0032 00000081 c0015028 00000000 c521ffa8
    ffa0: c0014dc0 c009bcd0 0001a004 70616d6f bec2ab38 00000880 bec2ab38 00000880
    ffc0: 0001a004 70616d6f b6fe0032 00000081 00000319 00000000 b6fe1000 00000000
    ffe0: bec2ab30 bec2ab20 00019f00 b6f539c0 60070010 bec2ab38 aaaaaaaa aaaaaaaa
    Backtrace:
    [<c01122cc>] (cache_free_debugcheck+0x0/0x36c) from [<c0112efc>] (kfree+0xc8/0x2ac)
    [<c0112e34>] (kfree+0x0/0x2ac) from [<bf0001fc>] (omap_nand_remove+0x5c/0x64 [omap2])
    [<bf0001a0>] (omap_nand_remove+0x0/0x64 [omap2]) from [<c02ca11c>] (platform_drv_remove+0x28/0x2c)
     r5:bf001ca8 r4:c0673778
    [<c02ca0f4>] (platform_drv_remove+0x0/0x2c) from [<c02c82c4>] (__device_release_driver+0x80/0xdc)
    [<c02c8244>] (__device_release_driver+0x0/0xdc) from [<c02c8dd8>] (driver_detach+0xc4/0xc8)
     r5:bf001ca8 r4:c0673778
    [<c02c8d14>] (driver_detach+0x0/0xc8) from [<c02c804c>] (bus_remove_driver+0x9c/0x104)
     r6:c0804ee0 r5:bf001ca8 r4:bf001ca8 r3:00000000
    [<c02c7fb0>] (bus_remove_driver+0x0/0x104) from [<c02c950c>] (driver_unregister+0x60/0x80)
     r6:c521e000 r5:bf001ca8 r4:00000000 r3:bf001924
    [<c02c94ac>] (driver_unregister+0x0/0x80) from [<c02ca3a4>] (platform_driver_unregister+0x1c/0x20)
     r5:00000000 r4:bf001d48
    [<c02ca388>] (platform_driver_unregister+0x0/0x20) from [<bf001938>] (omap_nand_driver_exit+0x14/0x1c [omap2])
    [<bf001924>] (omap_nand_driver_exit+0x0/0x1c [omap2]) from [<c009beb4>] (sys_delete_module+0x1f0/0x2ec)
    [<c009bcc4>] (sys_delete_module+0x0/0x2ec) from [<c0014dc0>] (ret_fast_syscall+0x0/0x48)
     r8:c0015028 r7:00000081 r6:b6fe0032 r5:70616d6f r4:0001a004
    Code: e1a00005 eb0d9172 e7f001f2 e7f001f2 (e7f001f2)
    ---[ end trace 6a30b24d8c0cc2ee ]---
    Segmentation fault
    --->8---
    
    This error was introduced in 67ce04b which
    was the first commit of this driver.
    
    Signed-off-by: Andreas Bießmann <andreas@biessmann.de>
    Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
    Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  11. @shmull @gregkh

    mtd: nand: Use the mirror BBT descriptor when reading its version

    shmull authored gregkh committed
    commit 7bb9c75 upstream.
    
    The code responsible for reading the version of the mirror bbt was
    incorrectly using the descriptor of the main bbt.
    
    Pass the mirror bbt descriptor to 'scan_read_raw' when reading the
    version of the mirror bbt.
    
    Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
    Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
    Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  12. @rgenoud @gregkh

    mtd: nandsim: bugfix: fail if overridesize is too big

    rgenoud authored gregkh committed
    commit bb0a13a upstream.
    
    If override size is too big, the module was actually loaded instead of
    failing, because retval was not set.
    
    This lead to memory corruption with the use of the freed structs nandsim
    and nand_chip.
    
    Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
    Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
    Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  13. @shcgit @gregkh

    mtd: autcpu12-nvram: Fix compile breakage

    shcgit authored gregkh committed
    commit d1f55c6 upstream.
    
    Update driver autcpu12-nvram.c so it compiles; map_read32/map_write32
    no longer exist in the kernel so the driver is totally broken.
    Additionally, map_info name passed to simple_map_init is incorrect.
    
    Signed-off-by: Alexander Shiyan <shc_work@mail.ru>
    Acked-by: Arnd Bergmann <arnd@arndb.de>
    Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
    Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  14. @zyzii @gregkh

    mtd: mtdpart: break it as soon as we parse out the partitions

    zyzii authored gregkh committed
    commit c51803d upstream.
    
    We may cause a memory leak when the @types has more then one parser.
    
    Take the `default_mtd_part_types` for example. The default_mtd_part_types has
    two parsers now: `cmdlinepart` and `ofpart`.
    
    Assume the following case:
    The kernel command line sets the partitions like:
    	#gpmi-nand:20m(boot),20m(kernel),1g(rootfs),-(user)
    But the devicetree file(such as arch/arm/boot/dts/imx28-evk.dts) also sets
    the same partitions as the kernel command line does.
    
    In the current code, the partitions parsed out by the `ofpart` will
    overwrite the @pparts which has already set by the `cmdlinepart` parser,
    and the the partitions parsed out by the `cmdlinepart` is missed.
    A memory leak occurs.
    
    So we should break the code as soon as we parse out the partitions,
    In actually, this patch makes a priority order between the parsers.
    If one parser has already parsed out the partitions successfully,
    it's no need to use another parser anymore.
    
    Signed-off-by: Huang Shijie <shijie8@gmail.com>
    Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
    Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  15. @gregkh

    CPU hotplug, cpusets, suspend: Don't modify cpusets during suspend/re…

    Srivatsa S. Bhat authored gregkh committed
    …sume
    
    commit d35be8b upstream.
    
    In the event of CPU hotplug, the kernel modifies the cpusets' cpus_allowed
    masks as and when necessary to ensure that the tasks belonging to the cpusets
    have some place (online CPUs) to run on. And regular CPU hotplug is
    destructive in the sense that the kernel doesn't remember the original cpuset
    configurations set by the user, across hotplug operations.
    
    However, suspend/resume (which uses CPU hotplug) is a special case in which
    the kernel has the responsibility to restore the system (during resume), to
    exactly the same state it was in before suspend.
    
    In order to achieve that, do the following:
    
    1. Don't modify cpusets during suspend/resume. At all.
       In particular, don't move the tasks from one cpuset to another, and
       don't modify any cpuset's cpus_allowed mask. So, simply ignore cpusets
       during the CPU hotplug operations that are carried out in the
       suspend/resume path.
    
    2. However, cpusets and sched domains are related. We just want to avoid
       altering cpusets alone. So, to keep the sched domains updated, build
       a single sched domain (containing all active cpus) during each of the
       CPU hotplug operations carried out in s/r path, effectively ignoring
       the cpusets' cpus_allowed masks.
    
       (Since userspace is frozen while doing all this, it will go unnoticed.)
    
    3. During the last CPU online operation during resume, build the sched
       domains by looking up the (unaltered) cpusets' cpus_allowed masks.
       That will bring back the system to the same original state as it was in
       before suspend.
    
    Ultimately, this will not only solve the cpuset problem related to suspend
    resume (ie., restores the cpusets to exactly what it was before suspend, by
    not touching it at all) but also speeds up suspend/resume because we avoid
    running cpuset update code for every CPU being offlined/onlined.
    
    Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
    Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Link: http://lkml.kernel.org/r/20120524141611.3692.20155.stgit@srivatsabhat.in.ibm.com
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  16. @gregkh

    efi: initialize efi.runtime_version to make query_variable_info/updat…

    Seiji Aguchi authored gregkh committed
    …e_capsule workable
    
    commit d6cf86d upstream.
    
    A value of efi.runtime_version is checked before calling
    update_capsule()/query_variable_info() as follows.
    But it isn't initialized anywhere.
    
    <snip>
    static efi_status_t virt_efi_query_variable_info(u32 attr,
                                                     u64 *storage_space,
                                                     u64 *remaining_space,
                                                     u64 *max_variable_size)
    {
            if (efi.runtime_version < EFI_2_00_SYSTEM_TABLE_REVISION)
                    return EFI_UNSUPPORTED;
    <snip>
    
    This patch initializes a value of efi.runtime_version at boot time.
    
    Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com>
    Acked-by: Matthew Garrett <mjg@redhat.com>
    Signed-off-by: Matt Fleming <matt.fleming@intel.com>
    Signed-off-by: Ivan Hu <ivan.hu@canonical.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  17. @gregkh

    efi: Build EFI stub with EFI-appropriate options

    Matthew Garrett authored gregkh committed
    commit 9dead5b upstream.
    
    We can't assume the presence of the red zone while we're still in a boot
    services environment, so we should build with -fno-red-zone to avoid
    problems. Change the size of wchar at the same time to make string handling
    simpler.
    
    Signed-off-by: Matthew Garrett <mjg@redhat.com>
    Signed-off-by: Matt Fleming <matt.fleming@intel.com>
    Acked-by: Josh Boyer <jwboyer@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  18. @gregkh

    mempolicy: fix a memory corruption by refcount imbalance in alloc_pag…

    Mel Gorman authored gregkh committed
    …es_vma()
    
    commit 00442ad upstream.
    
    Commit cc9a6c8 ("cpuset: mm: reduce large amounts of memory barrier
    related damage v3") introduced a potential memory corruption.
    shmem_alloc_page() uses a pseudo vma and it has one significant unique
    combination, vma->vm_ops=NULL and vma->policy->flags & MPOL_F_SHARED.
    
    get_vma_policy() does NOT increase a policy ref when vma->vm_ops=NULL
    and mpol_cond_put() DOES decrease a policy ref when a policy has
    MPOL_F_SHARED.  Therefore, when a cpuset update race occurs,
    alloc_pages_vma() falls in 'goto retry_cpuset' path, decrements the
    reference count and frees the policy prematurely.
    
    Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
    Signed-off-by: Mel Gorman <mgorman@suse.de>
    Reviewed-by: Christoph Lameter <cl@linux.com>
    Cc: Josh Boyer <jwboyer@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  19. @kosaki @gregkh

    mempolicy: fix refcount leak in mpol_set_shared_policy()

    kosaki authored gregkh committed
    commit 63f74ca upstream.
    
    When shared_policy_replace() fails to allocate new->policy is not freed
    correctly by mpol_set_shared_policy().  The problem is that shared
    mempolicy code directly call kmem_cache_free() in multiple places where
    it is easy to make a mistake.
    
    This patch creates an sp_free wrapper function and uses it. The bug was
    introduced pre-git age (IOW, before 2.6.12-rc2).
    
    [mgorman@suse.de: Editted changelog]
    Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
    Signed-off-by: Mel Gorman <mgorman@suse.de>
    Reviewed-by: Christoph Lameter <cl@linux.com>
    Cc: Josh Boyer <jwboyer@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  20. @gregkh

    mempolicy: fix a race in shared_policy_replace()

    Mel Gorman authored gregkh committed
    commit b22d127 upstream.
    
    shared_policy_replace() use of sp_alloc() is unsafe.  1) sp_node cannot
    be dereferenced if sp->lock is not held and 2) another thread can modify
    sp_node between spin_unlock for allocating a new sp node and next
    spin_lock.  The bug was introduced before 2.6.12-rc2.
    
    Kosaki's original patch for this problem was to allocate an sp node and
    policy within shared_policy_replace and initialise it when the lock is
    reacquired.  I was not keen on this approach because it partially
    duplicates sp_alloc().  As the paths were sp->lock is taken are not that
    performance critical this patch converts sp->lock to sp->mutex so it can
    sleep when calling sp_alloc().
    
    [kosaki.motohiro@jp.fujitsu.com: Original patch]
    Signed-off-by: Mel Gorman <mgorman@suse.de>
    Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
    Reviewed-by: Christoph Lameter <cl@linux.com>
    Cc: Josh Boyer <jwboyer@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  21. @kosaki @gregkh

    mempolicy: remove mempolicy sharing

    kosaki authored gregkh committed
    commit 869833f upstream.
    
    Dave Jones' system call fuzz testing tool "trinity" triggered the
    following bug error with slab debugging enabled
    
        =============================================================================
        BUG numa_policy (Not tainted): Poison overwritten
        -----------------------------------------------------------------------------
    
        INFO: 0xffff880146498250-0xffff880146498250. First byte 0x6a instead of 0x6b
        INFO: Allocated in mpol_new+0xa3/0x140 age=46310 cpu=6 pid=32154
         __slab_alloc+0x3d3/0x445
         kmem_cache_alloc+0x29d/0x2b0
         mpol_new+0xa3/0x140
         sys_mbind+0x142/0x620
         system_call_fastpath+0x16/0x1b
    
        INFO: Freed in __mpol_put+0x27/0x30 age=46268 cpu=6 pid=32154
         __slab_free+0x2e/0x1de
         kmem_cache_free+0x25a/0x260
         __mpol_put+0x27/0x30
         remove_vma+0x68/0x90
         exit_mmap+0x118/0x140
         mmput+0x73/0x110
         exit_mm+0x108/0x130
         do_exit+0x162/0xb90
         do_group_exit+0x4f/0xc0
         sys_exit_group+0x17/0x20
         system_call_fastpath+0x16/0x1b
    
        INFO: Slab 0xffffea0005192600 objects=27 used=27 fp=0x          (null) flags=0x20000000004080
        INFO: Object 0xffff880146498250 @offset=592 fp=0xffff88014649b9d0
    
    The problem is that the structure is being prematurely freed due to a
    reference count imbalance. In the following case mbind(addr, len) should
    replace the memory policies of both vma1 and vma2 and thus they will
    become to share the same mempolicy and the new mempolicy will have the
    MPOL_F_SHARED flag.
    
      +-------------------+-------------------+
      |     vma1          |     vma2(shmem)   |
      +-------------------+-------------------+
      |                                       |
     addr                                 addr+len
    
    alloc_pages_vma() uses get_vma_policy() and mpol_cond_put() pair for
    maintaining the mempolicy reference count.  The current rule is that
    get_vma_policy() only increments refcount for shmem VMA and
    mpol_conf_put() only decrements refcount if the policy has
    MPOL_F_SHARED.
    
    In above case, vma1 is not shmem vma and vma->policy has MPOL_F_SHARED!
    The reference count will be decreased even though was not increased
    whenever alloc_page_vma() is called.  This has been broken since commit
    [52cd3b0: mempolicy: rework mempolicy Reference Counting] in 2008.
    
    There is another serious bug with the sharing of memory policies.
    Currently, mempolicy rebind logic (it is called from cpuset rebinding)
    ignores a refcount of mempolicy and override it forcibly.  Thus, any
    mempolicy sharing may cause mempolicy corruption.  The bug was
    introduced by commit [68860ec: cpusets: automatic numa mempolicy
    rebinding].
    
    Ideally, the shared policy handling would be rewritten to either
    properly handle COW of the policy structures or at least reference count
    MPOL_F_SHARED based exclusively on information within the policy.
    However, this patch takes the easier approach of disabling any policy
    sharing between VMAs.  Each new range allocated with sp_alloc will
    allocate a new policy, set the reference count to 1 and drop the
    reference count of the old policy.  This increases the memory footprint
    but is not expected to be a major problem as mbind() is unlikely to be
    used for fine-grained ranges.  It is also inefficient because it means
    we allocate a new policy even in cases where mbind_range() could use the
    new_policy passed to it.  However, it is more straight-forward and the
    change should be invisible to the user.
    
    [mgorman@suse.de: Edited changelog]
    Reported-by: Dave Jones <davej@redhat.com>
    Cc: Christoph Lameter <cl@linux.com>
    Reviewed-by: Christoph Lameter <cl@linux.com>
    Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
    Signed-off-by: Mel Gorman <mgorman@suse.de>
    Cc: Josh Boyer <jwboyer@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  22. @kosaki @gregkh

    revert "mm: mempolicy: Let vma_merge and vma_split handle vma->vm_pol…

    kosaki authored gregkh committed
    …icy linkages"
    
    commit 8d34694 upstream.
    
    Commit 05f144a ("mm: mempolicy: Let vma_merge and vma_split handle
    vma->vm_policy linkages") removed vma->vm_policy updates code but it is
    the purpose of mbind_range().  Now, mbind_range() is virtually a no-op
    and while it does not allow memory corruption it is not the right fix.
    This patch is a revert.
    
    [mgorman@suse.de: Edited changelog]
    Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
    Signed-off-by: Mel Gorman <mgorman@suse.de>
    Cc: Christoph Lameter <cl@linux.com>
    Cc: Josh Boyer <jwboyer@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  23. @gregkh

    r8169: 8168c and later require bit 0x20 to be set in Config2 for PME …

    Francois Romieu authored gregkh committed
    …signaling.
    
    commit d387b42 upstream.
    
    The new 84xx stopped flying below the radars.
    
    Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
    Cc: Hayes Wang <hayeswang@realtek.com>
    Acked-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  24. @gregkh

    r8169: Config1 is read-only on 8168c and later.

    Francois Romieu authored gregkh committed
    commit 851e602 upstream.
    
    Suggested by Hayes.
    
    Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
    Cc: Hayes Wang <hayeswang@realtek.com>
    Acked-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  25. @gregkh

    rcu: Fix day-one dyntick-idle stall-warning bug

    Paul E. McKenney authored gregkh committed
    commit a10d206 upstream.
    
    Each grace period is supposed to have at least one callback waiting
    for that grace period to complete.  However, if CONFIG_NO_HZ=n, an
    extra callback-free grace period is no big problem -- it will chew up
    a tiny bit of CPU time, but it will complete normally.  In contrast,
    CONFIG_NO_HZ=y kernels have the potential for all the CPUs to go to
    sleep indefinitely, in turn indefinitely delaying completion of the
    callback-free grace period.  Given that nothing is waiting on this grace
    period, this is also not a problem.
    
    That is, unless RCU CPU stall warnings are also enabled, as they are
    in recent kernels.  In this case, if a CPU wakes up after at least one
    minute of inactivity, an RCU CPU stall warning will result.  The reason
    that no one noticed until quite recently is that most systems have enough
    OS noise that they will never remain absolutely idle for a full minute.
    But there are some embedded systems with cut-down userspace configurations
    that consistently get into this situation.
    
    All this begs the question of exactly how a callback-free grace period
    gets started in the first place.  This can happen due to the fact that
    CPUs do not necessarily agree on which grace period is in progress.
    If a CPU still believes that the grace period that just completed is
    still ongoing, it will believe that it has callbacks that need to wait for
    another grace period, never mind the fact that the grace period that they
    were waiting for just completed.  This CPU can therefore erroneously
    decide to start a new grace period.  Note that this can happen in
    TREE_RCU and TREE_PREEMPT_RCU even on a single-CPU system:  Deadlock
    considerations mean that the CPU that detected the end of the grace
    period is not necessarily officially informed of this fact for some time.
    
    Once this CPU notices that the earlier grace period completed, it will
    invoke its callbacks.  It then won't have any callbacks left.  If no
    other CPU has any callbacks, we now have a callback-free grace period.
    
    This commit therefore makes CPUs check more carefully before starting a
    new grace period.  This new check relies on an array of tail pointers
    into each CPU's list of callbacks.  If the CPU is up to date on which
    grace periods have completed, it checks to see if any callbacks follow
    the RCU_DONE_TAIL segment, otherwise it checks to see if any callbacks
    follow the RCU_WAIT_TAIL segment.  The reason that this works is that
    the RCU_WAIT_TAIL segment will be promoted to the RCU_DONE_TAIL segment
    as soon as the CPU is officially notified that the old grace period
    has ended.
    
    This change is to cpu_needs_another_gp(), which is called in a number
    of places.  The only one that really matters is in rcu_start_gp(), where
    the root rcu_node structure's ->lock is held, which prevents any
    other CPU from starting or completing a grace period, so that the
    comparison that determines whether the CPU is missing the completion
    of a grace period is stable.
    
    Reported-by: Becky Bruce <bgillbruce@gmail.com>
    Reported-by: Subodh Nijsure <snijsure@grid-net.com>
    Reported-by: Paul Walmsley <paul@pwsan.com>
    Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Tested-by: Paul Walmsley <paul@pwsan.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  26. @fweisbec @gregkh

    score: Add missing RCU idle APIs on idle loop

    fweisbec authored gregkh committed
    commit 0ee23fd upstream.
    
    In the old times, the whole idle task was considered
    as an RCU quiescent state. But as RCU became more and
    more successful overtime, some RCU read side critical
    section have been added even in the code of some
    architectures idle tasks, for tracing for example.
    
    So nowadays, rcu_idle_enter() and rcu_idle_exit() must
    be called by the architecture to tell RCU about the part
    in the idle loop that doesn't make use of rcu read side
    critical sections, typically the part that puts the CPU
    in low power mode.
    
    This is necessary for RCU to find the quiescent states in
    idle in order to complete grace periods.
    
    Add this missing pair of calls in scores's idle loop.
    
    Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
    Cc: Chen Liqin <liqin.chen@sunplusct.com>
    Cc: Lennox Wu <lennox.wu@gmail.com>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Reviewed-by: Josh Triplett <josh@joshtriplett.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  27. @fweisbec @gregkh

    m32r: Add missing RCU idle APIs on idle loop

    fweisbec authored gregkh committed
    commit 48ae077 upstream.
    
    In the old times, the whole idle task was considered
    as an RCU quiescent state. But as RCU became more and
    more successful overtime, some RCU read side critical
    section have been added even in the code of some
    architectures idle tasks, for tracing for example.
    
    So nowadays, rcu_idle_enter() and rcu_idle_exit() must
    be called by the architecture to tell RCU about the part
    in the idle loop that doesn't make use of rcu read side
    critical sections, typically the part that puts the CPU
    in low power mode.
    
    This is necessary for RCU to find the quiescent states in
    idle in order to complete grace periods.
    
    Add this missing pair of calls in the m32r's idle loop.
    
    Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
    Cc: Hirokazu Takata <takata@linux-m32r.org>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Reviewed-by: Josh Triplett <josh@joshtriplett.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  28. @fweisbec @gregkh

    cris: Add missing RCU idle APIs on idle loop

    fweisbec authored gregkh committed
    commit c633f9e upstream.
    
    In the old times, the whole idle task was considered
    as an RCU quiescent state. But as RCU became more and
    more successful overtime, some RCU read side critical
    section have been added even in the code of some
    architectures idle tasks, for tracing for example.
    
    So nowadays, rcu_idle_enter() and rcu_idle_exit() must
    be called by the architecture to tell RCU about the part
    in the idle loop that doesn't make use of rcu read side
    critical sections, typically the part that puts the CPU
    in low power mode.
    
    This is necessary for RCU to find the quiescent states in
    idle in order to complete grace periods.
    
    Add this missing pair of calls in the Cris's idle loop.
    
    Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
    Cc: Mikael Starvik <starvik@axis.com>
    Cc: Jesper Nilsson <jesper.nilsson@axis.com>
    Cc: Cris <linux-cris-kernel@axis.com>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Reviewed-by: Josh Triplett <josh@joshtriplett.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  29. @fweisbec @gregkh

    alpha: Add missing RCU idle APIs on idle loop

    fweisbec authored gregkh committed
    commit 4c94cad upstream.
    
    In the old times, the whole idle task was considered
    as an RCU quiescent state. But as RCU became more and
    more successful overtime, some RCU read side critical
    section have been added even in the code of some
    architectures idle tasks, for tracing for example.
    
    So nowadays, rcu_idle_enter() and rcu_idle_exit() must
    be called by the architecture to tell RCU about the part
    in the idle loop that doesn't make use of rcu read side
    critical sections, typically the part that puts the CPU
    in low power mode.
    
    This is necessary for RCU to find the quiescent states in
    idle in order to complete grace periods.
    
    Add this missing pair of calls in the Alpha's idle loop.
    
    Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
    Tested-by: Michael Cree <mcree@orcon.net.nz>
    Cc: Richard Henderson <rth@twiddle.net>
    Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
    Cc: Matt Turner <mattst88@gmail.com>
    Cc: alpha <linux-alpha@vger.kernel.org>
    Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Reviewed-by: Josh Triplett <josh@joshtriplett.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  30. @fweisbec @gregkh

    m68k: Add missing RCU idle APIs on idle loop

    fweisbec authored gregkh committed
    commit 5b57ba3 upstream.
    
    In the old times, the whole idle task was considered
    as an RCU quiescent state. But as RCU became more and
    more successful overtime, some RCU read side critical
    section have been added even in the code of some
    architectures idle tasks, for tracing for example.
    
    So nowadays, rcu_idle_enter() and rcu_idle_exit() must
    be called by the architecture to tell RCU about the part
    in the idle loop that doesn't make use of rcu read side
    critical sections, typically the part that puts the CPU
    in low power mode.
    
    This is necessary for RCU to find the quiescent states in
    idle in order to complete grace periods.
    
    Add this missing pair of calls in the m68k's idle loop.
    
    Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
    Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
    Cc: m68k <linux-m68k@lists.linux-m68k.org>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Reviewed-by: Josh Triplett <josh@joshtriplett.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  31. @fweisbec @gregkh

    mn10300: Add missing RCU idle APIs on idle loop

    fweisbec authored gregkh committed
    commit 5b0753a upstream.
    
    In the old times, the whole idle task was considered
    as an RCU quiescent state. But as RCU became more and
    more successful overtime, some RCU read side critical
    section have been added even in the code of some
    architectures idle tasks, for tracing for example.
    
    So nowadays, rcu_idle_enter() and rcu_idle_exit() must
    be called by the architecture to tell RCU about the part
    in the idle loop that doesn't make use of rcu read side
    critical sections, typically the part that puts the CPU
    in low power mode.
    
    This is necessary for RCU to find the quiescent states in
    idle in order to complete grace periods.
    
    Add this missing pair of calls in the mn10300's idle loop.
    
    Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
    Cc: David Howells <dhowells@redhat.com>
    Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Acked-by: David Howells <dhowells@redhat.com>
    Reviewed-by: Josh Triplett <josh@joshtriplett.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  32. @fweisbec @gregkh

    frv: Add missing RCU idle APIs on idle loop

    fweisbec authored gregkh committed
    commit 41d8fe5 upstream.
    
    In the old times, the whole idle task was considered
    as an RCU quiescent state. But as RCU became more and
    more successful overtime, some RCU read side critical
    section have been added even in the code of some
    architectures idle tasks, for tracing for example.
    
    So nowadays, rcu_idle_enter() and rcu_idle_exit() must
    be called by the architecture to tell RCU about the part
    in the idle loop that doesn't make use of rcu read side
    critical sections, typically the part that puts the CPU
    in low power mode.
    
    This is necessary for RCU to find the quiescent states in
    idle in order to complete grace periods.
    
    Add this missing pair of calls in the Frv's idle loop.
    
    Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
    Cc: David Howells <dhowells@redhat.com>
    Acked-by: David Howells <dhowells@redhat.com>
    Reviewed-by: Josh Triplett <josh@joshtriplett.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  33. @fweisbec @gregkh

    xtensa: Add missing RCU idle APIs on idle loop

    fweisbec authored gregkh committed
    commit 11ad47a upstream.
    
    In the old times, the whole idle task was considered
    as an RCU quiescent state. But as RCU became more and
    more successful overtime, some RCU read side critical
    section have been added even in the code of some
    architectures idle tasks, for tracing for example.
    
    So nowadays, rcu_idle_enter() and rcu_idle_exit() must
    be called by the architecture to tell RCU about the part
    in the idle loop that doesn't make use of rcu read side
    critical sections, typically the part that puts the CPU
    in low power mode.
    
    This is necessary for RCU to find the quiescent states in
    idle in order to complete grace periods.
    
    Add this missing pair of calls in the xtensa's idle loop.
    
    Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
    Cc: Chris Zankel <chris@zankel.net>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Reviewed-by: Josh Triplett <josh@joshtriplett.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  34. @fweisbec @gregkh

    parisc: Add missing RCU idle APIs on idle loop

    fweisbec authored gregkh committed
    commit fbe7521 upstream.
    
    In the old times, the whole idle task was considered
    as an RCU quiescent state. But as RCU became more and
    more successful overtime, some RCU read side critical
    section have been added even in the code of some
    architectures idle tasks, for tracing for example.
    
    So nowadays, rcu_idle_enter() and rcu_idle_exit() must
    be called by the architecture to tell RCU about the part
    in the idle loop that doesn't make use of rcu read side
    critical sections, typically the part that puts the CPU
    in low power mode.
    
    This is necessary for RCU to find the quiescent states in
    idle in order to complete grace periods.
    
    Add this missing pair of calls in the parisc's idle loop.
    
    Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
    Cc: James E.J. Bottomley <jejb@parisc-linux.org>
    Cc: Helge Deller <deller@gmx.de>
    Cc: Parisc <linux-parisc@vger.kernel.org>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Reviewed-by: Josh Triplett <josh@joshtriplett.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  35. @fweisbec @gregkh

    h8300: Add missing RCU idle APIs on idle loop

    fweisbec authored gregkh committed
    commit b2fe143 upstream.
    
    In the old times, the whole idle task was considered
    as an RCU quiescent state. But as RCU became more and
    more successful overtime, some RCU read side critical
    section have been added even in the code of some
    architectures idle tasks, for tracing for example.
    
    So nowadays, rcu_idle_enter() and rcu_idle_exit() must
    be called by the architecture to tell RCU about the part
    in the idle loop that doesn't make use of rcu read side
    critical sections, typically the part that puts the CPU
    in low power mode.
    
    This is necessary for RCU to find the quiescent states in
    idle in order to complete grace periods.
    
    Add this missing pair of calls in the h8300's idle loop.
    
    Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
    Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
    Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Reviewed-by: Josh Triplett <josh@joshtriplett.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Something went wrong with that request. Please try again.