Permalink
Commits on Jul 13, 2011
  1. Linux 2.6.32.43

    gregkh committed Jul 13, 2011
  2. mm: prevent concurrent unmap_mapping_range() on the same inode

    commit 2aa1589 upstream.
    
    Michael Leun reported that running parallel opens on a fuse filesystem
    can trigger a "kernel BUG at mm/truncate.c:475"
    
    Gurudas Pai reported the same bug on NFS.
    
    The reason is, unmap_mapping_range() is not prepared for more than
    one concurrent invocation per inode.  For example:
    
      thread1: going through a big range, stops in the middle of a vma and
         stores the restart address in vm_truncate_count.
    
      thread2: comes in with a small (e.g. single page) unmap request on
         the same vma, somewhere before restart_address, finds that the
         vma was already unmapped up to the restart address and happily
         returns without doing anything.
    
    Another scenario would be two big unmap requests, both having to
    restart the unmapping and each one setting vm_truncate_count to its
    own value.  This could go on forever without any of them being able to
    finish.
    
    Truncate and hole punching already serialize with i_mutex.  Other
    callers of unmap_mapping_range() do not, and it's difficult to get
    i_mutex protection for all callers.  In particular ->d_revalidate(),
    which calls invalidate_inode_pages2_range() in fuse, may be called
    with or without i_mutex.
    
    This patch adds a new mutex to 'struct address_space' to prevent
    running multiple concurrent unmap_mapping_range() on the same mapping.
    
    [ We'll hopefully get rid of all this with the upcoming mm
      preemptibility series by Peter Zijlstra, the "mm: Remove i_mmap_mutex
      lockbreak" patch in particular.  But that is for 2.6.39 ]
    
    Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
    Reported-by: Michael Leun <lkml20101129@newton.leun.net>
    Reported-by: Gurudas Pai <gurudas.pai@oracle.com>
    Tested-by: Gurudas Pai <gurudas.pai@oracle.com>
    Acked-by: Hugh Dickins <hughd@google.com>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Miklos Szeredi committed with gregkh Feb 23, 2011
  3. udp/recvmsg: Clear MSG_TRUNC flag when starting over for a new packet

    [ Upstream commit 9cfaa8d ]
    
    Consider this scenario: When the size of the first received udp packet
    is bigger than the receive buffer, MSG_TRUNC bit is set in msg->msg_flags.
    However, if checksum error happens and this is a blocking socket, it will
    goto try_again loop to receive the next packet.  But if the size of the
    next udp packet is smaller than receive buffer, MSG_TRUNC flag should not
    be set, but because MSG_TRUNC bit is not cleared in msg->msg_flags before
    receive the next packet, MSG_TRUNC is still set, which is wrong.
    
    Fix this problem by clearing MSG_TRUNC flag when starting over for a
    new packet.
    
    Signed-off-by: Xufeng Zhang <xufeng.zhang@windriver.com>
    Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Xufeng Zhang committed with gregkh Jun 21, 2011
  4. ipv6/udp: Use the correct variable to determine non-blocking condition

    [ Upstream commit 32c9025 ]
    
    udpv6_recvmsg() function is not using the correct variable to determine
    whether or not the socket is in non-blocking operation, this will lead
    to unexpected behavior when a UDP checksum error occurs.
    
    Consider a non-blocking udp receive scenario: when udpv6_recvmsg() is
    called by sock_common_recvmsg(), MSG_DONTWAIT bit of flags variable in
    udpv6_recvmsg() is cleared by "flags & ~MSG_DONTWAIT" in this call:
    
        err = sk->sk_prot->recvmsg(iocb, sk, msg, size, flags & MSG_DONTWAIT,
                       flags & ~MSG_DONTWAIT, &addr_len);
    
    i.e. with udpv6_recvmsg() getting these values:
    
    	int noblock = flags & MSG_DONTWAIT
    	int flags = flags & ~MSG_DONTWAIT
    
    So, when udp checksum error occurs, the execution will go to
    csum_copy_err, and then the problem happens:
    
        csum_copy_err:
                ...............
                if (flags & MSG_DONTWAIT)
                        return -EAGAIN;
                goto try_again;
                ...............
    
    But it will always go to try_again as MSG_DONTWAIT has been cleared
    from flags at call time -- only noblock contains the original value
    of MSG_DONTWAIT, so the test should be:
    
                if (noblock)
                        return -EAGAIN;
    
    This is also consistent with what the ipv4/udp code does.
    
    Signed-off-by: Xufeng Zhang <xufeng.zhang@windriver.com>
    Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Xufeng Zhang committed with gregkh Jun 21, 2011
  5. net/ipv4: Check for mistakenly passed in non-IPv4 address

    [ Upstream commit d0733d2 ]
    
    Check against mistakenly passing in IPv6 addresses (which would result
    in an INADDR_ANY bind) or similar incompatible sockaddrs.
    
    Signed-off-by: Marcus Meissner <meissner@suse.de>
    Cc: Reinhard Max <max@suse.de>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    msmeissn committed with gregkh Jun 2, 2011
  6. af_packet: prevent information leak

    [ Upstream commit 13fcb7b ]
    
    In 2.6.27, commit 393e52e (packet: deliver VLAN TCI to userspace)
    added a small information leak.
    
    Add padding field and make sure its zeroed before copy to user.
    
    Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
    CC: Patrick McHardy <kaber@trash.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Eric Dumazet committed with gregkh Jun 7, 2011
  7. net: filter: Use WARN_RATELIMIT

    [ Upstream commit 6c4a5cb ]
    
    A mis-configured filter can spam the logs with lots of stack traces.
    
    Rate-limit the warnings and add printout of the bogus filter information.
    
    Original-patch-by: Ben Greear <greearb@candelatech.com>
    Signed-off-by: Joe Perches <joe@perches.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    JoePerches committed with gregkh May 21, 2011
  8. bug.h: Add WARN_RATELIMIT

    [ Upstream commit b3eec79 ]
    
    Add a generic mechanism to ratelimit WARN(foo, fmt, ...) messages
    using a hidden per call site static struct ratelimit_state.
    
    Also add an __WARN_RATELIMIT variant to be able to use a specific
    struct ratelimit_state.
    
    Signed-off-by: Joe Perches <joe@perches.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    JoePerches committed with gregkh May 21, 2011
  9. PM / Hibernate: Fix free_unnecessary_pages()

    commit 4d4cf23 upstream.
    
    There is a bug in free_unnecessary_pages() that causes it to
    attempt to free too many pages in some cases, which triggers the
    BUG_ON() in memory_bm_clear_bit() for copy_bm.  Namely, if
    count_data_pages() is initially greater than alloc_normal, we get
    to_free_normal equal to 0 and "save" greater from 0.  In that case,
    if the sum of "save" and count_highmem_pages() is greater than
    alloc_highmem, we subtract a positive number from to_free_normal.
    Hence, since to_free_normal was 0 before the subtraction and is
    an unsigned int, the result is converted to a huge positive number
    that is used as the number of pages to free.
    
    Fix this bug by checking if to_free_normal is actually greater
    than or equal to the number we're going to subtract from it.
    
    Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
    Reported-and-tested-by: Matthew Garrett <mjg@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    rjwysocki committed with gregkh Jul 6, 2011
  10. PM / Hibernate: Avoid hitting OOM during preallocation of memory

    commit 6715045 upstream.
    
    There is a problem in hibernate_preallocate_memory() that it calls
    preallocate_image_memory() with an argument that may be greater than
    the total number of available non-highmem memory pages.  If that's
    the case, the OOM condition is guaranteed to trigger, which in turn
    can cause significant slowdown to occur during hibernation.
    
    To avoid that, make preallocate_image_memory() adjust its argument
    before calling preallocate_image_pages(), so that the total number of
    saveable non-highem pages left is not less than the minimum size of
    a hibernation image.  Change hibernate_preallocate_memory() to try to
    allocate from highmem if the number of pages allocated by
    preallocate_image_memory() is too low.
    
    Modify free_unnecessary_pages() to take all possible memory
    allocation patterns into account.
    
    Reported-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
    Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
    Tested-by: M. Vefa Bicakci <bicave@superonline.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    rjwysocki committed with gregkh Sep 11, 2010
  11. inet_diag: fix inet_diag_bc_audit()

    [ Upstream commit eeb1497 ]
    
    A malicious user or buggy application can inject code and trigger an
    infinite loop in inet_diag_bc_audit()
    
    Also make sure each instruction is aligned on 4 bytes boundary, to avoid
    unaligned accesses.
    
    Reported-by: Dan Rosenberg <drosenberg@vsecurity.com>
    Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Eric Dumazet committed with gregkh Jun 17, 2011
  12. netlink: Make nlmsg_find_attr take a const nlmsghdr*.

    commit 6b8c92b upstream.
    
    This will let us use it on a nlmsghdr stored inside a netlink_callback.
    
    Signed-off-by: Nelson Elhage <nelhage@ksplice.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    nelhage committed with gregkh Nov 3, 2010
  13. um: os-linux/mem.c needs sys/stat.h

    commit fb967ec upstream.
    
    The os-linux/mem.c file calls fchmod function, which is declared in sys/stat.h
    header file, so include it.  Fixes build breakage under FC13.
    
    Signed-off-by: Liu Aleaxander <Aleaxander@gmail.com>
    Acked-by: Boaz Harrosh <bharrosh@panasas.com>
    Cc: Jeff Dike <jdike@addtoit.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Aleaxander committed with gregkh Jun 29, 2010
  14. uml: fix CONFIG_STATIC_LINK=y build failure with newer glibc

    commit aa5fb4d upstream.
    
    With glibc 2.11 or later that was built with --enable-multi-arch, the UML
    link fails with undefined references to __rel_iplt_start and similar
    symbols.  In recent binutils, the default linker script defines these
    symbols (see ld --verbose).  Fix the UML linker scripts to match the new
    defaults for these sections.
    
    Signed-off-by: Roland McGrath <roland@redhat.com>
    Cc: Jeff Dike <jdike@addtoit.com>
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Roland McGrath committed with gregkh Oct 26, 2010
  15. USB: don't let the hub driver prevent system sleep

    commit cbb3300 upstream.
    
    This patch (as1465) continues implementation of the policy that errors
    during suspend or hibernation should not prevent the system from going
    to sleep.
    
    In this case, failure to turn on the Suspend feature for a hub port
    shouldn't be reported as an error.  There are situations where this
    does actually occur (such as when the device plugged into that port
    was disconnected in the recent past), and it turns out to be harmless.
    There's no reason for it to prevent a system sleep.
    
    Also, don't allow the hub driver to fail a system suspend if the
    downstream ports aren't all suspended.  This is also harmless (and
    should never happen, given the change mentioned above); printing a
    warning message in the kernel log is all we really need to do.
    
    Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Alan Stern committed with gregkh Jun 15, 2011
  16. USB: don't let errors prevent system sleep

    commit 0af212b upstream.
    
    This patch (as1464) implements the recommended policy that most errors
    during suspend or hibernation should not prevent the system from going
    to sleep.  In particular, failure to suspend a USB driver or a USB
    device should not prevent the sleep from succeeding:
    
    Failure to suspend a device won't matter, because the device will
    automatically go into suspend mode when the USB bus stops carrying
    packets.  (This might be less true for USB-3.0 devices, but let's not
    worry about them now.)
    
    Failure of a driver to suspend might lead to trouble later on when the
    system wakes up, but it isn't sufficient reason to prevent the system
    from going to sleep.
    
    Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Alan Stern committed with gregkh Jun 15, 2011
  17. taskstats: don't allow duplicate entries in listener mode

    commit 26c4cae upstream.
    
    Currently a single process may register exit handlers unlimited times.
    It may lead to a bloated listeners chain and very slow process
    terminations.
    
    Eg after 10KK sent TASKSTATS_CMD_ATTR_REGISTER_CPUMASKs ~300 Mb of
    kernel memory is stolen for the handlers chain and "time id" shows 2-7
    seconds instead of normal 0.003.  It makes it possible to exhaust all
    kernel memory and to eat much of CPU time by triggerring numerous exits
    on a single CPU.
    
    The patch limits the number of times a single process may register
    itself on a single CPU to one.
    
    One little issue is kept unfixed - as taskstats_exit() is called before
    exit_files() in do_exit(), the orphaned listener entry (if it was not
    explicitly deregistered) is kept until the next someone's exit() and
    implicit deregistration in send_cpu_listeners().  So, if a process
    registered itself as a listener exits and the next spawned process gets
    the same pid, it would inherit taskstats attributes.
    
    Signed-off-by: Vasiliy Kulikov <segooon@gmail.com>
    Cc: Balbir Singh <bsingharora@gmail.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Vasiliy Kulikov committed with gregkh Jun 27, 2011
  18. 6pack,mkiss: fix lock inconsistency

    commit 6e4e2f8 upstream.
    
    Lockdep found a locking inconsistency in the mkiss_close function:
    
    > kernel: [ INFO: inconsistent lock state ]
    > kernel: 2.6.39.1 #3
    > kernel: ---------------------------------
    > kernel: inconsistent {IN-SOFTIRQ-R} -> {SOFTIRQ-ON-W} usage.
    > kernel: ax25ipd/2813 [HC0[0]:SC0[0]:HE1:SE1] takes:
    > kernel: (disc_data_lock){+++?.-}, at: [<ffffffffa018552b>] mkiss_close+0x1b/0x90 [mkiss]
    > kernel: {IN-SOFTIRQ-R} state was registered at:
    
    The message hints that disc_data_lock is aquired with softirqs disabled,
    but does not itself disable softirqs, which can in rare circumstances
    lead to a deadlock.
    The same problem is present in the 6pack driver, this patch fixes both
    by using write_lock_bh instead of write_lock.
    
    Reported-by: Bernard F6BVP <f6bvp@free.fr>
    Tested-by: Bernard F6BVP <f6bvp@free.fr>
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Acked-by: Ralf Baechle<ralf@linux-mips.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    arndb committed with gregkh Jul 2, 2011
  19. SUNRPC: Ensure the RPC client only quits on fatal signals

    commit 5afa913 upstream.
    
    Fix a couple of instances where we were exiting the RPC client on
    arbitrary signals. We should only do so on fatal signals.
    
    Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Trond Myklebust committed with gregkh Jun 17, 2011
  20. md: avoid endless recovery loop when waiting for fail device to compl…

    …ete.
    
    commit 4274215 upstream.
    
    If a device fails in a way that causes pending request to take a while
    to complete, md will not be able to immediately remove it from the
    array in remove_and_add_spares.
    It will then incorrectly look like a spare device and md will try to
    recover it even though it is failed.
    This leads to a recovery process starting and instantly aborting over
    and over again.
    
    We should check if the device is faulty before considering it to be a
    spare.  This will avoid trying to start a recovery that cannot
    proceed.
    
    This bug was introduced in 2.6.26 so that patch is suitable for any
    kernel since then.
    
    Reported-by: Jim Paradis <james.paradis@stratus.com>
    Signed-off-by: NeilBrown <neilb@suse.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    neilbrown committed with gregkh Jun 28, 2011
  21. i2c-taos-evm: Fix log messages

    commit 9b640f2 upstream.
    
    * Print all error and information messages even when debugging is
      disabled.
    * Don't use adapter device to log messages before it is ready.
    
    Signed-off-by: Jean Delvare <khali@linux-fr.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Jean Delvare committed with gregkh Jun 29, 2011
  22. cfq-iosched: fix a rcu warning

    commit 3181faa upstream.
    
    I got a rcu warnning at boot. the ioc->ioc_data is rcu_deferenced, but
    doesn't hold rcu_read_lock.
    
    Signed-off-by: Shaohua Li <shaohua.li@intel.com>
    Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Shaohua Li committed with gregkh Jun 27, 2011
  23. cfq-iosched: fix locking around ioc->ioc_data assignment

    commit ab4bd22 upstream.
    
    Since we are modifying this RCU pointer, we need to hold
    the lock protecting it around it.
    
    This fixes a potential reuse and double free of a cfq
    io_context structure. The bug has been in CFQ for a long
    time, it hit very few people but those it did hit seemed
    to see it a lot.
    
    Tracked in RH bugzilla here:
    
    https://bugzilla.redhat.com/show_bug.cgi?id=577968
    
    Credit goes to Paul Bolle for figuring out that the issue
    was around the one-hit ioc->ioc_data cache. Thanks to his
    hard work the issue is now fixed.
    
    Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Jens Axboe committed with gregkh Jun 5, 2011
  24. debugobjects: Fix boot crash when kmemleak and debugobjects enabled

    commit 161b6ae upstream.
    
    Order of initialization look like this:
    ...
    debugobjects
    kmemleak
    ...(lots of other subsystems)...
    workqueues (through early initcall)
    ...
    
    debugobjects use schedule_work for batch freeing of its data and kmemleak
    heavily use debugobjects, so when it comes to freeing and workqueues were
    not initialized yet, kernel crashes:
    
    BUG: unable to handle kernel NULL pointer dereference at           (null)
    IP: [<ffffffff810854d1>] __queue_work+0x29/0x41a
     [<ffffffff81085910>] queue_work_on+0x16/0x1d
     [<ffffffff81085abc>] queue_work+0x29/0x55
     [<ffffffff81085afb>] schedule_work+0x13/0x15
     [<ffffffff81242de1>] free_object+0x90/0x95
     [<ffffffff81242f6d>] debug_check_no_obj_freed+0x187/0x1d3
     [<ffffffff814b6504>] ? _raw_spin_unlock_irqrestore+0x30/0x4d
     [<ffffffff8110bd14>] ? free_object_rcu+0x68/0x6d
     [<ffffffff8110890c>] kmem_cache_free+0x64/0x12c
     [<ffffffff8110bd14>] free_object_rcu+0x68/0x6d
     [<ffffffff810b58bc>] __rcu_process_callbacks+0x1b6/0x2d9
    ...
    
    because system_wq is NULL.
    
    Fix it by checking if workqueues susbystem was initialized before using.
    
    Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Tejun Heo <tj@kernel.org>
    Cc: Dipankar Sarma <dipankar@in.ibm.com>
    Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Link: http://lkml.kernel.org/r/20110528112342.GA3068@joi.lan
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    mslusarz committed with gregkh May 28, 2011
  25. watchdog: mtx1-wdt: request gpio before using it

    commit 9b19d40 upstream.
    
    Otherwise, the gpiolib autorequest feature will produce a WARN_ON():
    
    WARNING: at drivers/gpio/gpiolib.c:101 0x8020ec6c()
    autorequest GPIO-215
    [...]
    
    Signed-off-by: Florian Fainelli <florian@openwrt.org>
    Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    ffainelli committed with gregkh Jun 15, 2011
  26. uvcvideo: Remove buffers from the queues when freeing

    commit 8ca2c80 upstream.
    
    When freeing memory for the video buffers also remove them from the
    irq & main queues.
    
    This fixes an oops when doing the following:
    
    open ("/dev/video", ..)
    VIDIOC_REQBUFS
    VIDIOC_QBUF
    VIDIOC_REQBUFS
    close ()
    
    As the second VIDIOC_REQBUFS will cause the list entries of the buffers
    to be cleared while they still hang around on the main and irc queues
    
    Signed-off-by: Sjoerd Simons <sjoerd.simons@collabora.co.uk>
    Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
    Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    sjoerd-ccu committed with gregkh May 24, 2011
  27. mm: fix negative commitlimit when gigantic hugepages are allocated

    commit b0320c7 upstream.
    
    When 1GB hugepages are allocated on a system, free(1) reports less
    available memory than what really is installed in the box.  Also, if the
    total size of hugepages allocated on a system is over half of the total
    memory size, CommitLimit becomes a negative number.
    
    The problem is that gigantic hugepages (order > MAX_ORDER) can only be
    allocated at boot with bootmem, thus its frames are not accounted to
    'totalram_pages'.  However, they are accounted to hugetlb_total_pages()
    
    What happens to turn CommitLimit into a negative number is this
    calculation, in fs/proc/meminfo.c:
    
            allowed = ((totalram_pages - hugetlb_total_pages())
                    * sysctl_overcommit_ratio / 100) + total_swap_pages;
    
    A similar calculation occurs in __vm_enough_memory() in mm/mmap.c.
    
    Also, every vm statistic which depends on 'totalram_pages' will render
    confusing values, as if system were 'missing' some part of its memory.
    
    Impact of this bug:
    
    When gigantic hugepages are allocated and sysctl_overcommit_memory ==
    OVERCOMMIT_NEVER.  In a such situation, __vm_enough_memory() goes through
    the mentioned 'allowed' calculation and might end up mistakenly returning
    -ENOMEM, thus forcing the system to start reclaiming pages earlier than it
    would be ususal, and this could cause detrimental impact to overall
    system's performance, depending on the workload.
    
    Besides the aforementioned scenario, I can only think of this causing
    annoyances with memory reports from /proc/meminfo and free(1).
    
    [akpm@linux-foundation.org: standardize comment layout]
    Reported-by: Russ Anderson <rja@sgi.com>
    Signed-off-by: Rafael Aquini <aquini@linux.com>
    Acked-by: Russ Anderson <rja@sgi.com>
    Cc: Andrea Arcangeli <aarcange@redhat.com>
    Cc: Christoph Lameter <cl@linux.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    aquini committed with gregkh Jun 15, 2011
  28. ath5k: fix memory leak when fewer than N_PD_CURVES are in use

    commit a0b8de3 upstream.
    
    We would free the proper number of curves, but in the wrong
    slots, due to a missing level of indirection through
    the pdgain_idx table.
    
    It's simpler just to try to free all four slots, so do that.
    
    Signed-off-by: Bob Copeland <me@bobcopeland.com>
    Signed-off-by: John W. Linville <linville@tuxdriver.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Eugene A. Shatokhin committed with gregkh Jun 29, 2011
  29. PM: Free memory bitmaps if opening /dev/snapshot fails

    commit 8440f4b upstream.
    
    When opening /dev/snapshot device, snapshot_open() creates memory
    bitmaps which are freed in snapshot_release(). But if any of the
    callbacks called by pm_notifier_call_chain() returns NOTIFY_BAD, open()
    fails, snapshot_release() is never called and bitmaps are not freed.
    Next attempt to open /dev/snapshot then triggers BUG_ON() check in
    create_basic_memory_bitmaps(). This happens e.g. when vmwatchdog module
    is active on s390x.
    
    Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
    Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    mkubecek committed with gregkh Jun 18, 2011
  30. xhci: Reject double add of active endpoints.

    commit fa75ac3 upstream.
    
    While trying to switch a UAS device from the BOT configuration to the UAS
    configuration via the bConfigurationValue file, Tanya ran into an issue in
    the USB core.  usb_disable_device() sets entries in udev->ep_out and
    udev->ep_out to NULL, but doesn't call into the xHCI bandwidth management
    functions to remove the BOT configuration endpoints from the xHCI host's
    internal structures.
    
    The USB core would then attempt to add endpoints for the UAS
    configuration, and some of the endpoints had the same address as endpoints
    in the BOT configuration.  The xHCI driver blindly added the endpoints
    again, but the xHCI host controller rejected the Configure Endpoint
    command because active endpoints were added without being dropped.
    
    Make the xHCI driver reject calls to xhci_add_endpoint() that attempt to
    add active endpoints without first calling xhci_drop_endpoint().
    
    This should be backported to kernels as old as 2.6.31.
    
    Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
    Reported-by: Tanya Brokhman <tlinder@codeaurora.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Sarah Sharp committed with gregkh Jun 6, 2011
  31. TTY: ldisc, do not close until there are readers

    commit 92f6fa0 upstream.
    
    We restored tty_ldisc_wait_idle in 100eeae (TTY: restore
    tty_ldisc_wait_idle). We used it in the ldisc changing path to fix the
    case where there are tasks in n_tty_read waiting for data and somebody
    tries to change ldisc.
    
    Similar to the case above, there may be also tasks waiting in
    n_tty_read while hangup is performed. As 65b7704 (tty-ldisc: turn
    ldisc user count into a proper refcount) removed the wait-until-idle
    from all paths, hangup path won't wait for them to disappear either
    now. So add it back even to the hangup path.
    
    There is a difference, we need uninterruptible sleep as there is
    obviously HUP signal pending. So tty_ldisc_wait_idle now sleeps
    without possibility to be interrupted. This is what original
    tty_ldisc_wait_idle did. After the wait idle reintroduction
    (100eeae), we have had interruptible sleeps for the ldisc changing
    path. But as there is a 5s timeout anyway, we don't allow it to be
    interrupted from now on. It's not worth the added complexity of
    deciding what kind of sleep we want.
    
    Before 65b7704 tty_ldisc_release was called also from
    tty_ldisc_release. It is called from tty_release, so I don't think we
    need to restore that one.
    
    This is nicely reproducible after constifying the timing when
    drivers/tty/n_tty.c is patched as follows ("TTY: ntty, add one more
    sanity check" patch is needed to actually see it explode):
    %% -1548,6 +1549,7 @@ static int n_tty_open(struct tty_struct *tty)
    
            /* These are ugly. Currently a malloc failure here can panic */
            if (!tty->read_buf) {
    +               msleep(100);
                    tty->read_buf = kzalloc(N_TTY_BUF_SIZE, GFP_KERNEL);
                    if (!tty->read_buf)
                            return -ENOMEM;
    %% -1785,6 +1788,7 @@ do_it_again:
                                    break;
                            }
                            timeout = schedule_timeout(timeout);
    +                       msleep(20);
                            continue;
                    }
                    __set_current_state(TASK_RUNNING);
    ===== With a process: =====
        while (1) {
            int fd = open(argv[1], O_RDWR);
            read(fd, buf, sizeof(buf));
            close(fd);
        }
    ===== and its child: =====
            setsid();
            while (1) {
                    int fd = open(tty, O_RDWR|O_NOCTTY);
                    ioctl(fd, TIOCSCTTY, 1);
                    vhangup();
                    close(fd);
                    usleep(100 * (10 + random() % 1000));
            }
    ===== EOF =====
    
    References: https://bugzilla.novell.com/show_bug.cgi?id=693374
    References: https://bugzilla.novell.com/show_bug.cgi?id=694509
    Signed-off-by: Jiri Slaby <jslaby@suse.cz>
    Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Jiri Slaby committed with gregkh Jun 5, 2011
  32. clocksource: Make watchdog robust vs. interruption

    commit b519951 upstream.
    
    The clocksource watchdog code is interruptible and it has been
    observed that this can trigger false positives which disable the TSC.
    
    The reason is that an interrupt storm or a long running interrupt
    handler between the read of the watchdog source and the read of the
    TSC brings the two far enough apart that the delta is larger than the
    unstable treshold. Move both reads into a short interrupt disabled
    region to avoid that.
    
    Reported-and-tested-by: Vernon Mauery <vernux@us.ibm.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Thomas Gleixner committed with gregkh Jun 16, 2011
  33. xen: partially revert "xen: set max_pfn_mapped to the last pfn mapped"

    commit a91d928 upstream.
    
    We only need to set max_pfn_mapped to the last pfn mapped on x86_64 to
    make sure that cleanup_highmap doesn't remove important mappings at
    _end.
    
    We don't need to do this on x86_32 because cleanup_highmap is not called
    on x86_32. Besides lowering max_pfn_mapped on x86_32 has the unwanted
    side effect of limiting the amount of memory available for the 1:1
    kernel pagetable allocation.
    
    This patch reverts the x86_32 part of the original patch.
    
    Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
    Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Stefano Stabellini committed with gregkh Jun 3, 2011
  34. migrate: don't account swapcache as shmem

    commit 99a15e2 upstream.
    
    swapcache will reach the below code path in migrate_page_move_mapping,
    and swapcache is accounted as NR_FILE_PAGES but it's not accounted as
    NR_SHMEM.
    
    Hugh pointed out we must use PageSwapCache instead of comparing
    mapping to &swapper_space, to avoid build failure with CONFIG_SWAP=n.
    
    Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
    Acked-by: Hugh Dickins <hughd@google.com>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Andrea Arcangeli committed with gregkh Jun 16, 2011
  35. ksm: fix NULL pointer dereference in scan_get_next_rmap_item()

    commit 2b47261 upstream.
    
    Andrea Righi reported a case where an exiting task can race against
    ksmd::scan_get_next_rmap_item (http://lkml.org/lkml/2011/6/1/742) easily
    triggering a NULL pointer dereference in ksmd.
    
    ksm_scan.mm_slot == &ksm_mm_head with only one registered mm
    
    CPU 1 (__ksm_exit)		CPU 2 (scan_get_next_rmap_item)
     				list_empty() is false
    lock				slot == &ksm_mm_head
    list_del(slot->mm_list)
    (list now empty)
    unlock
    				lock
    				slot = list_entry(slot->mm_list.next)
    				(list is empty, so slot is still ksm_mm_head)
    				unlock
    				slot->mm == NULL ... Oops
    
    Close this race by revalidating that the new slot is not simply the list
    head again.
    
    Andrea's test case:
    
    #include <stdio.h>
    #include <stdlib.h>
    #include <unistd.h>
    #include <sys/mman.h>
    
    #define BUFSIZE getpagesize()
    
    int main(int argc, char **argv)
    {
    	void *ptr;
    
    	if (posix_memalign(&ptr, getpagesize(), BUFSIZE) < 0) {
    		perror("posix_memalign");
    		exit(1);
    	}
    	if (madvise(ptr, BUFSIZE, MADV_MERGEABLE) < 0) {
    		perror("madvise");
    		exit(1);
    	}
    	*(char *)NULL = 0;
    
    	return 0;
    }
    
    Reported-by: Andrea Righi <andrea@betterlinux.com>
    Tested-by: Andrea Righi <andrea@betterlinux.com>
    Cc: Andrea Arcangeli <aarcange@redhat.com>
    Signed-off-by: Hugh Dickins <hughd@google.com>
    Signed-off-by: Chris Wright <chrisw@sous-sol.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
    Hugh Dickins committed with gregkh Jun 15, 2011