Commits on Jul 19, 2016
  1. netfilter: x_tables: make sure e->next_offset covers remaining blob size

    Florian Westphal authored and mdmower committed Mar 22, 2016
    Otherwise this function may read data beyond the ruleset blob.
    Change-Id: I22f514057d3e0403d1af61f4d9555403ab9f72ea
    Signed-off-by: Florian Westphal <>
    Signed-off-by: Pablo Neira Ayuso <>
  2. HID: hiddev: validate num_values for HIDIOCGUSAGES, HIDIOCSUSAGES com…

    ScottyBauer authored and mdmower committed Jun 23, 2016
    This patch validates the num_values parameter from userland during the
    HIDIOCGUSAGES and HIDIOCSUSAGES commands. Previously, if the report id was set
    to HID_REPORT_ID_UNKNOWN, we would fail to validate the num_values parameter
    leading to a heap overflow.
    Change-Id: I10866ee01c7ba430eab2b5cc3356c9519c7f9730
    Signed-off-by: Scott Bauer <>
    Signed-off-by: Jiri Kosina <>
Commits on Jun 24, 2016
  1. net: validate the range we feed to iov_iter_init() in sys_sendto/sys_…

    Al Viro authored and mdmower committed Mar 20, 2015
    Change-Id: Ida19e5102b7faca17c685a261c20fbbf5c9614f9
    Cc: # v3.19
    Signed-off-by: Al Viro <>
    Signed-off-by: David S. Miller <>
  2. mnt: Fail collect_mounts when applied to unmounted mounts

    ebiederm authored and mdmower committed Jan 7, 2015
    The only users of collect_mounts are in audit_tree.c
    In audit_trim_trees and audit_add_tree_rule the path passed into
    collect_mounts is generated from kern_path passed an audit_tree
    pathname which is guaranteed to be an absolute path.   In those cases
    collect_mounts is obviously intended to work on mounted paths and
    if a race results in paths that are unmounted when collect_mounts
    it is reasonable to fail early.
    The paths passed into audit_tag_tree don't have the absolute path
    check.  But are used to play with fsnotify and otherwise interact with
    the audit_trees, so again operating only on mounted paths appears
    Avoid having to worry about what happens when we try and audit
    unmounted filesystems by restricting collect_mounts to mounts
    that appear in the mount tree.
    Change-Id: I2edfee6d6951a2179ce8f53785b65ddb1eb95629
    Signed-off-by: "Eric W. Biederman" <>
  3. KEYS: potential uninitialized variable

    Dan Carpenter authored and mdmower committed May 26, 2016
    If __key_link_begin() failed then "edit" would be uninitialized.  I've
    added a check to fix that.
    Change-Id: I0e28bdba07f645437db2b08daf67ca27f16c6f5c
    Fixes: f70e2e0 ('KEYS: Do preallocation for __key_link()')
    Signed-off-by: Dan Carpenter <>
  4. USB: usbfs: fix potential infoleak in devio

    kengiter authored and mdmower committed May 3, 2016
    The stack object “ci” has a total size of 8 bytes. Its last 3 bytes
    are padding bytes which are not initialized and leaked to userland
    via “copy_to_user”.
    Change-Id: Icd49231ee1862682739a871ae78a5602ee104731
    Signed-off-by: Kangjie Lu <>
    Signed-off-by: Greg Kroah-Hartman <>
Commits on Jun 8, 2016
  1. ASoC: msm: audio-effects: fix stack overread and heap overwrite

    Ravi Kumar Alamanda authored and u-ra committed Apr 13, 2016
    Fix overwrite of updt_params allocated in heap, and stack overread
    where param pointer is passed from user space.
    Bug: 27555224
    Change-Id: Ida8bdb7da2fcb97023dce3b6eafe4b899a51cb66
    Signed-off-by: Ravi Kumar Alamanda <>
  2. msm: camera: ispif: Validate VFE num input during reset

    Suman Mukherjee authored and u-ra committed Mar 4, 2016
    Userspace supplies the actual number of used VFEs in session to ISPIF.
    Validate the userspace input value and if found to be invalid, return
    CRs-Fixed: 898074
    Signed-off-by: Venu Yeshala <>
    Signed-off-by: Suman Mukherjee<>
    Change-Id: I3288ddb6404e817a705a92281b4c54666f372c56
  3. msm: kgsl: Add missing checks for alloc size and sglen

    Rajesh Kemisetti authored and u-ra committed Apr 13, 2016
    In _kgsl_sharedmem_page_alloc():
    - Make len of type size_t to be in line with size.
      - Check for boundary limits of requested alloc size before honoring.
        - Make sure sglen is greater than zero before marking it as end
          of sg list.
    Change-Id: I8e18aad2118f58ce677050ff4c4a4b0823c4b4b3
  4. msm: mdss: fix possible out-of-bounds and overflow issue in mdp debugfs

    adriansm authored and u-ra committed Apr 15, 2016
    There are few cases where the count argument passed by the user
    space is not validated, which can potentially lead to out of bounds
    or overflow issues. In some cases, kernel might copy more data than
    what is requested. Add necessary checks to avoid such cases.
    Change-Id: I32ccccce3179346fd261ffc5b3a379230e7b413f
Commits on Jun 6, 2016
  1. staging: prima: Squashed update from qcom-opensource/wlan

    mdmower committed Jun 6, 2016
    Branch: LA.BF.1.1.3_rb1.13
    Up to and including:
      wlan: Fix buffer overwrite problem in CCXBEACONREQ
    Change-Id: I5e53b9ef3990aab58a851fbb872be3abec1971a0
  2. Revert "staging: prima: Squashed revert of LA.BF.1.1.3 update"

    mdmower committed Jun 6, 2016
    This reverts commit f5491ca.
    Change-Id: Ibc1932b1511061827f0ec339f4d6c4c82db4e285
Commits on Jun 5, 2016
  1. Linux 3.4.112

    lizf-os authored and u-ra committed Apr 27, 2016
  2. splice: sendfile() at once fails for big files

    chleroy authored and u-ra committed May 6, 2015
    commit 0ff28d9f4674d781e492bcff6f32f0fe48cf0fed upstream.
    Using sendfile with below small program to get MD5 sums of some files,
    it appear that big files (over 64kbytes with 4k pages system) get a
    wrong MD5 sum while small files get the correct sum.
    This program uses sendfile() to send a file to an AF_ALG socket
    for hashing.
    /* md5sum2.c */
    #include <stdio.h>
    #include <stdlib.h>
    #include <unistd.h>
    #include <string.h>
    #include <fcntl.h>
    #include <sys/socket.h>
    #include <sys/stat.h>
    #include <sys/types.h>
    #include <linux/if_alg.h>
    int main(int argc, char **argv)
    	int sk = socket(AF_ALG, SOCK_SEQPACKET, 0);
    	struct stat st;
    	struct sockaddr_alg sa = {
    		.salg_family = AF_ALG,
    		.salg_type = "hash",
    		.salg_name = "md5",
    	int n;
    	bind(sk, (struct sockaddr*)&sa, sizeof(sa));
    	for (n = 1; n < argc; n++) {
    		int size;
    		int offset = 0;
    		char buf[4096];
    		int fd;
    		int sko;
    		int i;
    		fd = open(argv[n], O_RDONLY);
    		sko = accept(sk, NULL, 0);
    		fstat(fd, &st);
    		size = st.st_size;
    		sendfile(sko, fd, &offset, size);
    		size = read(sko, buf, sizeof(buf));
    		for (i = 0; i < size; i++)
    			printf("%2.2x", buf[i]);
    		printf("  %s\n", argv[n]);
    Test below is done using official linux patch files. First result is
    with a software based md5sum. Second result is with the program above.
    root@vgoip:~# ls -l patch-3.6.*
    -rw-r--r--    1 root     root         64011 Aug 24 12:01 patch-3.6.2.gz
    -rw-r--r--    1 root     root         94131 Aug 24 12:01 patch-3.6.3.gz
    root@vgoip:~# md5sum patch-3.6.*
    b3ffb9848196846f31b2ff133d2d6443  patch-3.6.2.gz
    c5e8f687878457db77cb7158c38a7e43  patch-3.6.3.gz
    root@vgoip:~# ./md5sum2 patch-3.6.*
    b3ffb9848196846f31b2ff133d2d6443  patch-3.6.2.gz
    5fd77b24e68bb24dcc72d6e57c64790e  patch-3.6.3.gz
    After investivation, it appears that sendfile() sends the files by blocks
    of 64kbytes (16 times PAGE_SIZE). The problem is that at the end of each
    block, the SPLICE_F_MORE flag is missing, therefore the hashing operation
    is reset as if it was the end of the file.
    This patch adds SPLICE_F_MORE to the flags when more data is pending.
    With the patch applied, we get the correct sums:
    root@vgoip:~# md5sum patch-3.6.*
    b3ffb9848196846f31b2ff133d2d6443  patch-3.6.2.gz
    c5e8f687878457db77cb7158c38a7e43  patch-3.6.3.gz
    root@vgoip:~# ./md5sum2 patch-3.6.*
    b3ffb9848196846f31b2ff133d2d6443  patch-3.6.2.gz
    c5e8f687878457db77cb7158c38a7e43  patch-3.6.3.gz
    Signed-off-by: Christophe Leroy <>
    Signed-off-by: Jens Axboe <>
    Cc: Ben Hutchings <>
    Signed-off-by: Zefan Li <>
  3. usb: Use the USB_SS_MULT() macro to decode burst multiplier for log m…

    bwhacks authored and u-ra committed Nov 18, 2015
    commit 5377adb092664d336ac212499961cac5e8728794 upstream.
    usb_parse_ss_endpoint_companion() now decodes the burst multiplier
    correctly in order to check that it's <= 3, but still uses the wrong
    expression if warning that it's > 3.
    Fixes: ff30cbc8da42 ("usb: Use the USB_SS_MULT() macro to get the ...")
    Signed-off-by: Ben Hutchings <>
    Signed-off-by: Greg Kroah-Hartman <>
    Signed-off-by: Zefan Li <>
  4. mm: make sendfile(2) killable

    Jan Kara authored and u-ra committed Oct 22, 2015
    commit 296291cdd1629c308114504b850dc343eabc2782 upstream.
    Currently a simple program below issues a sendfile(2) system call which
    takes about 62 days to complete in my test KVM instance.
            int fd;
            off_t off = 0;
            fd = open("file", O_RDWR | O_TRUNC | O_SYNC | O_CREAT, 0644);
            ftruncate(fd, 2);
            lseek(fd, 0, SEEK_END);
            sendfile(fd, fd, &off, 0xfffffff);
    Now you should not ask kernel to do a stupid stuff like copying 256MB in
    2-byte chunks and call fsync(2) after each chunk but if you do, sysadmin
    should have a way to stop you.
    We actually do have a check for fatal_signal_pending() in
    generic_perform_write() which triggers in this path however because we
    always succeed in writing something before the check is done, we return
    value > 0 from generic_perform_write() and thus the information about
    signal gets lost.
    Fix the problem by doing the signal check before writing anything.  That
    way generic_perform_write() returns -EINTR, the error gets propagated up
    and the sendfile loop terminates early.
    Signed-off-by: Jan Kara <>
    Reported-by: Dmitry Vyukov <>
    Cc: Al Viro <>
    Signed-off-by: Andrew Morton <>
    Signed-off-by: Linus Torvalds <>
    Signed-off-by: Zefan Li <>
  5. crypto: api - Only abort operations on fatal signal

    herbertx authored and u-ra committed Oct 19, 2015
    commit 3fc89adb9fa4beff31374a4bf50b3d099d88ae83 upstream.
    Currently a number of Crypto API operations may fail when a signal
    occurs.  This causes nasty problems as the caller of those operations
    are often not in a good position to restart the operation.
    In fact there is currently no need for those operations to be
    interrupted by user signals at all.  All we need is for them to
    be killable.
    This patch replaces the relevant calls of signal_pending with
    fatal_signal_pending, and wait_for_completion_interruptible with
    wait_for_completion_killable, respectively.
    Signed-off-by: Herbert Xu <>
    Signed-off-by: Zefan Li <>
  6. crypto: ahash - ensure statesize is non-zero

    Russell King authored and u-ra committed Oct 9, 2015
    commit 8996eafdcbad149ac0f772fb1649fbb75c482a6a upstream.
    Unlike shash algorithms, ahash drivers must implement export
    and import as their descriptors may contain hardware state and
    cannot be exported as is.  Unfortunately some ahash drivers did
    not provide them and end up causing crashes with algif_hash.
    This patch adds a check to prevent these drivers from registering
    ahash algorithms until they are fixed.
    Signed-off-by: Russell King <>
    Signed-off-by: Herbert Xu <>
    Signed-off-by: Zefan Li <>
  7. drivers/tty: require read access for controlling terminal

    thejh authored and u-ra committed Oct 4, 2015
    commit 0c55627167870255158db1cde0d28366f91c8872 upstream.
    This is mostly a hardening fix, given that write-only access to other
    users' ttys is usually only given through setgid tty executables.
    Signed-off-by: Jann Horn <>
    Signed-off-by: Greg Kroah-Hartman <>
    [lizf: Backported to 3.4: adjust context]
    Signed-off-by: Zefan Li <>
  8. tty: fix stall caused by missing memory barrier in drivers/tty/n_tty.c

    Kosuke Tatsukawa authored and u-ra committed Oct 2, 2015
    commit e81107d4c6bd098878af9796b24edc8d4a9524fd upstream.
    My colleague ran into a program stall on a x86_64 server, where
    n_tty_read() was waiting for data even if there was data in the buffer
    in the pty.  kernel stack for the stuck process looks like below.
     #0 [ffff88303d107b58] __schedule at ffffffff815c4b20
     #1 [ffff88303d107bd0] schedule at ffffffff815c513e
     #2 [ffff88303d107bf0] schedule_timeout at ffffffff815c7818
     #3 [ffff88303d107ca0] wait_woken at ffffffff81096bd2
     #4 [ffff88303d107ce0] n_tty_read at ffffffff8136fa23
     #5 [ffff88303d107dd0] tty_read at ffffffff81368013
     #6 [ffff88303d107e20] __vfs_read at ffffffff811a3704
     #7 [ffff88303d107ec0] vfs_read at ffffffff811a3a57
     #8 [ffff88303d107f00] sys_read at ffffffff811a4306
     #9 [ffff88303d107f50] entry_SYSCALL_64_fastpath at ffffffff815c86d7
    There seems to be two problems causing this issue.
    First, in drivers/tty/n_tty.c, __receive_buf() stores the data and
    updates ldata->commit_head using smp_store_release() and then checks
    the wait queue using waitqueue_active().  However, since there is no
    memory barrier, __receive_buf() could return without calling
    wake_up_interactive_poll(), and at the same time, n_tty_read() could
    start to wait in wait_woken() as in the following chart.
            __receive_buf()                         n_tty_read()
    if (waitqueue_active(&tty->read_wait))
    /* Memory operations issued after the
       RELEASE may be completed before the
       RELEASE operation has completed */
                                            add_wait_queue(&tty->read_wait, &wait);
                                            if (!input_available_p(tty, 0)) {
                                            timeout = wait_woken(&wait,
                                              TASK_INTERRUPTIBLE, timeout);
    The second problem is that n_tty_read() also lacks a memory barrier
    call and could also cause __receive_buf() to return without calling
    wake_up_interactive_poll(), and n_tty_read() to wait in wait_woken()
    as in the chart below.
            __receive_buf()                         n_tty_read()
                                            spin_lock_irqsave(&q->lock, flags);
                                            /* from add_wait_queue() */
                                            if (!input_available_p(tty, 0)) {
                                            /* Memory operations issued after the
                                               RELEASE may be completed before the
                                               RELEASE operation has completed */
    if (waitqueue_active(&tty->read_wait))
                                            __add_wait_queue(q, wait);
                                            /* from add_wait_queue() */
                                            timeout = wait_woken(&wait,
                                              TASK_INTERRUPTIBLE, timeout);
    There are also other places in drivers/tty/n_tty.c which have similar
    calls to waitqueue_active(), so instead of adding many memory barrier
    calls, this patch simply removes the call to waitqueue_active(),
    leaving just wake_up*() behind.
    This fixes both problems because, even though the memory access before
    or after the spinlocks in both wake_up*() and add_wait_queue() can
    sneak into the critical section, it cannot go past it and the critical
    section assures that they will be serialized (please see "INTER-CPU
    ACQUIRING BARRIER EFFECTS" in Documentation/memory-barriers.txt for a
    better explanation).  Moreover, the resulting code is much simpler.
    Latency measurement using a ping-pong test over a pty doesn't show any
    visible performance drop.
    Signed-off-by: Kosuke Tatsukawa <>
    Signed-off-by: Greg Kroah-Hartman <>
    [lizf: Backported to 3.4:
     - adjust context
     - s/wake_up_interruptible_poll/wake_up_interruptible/
     - drop changes to __receive_buf() and n_tty_set_termios()]
    Signed-off-by: Zefan Li <>
  9. clocksource: Fix abs() usage w/ 64bit values

    johnstultz-work authored and u-ra committed Sep 15, 2015
    commit 67dfae0cd72fec5cd158b6e5fb1647b7dbe0834c upstream.
    This patch fixes one cases where abs() was being used with 64-bit
    nanosecond values, where the result may be capped at 32-bits.
    This potentially could cause watchdog false negatives on 32-bit
    systems, so this patch addresses the issue by using abs64().
    Signed-off-by: John Stultz <>
    Cc: Prarit Bhargava <>
    Cc: Richard Cochran <>
    Cc: Ingo Molnar <>
    Signed-off-by: Thomas Gleixner <>
    [lizf: Backported to 3.4: adjust context]
    Signed-off-by: Zefan Li <>
  10. mm: hugetlbfs: skip shared VMAs when unmapping private pages to satis…

    gormanm authored and u-ra committed Oct 1, 2015
    …fy a fault
    commit 2f84a8990ebbe235c59716896e017c6b2ca1200f upstream.
    SunDong reported the following on
    	I think I find a linux bug, I have the test cases is constructed. I
    	can stable recurring problems in fedora22(4.0.4) kernel version,
    	arch for x86_64.  I construct transparent huge page, when the parent
    	and child process with MAP_SHARE, MAP_PRIVATE way to access the same
    	huge page area, it has the opportunity to lead to huge page copy on
    	write failure, and then it will munmap the child corresponding mmap
    	area, but then the child mmap area with VM_MAYSHARE attributes, child
    	process munmap this area can trigger VM_BUG_ON in set_vma_resv_flags
    	functions (vma - > vm_flags & VM_MAYSHARE).
    There were a number of problems with the report (e.g.  it's hugetlbfs that
    triggers this, not transparent huge pages) but it was fundamentally
    correct in that a VM_BUG_ON in set_vma_resv_flags() can be triggered that
    looks like this
    	 vma ffff8804651fd0d0 start 00007fc474e00000 end 00007fc475e00000
    	 next ffff8804651fd018 prev ffff8804651fd188 mm ffff88046b1b1800
    	 prot 8000000000000027 anon_vma           (null) vm_ops ffffffff8182a7a0
    	 pgoff 0 file ffff88106bdb9800 private_data           (null)
    	 flags: 0x84400fb(read|write|shared|mayread|maywrite|mayexec|mayshare|dontexpand|hugetlb)
    	 kernel BUG at mm/hugetlb.c:462!
    	 Modules linked in: xt_pkttype xt_LOG xt_limit [..]
    	 CPU: 38 PID: 26839 Comm: map Not tainted 4.0.4-default #1
    	 Hardware name: Dell Inc. PowerEdge R810/0TT6JF, BIOS 2.7.4 04/26/2012
    The VM_BUG_ON is correct because private and shared mappings have
    different reservation accounting but the warning clearly shows that the
    VMA is shared.
    When a private COW fails to allocate a new page then only the process
    that created the VMA gets the page -- all the children unmap the page.
    If the children access that data in the future then they get killed.
    The problem is that the same file is mapped shared and private.  During
    the COW, the allocation fails, the VMAs are traversed to unmap the other
    private pages but a shared VMA is found and the bug is triggered.  This
    patch identifies such VMAs and skips them.
    Signed-off-by: Mel Gorman <>
    Reported-by: SunDong <>
    Reviewed-by: Michal Hocko <>
    Cc: Andrea Arcangeli <>
    Cc: Hugh Dickins <>
    Cc: Naoya Horiguchi <>
    Cc: David Rientjes <>
    Reviewed-by: Naoya Horiguchi <>
    Signed-off-by: Andrew Morton <>
    Signed-off-by: Linus Torvalds <>
    Signed-off-by: Zefan Li <>
  11. genirq: Fix race in register_irq_proc()

    bwhacks authored and u-ra committed Sep 26, 2015
    commit 95c2b17534654829db428f11bcf4297c059a2a7e upstream.
    Per-IRQ directories in procfs are created only when a handler is first
    added to the irqdesc, not when the irqdesc is created.  In the case of
    a shared IRQ, multiple tasks can race to create a directory.  This
    race condition seems to have been present forever, but is easier to
    hit with async probing.
    Signed-off-by: Ben Hutchings <>
    Signed-off-by: Thomas Gleixner <>
    Signed-off-by: Zefan Li <>
  12. usb: Use the USB_SS_MULT() macro to get the burst multiplier.

    matnyman authored and u-ra committed Sep 21, 2015
    commit ff30cbc8da425754e8ab96904db1d295bd034f27 upstream.
    Bits 1:0 of the bmAttributes are used for the burst multiplier.
    The rest of the bits used to be reserved (zero), but USB3.1 takes bit 7
    into use.
    Use the existing USB_SS_MULT() macro instead to make sure the mult value
    and hence max packet calculations are correct for USB3.1 devices.
    Note that burst multiplier in bmAttributes is zero based and that
    the USB_SS_MULT() macro adds one.
    Signed-off-by: Mathias Nyman <>
    Signed-off-by: Greg Kroah-Hartman <>
    Signed-off-by: Zefan Li <>
  13. regmap: debugfs: Don't bother actually printing when calculating max …

    broonie authored and u-ra committed Sep 19, 2015
    commit 176fc2d5770a0990eebff903ba680d2edd32e718 upstream.
    The in kernel snprintf() will conveniently return the actual length of
    the printed string even if not given an output beffer at all so just do
    that rather than relying on the user to pass in a suitable buffer,
    ensuring that we don't need to worry if the buffer was truncated due to
    the size of the buffer passed in.
    Reported-by: Rasmus Villemoes <>
    Signed-off-by: Mark Brown <>
    Signed-off-by: Zefan Li <>
  14. regmap: debugfs: Ensure we don't underflow when printing access masks

    broonie authored and u-ra committed Sep 19, 2015
    commit b763ec17ac762470eec5be8ebcc43e4f8b2c2b82 upstream.
    If a read is attempted which is smaller than the line length then we may
    underflow the subtraction we're doing with the unsigned size_t type so
    move some of the calculation to be additions on the right hand side
    instead in order to avoid this.
    Reported-by: Rasmus Villemoes <>
    Signed-off-by: Mark Brown <>
    Signed-off-by: Zefan Li <>
  15. spi: Fix documentation of spi_alloc_master()

    groeck authored and u-ra committed Sep 5, 2015
    commit a394d635193b641f2c86ead5ada5b115d57c51f8 upstream.
    Actually, spi_master_put() after spi_alloc_master() must _not_ be followed
    by kfree(). The memory is already freed with the call to spi_master_put()
    through spi_master_class, which registers a release function. Calling both
    spi_master_put() and kfree() results in often nasty (and delayed) crashes
    elsewhere in the kernel, often in the networking stack.
    This reverts commit eb4af0f.
    Link to patch and concerns:
    Alexey Klimov: This revert becomes valid after
    94c69f765f1b4a658d96905ec59928e3e3e07e6a when spi-imx.c
    has been fixed and there is no need to call kfree() so comment
    for spi_alloc_master() should be fixed.
    Signed-off-by: Guenter Roeck <>
    Signed-off-by: Alexey Klimov <>
    Signed-off-by: Mark Brown <>
    Signed-off-by: Zefan Li <>
  16. sched/core: Fix TASK_DEAD race in finish_task_switch()

    Peter Zijlstra authored and u-ra committed Sep 29, 2015
    commit 95913d97914f44db2b81271c2e2ebd4d2ac2df83 upstream.
    So the problem this patch is trying to address is as follows:
            CPU0                            CPU1
            context_switch(A, B)
                                              LOCK A->pi_lock
                                              A->on_cpu == 0
              prev_state = A->state  <-.
              WMB                      |
              A->on_cpu = 0;           |
              UNLOCK rq0->lock         |
                                       |    context_switch(C, A)
                                       `--  A->state = TASK_DEAD
              prev_state == TASK_DEAD
                                            context_switch(A, C)
                                              A->state == TASK_DEAD
    The argument being that the WMB will allow the load of A->state on CPU0
    to cross over and observe CPU1's store of A->state, which will then
    result in a double-drop and use-after-free.
    Now the comment states (and this was true once upon a long time ago)
    that we need to observe A->state while holding rq->lock because that
    will order us against the wakeup; however the wakeup will not in fact
    acquire (that) rq->lock; it takes A->pi_lock these days.
    We can obviously fix this by upgrading the WMB to an MB, but that is
    expensive, so we'd rather avoid that.
    The alternative this patch takes is: smp_store_release(&A->on_cpu, 0),
    which avoids the MB on some archs, but not important ones like ARM.
    Reported-by: Oleg Nesterov <>
    Signed-off-by: Peter Zijlstra (Intel) <>
    Acked-by: Linus Torvalds <>
    Cc: Peter Zijlstra <>
    Cc: Thomas Gleixner <>
    Fixes: e4a52bc ("sched: Remove rq->lock from the first half of ttwu()")
    Signed-off-by: Ingo Molnar <>
    [lizf: Backported to 3.4: use smb_mb() instead of smp_store_release(), which
     is not defined in 3.4.y]
    Signed-off-by: Zefan Li <>
  17. ipv6: Fix IPsec pre-encap fragmentation check

    herbertx authored and u-ra committed Sep 4, 2015
    commit 93efac3f2e03321129de67a3c0ba53048bb53e31 upstream.
    The IPv6 IPsec pre-encap path performs fragmentation for tunnel-mode
    packets.  That is, we perform fragmentation pre-encap rather than
    A check was added later to ensure that proper MTU information is
    passed back for locally generated traffic.  Unfortunately this
    check was performed on all IPsec packets, including transport-mode
    What's more, the check failed to take GSO into account.
    The end result is that transport-mode GSO packets get dropped at
    the check.
    This patch fixes it by moving the tunnel mode check forward as well
    as adding the GSO check.
    Fixes: dd76785 ("xfrm6: Don't call icmpv6_send on local error")
    Signed-off-by: Herbert Xu <>
    Signed-off-by: Steffen Klassert <>
    [lizf: Backported to 3.4:
     - adjust context
     - s/ignore_df/local_df]
    Signed-off-by: Zefan Li <>
  18. module: Fix locking in symbol_put_addr()

    Peter Zijlstra authored and u-ra committed Aug 20, 2015
    commit 275d7d44d802ef271a42dc87ac091a495ba72fc5 upstream.
    Poma (on the way to another bug) reported an assertion triggering:
      [<ffffffff81150529>] module_assert_mutex_or_preempt+0x49/0x90
      [<ffffffff81150822>] __module_address+0x32/0x150
      [<ffffffff81150956>] __module_text_address+0x16/0x70
      [<ffffffff81150f19>] symbol_put_addr+0x29/0x40
      [<ffffffffa04b77ad>] dvb_frontend_detach+0x7d/0x90 [dvb_core]
    Laura Abbott <> produced a patch which lead us to
    inspect symbol_put_addr(). This function has a comment claiming it
    doesn't need to disable preemption around the module lookup
    because it holds a reference to the module it wants to find, which
    therefore cannot go away.
    This is wrong (and a false optimization too, preempt_disable() is really
    rather cheap, and I doubt any of this is on uber critical paths,
    otherwise it would've retained a pointer to the actual module anyway and
    avoided the second lookup).
    While its true that the module cannot go away while we hold a reference
    on it, the data structure we do the lookup in very much _CAN_ change
    while we do the lookup. Therefore fix the comment and add the
    required preempt_disable().
    Reported-by: poma <>
    Signed-off-by: Peter Zijlstra (Intel) <>
    Signed-off-by: Rusty Russell <>
    Fixes: a6e6abd ("module: remove module_text_address()")
    Signed-off-by: Zefan Li <>
  19. ARM: fix Thumb2 signal handling when ARMv6 is enabled

    Russell King authored and u-ra committed Sep 11, 2015
    commit 9b55613f42e8d40d5c9ccb8970bde6af4764b2ab upstream.
    When a kernel is built covering ARMv6 to ARMv7, we omit to clear the
    IT state when entering a signal handler.  This can cause the first
    few instructions to be conditionally executed depending on the parent
    In any case, the original test for >= ARMv7 is broken - ARMv6 can have
    Thumb-2 support as well, and an ARMv6T2 specific build would omit this
    code too.
    Relax the test back to ARMv6 or greater.  This results in us always
    clearing the IT state bits in the PSR, even on CPUs where these bits
    are reserved.  However, they're reserved for the IT state, so this
    should cause no harm.
    Fixes: d71e135 ("Clear the IT state when invoking a Thumb-2 signal handler")
    Acked-by: Tony Lindgren <>
    Tested-by: H. Nikolaus Schaller <>
    Tested-by: Grazvydas Ignotas <>
    Signed-off-by: Russell King <>
    Signed-off-by: Zefan Li <>
  20. perf header: Fixup reading of HEADER_NRCPUS feature

    Arnaldo Carvalho de Melo authored and u-ra committed Sep 11, 2015
    commit caa470475d9b59eeff093ae650800d34612c4379 upstream.
    The original patch introducing this header wrote the number of CPUs available
    and online in one order and then swapped those values when reading, fix it.
      # perf record usleep 1
      # perf report --header-only | grep 'nrcpus \(online\|avail\)'
      # nrcpus online : 4
      # nrcpus avail : 4
      # echo 0 > /sys/devices/system/cpu/cpu2/online
      # perf record usleep 1
      # perf report --header-only | grep 'nrcpus \(online\|avail\)'
      # nrcpus online : 4
      # nrcpus avail : 3
      # echo 0 > /sys/devices/system/cpu/cpu1/online
      # perf record usleep 1
      # perf report --header-only | grep 'nrcpus \(online\|avail\)'
      # nrcpus online : 4
      # nrcpus avail : 2
    After the fix, bringing back the CPUs online:
      # perf report --header-only | grep 'nrcpus \(online\|avail\)'
      # nrcpus online : 2
      # nrcpus avail : 4
      # echo 1 > /sys/devices/system/cpu/cpu2/online
      # perf record usleep 1
      # perf report --header-only | grep 'nrcpus \(online\|avail\)'
      # nrcpus online : 3
      # nrcpus avail : 4
      # echo 1 > /sys/devices/system/cpu/cpu1/online
      # perf record usleep 1
      # perf report --header-only | grep 'nrcpus \(online\|avail\)'
      # nrcpus online : 4
      # nrcpus avail : 4
    Acked-by: Namhyung Kim <>
    Cc: Adrian Hunter <>
    Cc: Borislav Petkov <>
    Cc: David Ahern <>
    Cc: Frederic Weisbecker <>
    Cc: Jiri Olsa <>
    Cc: Kan Liang <>
    Cc: Stephane Eranian <>
    Cc: Wang Nan <>
    Fixes: fbe96f2 ("perf tools: Make more self-descriptive (v8)")
    Signed-off-by: Arnaldo Carvalho de Melo <>
    [lizf: Backported to 3.4: fix it by saving values in an array and then print
     it in reverse order]
    Signed-off-by: Zefan Li <>
  21. ARM: 8429/1: disable GCC SRA optimization

    ardbiesheuvel authored and u-ra committed Sep 3, 2015
    commit a077224fd35b2f7fbc93f14cf67074fc792fbac2 upstream.
    While working on the 32-bit ARM port of UEFI, I noticed a strange
    corruption in the kernel log. The following snprintf() statement
    (in drivers/firmware/efi/efi.c:efi_md_typeattr_format())
    	snprintf(pos, size, "|%3s|%2s|%2s|%2s|%3s|%2s|%2s|%2s|%2s]",
    was producing the following output in the log:
    	|    |   |   |   |    |WB|WT|WC|UC]
    	|    |   |   |   |    |WB|WT|WC|UC]
    	|    |   |   |   |    |WB|WT|WC|UC]
    	|RUN|   |   |   |    |WB|WT|WC|UC]*
    	|RUN|   |   |   |    |WB|WT|WC|UC]*
    	|    |   |   |   |    |WB|WT|WC|UC]
    	|RUN|   |   |   |    |WB|WT|WC|UC]*
    	|    |   |   |   |    |WB|WT|WC|UC]
    	|RUN|   |   |   |    |   |   |   |UC]
    	|RUN|   |   |   |    |   |   |   |UC]
    As it turns out, this is caused by incorrect code being emitted for
    the string() function in lib/vsprintf.c. The following code
    	if (!(spec.flags & LEFT)) {
    		while (len < spec.field_width--) {
    			if (buf < end)
    				*buf = ' ';
    	for (i = 0; i < len; ++i) {
    		if (buf < end)
    			*buf = *s;
    		++buf; ++s;
    	while (len < spec.field_width--) {
    		if (buf < end)
    			*buf = ' ';
    when called with len == 0, triggers an issue in the GCC SRA optimization
    pass (Scalar Replacement of Aggregates), which handles promotion of signed
    struct members incorrectly. This is a known but as yet unresolved issue.
    ( In this particular
    case, it is causing the second while loop to be executed erroneously a
    single time, causing the additional space characters to be printed.
    So disable the optimization by passing -fno-ipa-sra.
    Acked-by: Nicolas Pitre <>
    Signed-off-by: Ard Biesheuvel <>
    Signed-off-by: Russell King <>
    Signed-off-by: Zefan Li <>
  22. fs: create and use seq_show_option for escaping

    kees authored and u-ra committed Sep 4, 2015
    commit a068acf2ee77693e0bf39d6e07139ba704f461c3 upstream.
    Many file systems that implement the show_options hook fail to correctly
    escape their output which could lead to unescaped characters (e.g.  new
    lines) leaking into /proc/mounts and /proc/[pid]/mountinfo files.  This
    could lead to confusion, spoofed entries (resulting in things like
    systemd issuing false d-bus "mount" notifications), and who knows what
    else.  This looks like it would only be the root user stepping on
    themselves, but it's possible weird things could happen in containers or
    in other situations with delegated mount privileges.
    Here's an example using overlay with setuid fusermount trusting the
    contents of /proc/mounts (via the /etc/mtab symlink).  Imagine the use
    of "sudo" is something more sneaky:
      $ BASE="ovl"
      $ MNT="$BASE/mnt"
      $ LOW="$BASE/lower"
      $ UP="$BASE/upper"
      $ WORK="$BASE/work/ 0 0
      none /proc fuse.pwn user_id=1000"
      $ mkdir -p "$LOW" "$UP" "$WORK"
      $ sudo mount -t overlay -o "lowerdir=$LOW,upperdir=$UP,workdir=$WORK" none /mnt
      $ cat /proc/mounts
      none /root/ovl/mnt overlay rw,relatime,lowerdir=ovl/lower,upperdir=ovl/upper,workdir=ovl/work/ 0 0
      none /proc fuse.pwn user_id=1000 0 0
      $ fusermount -u /proc
      $ cat /proc/mounts
      cat: /proc/mounts: No such file or directory
    This fixes the problem by adding new seq_show_option and
    seq_show_option_n helpers, and updating the vulnerable show_option
    handlers to use them as needed.  Some, like SELinux, need to be open
    coded due to unusual existing escape mechanisms.
    [ add lost chunk, per Kees]
    [ seq_show_option should be using const parameters]
    Signed-off-by: Kees Cook <>
    Acked-by: Serge Hallyn <>
    Acked-by: Jan Kara <>
    Acked-by: Paul Moore <>
    Cc: J. R. Okajima <>
    Signed-off-by: Kees Cook <>
    Signed-off-by: Andrew Morton <>
    Signed-off-by: Linus Torvalds <>
    [lizf: Backported to 3.4:
     - adjust context
     - one more place in ceph needs to be changed
     - drop changes to overlayfs
     - drop showing vers in cifs]
    Signed-off-by: Zefan Li <>
  23. drivercore: Fix unregistration path of platform devices

    glikely authored and u-ra committed Jun 7, 2015
    commit 7f5dcaf1fdf289767a126a0a5cc3ef39b5254b06 upstream.
    The unregister path of platform_device is broken. On registration, it
    will register all resources with either a parent already set, or
    type==IORESOURCE_{IO,MEM}. However, on unregister it will release
    everything with type==IORESOURCE_{IO,MEM}, but ignore the others. There
    are also cases where resources don't get registered in the first place,
    like with devices created by of_platform_populate()*.
    Fix the unregister path to be symmetrical with the register path by
    checking the parent pointer instead of the type field to decide which
    resources to unregister. This is safe because the upshot of the
    registration path algorithm is that registered resources have a parent
    pointer, and non-registered resources do not.
    * It can be argued that of_platform_populate() should be registering
      it's resources, and they argument has some merit. However, there are
      quite a few platforms that end up broken if we try to do that due to
      overlapping resources in the device tree. Until that is fixed, we need
      to solve the immediate problem.
    Cc: Pantelis Antoniou <>
    Cc: Wolfram Sang <>
    Cc: Rob Herring <>
    Cc: Greg Kroah-Hartman <>
    Cc: Ricardo Ribalda Delgado <>
    Signed-off-by: Grant Likely <>
    Tested-by: Ricardo Ribalda Delgado <>
    Tested-by: Wolfram Sang <>
    Signed-off-by: Rob Herring <>
    Signed-off-by: Zefan Li <>