Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Commits on Jul 25, 2012
  1. Linux 3.2.24

    Ben Hutchings authored
  2. @BlueDragonX

    HID: add support for 2012 MacBook Pro Retina

    BlueDragonX authored Ben Hutchings committed
    commit b2e6ad7 upstream.
    
    Add support for the 15'' MacBook Pro Retina. The keyboard is
    the same as recent models.
    
    The patch needs to be synchronized with the bcm5974 patch for
    the trackpad - as usual.
    
    Patch originally written by clipcarl (forums.opensuse.org).
    
    [rydberg@euromail.se: Amended mouse ignore lines]
    Signed-off-by: Ryan Bourgeois <bluedragonx@gmail.com>
    Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
    Acked-by: Jiri Kosina <jkosina@suse.cz>
    Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
  3. @yurikhan

    Input: xpad - add Andamiro Pump It Up pad

    yurikhan authored Ben Hutchings committed
    commit e76b8ee upstream.
    
    I couldn't find the vendor ID in any of the online databases, but this
    mat has a Pump It Up logo on the top side of the controller compartment,
    and a disclaimer stating that Andamiro will not be liable on the bottom.
    
    Signed-off-by: Yuri Khan <yurivkhan@gmail.com>
    Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
  4. @K900

    Input: xpad - add signature for Razer Onza Tournament Edition

    K900 authored Ben Hutchings committed
    commit cc71a7e upstream.
    
    Signed-off-by: Ilia Katsnelson <k0009000@gmail.com>
    Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
  5. @yurikhan

    Input: xpad - handle all variations of Mad Catz Beat Pad

    yurikhan authored Ben Hutchings committed
    commit 3ffb62c upstream.
    
    The device should be handled by xpad driver instead of generic HID driver.
    
    Signed-off-by: Yuri Khan <yurivkhan@gmail.com>
    Acked-by: Jiri Kosina <jkosina@suse.cz>
    Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
  6. @rydberg

    Input: bcm5974 - Add support for 2012 MacBook Pro Retina

    rydberg authored Ben Hutchings committed
    commit 3dde22a upstream.
    
    Add support for the 15'' MacBook Pro Retina model (MacBookPro10,1).
    
    Patch originally written by clipcarl (forums.opensuse.org).
    
    Signed-off-by: Henrik Rydberg <rydberg@euromail.se>
    Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
  7. @ebiederm

    bonding: Manage /proc/net/bonding/ entries from the netdev events

    ebiederm authored Ben Hutchings committed
    commit a64d49c upstream.
    
    It was recently reported that moving a bonding device between network
    namespaces causes warnings from /proc.  It turns out after the move we
    were trying to add and to remove the /proc/net/bonding entries from the
    wrong network namespace.
    
    Move the bonding /proc registration code into the NETDEV_REGISTER and
    NETDEV_UNREGISTER events where the proc registration and unregistration
    will always happen at the right time.
    
    Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
  8. @ebiederm

    bonding: debugfs and network namespaces are incompatible

    ebiederm authored Ben Hutchings committed
    commit 96ca7ff upstream.
    
    The bonding debugfs support has been broken in the presence of network
    namespaces since it has been added.  The debugfs support does not handle
    multiple bonding devices with the same name in different network
    namespaces.
    
    I haven't had any bug reports, and I'm not interested in getting any.
    Disable the debugfs support when network namespaces are enabled.
    
    Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
  9. stmmac: Fix for nfs hang on multiple reboot

    Deepak Sikri authored Ben Hutchings committed
    commit 8e83989 upstream.
    
    It was observed that during multiple reboots nfs hangs. The status of
    receive descriptors shows that all the descriptors were in control of
    CPU, and none were assigned to DMA.
    Also the DMA status register confirmed that the Rx buffer is
    unavailable.
    
    This patch adds the fix for the same by adding the memory barriers to
    ascertain that the all instructions before enabling the Rx or Tx DMA are
    completed which involves the proper setting of the ownership bit in DMA
    descriptors.
    
    Signed-off-by: Deepak Sikri <deepak.sikri@st.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
  10. @ra1nb0w

    ipheth: add support for iPad

    ra1nb0w authored Ben Hutchings committed
    commit 6de0298 upstream.
    
    This adds support for the iPad to the ipheth driver.
    (product id = 0x129a)
    
    Signed-off-by: Davide Gerhard <rainbow@irh.it>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
  11. @rjwysocki

    ACPI / PM: Make acpi_pm_device_sleep_state() follow the specification

    rjwysocki authored Ben Hutchings committed
    commit dbe9a2e upstream.
    
    The comparison between the system sleep state being entered
    and the lowest system sleep state the given device may wake up
    from in acpi_pm_device_sleep_state() is reversed, because the
    specification (ACPI 5.0) says that for wakeup to work:
    
    "The sleeping state being entered must be less than or equal to the
     power state declared in element 1 of the _PRW object."
    
    In other words, the state returned by _PRW is the deepest
    (lowest-power) system sleep state the device is capable of waking up
    the system from.
    
    Moreover, acpi_pm_device_sleep_state() also should check if the
    wakeup capability is supported through ACPI, because in principle it
    may be done via native PCIe PME, for example, in which case _SxW
    should not be evaluated.
    
    Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
  12. eCryptfs: Properly check for O_RDONLY flag before doing privileged open

    Tyler Hicks authored Ben Hutchings committed
    commit 9fe79d7 upstream.
    
    If the first attempt at opening the lower file read/write fails,
    eCryptfs will retry using a privileged kthread. However, the privileged
    retry should not happen if the lower file's inode is read-only because a
    read/write open will still be unsuccessful.
    
    The check for determining if the open should be retried was intended to
    be based on the access mode of the lower file's open flags being
    O_RDONLY, but the check was incorrectly performed. This would cause the
    open to be retried by the privileged kthread, resulting in a second
    failed open of the lower file. This patch corrects the check to
    determine if the open request should be handled by the privileged
    kthread.
    
    Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
    Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
    Acked-by: Dan Carpenter <dan.carpenter@oracle.com>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
  13. eCryptfs: Fix lockdep warning in miscdev operations

    Tyler Hicks authored Ben Hutchings committed
    commit 60d65f1 upstream.
    
    Don't grab the daemon mutex while holding the message context mutex.
    Addresses this lockdep warning:
    
     ecryptfsd/2141 is trying to acquire lock:
      (&ecryptfs_msg_ctx_arr[i].mux){+.+.+.}, at: [<ffffffffa029c213>] ecryptfs_miscdev_read+0x143/0x470 [ecryptfs]
    
     but task is already holding lock:
      (&(*daemon)->mux){+.+...}, at: [<ffffffffa029c2ec>] ecryptfs_miscdev_read+0x21c/0x470 [ecryptfs]
    
     which lock already depends on the new lock.
    
     the existing dependency chain (in reverse order) is:
    
     -> #1 (&(*daemon)->mux){+.+...}:
            [<ffffffff810a3b8d>] lock_acquire+0x9d/0x220
            [<ffffffff8151c6da>] __mutex_lock_common+0x5a/0x4b0
            [<ffffffff8151cc64>] mutex_lock_nested+0x44/0x50
            [<ffffffffa029c5d7>] ecryptfs_send_miscdev+0x97/0x120 [ecryptfs]
            [<ffffffffa029b744>] ecryptfs_send_message+0x134/0x1e0 [ecryptfs]
            [<ffffffffa029a24e>] ecryptfs_generate_key_packet_set+0x2fe/0xa80 [ecryptfs]
            [<ffffffffa02960f8>] ecryptfs_write_metadata+0x108/0x250 [ecryptfs]
            [<ffffffffa0290f80>] ecryptfs_create+0x130/0x250 [ecryptfs]
            [<ffffffff811963a4>] vfs_create+0xb4/0x120
            [<ffffffff81197865>] do_last+0x8c5/0xa10
            [<ffffffff811998f9>] path_openat+0xd9/0x460
            [<ffffffff81199da2>] do_filp_open+0x42/0xa0
            [<ffffffff81187998>] do_sys_open+0xf8/0x1d0
            [<ffffffff81187a91>] sys_open+0x21/0x30
            [<ffffffff81527d69>] system_call_fastpath+0x16/0x1b
    
     -> #0 (&ecryptfs_msg_ctx_arr[i].mux){+.+.+.}:
            [<ffffffff810a3418>] __lock_acquire+0x1bf8/0x1c50
            [<ffffffff810a3b8d>] lock_acquire+0x9d/0x220
            [<ffffffff8151c6da>] __mutex_lock_common+0x5a/0x4b0
            [<ffffffff8151cc64>] mutex_lock_nested+0x44/0x50
            [<ffffffffa029c213>] ecryptfs_miscdev_read+0x143/0x470 [ecryptfs]
            [<ffffffff811887d3>] vfs_read+0xb3/0x180
            [<ffffffff811888ed>] sys_read+0x4d/0x90
            [<ffffffff81527d69>] system_call_fastpath+0x16/0x1b
    
    Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
  14. eCryptfs: Gracefully refuse miscdev file ops on inherited/passed files

    Tyler Hicks authored Ben Hutchings committed
    commit 8dc6780 upstream.
    
    File operations on /dev/ecryptfs would BUG() when the operations were
    performed by processes other than the process that originally opened the
    file. This could happen with open files inherited after fork() or file
    descriptors passed through IPC mechanisms. Rather than calling BUG(), an
    error code can be safely returned in most situations.
    
    In ecryptfs_miscdev_release(), eCryptfs still needs to handle the
    release even if the last file reference is being held by a process that
    didn't originally open the file. ecryptfs_find_daemon_by_euid() will not
    be successful, so a pointer to the daemon is stored in the file's
    private_data. The private_data pointer is initialized when the miscdev
    file is opened and only used when the file is released.
    
    https://launchpad.net/bugs/994247
    
    Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
    Reported-by: Sasha Levin <levinsasha928@gmail.com>
    Tested-by: Sasha Levin <levinsasha928@gmail.com>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
  15. @pavlinux

    ACPI sysfs.c strlen fix

    pavlinux authored Ben Hutchings committed
    commit 9f13265 upstream.
    
    Current code is ignoring the last character of "enable" and "disable"
    in comparisons.
    
    https://bugzilla.kernel.org/show_bug.cgi?id=33732
    
    Signed-off-by: Len Brown <len.brown@intel.com>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
  16. @zhang-rui

    ACPI, x86: fix Dell M6600 ACPI reboot regression via DMI

    zhang-rui authored Ben Hutchings committed
    commit 76eb9a3 upstream.
    
    Dell Precision M6600 is known to require PCI reboot, so add it to
    the reboot blacklist in pci_reboot_dmi_table[].
    
    https://bugzilla.kernel.org/show_bug.cgi?id=42749
    
    cc: x86@kernel.org
    Signed-off-by: Zhang Rui <rui.zhang@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
  17. @ftang1

    ACPI: Add a quirk for "AMILO PRO V2030" to ignore the timer overriding

    ftang1 authored Ben Hutchings committed
    commit b939c2a upstream.
    
    commit f6b54f0 upstream.
    
    This is the 2nd part of fix for kernel bugzilla 40002:
        "IRQ 0 assigned to VGA"
    https://bugzilla.kernel.org/show_bug.cgi?id=40002
    
    The root cause is the buggy FW, whose ACPI tables assign the GSI 16
    to 2 irqs 0 and 16(VGA), and the VGA is the right owner of GSI 16.
    So add a quirk to ignore the irq0 overriding GSI 16 for the
    FUJITSU SIEMENS AMILO PRO V2030 platform will solve this issue.
    
    Reported-and-tested-by: Szymon Kowalczyk <fazerxlo@o2.pl>
    Signed-off-by: Feng Tang <feng.tang@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
  18. @ftang1

    ACPI: Remove one board specific WARN when ignoring timer overriding

    ftang1 authored Ben Hutchings committed
    commit 5752cdb upstream.
    
    commit 7f68b4c upstream.
    
    Current WARN msg is only for the ati_ixp4x0 board, while this function
    is used by mulitple platforms. So this one board specific warning
    is not appropriate any more.
    
    Signed-off-by: Feng Tang <feng.tang@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
  19. @ftang1

    ACPI: Make acpi_skip_timer_override cover all source_irq==0 cases

    ftang1 authored Ben Hutchings committed
    commit ae10ccd upstream.
    
    Currently when acpi_skip_timer_override is set, it only cover the
    (source_irq == 0 && global_irq == 2) cases. While there is also
    platform which need use this option and its global_irq is not 2.
    This patch will extend acpi_skip_timer_override to cover all
    timer overriding cases as long as the source irq is 0.
    
    This is the first part of a fix to kernel bug bugzilla 40002:
    	"IRQ 0 assigned to VGA"
    https://bugzilla.kernel.org/show_bug.cgi?id=40002
    
    Reported-and-tested-by: Szymon Kowalczyk <fazerxlo@o2.pl>
    Signed-off-by: Feng Tang <feng.tang@intel.com>
    Signed-off-by: Len Brown <len.brown@intel.com>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
  20. net: remove skb_orphan_try()

    Eric Dumazet authored Ben Hutchings committed
    commit 62b1a8a upstream.
    
    Orphaning skb in dev_hard_start_xmit() makes bonding behavior
    unfriendly for applications sending big UDP bursts : Once packets
    pass the bonding device and come to real device, they might hit a full
    qdisc and be dropped. Without orphaning, the sender is automatically
    throttled because sk->sk_wmemalloc reaches sk->sk_sndbuf (assuming
    sk_sndbuf is not too big)
    
    We could try to defer the orphaning adding another test in
    dev_hard_start_xmit(), but all this seems of little gain,
    now that BQL tends to make packets more likely to be parked
    in Qdisc queues instead of NIC TX ring, in cases where performance
    matters.
    
    Reverts commits :
    fc6055a net: Introduce skb_orphan_try()
    87fd308 net: skb_tx_hash() fix relative to skb_orphan_try()
    and removes SKBTX_DRV_NEEDS_SK_REF flag
    
    Reported-and-bisected-by: Jean-Michel Hautbois <jhautbois@gmail.com>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Tested-by: Oliver Hartkopp <socketcan@hartkopp.net>
    Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    [bwh: Backported to 3.2:
     - Adjust context
     - SKBTX_WIFI_STATUS is not defined]
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
  21. bnx2x: fix panic when TX ring is full

    Eric Dumazet authored Ben Hutchings committed
    commit bc14786 upstream.
    
    There is a off by one error in the minimal number of BD in
    bnx2x_start_xmit() and bnx2x_tx_int() before stopping/resuming tx queue.
    
    A full size GSO packet, with data included in skb->head really needs
    (MAX_SKB_FRAGS + 4) BDs, because of bnx2x_tx_split()
    
    This error triggers if BQL is disabled and heavy TCP transmit traffic
    occurs.
    
    bnx2x_tx_split() definitely can be called, remove a wrong comment.
    
    Reported-by: Tomas Hruby <thruby@google.com>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Eilon Greenstein <eilong@broadcom.com>
    Cc: Yaniv Rosner <yanivr@broadcom.com>
    Cc: Merav Sicron <meravs@broadcom.com>
    Cc: Tom Herbert <therbert@google.com>
    Cc: Robert Evans <evansr@google.com>
    Cc: Willem de Bruijn <willemb@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
  22. bnx2x: fix checksum validation

    Eric Dumazet authored Ben Hutchings committed
    commit d6cb3e4 upstream.
    
    bnx2x driver incorrectly sets ip_summed to CHECKSUM_UNNECESSARY on
    encapsulated segments. TCP stack happily accepts frames with bad
    checksums, if they are inside a GRE or IPIP encapsulation.
    
    Our understanding is that if no IP or L4 csum validation was done by the
    hardware, we should leave ip_summed as is (CHECKSUM_NONE), since
    hardware doesn't provide CHECKSUM_COMPLETE support in its cqe.
    
    Then, if IP/L4 checksumming was done by the hardware, set
    CHECKSUM_UNNECESSARY if no error was flagged.
    
    Patch based on findings and analysis from Robert Evans
    
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Eilon Greenstein <eilong@broadcom.com>
    Cc: Yaniv Rosner <yanivr@broadcom.com>
    Cc: Merav Sicron <meravs@broadcom.com>
    Cc: Tom Herbert <therbert@google.com>
    Cc: Robert Evans <evansr@google.com>
    Cc: Willem de Bruijn <willemb@google.com>
    Acked-by: Eilon Greenstein <eilong@broadcom.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    [bwh: Backported to 3.2: adjust context, indentation]
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
  23. @Devmiller

    r8169: call netif_napi_del at errpaths and at driver unload

    Devmiller authored Ben Hutchings committed
    commit ad1be8d upstream.
    
    when register_netdev fails, the init'ed NAPIs by netif_napi_add must be
    deleted with netif_napi_del, and also when driver unloads, it should
    delete the NAPI before unregistering netdevice using unregister_netdev.
    
    Signed-off-by: Devendra Naga <devendra.aaru@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
  24. vhost: don't forget to schedule()

    Nadav Har'El authored Ben Hutchings committed
    commit d550dda upstream.
    
    This is a tiny, but important, patch to vhost.
    
    Vhost's worker thread only called schedule() when it had no work to do, and
    it wanted to go to sleep. But if there's always work to do, e.g., the guest
    is running a network-intensive program like netperf with small message sizes,
    schedule() was *never* called. This had several negative implications (on
    non-preemptive kernels):
    
     1. Passing time was not properly accounted to the "vhost" process (ps and
        top would wrongly show it using zero CPU time).
    
     2. Sometimes error messages about RCU timeouts would be printed, if the
        core running the vhost thread didn't schedule() for a very long time.
    
     3. Worst of all, a vhost thread would "hog" the core. If several vhost
        threads need to share the same core, typically one would get most of the
        CPU time (and its associated guest most of the performance), while the
        others hardly get any work done.
    
    The trivial solution is to add
    
    	if (need_resched())
    		schedule();
    
    After doing every piece of work. This will not do the heavy schedule() all
    the time, just when the timer interrupt decided a reschedule is warranted
    (so need_resched returns true).
    
    Thanks to Abel Gordon for this patch.
    
    Signed-off-by: Nadav Har'El <nyh@il.ibm.com>
    Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
  25. @andreas-schwab

    powerpc: Fix wrong divisor in usecs_to_cputime

    andreas-schwab authored Ben Hutchings committed
    commit 9f5072d upstream.
    
    Commit d57af9b (taskstats: use real microsecond granularity for CPU times)
    renamed msecs_to_cputime to usecs_to_cputime, but failed to update all
    numbers on the way.  This causes nonsensical cpu idle/iowait values to be
    displayed in /proc/stat (the only user of usecs_to_cputime so far).
    
    This also renames __cputime_msec_factor to __cputime_usec_factor, adapting
    its value and using it directly in cputime_to_usecs instead of doing two
    multiplications.
    
    Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
    Acked-by: Anton Blanchard <anton@samba.org>
    Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
  26. timekeeping: Add missing update call in timekeeping_resume()

    Thomas Gleixner authored Ben Hutchings committed
    This is a backport of 3e99713
    
    The leap second rework unearthed another issue of inconsistent data.
    
    On timekeeping_resume() the timekeeper data is updated, but nothing
    calls timekeeping_update(), so now the update code in the timer
    interrupt sees stale values.
    
    This has been the case before those changes, but then the timer
    interrupt was using stale data as well so this went unnoticed for quite
    some time.
    
    Add the missing update call, so all the data is consistent everywhere.
    
    Reported-by: Andreas Schwab <schwab@linux-m68k.org>
    Reported-and-tested-by: "Rafael J. Wysocki" <rjw@sisk.pl>
    Reported-and-tested-by: Martin Steigerwald <Martin@lichtvoll.de>
    Cc: LKML <linux-kernel@vger.kernel.org>
    Cc: Linux PM list <linux-pm@vger.kernel.org>
    Cc: John Stultz <johnstul@us.ibm.com>
    Cc: Ingo Molnar <mingo@kernel.org>
    Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>,
    Cc: Prarit Bhargava <prarit@redhat.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: John Stultz <johnstul@us.ibm.com>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    [John Stultz: Backported to 3.2]
    Cc: Prarit Bhargava <prarit@redhat.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Linux Kernel <linux-kernel@vger.kernel.org>
    Signed-off-by: John Stultz <johnstul@us.ibm.com>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
  27. hrtimer: Update hrtimer base offsets each hrtimer_interrupt

    John Stultz authored Ben Hutchings committed
    commit 5baefd6 upstream.
    
    The update of the hrtimer base offsets on all cpus cannot be made
    atomically from the timekeeper.lock held and interrupt disabled region
    as smp function calls are not allowed there.
    
    clock_was_set(), which enforces the update on all cpus, is called
    either from preemptible process context in case of do_settimeofday()
    or from the softirq context when the offset modification happened in
    the timer interrupt itself due to a leap second.
    
    In both cases there is a race window for an hrtimer interrupt between
    dropping timekeeper lock, enabling interrupts and clock_was_set()
    issuing the updates. Any interrupt which arrives in that window will
    see the new time but operate on stale offsets.
    
    So we need to make sure that an hrtimer interrupt always sees a
    consistent state of time and offsets.
    
    ktime_get_update_offsets() allows us to get the current monotonic time
    and update the per cpu hrtimer base offsets from hrtimer_interrupt()
    to capture a consistent state of monotonic time and the offsets. The
    function replaces the existing ktime_get() calls in hrtimer_interrupt().
    
    The overhead of the new function vs. ktime_get() is minimal as it just
    adds two store operations.
    
    This ensures that any changes to realtime or boottime offsets are
    noticed and stored into the per-cpu hrtimer base structures, prior to
    any hrtimer expiration and guarantees that timers are not expired early.
    
    Signed-off-by: John Stultz <johnstul@us.ibm.com>
    Reviewed-by: Ingo Molnar <mingo@kernel.org>
    Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Acked-by: Prarit Bhargava <prarit@redhat.com>
    Link: http://lkml.kernel.org/r/1341960205-56738-8-git-send-email-johnstul@us.ibm.com
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
  28. timekeeping: Provide hrtimer update function

    Thomas Gleixner authored Ben Hutchings committed
    This is a backport of f6c06ab
    
    To finally fix the infamous leap second issue and other race windows
    caused by functions which change the offsets between the various time
    bases (CLOCK_MONOTONIC, CLOCK_REALTIME and CLOCK_BOOTTIME) we need a
    function which atomically gets the current monotonic time and updates
    the offsets of CLOCK_REALTIME and CLOCK_BOOTTIME with minimalistic
    overhead. The previous patch which provides ktime_t offsets allows us
    to make this function almost as cheap as ktime_get() which is going to
    be replaced in hrtimer_interrupt().
    
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Reviewed-by: Ingo Molnar <mingo@kernel.org>
    Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Acked-by: Prarit Bhargava <prarit@redhat.com>
    Signed-off-by: John Stultz <johnstul@us.ibm.com>
    Link: http://lkml.kernel.org/r/1341960205-56738-7-git-send-email-johnstul@us.ibm.com
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    [John Stultz: Backported to 3.2]
    Cc: Prarit Bhargava <prarit@redhat.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Linux Kernel <linux-kernel@vger.kernel.org>
    Signed-off-by: John Stultz <johnstul@us.ibm.com>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
  29. hrtimers: Move lock held region in hrtimer_interrupt()

    Thomas Gleixner authored Ben Hutchings committed
    commit 196951e upstream.
    
    We need to update the base offsets from this code and we need to do
    that under base->lock. Move the lock held region around the
    ktime_get() calls. The ktime_get() calls are going to be replaced with
    a function which gets the time and the offsets atomically.
    
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Reviewed-by: Ingo Molnar <mingo@kernel.org>
    Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Acked-by: Prarit Bhargava <prarit@redhat.com>
    Signed-off-by: John Stultz <johnstul@us.ibm.com>
    Link: http://lkml.kernel.org/r/1341960205-56738-6-git-send-email-johnstul@us.ibm.com
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
  30. timekeeping: Maintain ktime_t based offsets for hrtimers

    Thomas Gleixner authored Ben Hutchings committed
    This is a backport of 5b9fe75
    
    We need to update the hrtimer clock offsets from the hrtimer interrupt
    context. To avoid conversions from timespec to ktime_t maintain a
    ktime_t based representation of those offsets in the timekeeper. This
    puts the conversion overhead into the code which updates the
    underlying offsets and provides fast accessible values in the hrtimer
    interrupt.
    
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: John Stultz <johnstul@us.ibm.com>
    Reviewed-by: Ingo Molnar <mingo@kernel.org>
    Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Acked-by: Prarit Bhargava <prarit@redhat.com>
    Link: http://lkml.kernel.org/r/1341960205-56738-4-git-send-email-johnstul@us.ibm.com
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    [John Stultz: Backported to 3.2]
    Cc: Prarit Bhargava <prarit@redhat.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Linux Kernel <linux-kernel@vger.kernel.org>
    Signed-off-by: John Stultz <johnstul@us.ibm.com>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
  31. timekeeping: Fix leapsecond triggered load spike issue

    John Stultz authored Ben Hutchings committed
    This is a backport of 4873fa0
    
    The timekeeping code misses an update of the hrtimer subsystem after a
    leap second happened. Due to that timers based on CLOCK_REALTIME are
    either expiring a second early or late depending on whether a leap
    second has been inserted or deleted until an operation is initiated
    which causes that update. Unless the update happens by some other
    means this discrepancy between the timekeeping and the hrtimer data
    stays forever and timers are expired either early or late.
    
    The reported immediate workaround - $ data -s "`date`" - is causing a
    call to clock_was_set() which updates the hrtimer data structures.
    See: http://www.sheeri.com/content/mysql-and-leap-second-high-cpu-and-fix
    
    Add the missing clock_was_set() call to update_wall_time() in case of
    a leap second event. The actual update is deferred to softirq context
    as the necessary smp function call cannot be invoked from hard
    interrupt context.
    
    Signed-off-by: John Stultz <johnstul@us.ibm.com>
    Reported-by: Jan Engelhardt <jengelh@inai.de>
    Reviewed-by: Ingo Molnar <mingo@kernel.org>
    Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Acked-by: Prarit Bhargava <prarit@redhat.com>
    Link: http://lkml.kernel.org/r/1341960205-56738-3-git-send-email-johnstul@us.ibm.com
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Cc: Prarit Bhargava <prarit@redhat.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Linux Kernel <linux-kernel@vger.kernel.org>
    Signed-off-by: John Stultz <johnstul@us.ibm.com>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
  32. hrtimer: Provide clock_was_set_delayed()

    John Stultz authored Ben Hutchings committed
    commit f55a6fa upstream.
    
    clock_was_set() cannot be called from hard interrupt context because
    it calls on_each_cpu().
    
    For fixing the widely reported leap seconds issue it is necessary to
    call it from hard interrupt context, i.e. the timer tick code, which
    does the timekeeping updates.
    
    Provide a new function which denotes it in the hrtimer cpu base
    structure of the cpu on which it is called and raise the hrtimer
    softirq. We then execute the clock_was_set() notificiation from
    softirq context in run_hrtimer_softirq(). The hrtimer softirq is
    rarely used, so polling the flag there is not a performance issue.
    
    [ tglx: Made it depend on CONFIG_HIGH_RES_TIMERS. We really should get
      rid of all this ifdeffery ASAP ]
    
    Signed-off-by: John Stultz <johnstul@us.ibm.com>
    Reported-by: Jan Engelhardt <jengelh@inai.de>
    Reviewed-by: Ingo Molnar <mingo@kernel.org>
    Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Acked-by: Prarit Bhargava <prarit@redhat.com>
    Link: http://lkml.kernel.org/r/1341960205-56738-2-git-send-email-johnstul@us.ibm.com
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
  33. time: Move common updates to a function

    Thomas Gleixner authored Ben Hutchings committed
    This is a backport of cc06268
    
    [John Stultz: While not a bugfix itself, it allows following fixes
     to backport in a more straightforward manner.]
    
    CC: Thomas Gleixner <tglx@linutronix.de>
    CC: Eric Dumazet <eric.dumazet@gmail.com>
    CC: Richard Cochran <richardcochran@gmail.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Cc: Prarit Bhargava <prarit@redhat.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Linux Kernel <linux-kernel@vger.kernel.org>
    Signed-off-by: John Stultz <john.stultz@linaro.org>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
  34. @johnstultz-work

    timekeeping: Fix CLOCK_MONOTONIC inconsistency during leapsecond

    johnstultz-work authored Ben Hutchings committed
    This is a backport of fad0c66
    which resolves a bug the previous commit.
    
    Commit 6b43ae8 (ntp: Fix leap-second hrtimer livelock) broke the
    leapsecond update of CLOCK_MONOTONIC. The missing leapsecond update to
    wall_to_monotonic causes discontinuities in CLOCK_MONOTONIC.
    
    Adjust wall_to_monotonic when NTP inserted a leapsecond.
    
    Reported-by: Richard Cochran <richardcochran@gmail.com>
    Signed-off-by: John Stultz <john.stultz@linaro.org>
    Tested-by: Richard Cochran <richardcochran@gmail.com>
    Link: http://lkml.kernel.org/r/1338400497-12420-1-git-send-email-john.stultz@linaro.org
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Cc: Prarit Bhargava <prarit@redhat.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Linux Kernel <linux-kernel@vger.kernel.org>
    Signed-off-by: John Stultz <johnstul@us.ibm.com>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
  35. @richardcochran

    ntp: Correct TAI offset during leap second

    richardcochran authored Ben Hutchings committed
    commit dd48d70 upstream.
    
    When repeating a UTC time value during a leap second (when the UTC
    time should be 23:59:60), the TAI timescale should not stop. The kernel
    NTP code increments the TAI offset one second too late. This patch fixes
    the issue by incrementing the offset during the leap second itself.
    
    Signed-off-by: Richard Cochran <richardcochran@gmail.com>
    Signed-off-by: John Stultz <john.stultz@linaro.org>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Something went wrong with that request. Please try again.