Skip to content
Permalink
Tag: #116
Commits on Apr 9, 2013
  1. Merge branch 'linux-3.0.y' of git://git.kernel.org/pub/scm/linux/kern…

    abev66 committed Apr 9, 2013
    …el/git/stable/linux-stable into test
    
    Conflicts:
    	fs/ext4/ialloc.c
    	fs/ext4/mballoc.c
  2. Revert merge mistake.

    abev66 committed Apr 9, 2013
Commits on Apr 5, 2013
  1. Linux 3.0.72

    gregkh committed Apr 5, 2013
  2. bonding: get netdev_rx_handler_unregister out of locks

    Veaceslav Falico authored and gregkh committed Apr 2, 2013
    [ Upstream commit fcd99434fb5c137274d2e15dd2a6a7455f0f29ff ]
    
    Now that netdev_rx_handler_unregister contains synchronize_net(), we need
    to call it outside of bond->lock, cause it might sleep. Also, remove the
    already unneded synchronize_net().
    
    Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
    Acked-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  3. iommu/amd: Make sure dma_ops are set for hotplug devices

    joergroedel authored and gregkh committed Mar 26, 2013
    commit c2a2876e863356b092967ea62bebdb4dd663af80 upstream.
    
    There is a bug introduced with commit 27c2127 that causes
    devices which are hot unplugged and then hot-replugged to
    not have per-device dma_ops set. This causes these devices
    to not function correctly. Fixed with this patch.
    
    Reported-by: Andreas Degert <andreas.degert@googlemail.com>
    Signed-off-by: Joerg Roedel <joro@8bytes.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  4. smsc75xx: fix jumbo frame support

    steveglen authored and gregkh committed Mar 28, 2013
    [ Upstream commit 4c51e53689569398d656e631c17308d9b8e84650 ]
    
    This patch enables RX of jumbo frames for LAN7500.
    
    Previously the driver would transmit jumbo frames succesfully but
    would drop received jumbo frames (incrementing the interface errors
    count).
    
    With this patch applied the device can succesfully receive jumbo
    frames up to MTU 9000 (9014 bytes on the wire including ethernet
    header).
    
    Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  5. pch_gbe: fix ip_summed checksum reporting on rx

    Veaceslav Falico authored and gregkh committed Mar 25, 2013
    [ Upstream commit 76a0e68129d7d24eb995a6871ab47081bbfa0acc ]
    
    skb->ip_summed should be CHECKSUM_UNNECESSARY when the driver reports that
    checksums were correct and CHECKSUM_NONE in any other case. They're
    currently placed vice versa, which breaks the forwarding scenario. Fix it
    by placing them as described above.
    
    Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  6. net: add a synchronize_net() in netdev_rx_handler_unregister()

    Eric Dumazet authored and gregkh committed Mar 29, 2013
    [ Upstream commit 00cfec37484761a44a3b6f4675a54caa618210ae ]
    
    commit 35d4890 (bonding: fix rx_handler locking) added a race
    in bonding driver, reported by Steven Rostedt who did a very good
    diagnosis :
    
    <quoting Steven>
    
    I'm currently debugging a crash in an old 3.0-rt kernel that one of our
    customers is seeing. The bug happens with a stress test that loads and
    unloads the bonding module in a loop (I don't know all the details as
    I'm not the one that is directly interacting with the customer). But the
    bug looks to be something that may still be present and possibly present
    in mainline too. It will just be much harder to trigger it in mainline.
    
    In -rt, interrupts are threads, and can schedule in and out just like
    any other thread. Note, mainline now supports interrupt threads so this
    may be easily reproducible in mainline as well. I don't have the ability
    to tell the customer to try mainline or other kernels, so my hands are
    somewhat tied to what I can do.
    
    But according to a core dump, I tracked down that the eth irq thread
    crashed in bond_handle_frame() here:
    
            slave = bond_slave_get_rcu(skb->dev);
            bond = slave->bond; <--- BUG
    
    the slave returned was NULL and accessing slave->bond caused a NULL
    pointer dereference.
    
    Looking at the code that unregisters the handler:
    
    void netdev_rx_handler_unregister(struct net_device *dev)
    {
    
            ASSERT_RTNL();
            RCU_INIT_POINTER(dev->rx_handler, NULL);
            RCU_INIT_POINTER(dev->rx_handler_data, NULL);
    }
    
    Which is basically:
            dev->rx_handler = NULL;
            dev->rx_handler_data = NULL;
    
    And looking at __netif_receive_skb() we have:
    
            rx_handler = rcu_dereference(skb->dev->rx_handler);
            if (rx_handler) {
                    if (pt_prev) {
                            ret = deliver_skb(skb, pt_prev, orig_dev);
                            pt_prev = NULL;
                    }
                    switch (rx_handler(&skb)) {
    
    My question to all of you is, what stops this interrupt from happening
    while the bonding module is unloading?  What happens if the interrupt
    triggers and we have this:
    
            CPU0                    CPU1
            ----                    ----
      rx_handler = skb->dev->rx_handler
    
                            netdev_rx_handler_unregister() {
                               dev->rx_handler = NULL;
                               dev->rx_handler_data = NULL;
    
      rx_handler()
       bond_handle_frame() {
        slave = skb->dev->rx_handler;
        bond = slave->bond; <-- NULL pointer dereference!!!
    
    What protection am I missing in the bond release handler that would
    prevent the above from happening?
    
    </quoting Steven>
    
    We can fix bug this in two ways. First is adding a test in
    bond_handle_frame() and others to check if rx_handler_data is NULL.
    
    A second way is adding a synchronize_net() in
    netdev_rx_handler_unregister() to make sure that a rcu protected reader
    has the guarantee to see a non NULL rx_handler_data.
    
    The second way is better as it avoids an extra test in fast path.
    
    Reported-by: Steven Rostedt <rostedt@goodmis.org>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Jiri Pirko <jpirko@redhat.com>
    Cc: Paul E. McKenney <paulmck@us.ibm.com>
    Acked-by: Steven Rostedt <rostedt@goodmis.org>
    Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  7. ks8851: Fix interpretation of rxlen field.

    Max.Nekludov@us.elster.com authored and gregkh committed Mar 29, 2013
    [ Upstream commit 14bc435ea54cb888409efb54fc6b76c13ef530e9 ]
    
    According to the Datasheet (page 52):
    15-12 Reserved
    11-0 RXBC Receive Byte Count
    This field indicates the present received frame byte size.
    
    The code has a bug:
                     rxh = ks8851_rdreg32(ks, KS_RXFHSR);
                     rxstat = rxh & 0xffff;
                     rxlen = rxh >> 16; // BUG!!! 0xFFF mask should be applied
    
    Signed-off-by: Max Nekludov <Max.Nekludov@us.elster.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  8. ipv6: fix bad free of addrconf_init_net

    honkiko authored and gregkh committed Mar 25, 2013
    [ Upstream commit a79ca223e029aa4f09abb337accf1812c900a800 ]
    
    Signed-off-by: Hong Zhiguo <honkiko@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  9. atl1e: drop pci-msi support because of packet corruption

    mugunthanvnm authored and gregkh committed Mar 28, 2013
    [ Upstream commit 188ab1b105c96656f6bcfb49d0d8bb1b1936b632 ]
    
    Usage of pci-msi results in corrupted dma packet transfers to the host.
    
    Reported-by: rebelyouth <rebelyouth.hacklab@gmail.com>
    Cc: Huang, Xiong <xiong@qca.qualcomm.com>
    Tested-by: Christian Sünkenberg <christian.suenkenberg@student.kit.edu>
    Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  10. drivers: net: ethernet: davinci_emac: use netif_wake_queue() while re…

    mugunthanvnm authored and gregkh committed Mar 27, 2013
    …starting tx queue
    
    To restart tx queue use netif_wake_queue() intead of netif_start_queue()
    so that net schedule will restart transmission immediately which will
    increase network performance while doing huge data transfers.
    
    Reported-by: Dan Franke <dan.franke@schneider-electric.com>
    Suggested-by: Sriramakrishnan A G <srk@ti.com>
    Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
    Acked-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  11. aoe: reserve enough headroom on skbs

    Eric Dumazet authored and gregkh committed Mar 27, 2013
    [ Upstream commit 91c5746425aed8f7188a351f1224a26aa232e4b3 ]
    
    Some network drivers use a non default hard_header_len
    
    Transmitted skb should take into account dev->hard_header_len, or risk
    crashes or expensive reallocations.
    
    In the case of aoe, lets reserve MAX_HEADER bytes.
    
    David reported a crash in defxx driver, solved by this patch.
    
    Reported-by: David Oostdyk <daveo@ll.mit.edu>
    Tested-by: David Oostdyk <daveo@ll.mit.edu>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Ed Cashin <ecashin@coraid.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  12. unix: fix a race condition in unix_release()

    pcmoore authored and gregkh committed Mar 25, 2013
    [ Upstream commit ded34e0fe8fe8c2d595bfa30626654e4b87621e0 ]
    
    As reported by Jan, and others over the past few years, there is a
    race condition caused by unix_release setting the sock->sk pointer
    to NULL before properly marking the socket as dead/orphaned.  This
    can cause a problem with the LSM hook security_unix_may_send() if
    there is another socket attempting to write to this partially
    released socket in between when sock->sk is set to NULL and it is
    marked as dead/orphaned.  This patch fixes this by only setting
    sock->sk to NULL after the socket has been marked as dead; I also
    take the opportunity to make unix_release_sock() a void function
    as it only ever returned 0/success.
    
    Dave, I think this one should go on the -stable pile.
    
    Special thanks to Jan for coming up with a reproducer for this
    problem.
    
    Reported-by: Jan Stancek <jan.stancek@gmail.com>
    Signed-off-by: Paul Moore <pmoore@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  13. thermal: shorten too long mcast group name

    masatake authored and gregkh committed Apr 1, 2013
    [ Upstream commits 73214f5d9f33b79918b1f7babddd5c8af28dd23d
      and f1e79e208076ffe7bad97158275f1c572c04f5c7, the latter
      adds an assertion to genetlink to prevent this from happening
      again in the future. ]
    
    The original name is too long.
    
    Signed-off-by: Masatake YAMATO <yamato@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  14. 8021q: fix a potential use-after-free

    Cong Wang authored and gregkh committed Mar 22, 2013
    [ Upstream commit 4a7df340ed1bac190c124c1601bfc10cde9fb4fb ]
    
    vlan_vid_del() could possibly free ->vlan_info after a RCU grace
    period, however, we may still refer to the freed memory area
    by 'grp' pointer. Found by code inspection.
    
    This patch moves vlan_vid_del() as behind as possible.
    
    Signed-off-by: Cong Wang <amwang@redhat.com>
    Cc: Patrick McHardy <kaber@trash.net>
    Cc: "David S. Miller" <davem@davemloft.net>
    Acked-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  15. tcp: undo spurious timeout after SACK reneging

    Yuchung Cheng authored and gregkh committed Mar 24, 2013
    [ Upstream commit 7ebe183c6d444ef5587d803b64a1f4734b18c564 ]
    
    On SACK reneging the sender immediately retransmits and forces a
    timeout but disables Eifel (undo). If the (buggy) receiver does not
    drop any packet this can trigger a false slow-start retransmit storm
    driven by the ACKs of the original packets. This can be detected with
    undo and TCP timestamps.
    
    Signed-off-by: Yuchung Cheng <ycheng@google.com>
    Acked-by: Neal Cardwell <ncardwell@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  16. tcp: preserve ACK clocking in TSO

    Eric Dumazet authored and gregkh committed Mar 21, 2013
    [ Upstream commit f4541d60a449afd40448b06496dcd510f505928e ]
    
    A long standing problem with TSO is the fact that tcp_tso_should_defer()
    rearms the deferred timer, while it should not.
    
    Current code leads to following bad bursty behavior :
    
    20:11:24.484333 IP A > B: . 297161:316921(19760) ack 1 win 119
    20:11:24.484337 IP B > A: . ack 263721 win 1117
    20:11:24.485086 IP B > A: . ack 265241 win 1117
    20:11:24.485925 IP B > A: . ack 266761 win 1117
    20:11:24.486759 IP B > A: . ack 268281 win 1117
    20:11:24.487594 IP B > A: . ack 269801 win 1117
    20:11:24.488430 IP B > A: . ack 271321 win 1117
    20:11:24.489267 IP B > A: . ack 272841 win 1117
    20:11:24.490104 IP B > A: . ack 274361 win 1117
    20:11:24.490939 IP B > A: . ack 275881 win 1117
    20:11:24.491775 IP B > A: . ack 277401 win 1117
    20:11:24.491784 IP A > B: . 316921:332881(15960) ack 1 win 119
    20:11:24.492620 IP B > A: . ack 278921 win 1117
    20:11:24.493448 IP B > A: . ack 280441 win 1117
    20:11:24.494286 IP B > A: . ack 281961 win 1117
    20:11:24.495122 IP B > A: . ack 283481 win 1117
    20:11:24.495958 IP B > A: . ack 285001 win 1117
    20:11:24.496791 IP B > A: . ack 286521 win 1117
    20:11:24.497628 IP B > A: . ack 288041 win 1117
    20:11:24.498459 IP B > A: . ack 289561 win 1117
    20:11:24.499296 IP B > A: . ack 291081 win 1117
    20:11:24.500133 IP B > A: . ack 292601 win 1117
    20:11:24.500970 IP B > A: . ack 294121 win 1117
    20:11:24.501388 IP B > A: . ack 295641 win 1117
    20:11:24.501398 IP A > B: . 332881:351881(19000) ack 1 win 119
    
    While the expected behavior is more like :
    
    20:19:49.259620 IP A > B: . 197601:202161(4560) ack 1 win 119
    20:19:49.260446 IP B > A: . ack 154281 win 1212
    20:19:49.261282 IP B > A: . ack 155801 win 1212
    20:19:49.262125 IP B > A: . ack 157321 win 1212
    20:19:49.262136 IP A > B: . 202161:206721(4560) ack 1 win 119
    20:19:49.262958 IP B > A: . ack 158841 win 1212
    20:19:49.263795 IP B > A: . ack 160361 win 1212
    20:19:49.264628 IP B > A: . ack 161881 win 1212
    20:19:49.264637 IP A > B: . 206721:211281(4560) ack 1 win 119
    20:19:49.265465 IP B > A: . ack 163401 win 1212
    20:19:49.265886 IP B > A: . ack 164921 win 1212
    20:19:49.266722 IP B > A: . ack 166441 win 1212
    20:19:49.266732 IP A > B: . 211281:215841(4560) ack 1 win 119
    20:19:49.267559 IP B > A: . ack 167961 win 1212
    20:19:49.268394 IP B > A: . ack 169481 win 1212
    20:19:49.269232 IP B > A: . ack 171001 win 1212
    20:19:49.269241 IP A > B: . 215841:221161(5320) ack 1 win 119
    
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Yuchung Cheng <ycheng@google.com>
    Cc: Van Jacobson <vanj@google.com>
    Cc: Neal Cardwell <ncardwell@google.com>
    Cc: Nandita Dukkipati <nanditad@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  17. sky2: Threshold for Pause Packet is set wrong

    pldemone authored and gregkh committed Mar 26, 2013
    [ Upstream commit 74f9f42c1c1650e74fb464f76644c9041f996851 ]
    
    The sky2 driver sets the Rx Upper Threshold for Pause Packet generation to a
    wrong value which leads to only 2kB of RAM remaining space. This can lead to
    Rx overflow errors even with activated flow-control.
    
    Fix: We should increase the value to 8192/8
    
    Signed-off-by: Mirko Lindner <mlindner@marvell.com>
    Acked-by: Stephen Hemminger <stephen@networkplumber.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  18. sky2: Receive Overflows not counted

    pldemone authored and gregkh committed Mar 26, 2013
    [ Upstream commit 9cfe8b156c21cf340b3a10ecb3022fbbc1c39185 ]
    
    The sky2 driver doesn't count the Receive Overflows because the MAC
    interrupt for this event is not set in the MAC's interrupt mask.
    The MAC's interrupt mask is set only for Transmit FIFO Underruns.
    
    Fix: The correct setting should be (GM_IS_TX_FF_UR | GM_IS_RX_FF_OR)
    Otherwise the Receive Overflow event will not generate any interrupt.
    The  Receive Overflow interrupt is handled correctly
    
    Signed-off-by: Mirko Lindner <mlindner@marvell.com>
    Acked-by: Stephen Hemminger <stephen@networkplumber.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  19. tracing: Prevent buffer overwrite disabled for latency tracers

    rostedt authored and gregkh committed Mar 14, 2013
    commit 613f04a0f51e6e68ac6fe571ab79da3c0a5eb4da upstream.
    
    The latency tracers require the buffers to be in overwrite mode,
    otherwise they get screwed up. Force the buffers to stay in overwrite
    mode when latency tracers are enabled.
    
    Added a flag_changed() method to the tracer structure to allow
    the tracers to see what flags are being changed, and also be able
    to prevent the change from happing.
    
    [Backported for 3.4-stable. Re-added current_trace NULL checks; removed
    allocated_snapshot field; adapted to tracing_trace_options_write without
    trace_set_options.]
    
    Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
    Signed-off-by: Lingzhu Xiang <lxiang@redhat.com>
    Reviewed-by: CAI Qian <caiqian@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  20. tracing: Protect tracer flags with trace_types_lock

    rostedt authored and gregkh committed Mar 14, 2013
    commit 69d34da2984c95b33ea21518227e1f9470f11d95 upstream.
    
    Seems that the tracer flags have never been protected from
    synchronous writes. Luckily, admins don't usually modify the
    tracing flags via two different tasks. But if scripts were to
    be used to modify them, then they could get corrupted.
    
    Move the trace_types_lock that protects against tracers changing
    to also protect the flags being set.
    
    [Backported for 3.4, 3.0-stable. Moved return to after unlock.]
    
    Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
    Signed-off-by: Lingzhu Xiang <lxiang@redhat.com>
    Reviewed-by: CAI Qian <caiqian@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  21. ext4: use atomic64_t for the per-flexbg free_clusters count

    tytso authored and gregkh committed Mar 12, 2013
    commit 90ba983f6889e65a3b506b30dc606aa9d1d46cd2 upstream.
    
    A user who was using a 8TB+ file system and with a very large flexbg
    size (> 65536) could cause the atomic_t used in the struct flex_groups
    to overflow.  This was detected by PaX security patchset:
    
    http://forums.grsecurity.net/viewtopic.php?f=3&t=3289&p=12551#p12551
    
    This bug was introduced in commit 9f24e42, so it's been around
    since 2.6.30.  :-(
    
    Fix this by using an atomic64_t for struct orlav_stats's
    free_clusters.
    
    [Backported for 3.0-stable. Renamed free_clusters back to free_blocks;
    fixed a few more atomic_read's of free_blocks left in 3.0.]
    
    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Reviewed-by: Lukas Czerner <lczerner@redhat.com>
    Signed-off-by: Lingzhu Xiang <lxiang@redhat.com>
    Reviewed-by: CAI Qian <caiqian@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  22. efivars: Handle duplicate names from get_next_variable()

    mfleming authored and gregkh committed Mar 7, 2013
    commit e971318bbed610e28bb3fde9d548e6aaf0a6b02e upstream.
    
    Some firmware exhibits a bug where the same VariableName and
    VendorGuid values are returned on multiple invocations of
    GetNextVariableName(). See,
    
        https://bugzilla.kernel.org/show_bug.cgi?id=47631
    
    As a consequence of such a bug, Andre reports hitting the following
    WARN_ON() in the sysfs code after updating the BIOS on his, "Gigabyte
    Technology Co., Ltd. To be filled by O.E.M./Z77X-UD3H, BIOS F19e
    11/21/2012)" machine,
    
    [    0.581554] EFI Variables Facility v0.08 2004-May-17
    [    0.584914] ------------[ cut here ]------------
    [    0.585639] WARNING: at /home/andre/linux/fs/sysfs/dir.c:536 sysfs_add_one+0xd4/0x100()
    [    0.586381] Hardware name: To be filled by O.E.M.
    [    0.587123] sysfs: cannot create duplicate filename '/firmware/efi/vars/SbAslBufferPtrVar-01f33c25-764d-43ea-aeea-6b5a41f3f3e8'
    [    0.588694] Modules linked in:
    [    0.589484] Pid: 1, comm: swapper/0 Not tainted 3.8.0+ #7
    [    0.590280] Call Trace:
    [    0.591066]  [<ffffffff81208954>] ? sysfs_add_one+0xd4/0x100
    [    0.591861]  [<ffffffff810587bf>] warn_slowpath_common+0x7f/0xc0
    [    0.592650]  [<ffffffff810588bc>] warn_slowpath_fmt+0x4c/0x50
    [    0.593429]  [<ffffffff8134dd85>] ? strlcat+0x65/0x80
    [    0.594203]  [<ffffffff81208954>] sysfs_add_one+0xd4/0x100
    [    0.594979]  [<ffffffff81208b78>] create_dir+0x78/0xd0
    [    0.595753]  [<ffffffff81208ec6>] sysfs_create_dir+0x86/0xe0
    [    0.596532]  [<ffffffff81347e4c>] kobject_add_internal+0x9c/0x220
    [    0.597310]  [<ffffffff81348307>] kobject_init_and_add+0x67/0x90
    [    0.598083]  [<ffffffff81584a71>] ? efivar_create_sysfs_entry+0x61/0x1c0
    [    0.598859]  [<ffffffff81584b2b>] efivar_create_sysfs_entry+0x11b/0x1c0
    [    0.599631]  [<ffffffff8158517e>] register_efivars+0xde/0x420
    [    0.600395]  [<ffffffff81d430a7>] ? edd_init+0x2f5/0x2f5
    [    0.601150]  [<ffffffff81d4315f>] efivars_init+0xb8/0x104
    [    0.601903]  [<ffffffff8100215a>] do_one_initcall+0x12a/0x180
    [    0.602659]  [<ffffffff81d05d80>] kernel_init_freeable+0x13e/0x1c6
    [    0.603418]  [<ffffffff81d05586>] ? loglevel+0x31/0x31
    [    0.604183]  [<ffffffff816a6530>] ? rest_init+0x80/0x80
    [    0.604936]  [<ffffffff816a653e>] kernel_init+0xe/0xf0
    [    0.605681]  [<ffffffff816ce7ec>] ret_from_fork+0x7c/0xb0
    [    0.606414]  [<ffffffff816a6530>] ? rest_init+0x80/0x80
    [    0.607143] ---[ end trace 1609741ab737eb29 ]---
    
    There's not much we can do to work around and keep traversing the
    variable list once we hit this firmware bug. Our only solution is to
    terminate the loop because, as Lingzhu reports, some machines get
    stuck when they encounter duplicate names,
    
      > I had an IBM System x3100 M4 and x3850 X5 on which kernel would
      > get stuck in infinite loop creating duplicate sysfs files because,
      > for some reason, there are several duplicate boot entries in nvram
      > getting GetNextVariableName into a circle of iteration (with
      > period > 2).
    
    Also disable the workqueue, as efivar_update_sysfs_entries() uses
    GetNextVariableName() to figure out which variables have been created
    since the last iteration. That algorithm isn't going to work if
    GetNextVariableName() returns duplicates. Note that we don't disable
    EFI variable creation completely on the affected machines, it's just
    that any pstore dump-* files won't appear in sysfs until the next
    boot.
    
    [Backported for 3.0-stable. Removed code related to pstore
    workqueue but pulled in helper function variable_is_present
    from a93bc0c; Moved the definition of __efivars to the top
    for being referenced in variable_is_present.]
    
    Reported-by: Andre Heider <a.heider@gmail.com>
    Reported-by: Lingzhu Xiang <lxiang@redhat.com>
    Tested-by: Lingzhu Xiang <lxiang@redhat.com>
    Cc: Seiji Aguchi <seiji.aguchi@hds.com>
    Signed-off-by: Matt Fleming <matt.fleming@intel.com>
    Signed-off-by: Lingzhu Xiang <lxiang@redhat.com>
    Reviewed-by: CAI Qian <caiqian@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  23. efivars: explicitly calculate length of VariableName

    mfleming authored and gregkh committed Mar 1, 2013
    commit ec50bd32f1672d38ddce10fb1841cbfda89cfe9a upstream.
    
    It's not wise to assume VariableNameSize represents the length of
    VariableName, as not all firmware updates VariableNameSize in the same
    way (some don't update it at all if EFI_SUCCESS is returned). There
    are even implementations out there that update VariableNameSize with
    values that are both larger than the string returned in VariableName
    and smaller than the buffer passed to GetNextVariableName(), which
    resulted in the following bug report from Michael Schroeder,
    
      > On HP z220 system (firmware version 1.54), some EFI variables are
      > incorrectly named :
      >
      > ls -d /sys/firmware/efi/vars/*8be4d* | grep -v -- -8be returns
      > /sys/firmware/efi/vars/dbxDefault-pport8be4df61-93ca-11d2-aa0d-00e098032b8c
      > /sys/firmware/efi/vars/KEKDefault-pport8be4df61-93ca-11d2-aa0d-00e098032b8c
      > /sys/firmware/efi/vars/SecureBoot-pport8be4df61-93ca-11d2-aa0d-00e098032b8c
      > /sys/firmware/efi/vars/SetupMode-Information8be4df61-93ca-11d2-aa0d-00e098032b8c
    
    The issue here is that because we blindly use VariableNameSize without
    verifying its value, we can potentially read garbage values from the
    buffer containing VariableName if VariableNameSize is larger than the
    length of VariableName.
    
    Since VariableName is a string, we can calculate its size by searching
    for the terminating NULL character.
    
    [Backported for 3.8-stable. Removed workqueue code added in
    a93bc0c 3.9-rc1.]
    
    Reported-by: Frederic Crozat <fcrozat@suse.com>
    Cc: Matthew Garrett <mjg59@srcf.ucam.org>
    Cc: Josh Boyer <jwboyer@redhat.com>
    Cc: Michael Schroeder <mls@suse.com>
    Cc: Lee, Chun-Yi <jlee@suse.com>
    Cc: Lingzhu Xiang <lxiang@redhat.com>
    Cc: Seiji Aguchi <seiji.aguchi@hds.com>
    Signed-off-by: Matt Fleming <matt.fleming@intel.com>
    Signed-off-by: Lingzhu Xiang <lxiang@redhat.com>
    Reviewed-by: CAI Qian <caiqian@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  24. drm/i915: Don't clobber crtc->fb when queue_flip fails

    vsyrjala authored and gregkh committed Feb 22, 2013
    commit 4a35f83b2b7c6aae3fc0d1c4554fdc99dc33ad07 upstream.
    
    Restore crtc->fb to the old framebuffer if queue_flip fails.
    
    While at it, kill the pointless intel_fb temp variable.
    
    v2: Update crtc->fb before queue_flip and restore it back
        after a failure.
    
    [Backported for 3.0-stable. Adjusted context. Please
    cherry-pick commit 7317c75e66fce0c9f82fbe6f72f7e5256b315422
    upstream before this patch as it provides necessary context
    and fixes a panic.]
    
    Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
    Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
    Reported-and-Tested-by: Mika Kuoppala <mika.kuoppala@intel.com>
    Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
    Signed-off-by: Lingzhu Xiang <lxiang@redhat.com>
    Reviewed-by: CAI Qian <caiqian@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  25. drm/i915: don't set unpin_work if vblank_get fails

    jbarnes993 authored and gregkh committed Aug 29, 2011
    commit 7317c75e66fce0c9f82fbe6f72f7e5256b315422 upstream.
    
    This fixes a race where we may try to finish a page flip and decrement
    the refcount even if our vblank_get failed and we ended up with a
    spurious flip pending interrupt.
    
    Fixes https://bugs.freedesktop.org/show_bug.cgi?id=34211.
    
    Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
    Signed-off-by: Keith Packard <keithp@keithp.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  26. nfsd4: reject "negative" acl lengths

    J. Bruce Fields authored and gregkh committed Mar 26, 2013
    commit 64a817cfbded8674f345d1117b117f942a351a69 upstream.
    
    Since we only enforce an upper bound, not a lower bound, a "negative"
    length can get through here.
    
    The symptom seen was a warning when we attempt to a kmalloc with an
    excessive size.
    
    Reported-by: Toralf Förster <toralf.foerster@gmx.de>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  27. loop: prevent bdev freeing while device in use

    anatol authored and gregkh committed Apr 1, 2013
    commit c1681bf8a7b1b98edee8b862a42c19c4e53205fd upstream.
    
    struct block_device lifecycle is defined by its inode (see fs/block_dev.c) -
    block_device allocated first time we access /dev/loopXX and deallocated on
    bdev_destroy_inode. When we create the device "losetup /dev/loopXX afile"
    we want that block_device stay alive until we destroy the loop device
    with "losetup -d".
    
    But because we do not hold /dev/loopXX inode its counter goes 0, and
    inode/bdev can be destroyed at any moment. Usually it happens at memory
    pressure or when user drops inode cache (like in the test below). When later in
    loop_clr_fd() we want to use bdev we have use-after-free error with following
    stack:
    
    BUG: unable to handle kernel NULL pointer dereference at 0000000000000280
      bd_set_size+0x10/0xa0
      loop_clr_fd+0x1f8/0x420 [loop]
      lo_ioctl+0x200/0x7e0 [loop]
      lo_compat_ioctl+0x47/0xe0 [loop]
      compat_blkdev_ioctl+0x341/0x1290
      do_filp_open+0x42/0xa0
      compat_sys_ioctl+0xc1/0xf20
      do_sys_open+0x16e/0x1d0
      sysenter_dispatch+0x7/0x1a
    
    To prevent use-after-free we need to grab the device in loop_set_fd()
    and put it later in loop_clr_fd().
    
    The issue is reprodusible on current Linus head and v3.3. Here is the test:
    
      dd if=/dev/zero of=loop.file bs=1M count=1
      while [ true ]; do
        losetup /dev/loop0 loop.file
        echo 2 > /proc/sys/vm/drop_caches
        losetup -d /dev/loop0
      done
    
    [ Doing bdgrab/bput in loop_set_fd/loop_clr_fd is safe, because every
      time we call loop_set_fd() we check that loop_device->lo_state is
      Lo_unbound and set it to Lo_bound If somebody will try to set_fd again
      it will get EBUSY.  And if we try to loop_clr_fd() on unbound loop
      device we'll get ENXIO.
    
      loop_set_fd/loop_clr_fd (and any other loop ioctl) is called under
      loop_device->lo_ctl_mutex. ]
    
    Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  28. KVM: x86: invalid opcode oops on SET_SREGS with OSXSAVE bit set (CVE-…

    Petr Matousek authored and gregkh committed Mar 19, 2013
    …2012-4461)
    
    commit 6d1068b3a98519247d8ba4ec85cd40ac136dbdf9 upstream.
    
    On hosts without the XSAVE support unprivileged local user can trigger
    oops similar to the one below by setting X86_CR4_OSXSAVE bit in guest
    cr4 register using KVM_SET_SREGS ioctl and later issuing KVM_RUN
    ioctl.
    
    invalid opcode: 0000 [#2] SMP
    Modules linked in: tun ip6table_filter ip6_tables ebtable_nat ebtables
    ...
    Pid: 24935, comm: zoog_kvm_monito Tainted: G      D      3.2.0-3-686-pae
    EIP: 0060:[<f8b9550c>] EFLAGS: 00210246 CPU: 0
    EIP is at kvm_arch_vcpu_ioctl_run+0x92a/0xd13 [kvm]
    EAX: 00000001 EBX: 000f387e ECX: 00000000 EDX: 00000000
    ESI: 00000000 EDI: 00000000 EBP: ef5a0060 ESP: d7c63e70
     DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
    Process zoog_kvm_monito (pid: 24935, ti=d7c62000 task=ed84a0c0
    task.ti=d7c62000)
    Stack:
     00000001 f70a1200 f8b940a9 ef5a0060 00000000 00200202 f8769009 00000000
     ef5a0060 000f387e eda5c020 8722f9c8 00015bae 00000000 ed84a0c0 ed84a0c0
     c12bf02d 0000ae80 ef7f8740 fffffffb f359b740 ef5a0060 f8b85dc1 0000ae80
    Call Trace:
     [<f8b940a9>] ? kvm_arch_vcpu_ioctl_set_sregs+0x2fe/0x308 [kvm]
    ...
     [<c12bfb44>] ? syscall_call+0x7/0xb
    Code: 89 e8 e8 14 ee ff ff ba 00 00 04 00 89 e8 e8 98 48 ff ff 85 c0 74
    1e 83 7d 48 00 75 18 8b 85 08 07 00 00 31 c9 8b 95 0c 07 00 00 <0f> 01
    d1 c7 45 48 01 00 00 00 c7 45 1c 01 00 00 00 0f ae f0 89
    EIP: [<f8b9550c>] kvm_arch_vcpu_ioctl_run+0x92a/0xd13 [kvm] SS:ESP
    0068:d7c63e70
    
    QEMU first retrieves the supported features via KVM_GET_SUPPORTED_CPUID
    and then sets them later. So guest's X86_FEATURE_XSAVE should be masked
    out on hosts without X86_FEATURE_XSAVE, making kvm_set_cr4 with
    X86_CR4_OSXSAVE fail. Userspaces that allow specifying guest cpuid with
    X86_FEATURE_XSAVE even on hosts that do not support it, might be
    susceptible to this attack from inside the guest as well.
    
    Allow setting X86_CR4_OSXSAVE bit only if host has XSAVE support.
    
    Signed-off-by: Petr Matousek <pmatouse@redhat.com>
    Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
    Signed-off-by: Jiri Slaby <jslaby@suse.cz>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  29. mm/hotplug: correctly add new zone to all other nodes' zone lists

    Jiang Liu authored and gregkh committed Mar 19, 2013
    commit 08dff7b7d629807dbb1f398c68dd9cd58dd657a1 upstream.
    
    When online_pages() is called to add new memory to an empty zone, it
    rebuilds all zone lists by calling build_all_zonelists().  But there's a
    bug which prevents the new zone to be added to other nodes' zone lists.
    
    online_pages() {
    	build_all_zonelists()
    	.....
    	node_set_state(zone_to_nid(zone), N_HIGH_MEMORY)
    }
    
    Here the node of the zone is put into N_HIGH_MEMORY state after calling
    build_all_zonelists(), but build_all_zonelists() only adds zones from
    nodes in N_HIGH_MEMORY state to the fallback zone lists.
    build_all_zonelists()
    
        ->__build_all_zonelists()
    	->build_zonelists()
    	    ->find_next_best_node()
    		->for_each_node_state(n, N_HIGH_MEMORY)
    
    So memory in the new zone will never be used by other nodes, and it may
    cause strange behavor when system is under memory pressure.  So put node
    into N_HIGH_MEMORY state before calling build_all_zonelists().
    
    Signed-off-by: Jianguo Wu <wujianguo@huawei.com>
    Signed-off-by: Jiang Liu <liuj97@gmail.com>
    Cc: Mel Gorman <mgorman@suse.de>
    Cc: Michal Hocko <mhocko@suse.cz>
    Cc: Minchan Kim <minchan@kernel.org>
    Cc: Rusty Russell <rusty@rustcorp.com.au>
    Cc: Yinghai Lu <yinghai@kernel.org>
    Cc: Tony Luck <tony.luck@intel.com>
    Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
    Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Keping Chen <chenkeping@huawei.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Jiri Slaby <jslaby@suse.cz>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  30. KVM: Fix buffer overflow in kvm_set_irq()

    Avi Kivity authored and gregkh committed Mar 19, 2013
    commit f2ebd422f71cda9c791f76f85d2ca102ae34a1ed upstream.
    
    kvm_set_irq() has an internal buffer of three irq routing entries, allowing
    connecting a GSI to three IRQ chips or on MSI.  However setup_routing_entry()
    does not properly enforce this, allowing three irqchip routes followed by
    an MSI route to overflow the buffer.
    
    Fix by ensuring that an MSI entry is added to an empty list.
    
    Signed-off-by: Avi Kivity <avi@redhat.com>
    Signed-off-by: Jiri Slaby <jslaby@suse.cz>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  31. macvtap: zerocopy: validate vectors before building skb

    jasowang authored and gregkh committed Mar 19, 2013
    commit b92946e2919134ebe2a4083e4302236295ea2a73 upstream.
    
    There're several reasons that the vectors need to be validated:
    
    - Return error when caller provides vectors whose num is greater than UIO_MAXIOV.
    - Linearize part of skb when userspace provides vectors grater than MAX_SKB_FRAGS.
    - Return error when userspace provides vectors whose total length may exceed
    - MAX_SKB_FRAGS * PAGE_SIZE.
    
    Signed-off-by: Jason Wang <jasowang@redhat.com>
    Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
    Signed-off-by: Benjamin Poirier <bpoirier@suse.de> [patch reduced to
    					the 3rd reason only for 3.0]
    Signed-off-by: Jiri Slaby <jslaby@suse.cz>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
  32. KVM: Ensure all vcpus are consistent with in-kernel irqchip settings

    Avi Kivity authored and gregkh committed Mar 19, 2013
    commit 3e515705a1f46beb1c942bb8043c16f8ac7b1e9e upstream.
    
    If some vcpus are created before KVM_CREATE_IRQCHIP, then
    irqchip_in_kernel() and vcpu->arch.apic will be inconsistent, leading
    to potential NULL pointer dereferences.
    
    Fix by:
    - ensuring that no vcpus are installed when KVM_CREATE_IRQCHIP is called
    - ensuring that a vcpu has an apic if it is installed after KVM_CREATE_IRQCHIP
    
    This is somewhat long winded because vcpu->arch.apic is created without
    kvm->lock held.
    
    Based on earlier patch by Michael Ellerman.
    
    Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
    Signed-off-by: Avi Kivity <avi@redhat.com>
    Signed-off-by: Jiri Slaby <jslaby@suse.cz>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Older
You can’t perform that action at this time.