Permalink
Commits on Feb 19, 2017
  1. fix merge conflict

    committed Feb 19, 2017
  2. Merge tag 'v4.9.11' into stable-4.9

    This is the 4.9.11 stable release
    committed Feb 19, 2017
Commits on Feb 18, 2017
  1. Linux 4.9.11

    gregkh committed Feb 18, 2017
  2. x86/fpu/xstate: Fix xcomp_bv in XSAVES header

    commit dffba9a upstream.
    
    The compacted-format XSAVES area is determined at boot time and
    never changed after.  The field xsave.header.xcomp_bv indicates
    which components are in the fixed XSAVES format.
    
    In fpstate_init() we did not set xcomp_bv to reflect the XSAVES
    format since at the time there is no valid data.
    
    However, after we do copy_init_fpstate_to_fpregs() in fpu__clear(),
    as in commit:
    
      b22cbe4 x86/fpu: Fix invalid FPU ptrace state after execve()
    
    and when __fpu_restore_sig() does fpu__restore() for a COMPAT-mode
    app, a #GP occurs.  This can be easily triggered by doing valgrind on
    a COMPAT-mode "Hello World," as reported by Joakim Tjernlund and
    others:
    
    	https://bugzilla.kernel.org/show_bug.cgi?id=190061
    
    Fix it by setting xcomp_bv correctly.
    
    This patch also moves the xcomp_bv initialization to the proper
    place, which was in copyin_to_xsaves() as of:
    
      4c83336 x86/fpu: Set the xcomp_bv when we fake up a XSAVES area
    
    which fixed the bug too, but it's more efficient and cleaner to
    initialize things once per boot, not for every signal handling
    operation.
    
    Reported-by: Kevin Hao <haokexin@gmail.com>
    Reported-by: Joakim Tjernlund <Joakim.Tjernlund@infinera.com>
    Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
    Cc: Andy Lutomirski <luto@kernel.org>
    Cc: Borislav Petkov <bp@suse.de>
    Cc: Dave Hansen <dave.hansen@linux.intel.com>
    Cc: Fenghua Yu <fenghua.yu@intel.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: haokexin@gmail.com
    Link: http://lkml.kernel.org/r/1485212084-4418-1-git-send-email-yu-cheng.yu@intel.com
    [ Combined it with 4c83336. ]
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    yyu168 committed with gregkh Jan 23, 2017
  3. tcp: don't annotate mark on control socket from tcp_v6_send_response()

    commit 92e55f4 upstream.
    
    Unlike ipv4, this control socket is shared by all cpus so we cannot use
    it as scratchpad area to annotate the mark that we pass to ip6_xmit().
    
    Add a new parameter to ip6_xmit() to indicate the mark. The SCTP socket
    family caches the flowi6 structure in the sctp_transport structure, so
    we cannot use to carry the mark unless we later on reset it back, which
    I discarded since it looks ugly to me.
    
    Fixes: bf99b4d ("tcp: fix mark propagation with fwmark_reflect enabled")
    Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Pablo Neira committed with gregkh Jan 26, 2017
  4. net/mlx5: Don't unlock fte while still using it

    commit 0fd758d upstream.
    
    When adding a new rule to an fte, we need to hold the fte lock
    until we add that rule to the fte and increase the fte ref count.
    
    Fixes: 0c56b97 ("net/mlx5_core: Introduce flow steering API")
    Signed-off-by: Mark Bloch <markb@mellanox.com>
    Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
    Signed-off-by: Leon Romanovsky <leon@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    mark-bloch committed with gregkh Sep 5, 2016
  5. tcp: fix mark propagation with fwmark_reflect enabled

    commit bf99b4d upstream.
    
    Otherwise, RST packets generated by the TCP stack for non-existing
    sockets always have mark 0.
    The mark from the original packet is assigned to the netns_ipv4/6
    socket used to send the response so that it can get copied into the
    response skb when the socket sends it.
    
    Fixes: e110861 ("net: add a sysctl to reflect the fwmark on replies")
    Cc: Lorenzo Colitti <lorenzo@google.com>
    Signed-off-by: Pau Espin Pedrol <pau.espin@tessares.net>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    pespin committed with gregkh Jan 6, 2017
  6. igmp, mld: Fix memory leak in igmpv3/mld_del_delrec()

    [ Upstream commit 9c8bb16 ]
    
    In function igmpv3/mld_add_delrec() we allocate pmc and put it in
    idev->mc_tomb, so we should free it when we don't need it in del_delrec().
    But I removed kfree(pmc) incorrectly in latest two patches. Now fix it.
    
    Fixes: 24803f3 ("igmp: do not remove igmp souce list info when ...")
    Fixes: 1666d49 ("mld: do not remove mld souce list info when ...")
    Reported-by: Daniel Borkmann <daniel@iogearbox.net>
    Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    liuhangbin committed with gregkh Feb 8, 2017
  7. mld: do not remove mld souce list info when set link down

    [ Upstream commit 1666d49 ]
    
    This is an IPv6 version of commit 24803f3 ("igmp: do not remove igmp
    souce list..."). In mld_del_delrec(), we will restore back all source filter
    info instead of flush them.
    
    Move mld_clear_delrec() from ipv6_mc_down() to ipv6_mc_destroy_dev() since
    we should not remove source list info when set link down. Remove
    igmp6_group_dropped() in ipv6_mc_destroy_dev() since we have called it in
    ipv6_mc_down().
    
    Also clear all source info after igmp6_group_dropped() instead of in it
    because ipv6_mc_down() will call igmp6_group_dropped().
    
    Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    liuhangbin committed with gregkh Jan 12, 2017
  8. l2tp: do not use udp_ioctl()

    [ Upstream commit 72fb96e ]
    
    udp_ioctl(), as its name suggests, is used by UDP protocols,
    but is also used by L2TP :(
    
    L2TP should use its own handler, because it really does not
    look the same.
    
    SIOCINQ for instance should not assume UDP checksum or headers.
    
    Thanks to Andrey and syzkaller team for providing the report
    and a nice reproducer.
    
    While crashes only happen on recent kernels (after commit
    7c13f97 ("udp: do fwd memory scheduling on dequeue")), this
    probably needs to be backported to older kernels.
    
    Fixes: 7c13f97 ("udp: do fwd memory scheduling on dequeue")
    Fixes: 8558467 ("udp: Fix udp_poll() and ioctl()")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reported-by: Andrey Konovalov <andreyknvl@google.com>
    Acked-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Eric Dumazet committed with gregkh Feb 10, 2017
  9. net: dsa: Do not destroy invalid network devices

    [ Upstream commit 382e1ee ]
    
    dsa_slave_create() can fail, and dsa_user_port_unapply() will properly check
    for the network device not being NULL before attempting to destroy it. We were
    not setting the slave network device as NULL if dsa_slave_create() failed, so
    we would later on be calling dsa_slave_destroy() on a now free'd and
    unitialized network device, causing crashes in dsa_slave_destroy().
    
    Fixes: 83c0afa ("net: dsa: Add new binding implementation")
    Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    ffainelli committed with gregkh Feb 8, 2017
  10. ping: fix a null pointer dereference

    [ Upstream commit 73d2c66 ]
    
    Andrey reported a kernel crash:
    
      general protection fault: 0000 [#1] SMP KASAN
      Dumping ftrace buffer:
         (ftrace buffer empty)
      Modules linked in:
      CPU: 2 PID: 3880 Comm: syz-executor1 Not tainted 4.10.0-rc6+ #124
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
      task: ffff880060048040 task.stack: ffff880069be8000
      RIP: 0010:ping_v4_push_pending_frames net/ipv4/ping.c:647 [inline]
      RIP: 0010:ping_v4_sendmsg+0x1acd/0x23f0 net/ipv4/ping.c:837
      RSP: 0018:ffff880069bef8b8 EFLAGS: 00010206
      RAX: dffffc0000000000 RBX: ffff880069befb90 RCX: 0000000000000000
      RDX: 0000000000000018 RSI: ffff880069befa30 RDI: 00000000000000c2
      RBP: ffff880069befbb8 R08: 0000000000000008 R09: 0000000000000000
      R10: 0000000000000002 R11: 0000000000000000 R12: ffff880069befab0
      R13: ffff88006c624a80 R14: ffff880069befa70 R15: 0000000000000000
      FS:  00007f6f7c716700(0000) GS:ffff88006de00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000004a6f28 CR3: 000000003a134000 CR4: 00000000000006e0
      Call Trace:
       inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
       sock_sendmsg_nosec net/socket.c:635 [inline]
       sock_sendmsg+0xca/0x110 net/socket.c:645
       SYSC_sendto+0x660/0x810 net/socket.c:1687
       SyS_sendto+0x40/0x50 net/socket.c:1655
       entry_SYSCALL_64_fastpath+0x1f/0xc2
    
    This is because we miss a check for NULL pointer for skb_peek() when
    the queue is empty. Other places already have the same check.
    
    Fixes: c319b4d ("net: ipv4: add IPPROTO_ICMP socket kind")
    Reported-by: Andrey Konovalov <andreyknvl@google.com>
    Tested-by: Andrey Konovalov <andreyknvl@google.com>
    Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    congwang committed with gregkh Feb 7, 2017
  11. packet: round up linear to header len

    [ Upstream commit 57031eb ]
    
    Link layer protocols may unconditionally pull headers, as Ethernet
    does in eth_type_trans. Ensure that the entire link layer header
    always lies in the skb linear segment. tpacket_snd has such a check.
    Extend this to packet_snd.
    
    Variable length link layer headers complicate the computation
    somewhat. Here skb->len may be smaller than dev->hard_header_len.
    
    Round up the linear length to be at least as long as the smallest of
    the two.
    
    Reported-by: Dmitry Vyukov <dvyukov@google.com>
    Signed-off-by: Willem de Bruijn <willemb@google.com>
    Acked-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    wdebruij committed with gregkh Feb 7, 2017
  12. net: introduce device min_header_len

    [ Upstream commit 217e6fa ]
    
    The stack must not pass packets to device drivers that are shorter
    than the minimum link layer header length.
    
    Previously, packet sockets would drop packets smaller than or equal
    to dev->hard_header_len, but this has false positives. Zero length
    payload is used over Ethernet. Other link layer protocols support
    variable length headers. Support for validation of these protocols
    removed the min length check for all protocols.
    
    Introduce an explicit dev->min_header_len parameter and drop all
    packets below this value. Initially, set it to non-zero only for
    Ethernet and loopback. Other protocols can follow in a patch to
    net-next.
    
    Fixes: 9ed988c ("packet: validate variable length ll headers")
    Reported-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
    Signed-off-by: Willem de Bruijn <willemb@google.com>
    Acked-by: Eric Dumazet <edumazet@google.com>
    Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    wdebruij committed with gregkh Feb 7, 2017
  13. sit: fix a double free on error path

    [ Upstream commit d7426c6 ]
    
    Dmitry reported a double free in sit_init_net():
    
      kernel BUG at mm/percpu.c:689!
      invalid opcode: 0000 [#1] SMP KASAN
      Dumping ftrace buffer:
         (ftrace buffer empty)
      Modules linked in:
      CPU: 0 PID: 15692 Comm: syz-executor1 Not tainted 4.10.0-rc6-next-20170206 #1
      Hardware name: Google Google Compute Engine/Google Compute Engine,
      BIOS Google 01/01/2011
      task: ffff8801c9cc27c0 task.stack: ffff88017d1d8000
      RIP: 0010:pcpu_free_area+0x68b/0x810 mm/percpu.c:689
      RSP: 0018:ffff88017d1df488 EFLAGS: 00010046
      RAX: 0000000000010000 RBX: 00000000000007c0 RCX: ffffc90002829000
      RDX: 0000000000010000 RSI: ffffffff81940efb RDI: ffff8801db841d94
      RBP: ffff88017d1df590 R08: dffffc0000000000 R09: 1ffffffff0bb3bdd
      R10: dffffc0000000000 R11: 00000000000135dd R12: ffff8801db841d80
      R13: 0000000000038e40 R14: 00000000000007c0 R15: 00000000000007c0
      FS:  00007f6ea608f700(0000) GS:ffff8801dbe00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 000000002000aff8 CR3: 00000001c8d44000 CR4: 00000000001426f0
      DR0: 0000000020000000 DR1: 0000000020000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
      Call Trace:
       free_percpu+0x212/0x520 mm/percpu.c:1264
       ipip6_dev_free+0x43/0x60 net/ipv6/sit.c:1335
       sit_init_net+0x3cb/0xa10 net/ipv6/sit.c:1831
       ops_init+0x10a/0x530 net/core/net_namespace.c:115
       setup_net+0x2ed/0x690 net/core/net_namespace.c:291
       copy_net_ns+0x26c/0x530 net/core/net_namespace.c:396
       create_new_namespaces+0x409/0x860 kernel/nsproxy.c:106
       unshare_nsproxy_namespaces+0xae/0x1e0 kernel/nsproxy.c:205
       SYSC_unshare kernel/fork.c:2281 [inline]
       SyS_unshare+0x64e/0xfc0 kernel/fork.c:2231
       entry_SYSCALL_64_fastpath+0x1f/0xc2
    
    This is because when tunnel->dst_cache init fails, we free dev->tstats
    once in ipip6_tunnel_init() and twice in sit_init_net(). This looks
    redundant but its ndo_uinit() does not seem enough to clean up everything
    here. So avoid this by setting dev->tstats to NULL after the first free,
    at least for -net.
    
    Reported-by: Dmitry Vyukov <dvyukov@google.com>
    Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    congwang committed with gregkh Feb 8, 2017
  14. lwtunnel: valid encap attr check should return 0 when lwtunnel is dis…

    …abled
    
    [ Upstream commit 2bd137d ]
    
    An error was reported upgrading to 4.9.8:
        root@Typhoon:~# ip route add default table 210 nexthop dev eth0 via 10.68.64.1
        weight 1 nexthop dev eth0 via 10.68.64.2 weight 1
        RTNETLINK answers: Operation not supported
    
    The problem occurs when CONFIG_LWTUNNEL is not enabled and a multipath
    route is submitted.
    
    The point of lwtunnel_valid_encap_type_attr is catch modules that
    need to be loaded before any references are taken with rntl held. With
    CONFIG_LWTUNNEL disabled, there will be no modules to load so the
    lwtunnel_valid_encap_type_attr stub should just return 0.
    
    Fixes: 9ed5959 ("lwtunnel: fix autoload of lwt modules")
    Reported-by: pupilla@libero.it
    Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    dsahern committed with gregkh Feb 8, 2017
  15. sctp: avoid BUG_ON on sctp_wait_for_sndbuf

    [ Upstream commit 2dcab59 ]
    
    Alexander Popov reported that an application may trigger a BUG_ON in
    sctp_wait_for_sndbuf if the socket tx buffer is full, a thread is
    waiting on it to queue more data and meanwhile another thread peels off
    the association being used by the first thread.
    
    This patch replaces the BUG_ON call with a proper error handling. It
    will return -EPIPE to the original sendmsg call, similarly to what would
    have been done if the association wasn't found in the first place.
    
    Acked-by: Alexander Popov <alex.popov@linux.com>
    Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
    Reviewed-by: Xin Long <lucien.xin@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    marceloleitner committed with gregkh Feb 6, 2017
  16. mlx4: Invoke softirqs after napi_reschedule

    [ Upstream commit bd4ce94 ]
    
    mlx4 may schedule napi from a workqueue. Afterwards, softirqs are not run
    in a deterministic time frame and the following message may be logged:
    NOHZ: local_softirq_pending 08
    
    The problem is the same as what was described in commit ec13ee8
    ("virtio_net: invoke softirqs after __napi_schedule") and this patch
    applies the same fix to mlx4.
    
    Fixes: 07841f9 ("net/mlx4_en: Schedule napi when RX buffers allocation fails")
    Cc: Eric Dumazet <eric.dumazet@gmail.com>
    Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
    Acked-by: Eric Dumazet <edumazet@google.com>
    Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Benjamin Poirier committed with gregkh Feb 6, 2017
  17. catc: Use heap buffer for memory size test

    [ Upstream commit 2d6a0e9 ]
    
    Allocating USB buffers on the stack is not portable, and no longer
    works on x86_64 (with VMAP_STACK enabled as per default).
    
    Fixes: 1da177e ("Linux-2.6.12-rc2")
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    bwhacks committed with gregkh Feb 4, 2017
  18. catc: Combine failure cleanup code in catc_probe()

    [ Upstream commit d411491 ]
    
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    bwhacks committed with gregkh Feb 4, 2017
  19. rtl8150: Use heap buffers for all register access

    [ Upstream commit 7926aff ]
    
    Allocating USB buffers on the stack is not portable, and no longer
    works on x86_64 (with VMAP_STACK enabled as per default).
    
    Fixes: 1da177e ("Linux-2.6.12-rc2")
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    bwhacks committed with gregkh Feb 4, 2017
  20. pegasus: Use heap buffers for all register access

    [ Upstream commit 5593523 ]
    
    Allocating USB buffers on the stack is not portable, and no longer
    works on x86_64 (with VMAP_STACK enabled as per default).
    
    Fixes: 1da177e ("Linux-2.6.12-rc2")
    References: https://bugs.debian.org/852556
    Reported-by: Lisandro Damián Nicanor Pérez Meyer <lisandro@debian.org>
    Tested-by: Lisandro Damián Nicanor Pérez Meyer <lisandro@debian.org>
    Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    bwhacks committed with gregkh Feb 4, 2017
  21. macvtap: read vnet_hdr_size once

    [ Upstream commit 837585a ]
    
    When IFF_VNET_HDR is enabled, a virtio_net header must precede data.
    Data length is verified to be greater than or equal to expected header
    length tun->vnet_hdr_sz before copying.
    
    Macvtap functions read the value once, but unless READ_ONCE is used,
    the compiler may ignore this and read multiple times. Enforce a single
    read and locally cached value to avoid updates between test and use.
    
    Signed-off-by: Willem de Bruijn <willemb@google.com>
    Suggested-by: Eric Dumazet <edumazet@google.com>
    Acked-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    wdebruij committed with gregkh Feb 3, 2017
  22. tun: read vnet_hdr_sz once

    [ Upstream commit e1edab8 ]
    
    When IFF_VNET_HDR is enabled, a virtio_net header must precede data.
    Data length is verified to be greater than or equal to expected header
    length tun->vnet_hdr_sz before copying.
    
    Read this value once and cache locally, as it can be updated between
    the test and use (TOCTOU).
    
    Signed-off-by: Willem de Bruijn <willemb@google.com>
    Reported-by: Dmitry Vyukov <dvyukov@google.com>
    CC: Eric Dumazet <edumazet@google.com>
    Acked-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    wdebruij committed with gregkh Feb 3, 2017
  23. tcp: avoid infinite loop in tcp_splice_read()

    [ Upstream commit ccf7abb ]
    
    Splicing from TCP socket is vulnerable when a packet with URG flag is
    received and stored into receive queue.
    
    __tcp_splice_read() returns 0, and sk_wait_data() immediately
    returns since there is the problematic skb in queue.
    
    This is a nice way to burn cpu (aka infinite loop) and trigger
    soft lockups.
    
    Again, this gem was found by syzkaller tool.
    
    Fixes: 9c55e01 ("[TCP]: Splice receive support.")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reported-by: Dmitry Vyukov  <dvyukov@google.com>
    Cc: Willy Tarreau <w@1wt.eu>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Eric Dumazet committed with gregkh Feb 3, 2017
  24. ipv6: tcp: add a missing tcp_v6_restore_cb()

    [ Upstream commit ebf6c9c ]
    
    Dmitry reported use-after-free in ip6_datagram_recv_specific_ctl()
    
    A similar bug was fixed in commit 8ce4862 ("ipv6: tcp: restore
    IP6CB for pktoptions skbs"), but I missed another spot.
    
    tcp_v6_syn_recv_sock() can indeed set np->pktoptions from ireq->pktopts
    
    Fixes: 971f10e ("tcp: better TCP_SKB_CB layout to reduce cache line misses")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reported-by: Dmitry Vyukov <dvyukov@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Eric Dumazet committed with gregkh Feb 6, 2017
  25. ip6_gre: fix ip6gre_err() invalid reads

    [ Upstream commit 7892032 ]
    
    Andrey Konovalov reported out of bound accesses in ip6gre_err()
    
    If GRE flags contains GRE_KEY, the following expression
    *(((__be32 *)p) + (grehlen / 4) - 1)
    
    accesses data ~40 bytes after the expected point, since
    grehlen includes the size of IPv6 headers.
    
    Let's use a "struct gre_base_hdr *greh" pointer to make this
    code more readable.
    
    p[1] becomes greh->protocol.
    grhlen is the GRE header length.
    
    Fixes: c12b395 ("gre: Support GRE over IPv6")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reported-by: Andrey Konovalov <andreyknvl@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Eric Dumazet committed with gregkh Feb 5, 2017
  26. netlabel: out of bound access in cipso_v4_validate()

    [ Upstream commit d71b789 ]
    
    syzkaller found another out of bound access in ip_options_compile(),
    or more exactly in cipso_v4_validate()
    
    Fixes: 20e2a86 ("cipso: handle CIPSO options correctly when NetLabel is disabled")
    Fixes: 446fda4 ("[NetLabel]: CIPSOv4 engine")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reported-by: Dmitry Vyukov  <dvyukov@google.com>
    Cc: Paul Moore <paul@paul-moore.com>
    Acked-by: Paul Moore <paul@paul-moore.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Eric Dumazet committed with gregkh Feb 3, 2017
  27. ipv4: keep skb->dst around in presence of IP options

    [ Upstream commit 34b2cef ]
    
    Andrey Konovalov got crashes in __ip_options_echo() when a NULL skb->dst
    is accessed.
    
    ipv4_pktinfo_prepare() should not drop the dst if (evil) IP options
    are present.
    
    We could refine the test to the presence of ts_needtime or srr,
    but IP options are not often used, so let's be conservative.
    
    Thanks to syzkaller team for finding this bug.
    
    Fixes: d826eb1 ("ipv4: PKTINFO doesnt need dst reference")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reported-by: Andrey Konovalov <andreyknvl@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Eric Dumazet committed with gregkh Feb 4, 2017
  28. net: use a work queue to defer net_disable_timestamp() work

    [ Upstream commit 5fa8bbd ]
    
    Dmitry reported a warning [1] showing that we were calling
    net_disable_timestamp() -> static_key_slow_dec() from a non
    process context.
    
    Grabbing a mutex while holding a spinlock or rcu_read_lock()
    is not allowed.
    
    As Cong suggested, we now use a work queue.
    
    It is possible netstamp_clear() exits while netstamp_needed_deferred
    is not zero, but it is probably not worth trying to do better than that.
    
    netstamp_needed_deferred atomic tracks the exact number of deferred
    decrements.
    
    [1]
    [ INFO: suspicious RCU usage. ]
    4.10.0-rc5+ #192 Not tainted
    -------------------------------
    ./include/linux/rcupdate.h:561 Illegal context switch in RCU read-side
    critical section!
    
    other info that might help us debug this:
    
    rcu_scheduler_active = 2, debug_locks = 0
    2 locks held by syz-executor14/23111:
     #0:  (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff83a35c35>] lock_sock
    include/net/sock.h:1454 [inline]
     #0:  (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff83a35c35>]
    rawv6_sendmsg+0x1e65/0x3ec0 net/ipv6/raw.c:919
     #1:  (rcu_read_lock){......}, at: [<ffffffff83ae2678>] nf_hook
    include/linux/netfilter.h:201 [inline]
     #1:  (rcu_read_lock){......}, at: [<ffffffff83ae2678>]
    __ip6_local_out+0x258/0x840 net/ipv6/output_core.c:160
    
    stack backtrace:
    CPU: 2 PID: 23111 Comm: syz-executor14 Not tainted 4.10.0-rc5+ #192
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
    01/01/2011
    Call Trace:
     __dump_stack lib/dump_stack.c:15 [inline]
     dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
     lockdep_rcu_suspicious+0x139/0x180 kernel/locking/lockdep.c:4452
     rcu_preempt_sleep_check include/linux/rcupdate.h:560 [inline]
     ___might_sleep+0x560/0x650 kernel/sched/core.c:7748
     __might_sleep+0x95/0x1a0 kernel/sched/core.c:7739
     mutex_lock_nested+0x24f/0x1730 kernel/locking/mutex.c:752
     atomic_dec_and_mutex_lock+0x119/0x160 kernel/locking/mutex.c:1060
     __static_key_slow_dec+0x7a/0x1e0 kernel/jump_label.c:149
     static_key_slow_dec+0x51/0x90 kernel/jump_label.c:174
     net_disable_timestamp+0x3b/0x50 net/core/dev.c:1728
     sock_disable_timestamp+0x98/0xc0 net/core/sock.c:403
     __sk_destruct+0x27d/0x6b0 net/core/sock.c:1441
     sk_destruct+0x47/0x80 net/core/sock.c:1460
     __sk_free+0x57/0x230 net/core/sock.c:1468
     sock_wfree+0xae/0x120 net/core/sock.c:1645
     skb_release_head_state+0xfc/0x200 net/core/skbuff.c:655
     skb_release_all+0x15/0x60 net/core/skbuff.c:668
     __kfree_skb+0x15/0x20 net/core/skbuff.c:684
     kfree_skb+0x16e/0x4c0 net/core/skbuff.c:705
     inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304
     inet_frag_put include/net/inet_frag.h:133 [inline]
     nf_ct_frag6_gather+0x1106/0x3840
    net/ipv6/netfilter/nf_conntrack_reasm.c:617
     ipv6_defrag+0x1be/0x2b0 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68
     nf_hook_entry_hookfn include/linux/netfilter.h:102 [inline]
     nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310
     nf_hook include/linux/netfilter.h:212 [inline]
     __ip6_local_out+0x489/0x840 net/ipv6/output_core.c:160
     ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170
     ip6_send_skb+0xa1/0x340 net/ipv6/ip6_output.c:1722
     ip6_push_pending_frames+0xb3/0xe0 net/ipv6/ip6_output.c:1742
     rawv6_push_pending_frames net/ipv6/raw.c:613 [inline]
     rawv6_sendmsg+0x2d1a/0x3ec0 net/ipv6/raw.c:927
     inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
     sock_sendmsg_nosec net/socket.c:635 [inline]
     sock_sendmsg+0xca/0x110 net/socket.c:645
     sock_write_iter+0x326/0x600 net/socket.c:848
     do_iter_readv_writev+0x2e3/0x5b0 fs/read_write.c:695
     do_readv_writev+0x42c/0x9b0 fs/read_write.c:872
     vfs_writev+0x87/0xc0 fs/read_write.c:911
     do_writev+0x110/0x2c0 fs/read_write.c:944
     SYSC_writev fs/read_write.c:1017 [inline]
     SyS_writev+0x27/0x30 fs/read_write.c:1014
     entry_SYSCALL_64_fastpath+0x1f/0xc2
    RIP: 0033:0x445559
    RSP: 002b:00007f6f46fceb58 EFLAGS: 00000292 ORIG_RAX: 0000000000000014
    RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 0000000000445559
    RDX: 0000000000000001 RSI: 0000000020f1eff0 RDI: 0000000000000005
    RBP: 00000000006e19c0 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000292 R12: 0000000000700000
    R13: 0000000020f59000 R14: 0000000000000015 R15: 0000000000020400
    BUG: sleeping function called from invalid context at
    kernel/locking/mutex.c:752
    in_atomic(): 1, irqs_disabled(): 0, pid: 23111, name: syz-executor14
    INFO: lockdep is turned off.
    CPU: 2 PID: 23111 Comm: syz-executor14 Not tainted 4.10.0-rc5+ #192
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
    01/01/2011
    Call Trace:
     __dump_stack lib/dump_stack.c:15 [inline]
     dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
     ___might_sleep+0x47e/0x650 kernel/sched/core.c:7780
     __might_sleep+0x95/0x1a0 kernel/sched/core.c:7739
     mutex_lock_nested+0x24f/0x1730 kernel/locking/mutex.c:752
     atomic_dec_and_mutex_lock+0x119/0x160 kernel/locking/mutex.c:1060
     __static_key_slow_dec+0x7a/0x1e0 kernel/jump_label.c:149
     static_key_slow_dec+0x51/0x90 kernel/jump_label.c:174
     net_disable_timestamp+0x3b/0x50 net/core/dev.c:1728
     sock_disable_timestamp+0x98/0xc0 net/core/sock.c:403
     __sk_destruct+0x27d/0x6b0 net/core/sock.c:1441
     sk_destruct+0x47/0x80 net/core/sock.c:1460
     __sk_free+0x57/0x230 net/core/sock.c:1468
     sock_wfree+0xae/0x120 net/core/sock.c:1645
     skb_release_head_state+0xfc/0x200 net/core/skbuff.c:655
     skb_release_all+0x15/0x60 net/core/skbuff.c:668
     __kfree_skb+0x15/0x20 net/core/skbuff.c:684
     kfree_skb+0x16e/0x4c0 net/core/skbuff.c:705
     inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304
     inet_frag_put include/net/inet_frag.h:133 [inline]
     nf_ct_frag6_gather+0x1106/0x3840
    net/ipv6/netfilter/nf_conntrack_reasm.c:617
     ipv6_defrag+0x1be/0x2b0 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68
     nf_hook_entry_hookfn include/linux/netfilter.h:102 [inline]
     nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310
     nf_hook include/linux/netfilter.h:212 [inline]
     __ip6_local_out+0x489/0x840 net/ipv6/output_core.c:160
     ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170
     ip6_send_skb+0xa1/0x340 net/ipv6/ip6_output.c:1722
     ip6_push_pending_frames+0xb3/0xe0 net/ipv6/ip6_output.c:1742
     rawv6_push_pending_frames net/ipv6/raw.c:613 [inline]
     rawv6_sendmsg+0x2d1a/0x3ec0 net/ipv6/raw.c:927
     inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
     sock_sendmsg_nosec net/socket.c:635 [inline]
     sock_sendmsg+0xca/0x110 net/socket.c:645
     sock_write_iter+0x326/0x600 net/socket.c:848
     do_iter_readv_writev+0x2e3/0x5b0 fs/read_write.c:695
     do_readv_writev+0x42c/0x9b0 fs/read_write.c:872
     vfs_writev+0x87/0xc0 fs/read_write.c:911
     do_writev+0x110/0x2c0 fs/read_write.c:944
     SYSC_writev fs/read_write.c:1017 [inline]
     SyS_writev+0x27/0x30 fs/read_write.c:1014
     entry_SYSCALL_64_fastpath+0x1f/0xc2
    RIP: 0033:0x445559
    
    Fixes: b90e579 ("net: dont call jump_label_dec from irq context")
    Suggested-by: Cong Wang <xiyou.wangcong@gmail.com>
    Reported-by: Dmitry Vyukov <dvyukov@google.com>
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Eric Dumazet committed with gregkh Feb 2, 2017
  29. stmmac: Discard masked flags in interrupt status register

    [ Upstream commit 0a764db ]
    
    DW GMAC databook says the following about bits in "Register 15 (Interrupt
    Mask Register)":
    --------------------------->8-------------------------
    When set, this bit __disables_the_assertion_of_the_interrupt_signal__
    because of the setting of XXX bit in Register 14 (Interrupt
    Status Register).
    --------------------------->8-------------------------
    
    In fact even if we mask one bit in the mask register it doesn't prevent
    corresponding bit to appear in the status register, it only disables
    interrupt generation for corresponding event.
    
    But currently we expect a bit different behavior: status bits to be in
    sync with their masks, i.e. if mask for bit A is set in the mask
    register then bit A won't appear in the interrupt status register.
    
    This was proven to be incorrect assumption, see discussion here [1].
    That misunderstanding causes unexpected behaviour of the GMAC, for
    example we were happy enough to just see bogus messages about link
    state changes.
    
    So from now on we'll be only checking bits that really may trigger an
    interrupt.
    
    [1] https://lkml.org/lkml/2016/11/3/413
    
    Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
    Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
    Cc: Fabrice Gasnier <fabrice.gasnier@st.com>
    Cc: Joachim Eastwood <manabian@gmail.com>
    Cc: Phil Reid <preid@electromag.com.au>
    Cc: David Miller <davem@davemloft.net>
    Cc: Alexandre Torgue <alexandre.torgue@gmail.com>
    Cc: Vineet Gupta <vgupta@synopsys.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    abrodkin committed with gregkh Jan 27, 2017
  30. tcp: fix 0 divide in __tcp_select_window()

    [ Upstream commit 06425c3 ]
    
    syszkaller fuzzer was able to trigger a divide by zero, when
    TCP window scaling is not enabled.
    
    SO_RCVBUF can be used not only to increase sk_rcvbuf, also
    to decrease it below current receive buffers utilization.
    
    If mss is negative or 0, just return a zero TCP window.
    
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reported-by: Dmitry Vyukov  <dvyukov@google.com>
    Acked-by: Neal Cardwell <ncardwell@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Eric Dumazet committed with gregkh Feb 1, 2017
  31. ipv6: pointer math error in ip6_tnl_parse_tlv_enc_lim()

    [ Upstream commit 63117f0 ]
    
    Casting is a high precedence operation but "off" and "i" are in terms of
    bytes so we need to have some parenthesis here.
    
    Fixes: fbfa743 ("ipv6: fix ip6_tnl_parse_tlv_enc_lim()")
    Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
    Acked-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Dan Carpenter committed with gregkh Feb 1, 2017
  32. ipv6: fix ip6_tnl_parse_tlv_enc_lim()

    [ Upstream commit fbfa743 ]
    
    This function suffers from multiple issues.
    
    First one is that pskb_may_pull() may reallocate skb->head,
    so the 'raw' pointer needs either to be reloaded or not used at all.
    
    Second issue is that NEXTHDR_DEST handling does not validate
    that the options are present in skb->data, so we might read
    garbage or access non existent memory.
    
    With help from Willem de Bruijn.
    
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Reported-by: Dmitry Vyukov  <dvyukov@google.com>
    Cc: Willem de Bruijn <willemb@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Eric Dumazet committed with gregkh Jan 24, 2017
  33. net/sched: matchall: Fix configuration race

    [ Upstream commit fd62d9f ]
    
    In the current version, the matchall internal state is split into two
    structs: cls_matchall_head and cls_matchall_filter. This makes little
    sense, as matchall instance supports only one filter, and there is no
    situation where one exists and the other does not. In addition, that led
    to some races when filter was deleted while packet was processed.
    
    Unify that two structs into one, thus simplifying the process of matchall
    creation and deletion. As a result, the new, delete and get callbacks have
    a dummy implementation where all the work is done in destroy and change
    callbacks, as was done in cls_cgroup.
    
    Fixes: bf3994d ("net/sched: introduce Match-all classifier")
    Reported-by: Daniel Borkmann <daniel@iogearbox.net>
    Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
    Acked-by: Jiri Pirko <jiri@mellanox.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    yotamgi committed with gregkh Jan 31, 2017