Skip to content
Permalink
Pablo-Neira-Ay…
Switch branches/tags

Commits on Apr 12, 2021

  1. netfilter: nftables_offload: special ethertype handling for VLAN

    The nftables offload parser sets FLOW_DISSECTOR_KEY_BASIC .n_proto to the
    ethertype field in the ethertype frame. However:
    
    - FLOW_DISSECTOR_KEY_BASIC .n_proto field always stores either IPv4 or IPv6
      ethertypes.
    - FLOW_DISSECTOR_KEY_VLAN .vlan_tpid stores either the 802.1q and 802.1ad
      ethertypes. Same as for FLOW_DISSECTOR_KEY_CVLAN.
    
    This function adjusts the flow dissector to handle two scenarios:
    
    1) FLOW_DISSECTOR_KEY_VLAN .vlan_tpid is set to 802.1q or 802.1ad.
       Then, transfer:
       - the .n_proto field to FLOW_DISSECTOR_KEY_VLAN .tpid.
       - the original FLOW_DISSECTOR_KEY_VLAN .tpid to the
         FLOW_DISSECTOR_KEY_CVLAN .tpid
       - the original FLOW_DISSECTOR_KEY_CVLAN .tpid to the .n_proto field.
    
    2) .n_proto is set to 802.1q or 802.1ad. Then, transfer:
       - the .n_proto field to FLOW_DISSECTOR_KEY_VLAN .tpid.
       - the original FLOW_DISSECTOR_KEY_VLAN .tpid to the .n_proto field.
    
    Fixes: a82055a ("netfilter: nft_payload: add VLAN offload support")
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    ummakynes authored and intel-lab-lkp committed Apr 12, 2021
  2. netfilter: nftables_offload: VLAN id needs host byteorder in flow dis…

    …sector
    
    The flow dissector representation expects the VLAN id in host byteorder.
    Add the NFT_OFFLOAD_F_NETWORK2HOST flag to swap the bytes from nft_cmp.
    
    Fixes: a82055a ("netfilter: nft_payload: add VLAN offload support")
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    ummakynes authored and intel-lab-lkp committed Apr 12, 2021
  3. netfilter: nft_payload: fix C-VLAN offload support

    - add another struct flow_dissector_key_vlan for C-VLAN
    - update layer 3 dependency to allow to match on IPv4/IPv6
    
    Fixes: 89d8fd4 ("netfilter: nft_payload: add C-VLAN offload support")
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    ummakynes authored and intel-lab-lkp committed Apr 12, 2021

Commits on Apr 10, 2021

  1. netfilter: arp_tables: add pre_exit hook for table unregister

    Same problem that also existed in iptables/ip(6)tables, when
    arptable_filter is removed there is no longer a wait period before the
    table/ruleset is free'd.
    
    Unregister the hook in pre_exit, then remove the table in the exit
    function.
    This used to work correctly because the old nf_hook_unregister API
    did unconditional synchronize_net.
    
    The per-net hook unregister function uses call_rcu instead.
    
    Fixes: b9e69e1 ("netfilter: xtables: don't hook tables by default")
    Signed-off-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Florian Westphal authored and ummakynes committed Apr 10, 2021
  2. netfilter: bridge: add pre_exit hooks for ebtable unregistration

    Just like ip/ip6/arptables, the hooks have to be removed, then
    synchronize_rcu() has to be called to make sure no more packets are being
    processed before the ruleset data is released.
    
    Place the hook unregistration in the pre_exit hook, then call the new
    ebtables pre_exit function from there.
    
    Years ago, when first netns support got added for netfilter+ebtables,
    this used an older (now removed) netfilter hook unregister API, that did
    a unconditional synchronize_rcu().
    
    Now that all is done with call_rcu, ebtable_{filter,nat,broute} pernet exit
    handlers may free the ebtable ruleset while packets are still in flight.
    
    This can only happens on module removal, not during netns exit.
    
    The new function expects the table name, not the table struct.
    
    This is because upcoming patch set (targeting -next) will remove all
    net->xt.{nat,filter,broute}_table instances, this makes it necessary
    to avoid external references to those member variables.
    
    The existing APIs will be converted, so follow the upcoming scheme of
    passing name + hook type instead.
    
    Fixes: aee12a0 ("ebtables: remove nf_hook_register usage")
    Signed-off-by: Florian Westphal <fw@strlen.de>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    Florian Westphal authored and ummakynes committed Apr 10, 2021
  3. netfilter: nft_limit: avoid possible divide error in nft_limit_init

    div_u64() divides u64 by u32.
    
    nft_limit_init() wants to divide u64 by u64, use the appropriate
    math function (div64_u64)
    
    divide error: 0000 [#1] PREEMPT SMP KASAN
    CPU: 1 PID: 8390 Comm: syz-executor188 Not tainted 5.12.0-rc4-syzkaller #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    RIP: 0010:div_u64_rem include/linux/math64.h:28 [inline]
    RIP: 0010:div_u64 include/linux/math64.h:127 [inline]
    RIP: 0010:nft_limit_init+0x2a2/0x5e0 net/netfilter/nft_limit.c:85
    Code: ef 4c 01 eb 41 0f 92 c7 48 89 de e8 38 a5 22 fa 4d 85 ff 0f 85 97 02 00 00 e8 ea 9e 22 fa 4c 0f af f3 45 89 ed 31 d2 4c 89 f0 <49> f7 f5 49 89 c6 e8 d3 9e 22 fa 48 8d 7d 48 48 b8 00 00 00 00 00
    RSP: 0018:ffffc90009447198 EFLAGS: 00010246
    RAX: 0000000000000000 RBX: 0000200000000000 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: ffffffff875152e6 RDI: 0000000000000003
    RBP: ffff888020f80908 R08: 0000200000000000 R09: 0000000000000000
    R10: ffffffff875152d8 R11: 0000000000000000 R12: ffffc90009447270
    R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
    FS:  000000000097a300(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00000000200001c4 CR3: 0000000026a52000 CR4: 00000000001506e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
     nf_tables_newexpr net/netfilter/nf_tables_api.c:2675 [inline]
     nft_expr_init+0x145/0x2d0 net/netfilter/nf_tables_api.c:2713
     nft_set_elem_expr_alloc+0x27/0x280 net/netfilter/nf_tables_api.c:5160
     nf_tables_newset+0x1997/0x3150 net/netfilter/nf_tables_api.c:4321
     nfnetlink_rcv_batch+0x85a/0x21b0 net/netfilter/nfnetlink.c:456
     nfnetlink_rcv_skb_batch net/netfilter/nfnetlink.c:580 [inline]
     nfnetlink_rcv+0x3af/0x420 net/netfilter/nfnetlink.c:598
     netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline]
     netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338
     netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927
     sock_sendmsg_nosec net/socket.c:654 [inline]
     sock_sendmsg+0xcf/0x120 net/socket.c:674
     ____sys_sendmsg+0x6e8/0x810 net/socket.c:2350
     ___sys_sendmsg+0xf3/0x170 net/socket.c:2404
     __sys_sendmsg+0xe5/0x1b0 net/socket.c:2433
     do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
     entry_SYSCALL_64_after_hwframe+0x44/0xae
    
    Fixes: c26844e ("netfilter: nf_tables: Fix nft limit burst handling")
    Fixes: 3e0f64b ("netfilter: nft_limit: fix packet ratelimiting")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Diagnosed-by: Luigi Rizzo <lrizzo@google.com>
    Reported-by: syzbot <syzkaller@googlegroups.com>
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    neebe000 authored and ummakynes committed Apr 10, 2021

Commits on Mar 30, 2021

  1. netfilter: conntrack: do not print icmpv6 as unknown via /proc

    /proc/net/nf_conntrack shows icmpv6 as unknown.
    
    Fixes: 09ec82f ("netfilter: conntrack: remove protocol name from l4proto struct")
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    ummakynes committed Mar 30, 2021
  2. netfilter: flowtable: fix NAT IPv6 offload mangling

    Fix out-of-bound access in the address array.
    
    Fixes: 5c27d8d ("netfilter: nf_flow_table_offload: add IPv6 support")
    Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
    ummakynes committed Mar 30, 2021
  3. net: let skb_orphan_partial wake-up waiters.

    Currently the mentioned helper can end-up freeing the socket wmem
    without waking-up any processes waiting for more write memory.
    
    If the partially orphaned skb is attached to an UDP (or raw) socket,
    the lack of wake-up can hang the user-space.
    
    Even for TCP sockets not calling the sk destructor could have bad
    effects on TSQ.
    
    Address the issue using skb_orphan to release the sk wmem before
    setting the new sock_efree destructor. Additionally bundle the
    whole ownership update in a new helper, so that later other
    potential users could avoid duplicate code.
    
    v1 -> v2:
     - use skb_orphan() instead of sort of open coding it (Eric)
     - provide an helper for the ownership change (Eric)
    
    Fixes: f6ba8d3 ("netem: fix skb_orphan_partial()")
    Suggested-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Paolo Abeni authored and davem330 committed Mar 30, 2021
  4. sch_htb: fix null pointer dereference on a null new_q

    sch_htb: fix null pointer dereference on a null new_q
    
    Currently if new_q is null, the null new_q pointer will be
    dereference when 'q->offload' is true. Fix this by adding
    a braces around htb_parent_to_leaf_offload() to avoid it.
    
    Addresses-Coverity: ("Dereference after null check")
    Fixes: d03b195 ("sch_htb: Hierarchical QoS hardware offload")
    
    Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    wyjwang authored and davem330 committed Mar 30, 2021
  5. net: qrtr: Fix memory leak on qrtr_tx_wait failure

    qrtr_tx_wait does not check for radix_tree_insert failure, causing
    the 'flow' object to be unreferenced after qrtr_tx_wait return. Fix
    that by releasing flow on radix_tree_insert failure.
    
    Fixes: 5fdeb0d ("net: qrtr: Implement outgoing flow control")
    Reported-by: syzbot+739016799a89c530b32a@syzkaller.appspotmail.com
    Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
    Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
    Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Loic Poulain authored and davem330 committed Mar 30, 2021
  6. net: sched: bump refcount for new action in ACT replace mode

    Currently, action creation using ACT API in replace mode is buggy.
    When invoking for non-existent action index 42,
    
    	tc action replace action bpf obj foo.o sec <xyz> index 42
    
    kernel creates the action, fills up the netlink response, and then just
    deletes the action after notifying userspace.
    
    	tc action show action bpf
    
    doesn't list the action.
    
    This happens due to the following sequence when ovr = 1 (replace mode)
    is enabled:
    
    tcf_idr_check_alloc is used to atomically check and either obtain
    reference for existing action at index, or reserve the index slot using
    a dummy entry (ERR_PTR(-EBUSY)).
    
    This is necessary as pointers to these actions will be held after
    dropping the idrinfo lock, so bumping the reference count is necessary
    as we need to insert the actions, and notify userspace by dumping their
    attributes. Finally, we drop the reference we took using the
    tcf_action_put_many call in tcf_action_add. However, for the case where
    a new action is created due to free index, its refcount remains one.
    This when paired with the put_many call leads to the kernel setting up
    the action, notifying userspace of its creation, and then tearing it
    down. For existing actions, the refcount is still held so they remain
    unaffected.
    
    Fortunately due to rtnl_lock serialization requirement, such an action
    with refcount == 1 will not be concurrently deleted by anything else, at
    best CLS API can move its refcount up and down by binding to it after it
    has been published from tcf_idr_insert_many. Since refcount is atleast
    one until put_many call, CLS API cannot delete it. Also __tcf_action_put
    release path already ensures deterministic outcome (either new action
    will be created or existing action will be reused in case CLS API tries
    to bind to action concurrently) due to idr lock serialization.
    
    We fix this by making refcount of newly created actions as 2 in ACT API
    replace mode. A relaxed store will suffice as visibility is ensured only
    after the tcf_idr_insert_many call.
    
    Note that in case of creation or overwriting using CLS API only (i.e.
    bind = 1), overwriting existing action object is not allowed, and any
    such request is silently ignored (without error).
    
    The refcount bump that occurs in tcf_idr_check_alloc call there for
    existing action will pair with tcf_exts_destroy call made from the
    owner module for the same action. In case of action creation, there
    is no existing action, so no tcf_exts_destroy callback happens.
    
    This means no code changes for CLS API.
    
    Fixes: cae422f ("net: sched: use reference counting action init")
    Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    kkdwivedi authored and davem330 committed Mar 30, 2021
  7. net/ncsi: Avoid channel_monitor hrtimer deadlock

    Calling ncsi_stop_channel_monitor from channel_monitor is a guaranteed
    deadlock on SMP because stop calls del_timer_sync on the timer that
    invoked channel_monitor as its timer function.
    
    Recognise the inherent race of marking the monitor disabled before
    deleting the timer by just returning if enable was cleared.  After
    a timeout (the default case -- reset to START when response received)
    just mark the monitor.enabled false.
    
    If the channel has an entry on the channel_queue list, or if the
    state is not ACTIVE or INACTIVE, then warn and mark the timer stopped
    and don't restart, as the locking is broken somehow.
    
    Fixes: 0795fb2 ("net/ncsi: Stop monitor if channel times out or is inactive")
    Signed-off-by: Milton Miller <miltonm@us.ibm.com>
    Signed-off-by: Eddie James <eajames@linux.ibm.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    mdmillerii authored and davem330 committed Mar 30, 2021
  8. ethernet/netronome/nfp: Fix a use after free in nfp_bpf_ctrl_msg_rx

    In nfp_bpf_ctrl_msg_rx, if
    nfp_ccm_get_type(skb) == NFP_CCM_TYPE_BPF_BPF_EVENT is true, the skb
    will be freed. But the skb is still used by nfp_ccm_rx(&bpf->ccm, skb).
    
    My patch adds a return when the skb was freed.
    
    Fixes: bcf0caf ("nfp: split out common control message handling code")
    Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn>
    Reviewed-by: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Yunlongs authored and davem330 committed Mar 30, 2021

Commits on Mar 29, 2021

  1. Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/gi…

    …t/tnguy/net-queue
    
    Tony Nguyen says:
    
    ====================
    Intel Wired LAN Driver Updates 2021-03-29
    
    This series contains updates to ice driver only.
    
    Ani does not fail on link/PHY errors during probe as this is not a fatal
    error to prevent the user from remedying the problem. He also corrects
    checking Wake on LAN support to be port number, not PF ID.
    
    Fabio increases the AdminQ timeout as some commands can take longer than
    the current value.
    
    Chinh fixes iSCSI to use be able to use port 860 by using information
    from DCBx and other QoS configuration info.
    
    Krzysztof fixes a possible race between ice_open() and ice_stop().
    
    Bruce corrects the ordering of arguments in a memory allocation call.
    
    Dave removes DCBNL device reset bit which is blocking changes coming
    from DCBNL interface.
    
    Jacek adds error handling for filter allocation failure.
    
    Robert ensures memory is freed if VSI filter list issues are
    encountered.
    ====================
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
    davem330 committed Mar 29, 2021
  2. dt-bindings: net: bcm4908-enet: fix Ethernet generic properties

    This binding file uses $ref: ethernet-controller.yaml# so it's required
    to use "unevaluatedProperties" (instead of "additionalProperties") to
    make Ethernet properties validate.
    
    Fixes: f08b5cf ("dt-bindings: net: bcm4908-enet: include ethernet-controller.yaml")
    Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Rafał Miłecki authored and davem330 committed Mar 29, 2021
  3. dt-bindings: net: ethernet-controller: fix typo in NVMEM

    The correct property name is "nvmem-cell-names". This is what:
    1. Was originally documented in the ethernet.txt
    2. Is used in DTS files
    3. Matches standard syntax for phandles
    4. Linux net subsystem checks for
    
    Fixes: 9d3de3c ("dt-bindings: net: Add YAML schemas for the generic Ethernet options")
    Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Rafał Miłecki authored and davem330 committed Mar 29, 2021
  4. net:tipc: Fix a double free in tipc_sk_mcast_rcv

    In the if(skb_peek(arrvq) == skb) branch, it calls __skb_dequeue(arrvq) to get
    the skb by skb = skb_peek(arrvq). Then __skb_dequeue() unlinks the skb from arrvq
    and returns the skb which equals to skb_peek(arrvq). After __skb_dequeue(arrvq)
    finished, the skb is freed by kfree_skb(__skb_dequeue(arrvq)) in the first time.
    
    Unfortunately, the same skb is freed in the second time by kfree_skb(skb) after
    the branch completed.
    
    My patch removes kfree_skb() in the if(skb_peek(arrvq) == skb) branch, because
    this skb will be freed by kfree_skb(skb) finally.
    
    Fixes: cb1b728 ("tipc: eliminate race condition at multicast reception")
    Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Yunlongs authored and davem330 committed Mar 29, 2021
  5. cxgb4: avoid collecting SGE_QBASE regs during traffic

    Accessing SGE_QBASE_MAP[0-3] and SGE_QBASE_INDEX registers can lead
    to SGE missing doorbells under heavy traffic. So, only collect them
    when adapter is idle. Also update the regdump range to skip collecting
    these registers.
    
    Fixes: 80a95a8 ("cxgb4: collect SGE PF/VF queue map")
    Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    chelsiocudbg authored and davem330 committed Mar 29, 2021
  6. net: dsa: Fix type was not set for devlink port

    If PHY is not available on DSA port (described at devicetree but absent or
    failed to detect) then kernel prints warning after 3700 secs:
    
    [ 3707.948771] ------------[ cut here ]------------
    [ 3707.948784] Type was not set for devlink port.
    [ 3707.948894] WARNING: CPU: 1 PID: 17 at net/core/devlink.c:8097 0xc083f9d8
    
    We should unregister the devlink port as a user port and
    re-register it as an unused port before executing "continue" in case of
    dsa_port_setup error.
    
    Fixes: 86f8b1c ("net: dsa: Do not make user port errors fatal")
    Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru>
    Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    fidomax authored and davem330 committed Mar 29, 2021
  7. gianfar: Handle error code at MAC address change

    Handle return error code of eth_mac_addr();
    
    Fixes: 3d23a05 ("gianfar: Enable changing mac addr when if up")
    Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    claudiu-m authored and davem330 committed Mar 29, 2021
  8. ethernet: myri10ge: Fix a use after free in myri10ge_sw_tso

    In myri10ge_sw_tso, the skb_list_walk_safe macro will set
    (curr) = (segs) and (next) = (curr)->next. If status!=0 is true,
    the memory pointed by curr and segs will be free by dev_kfree_skb_any(curr).
    But later, the segs is used by segs = segs->next and causes a uaf.
    
    As (next) = (curr)->next, my patch replaces seg->next to next.
    
    Fixes: 536577f ("net: myri10ge: use skb_list_walk_safe helper for gso segments")
    Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Yunlongs authored and davem330 committed Mar 29, 2021
  9. MAINTAINERS: Add entry for Qualcomm IPC Router (QRTR) driver

    Add MAINTAINERS entry for Qualcomm IPC Router (QRTR) driver.
    
    Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Mani-Sadhasivam authored and davem330 committed Mar 29, 2021
  10. Merge tag 'linux-can-fixes-for-5.12-20210329' of git://git.kernel.org…

    …/pub/scm/linux/kernel/git/mkl/linux-can
    
    Marc Kleine-Budde says:
    
    ====================
    pull-request: can 2021-03-29
    
    this is a pull request of 3 patches for net/master.
    
    The two patch are by Oliver Hartkopp. He fixes length check in the
    proto_ops::getname callback for the CAN RAW, BCM and ISOTP protocols,
    which were broken by the introduction of the J1939 protocol.
    
    The last patch is by me and fixes the a BUILD_BUG_ON() check which
    triggers on ARCH=arm with CONFIG_AEABI unset.
    ====================
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
    davem330 committed Mar 29, 2021
  11. Merge branch 'mlxsw-ecn-marking'

    Ido Schimmel says:
    
    ====================
    mlxsw: spectrum: Fix ECN marking in tunnel decapsulation
    
    Patch #1 fixes a discrepancy between the software and hardware data
    paths with regards to ECN marking after decapsulation. See the changelog
    for a detailed description.
    
    Patch #2 extends the ECN decap test to cover all possible combinations
    of inner and outer ECN markings. The test passes over both data paths.
    
    v2:
    * Only set ECT(1) if inner is ECT(0)
    * Introduce a new helper to determine inner ECN. Share it between NVE
      and IP-in-IP tunnels
    * Extend the selftest
    ====================
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
    davem330 committed Mar 29, 2021
  12. selftests: forwarding: vxlan_bridge_1d: Add more ECN decap test cases

    Test that all possible combinations of inner and outer ECN bits result
    in the correct inner ECN marking according to RFC 6040 4.2.
    
    Signed-off-by: Ido Schimmel <idosch@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    idosch authored and davem330 committed Mar 29, 2021
  13. mlxsw: spectrum: Fix ECN marking in tunnel decapsulation

    Cited commit changed the behavior of the software data path with regards
    to the ECN marking of decapsulated packets. However, the commit did not
    change other callers of __INET_ECN_decapsulate(), namely mlxsw. The
    driver is using the function in order to ensure that the hardware and
    software data paths act the same with regards to the ECN marking of
    decapsulated packets.
    
    The discrepancy was uncovered by commit 5aa3c33 ("selftests:
    forwarding: vxlan_bridge_1d: Fix vxlan ecn decapsulate value") that
    aligned the selftest to the new behavior. Without this patch the
    selftest passes when used with veth pairs, but fails when used with
    mlxsw netdevs.
    
    Fix this by instructing the device to propagate the ECT(1) mark from the
    outer header to the inner header when the inner header is ECT(0), for
    both NVE and IP-in-IP tunnels.
    
    A helper is added in order not to duplicate the code between both tunnel
    types.
    
    Fixes: b723748 ("tunnel: Propagate ECT(1) when decapsulating as recommended by RFC6040")
    Signed-off-by: Ido Schimmel <idosch@nvidia.com>
    Reviewed-by: Petr Machata <petrm@nvidia.com>
    Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    idosch authored and davem330 committed Mar 29, 2021
  14. ice: Cleanup fltr list in case of allocation issues

    When ice_remove_vsi_lkup_fltr is called, by calling
    ice_add_to_vsi_fltr_list local copy of vsi filter list
    is created. If any issues during creation of vsi filter
    list occurs it up for the caller to free already
    allocated memory. This patch ensures proper memory
    deallocation in these cases.
    
    Fixes: 80d144c ("ice: Refactor switch rule management structures and functions")
    Signed-off-by: Robert Malz <robertx.malz@intel.com>
    Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    rmalzx authored and anguy11 committed Mar 29, 2021
  15. ice: Use port number instead of PF ID for WoL

    As per the spec, the WoL control word read from the NVM should be
    interpreted as port numbers, and not PF numbers. So when checking
    if WoL supported, use the port number instead of the PF ID.
    
    Also, ice_is_wol_supported doesn't really need a pointer to the pf
    struct, but just needs a pointer to the hw instance.
    
    Fixes: 769c500 ("ice: Add advanced power mgmt for WoL")
    Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
    Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    refactorman authored and anguy11 committed Mar 29, 2021
  16. ice: Fix for dereference of NULL pointer

    Add handling of allocation fault for ice_vsi_list_map_info.
    
    Also *fi should not be NULL pointer, it is a reference to raw
    data field, so remove this variable and use the reference
    directly.
    
    Fixes: 9daf820 ("ice: Add support for switch filter programming")
    Signed-off-by: Jacek Bułatek <jacekx.bulatek@intel.com>
    Co-developed-by: Haiyue Wang <haiyue.wang@intel.com>
    Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
    Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    jacek-bulatek authored and anguy11 committed Mar 29, 2021
  17. ice: remove DCBNL_DEVRESET bit from PF state

    The original purpose of the ICE_DCBNL_DEVRESET was to protect
    the driver during DCBNL device resets.  But, the flow for
    DCBNL device resets now consists of only calls up the stack
    such as dev_close() and dev_open() that will result in NDO calls
    to the driver.  These will be handled with state changes from the
    stack.  Also, there is a problem of the dev_close and dev_open
    being blocked by checks for reset in progress also using the
    ICE_DCBNL_DEVRESET bit.
    
    Since the ICE_DCBNL_DEVRESET bit is not necessary for protecting
    the driver from DCBNL device resets and it is actually blocking
    changes coming from the DCBNL interface, remove the bit from the
    PF state and don't block driver function based on DCBNL reset in
    progress.
    
    Fixes: b94b013 ("ice: Implement DCBNL support")
    Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
    Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    dmertman authored and anguy11 committed Mar 29, 2021
  18. ice: fix memory allocation call

    Fix the order of number of array members and member size parameters in a
    *calloc() call.
    
    Fixes: b3c3890 ("ice: avoid unnecessary single-member variable-length structs")
    Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
    Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    bwallan authored and anguy11 committed Mar 29, 2021
  19. ice: prevent ice_open and ice_stop during reset

    There is a possibility of race between ice_open or ice_stop calls
    performed by OS and reset handling routine both trying to modify VSI
    resources. Observed scenarios:
    - reset handler deallocates memory in ice_vsi_free_arrays and ice_open
      tries to access it in ice_vsi_cfg_txq leading to driver crash
    - reset handler deallocates memory in ice_vsi_free_arrays and ice_close
      tries to access it in ice_down leading to driver crash
    - reset handler clears port scheduler topology and sets port state to
      ICE_SCHED_PORT_STATE_INIT leading to ice_ena_vsi_txq fail in ice_open
    
    To prevent this additional checks in ice_open and ice_stop are
    introduced to make sure that OS is not allowed to alter VSI config while
    reset is in progress.
    
    Fixes: cdedef5 ("ice: Configure VSIs for Tx/Rx")
    Signed-off-by: Krzysztof Goreczny <krzysztof.goreczny@intel.com>
    Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Krzysztof Goreczny authored and anguy11 committed Mar 29, 2021
  20. ice: Recognize 860 as iSCSI port in CEE mode

    iSCSI can use both TCP ports 860 and 3260. However, in our current
    implementation, the ice_aqc_opc_get_cee_dcb_cfg (0x0A07) AQ command
    doesn't provide a way to communicate the protocol port number to the
    AQ's caller. Thus, we assume that 3260 is the iSCSI port number at the
    AQ's caller layer.
    
    Rely on the dcbx-willing mode, desired QoS and remote QoS configuration to
    determine which port number that iSCSI will use.
    
    Fixes: 0ebd3ff ("ice: Add code for DCB initialization part 2/4")
    Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com>
    Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    chinhc authored and anguy11 committed Mar 29, 2021
  21. ice: Increase control queue timeout

    250 msec timeout is insufficient for some AQ commands. Advice from FW
    team was to increase the timeout. Increase to 1 second.
    
    Fixes: 7ec59ee ("ice: Add support for control queues")
    Signed-off-by: Fabio Pricoco <fabio.pricoco@intel.com>
    Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    fpricoco authored and anguy11 committed Mar 29, 2021
Older