Skip to content
Permalink
Ziyang-Xuan/ca…
Switch branches/tags

Commits on Jul 22, 2021

  1. can: raw: fix raw_rcv panic for sock UAF

    We get a bug during ltp can_filter test as following.
    
    ===========================================
    [60919.264984] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
    [60919.265223] PGD 8000003dda726067 P4D 8000003dda726067 PUD 3dda727067 PMD 0
    [60919.265443] Oops: 0000 [#1] SMP PTI
    [60919.265550] CPU: 30 PID: 3638365 Comm: can_filter Kdump: loaded Tainted: G        W         4.19.90+ #1
    [60919.266068] RIP: 0010:selinux_socket_sock_rcv_skb+0x3e/0x200
    [60919.293289] RSP: 0018:ffff8d53bfc03cf8 EFLAGS: 00010246
    [60919.307140] RAX: 0000000000000000 RBX: 000000000000001d RCX: 0000000000000007
    [60919.320756] RDX: 0000000000000001 RSI: ffff8d5104a8ed00 RDI: ffff8d53bfc03d30
    [60919.334319] RBP: ffff8d9338056800 R08: ffff8d53bfc29d80 R09: 0000000000000001
    [60919.347969] R10: ffff8d53bfc03ec0 R11: ffffb8526ef47c98 R12: ffff8d53bfc03d30
    [60919.350320] perf: interrupt took too long (3063 > 2500), lowering kernel.perf_event_max_sample_rate to 65000
    [60919.361148] R13: 0000000000000001 R14: ffff8d53bcf90000 R15: 0000000000000000
    [60919.361151] FS:  00007fb78b6b3600(0000) GS:ffff8d53bfc00000(0000) knlGS:0000000000000000
    [60919.400812] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [60919.413730] CR2: 0000000000000010 CR3: 0000003e3f784006 CR4: 00000000007606e0
    [60919.426479] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [60919.439339] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [60919.451608] PKRU: 55555554
    [60919.463622] Call Trace:
    [60919.475617]  <IRQ>
    [60919.487122]  ? update_load_avg+0x89/0x5d0
    [60919.498478]  ? update_load_avg+0x89/0x5d0
    [60919.509822]  ? account_entity_enqueue+0xc5/0xf0
    [60919.520709]  security_sock_rcv_skb+0x2a/0x40
    [60919.531413]  sk_filter_trim_cap+0x47/0x1b0
    [60919.542178]  ? kmem_cache_alloc+0x38/0x1b0
    [60919.552444]  sock_queue_rcv_skb+0x17/0x30
    [60919.562477]  raw_rcv+0x110/0x190 [can_raw]
    [60919.572539]  can_rcv_filter+0xbc/0x1b0 [can]
    [60919.582173]  can_receive+0x6b/0xb0 [can]
    [60919.591595]  can_rcv+0x31/0x70 [can]
    [60919.600783]  __netif_receive_skb_one_core+0x5a/0x80
    [60919.609864]  process_backlog+0x9b/0x150
    [60919.618691]  net_rx_action+0x156/0x400
    [60919.627310]  ? sched_clock_cpu+0xc/0xa0
    [60919.635714]  __do_softirq+0xe8/0x2e9
    [60919.644161]  do_softirq_own_stack+0x2a/0x40
    [60919.652154]  </IRQ>
    [60919.659899]  do_softirq.part.17+0x4f/0x60
    [60919.667475]  __local_bh_enable_ip+0x60/0x70
    [60919.675089]  __dev_queue_xmit+0x539/0x920
    [60919.682267]  ? finish_wait+0x80/0x80
    [60919.689218]  ? finish_wait+0x80/0x80
    [60919.695886]  ? sock_alloc_send_pskb+0x211/0x230
    [60919.702395]  ? can_send+0xe5/0x1f0 [can]
    [60919.708882]  can_send+0xe5/0x1f0 [can]
    [60919.715037]  raw_sendmsg+0x16d/0x268 [can_raw]
    
    It's because raw_setsockopt() concurrently with
    unregister_netdevice_many(). Concurrent scenario as following.
    
    	cpu0						cpu1
    raw_bind
    raw_setsockopt					unregister_netdevice_many
    						unlist_netdevice
    dev_get_by_index				raw_notifier
    raw_enable_filters				......
    can_rx_register
    can_rcv_list_find(..., net->can.rx_alldev_list)
    
    ......
    
    sock_close
    raw_release(sock_a)
    
    ......
    
    can_receive
    can_rcv_filter(net->can.rx_alldev_list, ...)
    raw_rcv(skb, sock_a)
    BUG
    
    After unlist_netdevice(), dev_get_by_index() return NULL in
    raw_setsockopt(). Function raw_enable_filters() will add sock
    and can_filter to net->can.rx_alldev_list. Then the sock is closed.
    Followed by, we sock_sendmsg() to a new vcan device use the same
    can_filter. Protocol stack match the old receiver whose sock has
    been released on net->can.rx_alldev_list in can_rcv_filter().
    Function raw_rcv() uses the freed sock. UAF BUG is triggered.
    
    We can find that the key issue is that net_device has not been
    protected in raw_setsockopt(). Use rtnl_lock to protect net_device
    in raw_setsockopt().
    
    Fixes: c18ce10 ("[CAN]: Add raw protocol")
    Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
    Ziyang Xuan authored and intel-lab-lkp committed Jul 22, 2021

Commits on Jul 21, 2021

  1. net: ixp46x: fix ptp build failure

    The rework of the ixp46x cpu detection left the network driver in
    a half broken state:
    
    drivers/net/ethernet/xscale/ptp_ixp46x.c: In function 'ptp_ixp_init':
    drivers/net/ethernet/xscale/ptp_ixp46x.c:290:51: error: 'IXP4XX_TIMESYNC_BASE_VIRT' undeclared (first use in this function)
      290 |                 (struct ixp46x_ts_regs __iomem *) IXP4XX_TIMESYNC_BASE_VIRT;
          |                                                   ^~~~~~~~~~~~~~~~~~~~~~~~~
    drivers/net/ethernet/xscale/ptp_ixp46x.c:290:51: note: each undeclared identifier is reported only once for each function it appears in
    drivers/net/ethernet/xscale/ptp_ixp46x.c: At top level:
    drivers/net/ethernet/xscale/ptp_ixp46x.c:323:1: error: data definition has no type or storage class [-Werror]
      323 | module_init(ptp_ixp_init);
    
    I have patches to complete the transition for a future release, but
    for the moment, add the missing include statements to get it to build
    again.
    
    Fixes: 09aa9aa ("soc: ixp4xx: move cpu detection to linux/soc/ixp4xx/cpu.h")
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    arndb authored and davem330 committed Jul 21, 2021
  2. ibmvnic: Remove the proper scrq flush

    Commit 65d6470 ("ibmvnic: clean pending indirect buffs during reset")
    intended to remove the call to ibmvnic_tx_scrq_flush() when the
    ->resetting flag is true and was tested that way. But during the final
    rebase to net-next, the hunk got applied to a block few lines below
    (which happened to have the same diff context) and the wrong call to
    ibmvnic_tx_scrq_flush() got removed.
    
    Fix that by removing the correct ibmvnic_tx_scrq_flush() and restoring
    the one that was incorrectly removed.
    
    Fixes: 65d6470 ("ibmvnic: clean pending indirect buffs during reset")
    Reported-by: Dany Madden <drt@linux.ibm.com>
    Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    sukadev authored and davem330 committed Jul 21, 2021
  3. Merge branch 'pmtu-esp'

    Vadim Fedorenko ays:
    
    ====================
    Fix PMTU for ESP-in-UDP encapsulation
    
    Bug 213669 uncovered regression in PMTU discovery for UDP-encapsulated
    routes and some incorrect usage in udp tunnel fields. This series fixes
    problems and also adds such case for selftests
    
    v3:
     - update checking logic to account SCTP use case
    v2:
     - remove refactor code that was in first patch
     - move checking logic to __udp{4,6}_lib_err_encap
     - add more tests, especially routed configuration
    ====================
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
    davem330 committed Jul 21, 2021
  4. selftests: net: add ESP-in-UDP PMTU test

    The case of ESP in UDP encapsulation was not covered before. Add
    cases of local changes of MTU and difference on routed path.
    
    Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    vvfedorenko authored and davem330 committed Jul 21, 2021
  5. udp: check encap socket in __udp_lib_err

    Commit d26796a ("udp: check udp sock encap_type in __udp_lib_err")
    added checks for encapsulated sockets but it broke cases when there is
    no implementation of encap_err_lookup for encapsulation, i.e. ESP in
    UDP encapsulation. Fix it by calling encap_err_lookup only if socket
    implements this method otherwise treat it as legal socket.
    
    Fixes: d26796a ("udp: check udp sock encap_type in __udp_lib_err")
    Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
    Reviewed-by: Xin Long <lucien.xin@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    vvfedorenko authored and davem330 committed Jul 21, 2021
  6. sctp: update active_key for asoc when old key is being replaced

    syzbot reported a call trace:
    
      BUG: KASAN: use-after-free in sctp_auth_shkey_hold+0x22/0xa0 net/sctp/auth.c:112
      Call Trace:
       sctp_auth_shkey_hold+0x22/0xa0 net/sctp/auth.c:112
       sctp_set_owner_w net/sctp/socket.c:131 [inline]
       sctp_sendmsg_to_asoc+0x152e/0x2180 net/sctp/socket.c:1865
       sctp_sendmsg+0x103b/0x1d30 net/sctp/socket.c:2027
       inet_sendmsg+0x99/0xe0 net/ipv4/af_inet.c:821
       sock_sendmsg_nosec net/socket.c:703 [inline]
       sock_sendmsg+0xcf/0x120 net/socket.c:723
    
    This is an use-after-free issue caused by not updating asoc->shkey after
    it was replaced in the key list asoc->endpoint_shared_keys, and the old
    key was freed.
    
    This patch is to fix by also updating active_key for asoc when old key is
    being replaced with a new one. Note that this issue doesn't exist in
    sctp_auth_del_key_id(), as it's not allowed to delete the active_key
    from the asoc.
    
    Fixes: 1b1e0bc ("sctp: add refcnt support for sh_key")
    Reported-by: syzbot+b774577370208727d12b@syzkaller.appspotmail.com
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    lxin authored and davem330 committed Jul 21, 2021
  7. r8169: Avoid duplicate sysfs entry creation error

    When registering the MDIO bus for a r8169 device, we use the PCI
    bus/device specifier as a (seemingly) unique device identifier.
    However the very same BDF number can be used on another PCI segment,
    which makes the driver fail probing:
    
    [ 27.544136] r8169 0002:07:00.0: enabling device (0000 -> 0003)
    [ 27.559734] sysfs: cannot create duplicate filename '/class/mdio_bus/r8169-700'
    ....
    [ 27.684858] libphy: mii_bus r8169-700 failed to register
    [ 27.695602] r8169: probe of 0002:07:00.0 failed with error -22
    
    Add the segment number to the device name to make it more unique.
    
    This fixes operation on ARM N1SDP boards, with two boards connected
    together to form an SMP system, and all on-board devices showing up
    twice, just on different PCI segments. A similar issue would occur on
    large systems with many PCI slots and multiple RTL8169 NICs.
    
    Fixes: f1e911d ("r8169: add basic phylib support")
    Signed-off-by: Sayanta Pattanayak <sayanta.pattanayak@arm.com>
    [Andre: expand commit message, use pci_domain_nr()]
    Signed-off-by: Andre Przywara <andre.przywara@arm.com>
    Acked-by: Heiner Kallweit <hkallweit1@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    SayantaP-arm authored and davem330 committed Jul 21, 2021

Commits on Jul 20, 2021

  1. ixgbe: Fix packet corruption due to missing DMA sync

    When receiving a packet with multiple fragments, hardware may still
    touch the first fragment until the entire packet has been received. The
    driver therefore keeps the first fragment mapped for DMA until end of
    packet has been asserted, and delays its dma_sync call until then.
    
    The driver tries to fit multiple receive buffers on one page. When using
    3K receive buffers (e.g. using Jumbo frames and legacy-rx is turned
    off/build_skb is being used) on an architecture with 4K pages, the
    driver allocates an order 1 compound page and uses one page per receive
    buffer. To determine the correct offset for a delayed DMA sync of the
    first fragment of a multi-fragment packet, the driver then cannot just
    use PAGE_MASK on the DMA address but has to construct a mask based on
    the actual size of the backing page.
    
    Using PAGE_MASK in the 3K RX buffer/4K page architecture configuration
    will always sync the first page of a compound page. With the SWIOTLB
    enabled this can lead to corrupted packets (zeroed out first fragment,
    re-used garbage from another packet) and various consequences, such as
    slow/stalling data transfers and connection resets. For example, testing
    on a link with MTU exceeding 3058 bytes on a host with SWIOTLB enabled
    (e.g. "iommu=soft swiotlb=262144,force") TCP transfers quickly fizzle
    out without this patch.
    
    Cc: stable@vger.kernel.org
    Fixes: 0c5661e ("ixgbe: fix crash in build_skb Rx code path")
    Signed-off-by: Markus Boehme <markubo@amazon.com>
    Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
    Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    markusboehme authored and davem330 committed Jul 20, 2021
  2. Revert "qed: fix possible unpaired spin_{un}lock_bh in _qed_mcp_cmd_a…

    …nd_union()"
    
    This reverts commit 6206b79.
    
    That patch added additional spin_{un}lock_bh(), which was harmless
    but pointless. The orginal code path has guaranteed the pair of
    spin_{un}lock_bh().
    
    We'd better revert it before we find the exact root cause of the
    bug_on mentioned in that patch.
    
    Fixes: 6206b79 ("qed: fix possible unpaired spin_{un}lock_bh in _qed_mcp_cmd_and_union()")
    Cc: David S. Miller <davem@davemloft.net>
    Cc: Prabhakar Kushwaha <pkushwaha@marvell.com>
    Signed-off-by: Jia He <justin.he@arm.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    justin-he authored and davem330 committed Jul 20, 2021
  3. ipv6: fix another slab-out-of-bounds in fib6_nh_flush_exceptions

    While running the self-tests on a KASAN enabled kernel, I observed a
    slab-out-of-bounds splat very similar to the one reported in
    commit 821bbf7 ("ipv6: Fix KASAN: slab-out-of-bounds Read in
     fib6_nh_flush_exceptions").
    
    We additionally need to take care of fib6_metrics initialization
    failure when the caller provides an nh.
    
    The fix is similar, explicitly free the route instead of calling
    fib6_info_release on a half-initialized object.
    
    Fixes: f88d8ea ("ipv6: Plumb support for nexthop object in a fib6_info")
    Signed-off-by: Paolo Abeni <pabeni@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Paolo Abeni authored and davem330 committed Jul 20, 2021
  4. fsl/fman: Add fibre support

    Set SUPPORTED_FIBRE to mac_dev->if_support. It allows proper usage of
    PHYs with optical/fiber support.
    
    Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru>
    Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    fidomax authored and davem330 committed Jul 20, 2021
  5. net/sched: act_skbmod: Skip non-Ethernet packets

    Currently tcf_skbmod_act() assumes that packets use Ethernet as their L2
    protocol, which is not always the case.  As an example, for CAN devices:
    
    	$ ip link add dev vcan0 type vcan
    	$ ip link set up vcan0
    	$ tc qdisc add dev vcan0 root handle 1: htb
    	$ tc filter add dev vcan0 parent 1: protocol ip prio 10 \
    		matchall action skbmod swap mac
    
    Doing the above silently corrupts all the packets.  Do not perform skbmod
    actions for non-Ethernet packets.
    
    Fixes: 86da71b ("net_sched: Introduce skbmod action")
    Reviewed-by: Cong Wang <cong.wang@bytedance.com>
    Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    ypl-coffee authored and davem330 committed Jul 20, 2021
  6. mt7530 mt7530_fdb_write only set ivl bit vid larger than 1

    Fixes my earlier patch which broke vlan unaware bridges.
    
    The IVL bit now only gets set for vid's larger than 1.
    
    Fixes: 11d8d98 ("mt7530 fix mt7530_fdb_write vid missing ivl bit")
    Signed-off-by: Eric Woudstra <ericwouds@gmail.com>
    Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    ericwoud authored and davem330 committed Jul 20, 2021
  7. Merge branch 'octeon-DMAC'

    Subbaraya Sundeep says:
    
    ====================
    octeontx2-af: Introduce DMAC based switching
    
    With this patch set packets can be switched between
    all CGX mapped PFs and VFs in the system based on
    the DMAC addresses. To implement this:
    AF allocates high priority rules from top entry(0) in MCAM.
    Rules are allocated for all the CGX mapped PFs and VFs though
    they are not active and with no NIXLFs attached.
    Rules for a PF/VF will be enabled only after they are brought up.
    Two rules one for TX and one for RX are allocated for each PF/VF.
    
    A packet sent from a PF/VF with a destination mac of another
    PF/VF will be hit by TX rule and sent to LBK channel 63. The
    same returned packet will be hit by RX rule whose action is
    to forward packet to PF/VF with that destination mac.
    
    Implementation of this for 98xx is tricky since there are
    two NIX blocks and till now a PF/VF can install rule for
    an NIX0/1 interface only if it is mapped to corresponding NIX0/1 block.
    Hence Tx rules are modified such that TX interface in MCAM
    entry can be either NIX0-TX or NIX1-TX.
    
    Testing:
    
    1. Create two VFs over PF1(on NIX0) and assign two VFs to two VMs
    2. Assign ip addresses to two VFs in VMs and PF2(on NIX1) in host.
    3. Assign static arp entries in two VMs and PF2.
    4. Ping between VMs and host PF2.
    ====================
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
    davem330 committed Jul 20, 2021
  8. Merge branch 'net-hns3-fixes-for-net'

    Guangbin Huang says:
    
    ====================
    net: hns3: fixes for -net
    
    This series includes some bugfixes for the HNS3 ethernet driver.
    ====================
    
    Link: https://lore.kernel.org/r/1626685988-25869-1-git-send-email-huangguangbin2@huawei.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Jakub Kicinski committed Jul 20, 2021
  9. net: hns3: fix rx VLAN offload state inconsistent issue

    Currently, VF doesn't enable rx VLAN offload when initializating,
    and PF does it for VFs. If user disable the rx VLAN offload for
    VF with ethtool -K, and reload the VF driver, it may cause the
    rx VLAN offload state being inconsistent between hardware and
    software.
    
    Fixes it by enabling rx VLAN offload when VF initializing.
    
    Fixes: e2cb1de ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support")
    Signed-off-by: Jian Shen <shenjian15@huawei.com>
    Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    IronShen authored and Jakub Kicinski committed Jul 20, 2021
  10. net: hns3: disable port VLAN filter when support function level VLAN …

    …filter control
    
    For hardware limitation, port VLAN filter is port level, and
    effective for all the functions of the port. So if not support
    port VLAN bypass, it's necessary to disable the port VLAN filter,
    in order to support function level VLAN filter control.
    
    Fixes: 2ba3066 ("net: hns3: add support for modify VLAN filter state")
    Signed-off-by: Jian Shen <shenjian15@huawei.com>
    Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    IronShen authored and Jakub Kicinski committed Jul 20, 2021
  11. net: hns3: add match_id to check mailbox response from PF to VF

    When VF need response from PF, VF will wait (1us - 1s) to receive
    the response, or it will wait timeout and the VF action fails.
    If VF do not receive response in 1st action because timeout,
    the 2nd action may receive response for the 1st action, and get
    incorrect response data.VF must reciveve the right response from
    PF,or it will cause unexpected error.
    
    This patch adds match_id to check mailbox response from PF to VF,
    to make sure VF get the right response:
    1. The message sent from VF was labelled with match_id which was a
    unique 16-bit non-zero value.
    2. The response sent from PF will label with match_id which got from
    the request.
    3. The VF uses the match_id to match request and response message.
    
    This scheme depends on PF driver supports match_id, if PF driver doesn't
    support then VF will uses the original scheme.
    
    Signed-off-by: Peng Li <lipeng321@huawei.com>
    Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    321lipeng authored and Jakub Kicinski committed Jul 20, 2021
  12. net: hns3: fix possible mismatches resp of mailbox

    Currently, the mailbox synchronous communication between VF and PF use
    the following fields to maintain communication:
    1. Origin_mbx_msg which was combined by message code and subcode, used
    to match request and response.
    2. Received_resp which means whether received response.
    
    There may possible mismatches of the following situation:
    1. VF sends message A with code=1 subcode=1.
    2. PF was blocked about 500ms when processing the message A.
    3. VF will detect message A timeout because it can't get the response
    within 500ms.
    4. VF sends message B with code=1 subcode=1 which equal message A.
    5. PF processes the first message A and send the response message to
    VF.
    6. VF will identify the response matched the message B because the
    code/subcode is the same. This will lead to mismatch of request and
    response.
    
    To fix the above bug, we use the following scheme:
    1. The message sent from VF was labelled with match_id which was a
    unique 16-bit non-zero value.
    2. The response sent from PF will label with match_id which got from
    the request.
    3. The VF uses the match_id to match request and response message.
    
    As for PF driver, it only needs to copy the match_id from request to
    response.
    
    Fixes: dde1a86 ("net: hns3: Add mailbox support to PF driver")
    Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
    Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    fengchengwen authored and Jakub Kicinski committed Jul 20, 2021
  13. net: bridge: do not replay fdb entries pointing towards the bridge twice

    This simple script:
    
    ip link add br0 type bridge
    ip link set swp2 master br0
    ip link set br0 address 00:01:02:03:04:05
    ip link del br0
    
    produces this result on a DSA switch:
    
    [  421.306399] br0: port 1(swp2) entered blocking state
    [  421.311445] br0: port 1(swp2) entered disabled state
    [  421.472553] device swp2 entered promiscuous mode
    [  421.488986] device swp2 left promiscuous mode
    [  421.493508] br0: port 1(swp2) entered disabled state
    [  421.886107] sja1105 spi0.1: port 1 failed to delete 00:01:02:03:04:05 vid 1 from fdb: -ENOENT
    [  421.894374] sja1105 spi0.1: port 1 failed to delete 00:01:02:03:04:05 vid 0 from fdb: -ENOENT
    [  421.943982] br0: port 1(swp2) entered blocking state
    [  421.949030] br0: port 1(swp2) entered disabled state
    [  422.112504] device swp2 entered promiscuous mode
    
    A very simplified view of what happens is:
    
    (1) the bridge port is created, and the bridge device inherits its MAC
        address
    
    (2) when joining, the bridge port (DSA) requests a replay of the
        addition of all FDB entries towards this bridge port and towards the
        bridge device itself. In fact, DSA calls br_fdb_replay() twice:
    
    	br_fdb_replay(br, brport_dev);
    	br_fdb_replay(br, br);
    
        DSA uses reference counting for the FDB entries. So the MAC address
        of the bridge is simply kept with refcount 2. When the bridge port
        leaves under normal circumstances, everything cancels out since the
        replay of the FDB entry deletion is also done twice per VLAN.
    
    (3) when the bridge MAC address changes, switchdev is notified of the
        deletion of the old address and of the insertion of the new one.
        But the old address does not really go away, since it had refcount
        2, and the new address is added "only" with refcount 1.
    
    (4) when the bridge port leaves now, it will replay a deletion of the
        FDB entries pointing towards the bridge twice. Then DSA will
        complain that it can't delete something that no longer exists.
    
    It is clear that the problem is that the FDB entries towards the bridge
    are replayed too many times, so let's fix that problem.
    
    Fixes: 63c5145 ("net: dsa: replay the local bridge FDB entries pointing to the bridge dev too")
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Link: https://lore.kernel.org/r/20210719093916.4099032-1-vladimir.oltean@nxp.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    vladimiroltean authored and Jakub Kicinski committed Jul 20, 2021
  14. net: Update MAINTAINERS for MediaTek switch driver

    Update maintainers for MediaTek switch driver with Deng Qingfang who has
    contributed many high-quality patches (interrupt, VLAN, GPIO, and etc.)
    and will help maintenance.
    
    Signed-off-by: Landen Chao <landen.chao@mediatek.com>
    Signed-off-by: DENG Qingfang <dqfext@gmail.com>
    Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
    Acked-by: Vladimir Oltean <olteanv@gmail.com>
    Link: https://lore.kernel.org/r/49e1aa8aac58dcbf1b5e036d09b3fa3bbb1d94d0.1626751861.git.landen.chao@mediatek.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Landen Chao authored and Jakub Kicinski committed Jul 20, 2021
  15. net/tcp_fastopen: remove obsolete extern

    After cited commit, sysctl_tcp_fastopen_blackhole_timeout is no longer
    a global variable.
    
    Fixes: 3733be1 ("ipv4: Namespaceify tcp_fastopen_blackhole_timeout knob")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
    Cc: Yuchung Cheng <ycheng@google.com>
    Cc: Neal Cardwell <ncardwell@google.com>
    Acked-by: Wei Wang <weiwan@google.com>
    Link: https://lore.kernel.org/r/20210719092028.3016745-1-eric.dumazet@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    neebe000 authored and Jakub Kicinski committed Jul 20, 2021
  16. ipv6: ip6_finish_output2: set sk into newly allocated nskb

    skb_set_owner_w() should set sk not to old skb but to new nskb.
    
    Fixes: 5796015 ("ipv6: allocate enough headroom in ip6_finish_output2()")
    Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
    Link: https://lore.kernel.org/r/70c0744f-89ae-1869-7e3e-4fa292158f4b@virtuozzo.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    vaverin authored and Jakub Kicinski committed Jul 20, 2021

Commits on Jul 19, 2021

  1. octeontx2-af: Introduce internal packet switching

    As of now any communication between CGXs PFs and
    their VFs within the system is possible only by
    external switches sending packets back to the
    system. This patch adds internal switching support.
    Broadcast packet replication is not covered here.
    RVU admin function (AF) maintains MAC addresses
    of all interfaces in the system. When switching is
    enabled, MCAM entries are allocated to install rules
    such that packets with DMAC matching any of the
    internal interface MAC addresses is punted back
    into the system via the loopback channel.
    On the receive side the default unicast rules
    are modified to not check for ingress channel.
    So any packet with matching DMAC irrespective of
    which interface it is coming from will be forwarded
    to the respective PF/VF interface.
    The transmit side rules and default unicast rules
    are updated if user changes MAC address of an interface.
    
    Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
    Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Subbaraya Sundeep authored and davem330 committed Jul 19, 2021
  2. octeontx2-af: Prepare for allocating MCAM rules for AF

    AF till now only manages the allocation and freeing of
    MCAM rules for other PF/VFs in system. To implement
    L2 switching between all CGX mapped PF and VFs, AF
    requires MCAM entries for DMAC rules for each PF and VF.
    This patch modifies AF driver such that AF can also
    allocate MCAM rules and install rules for other
    PFs and VFs. All the checks like channel verification
    for RX rules and PF_FUNC verification for TX rules are
    relaxed in case AF is allocating or installing rules.
    Also all the entry and counter to owner mappings are
    set to NPC_MCAM_INVALID_MAP when they are free indicating
    those are not allocated to AF nor PF/VFs.
    This patch also ensures that AF allocated and installed
    entries are displayed in debugfs.
    
    Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
    Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Subbaraya Sundeep authored and davem330 committed Jul 19, 2021
  3. octeontx2-af: Enable transmit side LBK link

    For enabling VF-VF switching the packets egressing
    out of CGX mapped VFs needed to be sent to LBK
    so that same packets are received back to the system.
    But the LBK link also needs to be enabled in addition
    to a VF's mapped CGX_LMAC link otherwise hardware
    raises send error interrupt indicating selected LBK
    link is not enabled in NIX_AF_TL3_TL2X_LINKX_CFG register.
    Hence this patch enables all LBK links in
    TL3_TL2_LINKX_CFG registers.
    Also to enable packet flow between PFs/VFs of NIX0
    to PFs/VFs of NIX1(in 98xx silicon) the NPC TX DMAC
    rules has to be installed such that rules must be hit
    for any TX interface i.e., NIX0-TX or NIX1-TX provided
    DMAC match creteria is met. Hence this patch changes the
    behavior such that MCAM is programmed to match with any
    NIX0/1-TX interface for TX rules.
    
    Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
    Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Subbaraya Sundeep authored and davem330 committed Jul 19, 2021
  4. net/tcp_fastopen: fix data races around tfo_active_disable_stamp

    tfo_active_disable_stamp is read and written locklessly.
    We need to annotate these accesses appropriately.
    
    Then, we need to perform the atomic_inc(tfo_active_disable_times)
    after the timestamp has been updated, and thus add barriers
    to make sure tcp_fastopen_active_should_disable() wont read
    a stale timestamp.
    
    Fixes: cf1ef3f ("net/tcp_fastopen: Disable active side TFO in certain scenarios")
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Wei Wang <weiwan@google.com>
    Cc: Yuchung Cheng <ycheng@google.com>
    Cc: Neal Cardwell <ncardwell@google.com>
    Acked-by: Wei Wang <weiwan@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    neebe000 authored and davem330 committed Jul 19, 2021
  5. Merge branch 'dt-bindinga-dwmac'

    Joakim Zhang says:
    
    ====================
    dt-bindings: net: dwmac-imx: convert
    
    This patch set intends to convert imx dwmac binding to schema, and fixes
    found by dt_binding_check and dtbs_check.
    ====================
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
    davem330 committed Jul 19, 2021
  6. arm64: dts: imx8mp: change interrupt order per dt-binding

    This patch changs interrupt order which found by dtbs_check.
    
    $ make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- dtbs_check DT_SCHEMA_FILES=Documentation/devicetree/bindings/net/nxp,dwmac-imx.yaml
    arch/arm64/boot/dts/freescale/imx8mp-evk.dt.yaml: ethernet@30bf0000: interrupt-names:0: 'macirq' was expected
    arch/arm64/boot/dts/freescale/imx8mp-evk.dt.yaml: ethernet@30bf0000: interrupt-names:1: 'eth_wake_irq' was expected
    
    According to Documentation/devicetree/bindings/net/snps,dwmac.yaml, we
    should list interrupt in it's order.
    
    Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Joakim Zhang authored and davem330 committed Jul 19, 2021
  7. dt-bindings: net: imx-dwmac: convert imx-dwmac bindings to yaml

    In order to automate the verification of DT nodes covert imx-dwmac to
    nxp,dwmac-imx.yaml, and pass below checking.
    
    $ make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- dt_binding_check DT_SCHEMA_FILES=Documentation/devicetree/bindings/net/nxp,dwmac-imx.yaml
    $ make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- dtbs_check DT_SCHEMA_FILES=Documentation/devicetree/bindings/net/nxp,dwmac-imx.yaml
    
    Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Joakim Zhang authored and davem330 committed Jul 19, 2021
  8. dt-bindings: net: snps,dwmac: add missing DWMAC IP version

    Add missing DWMAC IP version in snps,dwmac.yaml which found by below
    command, as NXP i.MX8 families support SNPS DWMAC 5.10a IP.
    
    $ make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- dt_binding_check DT_SCHEMA_FILES=Documentation/devicetree/bindings/net/nxp,dwmac-imx.yaml
    Documentation/devicetree/bindings/net/nxp,dwmac-imx.example.dt.yaml:
    ethernet@30bf0000: compatible: None of ['nxp,imx8mp-dwmac-eqos', 'snps,dwmac-5.10a'] are valid under the given schema
    
    Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Joakim Zhang authored and davem330 committed Jul 19, 2021
  9. net: hisilicon: rename CACHE_LINE_MASK to avoid redefinition

    Building on ARCH=arc causes a "redefined" warning, so rename this
    driver's CACHE_LINE_MASK to avoid the warning.
    
    ../drivers/net/ethernet/hisilicon/hip04_eth.c:134: warning: "CACHE_LINE_MASK" redefined
      134 | #define CACHE_LINE_MASK   0x3F
    In file included from ../include/linux/cache.h:6,
                     from ../include/linux/printk.h:9,
                     from ../include/linux/kernel.h:19,
                     from ../include/linux/list.h:9,
                     from ../include/linux/module.h:12,
                     from ../drivers/net/ethernet/hisilicon/hip04_eth.c:7:
    ../arch/arc/include/asm/cache.h:17: note: this is the location of the previous definition
       17 | #define CACHE_LINE_MASK  (~(L1_CACHE_BYTES - 1))
    
    Fixes: d413779 ("net: hisilicon: Add an tx_desc to adapt HI13X1_GMAC")
    Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
    Cc: Vineet Gupta <vgupta@synopsys.com>
    Cc: Jiangfeng Xiao <xiaojiangfeng@huawei.com>
    Cc: "David S. Miller" <davem@davemloft.net>
    Cc: Jakub Kicinski <kuba@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    rddunlap authored and davem330 committed Jul 19, 2021
  10. Merge branch 'bnxt_en-fixes'

    Michael Chan says:
    
    ====================
    bnxt_en: Bug fixes
    
    Most of the fixes in this series have to do with error recovery.  They
    include error path handling when the error recovery has to abort, and
    the rediscovery of capabilities (PTP and RoCE) after firmware reset
    that may result in capability changes.
    
    Two other fixes are to reject invalid ETS settings and to validate
    VLAN protocol in the RX path.
    ====================
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
    davem330 committed Jul 19, 2021
  11. bnxt_en: Fix PTP capability discovery

    The current PTP initialization logic does not account for firmware
    reset that may cause PTP capability to change.  The valid pointer
    bp->ptp_cfg is used to indicate that the device is capable of PTP
    and that it has been initialized.  So we must clean up bp->ptp_cfg
    and free it if the firmware after reset does not support PTP.
    
    Fixes: 93cb62d ("bnxt_en: Enable hardware PTP support")
    Cc: Richard Cochran <richardcochran@gmail.com>
    Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
    Signed-off-by: Michael Chan <michael.chan@broadcom.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Michael Chan authored and davem330 committed Jul 19, 2021
Older