Skip to content
Permalink
Jiri-Kosina/ma…
Switch branches/tags

Commits on Feb 15, 2022

  1. mac80211: fix RCU usage in ieee80211_tx_h_select_key()

    ieee80211_tx_h_select_key() is performing a series of RCU dereferences,
    but none of the callers seems to be taking RCU read-side lock; let's
    acquire the lock in ieee80211_tx_h_select_key() itself.
    
    Spotted with rtw89 driver.
    
    This fixes the splat below.
    
     =============================
     WARNING: suspicious RCU usage
     5.17.0-rc4-00003-gccad664b7f14 #3 Tainted: G            E
     -----------------------------
     net/mac80211/tx.c:593 suspicious rcu_dereference_check() usage!
    
     other info that might help us debug this:
    
     rcu_scheduler_active = 2, debug_locks = 1
     2 locks held by kworker/u33:0/184:
      #0: ffff9c0b14811d38 ((wq_completion)rtw89_tx_wq){+.+.}-{0:0}, at: process_one_work+0x258/0x660
      #1: ffffb97380cf3e78 ((work_completion)(&rtwdev->txq_work)){+.+.}-{0:0}, at: process_one_work+0x258/0x660
    
     stack backtrace:
     CPU: 8 PID: 184 Comm: kworker/u33:0 Tainted: G            E     5.17.0-rc4-00003-gccad664b7f14 #3 473b49ab0e7c2d6af2900c756bfd04efd7a9de13
     Hardware name: LENOVO 20UJS2B905/20UJS2B905, BIOS R1CET63W(1.32 ) 04/09/2021
     Workqueue: rtw89_tx_wq rtw89_core_txq_work [rtw89_core]
     Call Trace:
      <TASK>
      dump_stack_lvl+0x58/0x71
      ieee80211_tx_h_select_key+0x2c0/0x530 [mac80211 911c23e2351c0ae60b597a67b1204a5ea955e365]
      ieee80211_tx_dequeue+0x1a7/0x1260 [mac80211 911c23e2351c0ae60b597a67b1204a5ea955e365]
      rtw89_core_txq_work+0x1a6/0x420 [rtw89_core b39ba493f2e517ad75e0f8187ecc24edf58bbbea]
      process_one_work+0x2d8/0x660
      worker_thread+0x39/0x3e0
      ? process_one_work+0x660/0x660
      kthread+0xe5/0x110
      ? kthread_complete_and_exit+0x20/0x20
      ret_from_fork+0x22/0x30
      </TASK>
    
     =============================
     WARNING: suspicious RCU usage
     5.17.0-rc4-00003-gccad664b7f14 #3 Tainted: G            E
     -----------------------------
     net/mac80211/tx.c:607 suspicious rcu_dereference_check() usage!
    
     other info that might help us debug this:
    
     rcu_scheduler_active = 2, debug_locks = 1
     2 locks held by kworker/u33:0/184:
      #0: ffff9c0b14811d38 ((wq_completion)rtw89_tx_wq){+.+.}-{0:0}, at: process_one_work+0x258/0x660
      #1: ffffb97380cf3e78 ((work_completion)(&rtwdev->txq_work)){+.+.}-{0:0}, at: process_one_work+0x258/0x660
    
     stack backtrace:
     CPU: 8 PID: 184 Comm: kworker/u33:0 Tainted: G            E     5.17.0-rc4-00003-gccad664b7f14 #3 473b49ab0e7c2d6af2900c756bfd04efd7a9de13
     Hardware name: LENOVO 20UJS2B905/20UJS2B905, BIOS R1CET63W(1.32 ) 04/09/2021
     Workqueue: rtw89_tx_wq rtw89_core_txq_work [rtw89_core]
     Call Trace:
      <TASK>
      dump_stack_lvl+0x58/0x71
      ieee80211_tx_h_select_key+0x464/0x530 [mac80211 911c23e2351c0ae60b597a67b1204a5ea955e365]
      ieee80211_tx_dequeue+0x1a7/0x1260 [mac80211 911c23e2351c0ae60b597a67b1204a5ea955e365]
      rtw89_core_txq_work+0x1a6/0x420 [rtw89_core b39ba493f2e517ad75e0f8187ecc24edf58bbbea]
      process_one_work+0x2d8/0x660
      worker_thread+0x39/0x3e0
      ? process_one_work+0x660/0x660
      kthread+0xe5/0x110
      ? kthread_complete_and_exit+0x20/0x20
      ret_from_fork+0x22/0x30
      </TASK>
    
    Fixes: a0761a3 ("mac80211: drop data frames without key on encrypted links")
    Fixes: 46f6b06 ("mac80211: Encrypt "Group addressed privacy" action frames")
    Fixes: 3cfcf6a ("mac80211: 802.11w - Use BIP (AES-128-CMAC)")
    Fixes: f7e0104 ("mac80211: support separate default keys")
    Signed-off-by: Jiri Kosina <jkosina@suse.cz>
    Jiri Kosina authored and intel-lab-lkp committed Feb 15, 2022

Commits on Feb 14, 2022

  1. rtw89: handle TX/RX 160M bandwidth

    Apply 160M bandwidth to RA (rate adaptive) mechanism, so it can transmit
    packets with this bandwidth. On the other hand, convert 160M bandwidth
    from RX desc to rx_info_bw.
    
    Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
    Signed-off-by: Kalle Valo <kvalo@kernel.org>
    Link: https://lore.kernel.org/r/20220211075953.40421-7-pkshih@realtek.com
    Ping-Ke Shih authored and Kalle Valo committed Feb 14, 2022
  2. rtw89: declare if chip support 160M bandwidth

    The new chip can support 160M, so add a chip attribute to indicate the
    chip support it.
    
    Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
    Signed-off-by: Kalle Valo <kvalo@kernel.org>
    Link: https://lore.kernel.org/r/20220211075953.40421-6-pkshih@realtek.com
    Ping-Ke Shih authored and Kalle Valo committed Feb 14, 2022
  3. rtw89: add 6G support to rate adaptive mechanism

    Construct rate mask of 6G band, and rate adaptive mechanism can work well
    on this band.
    
    Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
    Signed-off-by: Kalle Valo <kvalo@kernel.org>
    Link: https://lore.kernel.org/r/20220211075953.40421-5-pkshih@realtek.com
    Ping-Ke Shih authored and Kalle Valo committed Feb 14, 2022
  4. rtw89: extend subband for 6G band

    Split 6G band into 8 sub-bands where indexes are from 0 to 7,
    i.e. RTW89_CH_6G_BAND_IDX[0-7]. Then, decide subband by both
    band and channel instead of just channel because conflicts
    between 5G channels and 6G channels.
    
    Moreover, add default case to the existing use of switch (subband).
    
    Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
    Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
    Signed-off-by: Kalle Valo <kvalo@kernel.org>
    Link: https://lore.kernel.org/r/20220211075953.40421-4-pkshih@realtek.com
    Zong-Zhe Yang authored and Kalle Valo committed Feb 14, 2022
  5. rtw89: refine naming of rfk helpers with prefix

    Since these macro in rfk helpers are common now, a common naming
    should be better. So, apply RTW89_ as prefix to them, and modify
    the use correspondly. No logic is changed at all.
    
    Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
    Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
    Signed-off-by: Kalle Valo <kvalo@kernel.org>
    Link: https://lore.kernel.org/r/20220211075953.40421-3-pkshih@realtek.com
    Zong-Zhe Yang authored and Kalle Valo committed Feb 14, 2022
  6. rtw89: make rfk helpers common across chips

    These rfk helpers are also useful for the chip which is under planning.
    So, move them to common code to avoid duplicate stuff in the future.
    
    Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
    Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
    Signed-off-by: Kalle Valo <kvalo@kernel.org>
    Link: https://lore.kernel.org/r/20220211075953.40421-2-pkshih@realtek.com
    Zong-Zhe Yang authored and Kalle Valo committed Feb 14, 2022
  7. brcmfmac: Add BCM43454/6 support

    BCM43454/6 is a variant of BCM4345 which is exactly identical to
    BCM4345/6, except the chip id is 0xa9be. This patch adds support
    for BCM43454/6 by handing it in the same way as BCM4345.
    
    Note: when loading some specific version of BCM4345 firmware, the
    chip id may become 0x4345. This is an expected behavior, and it will
    restore to 0xa9be after power cycle.
    
    Signed-off-by: Jiaqing Zhao <jiaqing.zhao@intel.com>
    Signed-off-by: Kalle Valo <kvalo@kernel.org>
    Link: https://lore.kernel.org/r/CO1PR11MB47859B51BCA88613D1582EB88E2E9@CO1PR11MB4785.namprd11.prod.outlook.com
    jiaqingz-intel authored and Kalle Valo committed Feb 14, 2022

Commits on Feb 11, 2022

  1. Merge tag 'wireless-next-2022-02-11' of git://git.kernel.org/pub/scm/…

    …linux/kernel/git/wireless/wireless-next
    
    wireless-next patches for v5.18
    
    First set of patches for v5.18, with both wireless and stack patches.
    rtw89 now has AP mode support and wcn36xx has survey support. But
    otherwise pretty normal.
    
    Major changes:
    
    ath11k
    
    * add LDPC FEC type in 802.11 radiotap header
    
    * enable RX PPDU stats in monitor co-exist mode
    
    wcn36xx
    
    * implement survey reporting
    
    brcmfmac
    
    * add CYW43570 PCIE device
    
    rtw88
    
    * rtw8821c: enable RFE 6 devices
    
    rtw89
    
    * AP mode support
    
    mt76
    
    * mt7916 support
    
    * background radar detection support
    davem330 committed Feb 11, 2022
  2. Merge branch 'ipv6-loopback'

    Eric Dumazet says:
    
    ====================
    ipv6: remove addrconf reliance on loopback
    
    Second patch in this series removes IPv6 requirement about the netns
    loopback device being the last device being dismantled.
    
    This was needed because rt6_uncached_list_flush_dev()
    and ip6_dst_ifdown() had to switch dst dev to a known
    device (loopback).
    
    Instead of loopback, we can use the (hidden) blackhole_netdev
    which is also always there.
    
    This will allow future simplfications of netdev_run_to()
    and other parts of the stack like default_device_exit_batch().
    
    Last two patches are optimizations for both IP families.
    ====================
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
    davem330 committed Feb 11, 2022
  3. ipv4: add (struct uncached_list)->quarantine list

    This is an optimization to keep the per-cpu lists as short as possible:
    
    Whenever rt_flush_dev() changes one rtable dst.dev
    matching the disappearing device, it can can transfer the object
    to a quarantine list, waiting for a final rt_del_uncached_list().
    
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    neebe000 authored and davem330 committed Feb 11, 2022
  4. ipv6: add (struct uncached_list)->quarantine list

    This is an optimization to keep the per-cpu lists as short as possible:
    
    Whenever rt6_uncached_list_flush_dev() changes one rt6_info
    matching the disappearing device, it can can transfer the object
    to a quarantine list, waiting for a final rt6_uncached_list_del().
    
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    neebe000 authored and davem330 committed Feb 11, 2022
  5. ipv6: give an IPv6 dev to blackhole_netdev

    IPv6 addrconf notifiers wants the loopback device to
    be the last device being dismantled at netns deletion.
    
    This caused many limitations and work arounds.
    
    Back in linux-5.3, Mahesh added a per host blackhole_netdev
    that can be used whenever we need to make sure objects no longer
    refer to a disappearing device.
    
    If we attach to blackhole_netdev an ip6_ptr (allocate an idev),
    then we can use this special device (which is never freed)
    in place of the loopback_dev (which can be freed).
    
    This will permit improvements in netdev_run_todo() and other parts
    of the stack where had steps to make sure loopback_dev was
    the last device to disappear.
    
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Mahesh Bandewar <maheshb@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    neebe000 authored and davem330 committed Feb 11, 2022
  6. ipv6: get rid of net->ipv6.rt6_stats->fib_rt_uncache

    This counter has never been visible, there is little point
    trying to maintain it.
    
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    neebe000 authored and davem330 committed Feb 11, 2022
  7. dsa: mv88e6xxx: make serdes SGMII/Fiber tx amplitude configurable

    The mv88e6352, mv88e6240 and mv88e6176  have a serdes interface. This patch
    allows to configure the output swing to a desired value in the
    phy-handle of the port. The value which is peak to peak has to be
    specified in microvolts. As the chips only supports eight dedicated
    values we return EINVAL if the value in the DTS does not match one of
    these values.
    
    Signed-off-by: Holger Brunck <holger.brunck@hitachienergy.com>
    Reviewed-by: Andrew Lunn <andrew@lunn.ch>
    Reviewed-by: Marek Behún <kabel@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Holger Brunck authored and davem330 committed Feb 11, 2022
  8. dt-bindings: phy: Add tx-p2p-microvolt property binding

    Common PHYs and network PCSes often have the possibility to specify
    peak-to-peak voltage on the differential pair - the default voltage
    sometimes needs to be changed for a particular board.
    
    Add properties `tx-p2p-microvolt` and `tx-p2p-microvolt-names` for this
    purpose. The second property is needed to specify the mode for the
    corresponding voltage in the `tx-p2p-microvolt` property, if the voltage
    is to be used only for speficic mode. More voltage-mode pairs can be
    specified.
    
    Example usage with only one voltage (it will be used for all supported
    PHY modes, the `tx-p2p-microvolt-names` property is not needed in this
    case):
    
      tx-p2p-microvolt = <915000>;
    
    Example usage with voltages for multiple modes:
    
      tx-p2p-microvolt = <915000>, <1100000>, <1200000>;
      tx-p2p-microvolt-names = "2500base-x", "usb", "pcie";
    
    Add these properties into a separate file phy/transmit-amplitude.yaml,
    which should be referenced by any binding that uses it.
    
    Signed-off-by: Marek Behún <kabel@kernel.org>
    Reviewed-by: Rob Herring <robh@kernel.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    elkablo authored and davem330 committed Feb 11, 2022
  9. ipv6: Reject routes configurations that specify dsfield (tos)

    The ->rtm_tos option is normally used to route packets based on both
    the destination address and the DS field. However it's ignored for
    IPv6 routes. Setting ->rtm_tos for IPv6 is thus invalid as the route
    is going to work only on the destination address anyway, so it won't
    behave as specified.
    
    Suggested-by: Toke Høiland-Jørgensen <toke@redhat.com>
    Signed-off-by: Guillaume Nault <gnault@redhat.com>
    Reviewed-by: David Ahern <dsahern@kernel.org>
    Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Guillaume Nault authored and davem330 committed Feb 11, 2022
  10. Merge branch 'dsa-cleanup'

    Vladimir Oltean says:
    
    ====================
    More aggressive DSA cleanup
    
    This series deletes some code which is apparently not needed.
    
    I've had these patches in my tree for a while, and testing on my boards
    didn't reveal any issues.
    
    Compared to the RFC v1 series, the only change is the addition of patch 3.
    https://patchwork.kernel.org/project/netdevbpf/cover/20220107184842.550334-1-vladimir.oltean@nxp.com/
    ====================
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
    davem330 committed Feb 11, 2022
  11. net: dsa: remove lockdep class for DSA slave address list

    Since commit 2f1e8ea ("net: dsa: link interfaces with the DSA
    master to get rid of lockdep warnings"), suggested by Cong Wang, the
    DSA interfaces and their master have different dev->nested_level, which
    makes netif_addr_lock() stop complaining about potentially recursive
    locking on the same lock class.
    
    So we no longer need DSA slave interfaces to have their own lockdep
    class.
    
    Cc: Cong Wang <xiyou.wangcong@gmail.com>
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    vladimiroltean authored and davem330 committed Feb 11, 2022
  12. net: dsa: remove lockdep class for DSA master address list

    Since commit 2f1e8ea ("net: dsa: link interfaces with the DSA
    master to get rid of lockdep warnings"), suggested by Cong Wang, the
    DSA interfaces and their master have different dev->nested_level, which
    makes netif_addr_lock() stop complaining about potentially recursive
    locking on the same lock class.
    
    So we no longer need DSA masters to have their own lockdep class.
    
    Cc: Cong Wang <xiyou.wangcong@gmail.com>
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    vladimiroltean authored and davem330 committed Feb 11, 2022
  13. net: dsa: remove ndo_get_phys_port_name and ndo_get_port_parent_id

    There are no legacy ports, DSA registers a devlink instance with ports
    unconditionally for all switch drivers. Therefore, delete the old-style
    ndo operations used for determining bridge forwarding domains.
    
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
    Tested-by: Florian Fainelli <f.fainelli@gmail.com>
    Reviewed-by: Ido Schimmel <idosch@nvidia.com>
    Reviewed-by: Jiri Pirko <jiri@nvidia.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    vladimiroltean authored and davem330 committed Feb 11, 2022
  14. Merge branch 'smc-optimizations'

    D. Wythe says:
    
    ====================
    net/smc: Optimizing performance in short-lived scenarios
    
    This patch set aims to optimizing performance of SMC in short-lived
    links scenarios, which is quite unsatisfactory right now.
    
    In our benchmark, we test it with follow scripts:
    
    ./wrk -c 10000 -t 4 -H 'Connection: Close' -d 20 http://smc-server
    
    Current performance figures like that:
    
    Running 20s test @ http://11.213.45.6
      4 threads and 10000 connections
      4956 requests in 20.06s, 3.24MB read
      Socket errors: connect 0, read 0, write 672, timeout 0
    Requests/sec:    247.07
    Transfer/sec:    165.28KB
    
    There are many reasons for this phenomenon, this patch set doesn't
    solve it all though, but it can be well alleviated with it in.
    
    Patch 1/5  (Make smc_tcp_listen_work() independent) :
    
    Separate smc_tcp_listen_work() from smc_listen_work(), make them
    independent of each other, the busy SMC handshake can not affect new TCP
    connections visit any more. Avoid discarding a large number of TCP
    connections after being overstock, which is undoubtedly raise the
    connection establishment time.
    
    Patch 2/5 (Limit SMC backlog connections):
    
    Since patch 1 has separated smc_tcp_listen_work() from
    smc_listen_work(), an unrestricted TCP accept have come into being. This
    patch try to put a limit on SMC backlog connections refers to
    implementation of TCP.
    
    Patch 3/5 (Limit SMC visits when handshake workqueue congested):
    
    Considering the complexity of SMC handshake right now, in short-lived
    links scenarios, this may not be the main scenario of SMC though, it's
    performance is still quite poor. This patch try to provide constraint on
    SMC handshake when handshake workqueue congested, which is the sign of
    SMC handshake stacking in our opinion.
    
    Patch 4/5 (Dynamic control handshake limitation by socket options)
    
    This patch allow applications dynamically control the ability of SMC
    handshake limitation. Since SMC don't support set SMC socket option
    before,
    this patch also have to support SMC's owns socket options.
    
    Patch 5/5 (Add global configure for handshake limitation by netlink)
    
    This patch provides a way to get benefit of handshake limitation
    without
    modifying any code for applications, which is quite useful for most
    existing applications.
    
    After this patch set, performance figures like that:
    
    Running 20s test @ http://11.213.45.6
      4 threads and 10000 connections
      693253 requests in 20.10s, 452.88MB read
    Requests/sec:  34488.13
    Transfer/sec:     22.53MB
    
    That's a quite well performance improvement, about to 6 to 7 times in my
    environment.
    ---
    changelog:
    v1 -> v2:
    - fix compile warning
    - fix invalid dependencies in kconfig
    v2 -> v3:
    - correct spelling mistakes
    - fix useless variable declare
    v3 -> v4
    - make smc_tcp_ls_wq be static
    v4 -> v5
    - add dynamic control for SMC auto fallback by socket options
    - add global configure for SMC auto fallback through netlink
    v5 -> v6
    - move auto fallback to net namespace scope
    - remove auto fallback attribute in SMC_GEN_SYS_INFO
    - add independent attributes for auto fallback
    v6 -> v7
    - fix wording and the naming issues, rename 'auto fallback' to handshake
      limitation.
    ====================
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
    davem330 committed Feb 11, 2022
  15. net/smc: Add global configure for handshake limitation by netlink

    Although we can control SMC handshake limitation through socket options,
    which means that applications who need it must modify their code. It's
    quite troublesome for many existing applications. This patch modifies
    the global default value of SMC handshake limitation through netlink,
    providing a way to put constraint on handshake without modifies any code
    for applications.
    
    Suggested-by: Tony Lu <tonylu@linux.alibaba.com>
    Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
    Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    D. Wythe authored and davem330 committed Feb 11, 2022
  16. net/smc: Dynamic control handshake limitation by socket options

    This patch aims to add dynamic control for SMC handshake limitation for
    every smc sockets, in production environment, it is possible for the
    same applications to handle different service types, and may have
    different opinion on SMC handshake limitation.
    
    This patch try socket options to complete it, since we don't have socket
    option level for SMC yet, which requires us to implement it at the same
    time.
    
    This patch does the following:
    
    - add new socket option level: SOL_SMC.
    - add new SMC socket option: SMC_LIMIT_HS.
    - provide getter/setter for SMC socket options.
    
    Link: https://lore.kernel.org/all/20f504f961e1a803f85d64229ad84260434203bd.1644323503.git.alibuda@linux.alibaba.com/
    Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    D. Wythe authored and davem330 committed Feb 11, 2022
  17. net/smc: Limit SMC visits when handshake workqueue congested

    This patch intends to provide a mechanism to put constraint on SMC
    connections visit according to the pressure of SMC handshake process.
    At present, frequent visits will cause the incoming connections to be
    backlogged in SMC handshake queue, raise the connections established
    time. Which is quite unacceptable for those applications who base on
    short lived connections.
    
    There are two ways to implement this mechanism:
    
    1. Put limitation after TCP established.
    2. Put limitation before TCP established.
    
    In the first way, we need to wait and receive CLC messages that the
    client will potentially send, and then actively reply with a decline
    message, in a sense, which is also a sort of SMC handshake, affect the
    connections established time on its way.
    
    In the second way, the only problem is that we need to inject SMC logic
    into TCP when it is about to reply the incoming SYN, since we already do
    that, it's seems not a problem anymore. And advantage is obvious, few
    additional processes are required to complete the constraint.
    
    This patch use the second way. After this patch, connections who beyond
    constraint will not informed any SMC indication, and SMC will not be
    involved in any of its subsequent processes.
    
    Link: https://lore.kernel.org/all/1641301961-59331-1-git-send-email-alibuda@linux.alibaba.com/
    Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    D. Wythe authored and davem330 committed Feb 11, 2022
  18. net/smc: Limit backlog connections

    Current implementation does not handling backlog semantics, one
    potential risk is that server will be flooded by infinite amount
    connections, even if client was SMC-incapable.
    
    This patch works to put a limit on backlog connections, referring to the
    TCP implementation, we divides SMC connections into two categories:
    
    1. Half SMC connection, which includes all TCP established while SMC not
    connections.
    
    2. Full SMC connection, which includes all SMC established connections.
    
    For half SMC connection, since all half SMC connections starts with TCP
    established, we can achieve our goal by put a limit before TCP
    established. Refer to the implementation of TCP, this limits will based
    on not only the half SMC connections but also the full connections,
    which is also a constraint on full SMC connections.
    
    For full SMC connections, although we know exactly where it starts, it's
    quite hard to put a limit before it. The easiest way is to block wait
    before receive SMC confirm CLC message, while it's under protection by
    smc_server_lgr_pending, a global lock, which leads this limit to the
    entire host instead of a single listen socket. Another way is to drop
    the full connections, but considering the cast of SMC connections, we
    prefer to keep full SMC connections.
    
    Even so, the limits of full SMC connections still exists, see commits
    about half SMC connection below.
    
    After this patch, the limits of backend connection shows like:
    
    For SMC:
    
    1. Client with SMC-capability can makes 2 * backlog full SMC connections
       or 1 * backlog half SMC connections and 1 * backlog full SMC
       connections at most.
    
    2. Client without SMC-capability can only makes 1 * backlog half TCP
       connections and 1 * backlog full TCP connections.
    
    Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    D. Wythe authored and davem330 committed Feb 11, 2022
  19. net/smc: Make smc_tcp_listen_work() independent

    In multithread and 10K connections benchmark, the backend TCP connection
    established very slowly, and lots of TCP connections stay in SYN_SENT
    state.
    
    Client: smc_run wrk -c 10000 -t 4 http://server
    
    the netstate of server host shows like:
        145042 times the listen queue of a socket overflowed
        145042 SYNs to LISTEN sockets dropped
    
    One reason of this issue is that, since the smc_tcp_listen_work() shared
    the same workqueue (smc_hs_wq) with smc_listen_work(), while the
    smc_listen_work() do blocking wait for smc connection established. Once
    the workqueue became congested, it's will block the accept() from TCP
    listen.
    
    This patch creates a independent workqueue(smc_tcp_ls_wq) for
    smc_tcp_listen_work(), separate it from smc_listen_work(), which is
    quite acceptable considering that smc_tcp_listen_work() runs very fast.
    
    Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    D. Wythe authored and davem330 committed Feb 11, 2022
  20. dt-bindings: net: dsa: realtek: convert to YAML schema, add MDIO

    Schema changes:
    
    - support for mdio-connected switches (mdio driver), recognized by
      checking the presence of property "reg"
    - new compatible strings for rtl8367s and rtl8367rb
    - "interrupt-controller" was not added as a required property. It might
      still work polling the ports when missing.
    
    Examples changes:
    
    - renamed "switch_intc" to make it unique between examples
    - removed "dsa-mdio" from mdio compatible property
    - renamed phy@0 to ethernet-phy@0 (not tested with real HW)
      phy@ requires #phy-cells
    
    Signed-off-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
    Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    luizluca authored and davem330 committed Feb 11, 2022
  21. Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

    No conflicts.
    
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Jakub Kicinski committed Feb 11, 2022
  22. Merge tag 'net-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel…

    …/git/netdev/net
    
    Pull networking fixes from Jakub Kicinski:
     "Including fixes from netfilter and can.
    
    Current release - new code bugs:
    
       - sparx5: fix get_stat64 out-of-bound access and crash
    
       - smc: fix netdev ref tracker misuse
    
      Previous releases - regressions:
    
       - eth: ixgbevf: require large buffers for build_skb on 82599VF, avoid
         overflows
    
       - eth: ocelot: fix all IP traffic getting trapped to CPU with PTP
         over IP
    
       - bonding: fix rare link activation misses in 802.3ad mode
    
      Previous releases - always broken:
    
       - tcp: fix tcp sock mem accounting in zero-copy corner cases
    
       - remove the cached dst when uncloning an skb dst and its metadata,
         since we only have one ref it'd lead to an UaF
    
       - netfilter:
          - conntrack: don't refresh sctp entries in closed state
          - conntrack: re-init state for retransmitted syn-ack, avoid
            connection establishment getting stuck with strange stacks
          - ctnetlink: disable helper autoassign, avoid it getting lost
          - nft_payload: don't allow transport header access for fragments
    
       - dsa: fix use of devres for mdio throughout drivers
    
       - eth: amd-xgbe: disable interrupts during pci removal
    
       - eth: dpaa2-eth: unregister netdev before disconnecting the PHY
    
       - eth: ice: fix IPIP and SIT TSO offload"
    
    * tag 'net-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (53 commits)
      net: dsa: mv88e6xxx: fix use-after-free in mv88e6xxx_mdios_unregister
      net: mscc: ocelot: fix mutex lock error during ethtool stats read
      ice: Avoid RTNL lock when re-creating auxiliary device
      ice: Fix KASAN error in LAG NETDEV_UNREGISTER handler
      ice: fix IPIP and SIT TSO offload
      ice: fix an error code in ice_cfg_phy_fec()
      net: mpls: Fix GCC 12 warning
      dpaa2-eth: unregister the netdev before disconnecting from the PHY
      skbuff: cleanup double word in comment
      net: macb: Align the dma and coherent dma masks
      mptcp: netlink: process IPv6 addrs in creating listening sockets
      selftests: mptcp: add missing join check
      net: usb: qmi_wwan: Add support for Dell DW5829e
      vlan: move dev_put into vlan_dev_uninit
      vlan: introduce vlan_dev_free_egress_priority
      ax25: fix UAF bugs of net_device caused by rebinding operation
      net: dsa: fix panic when DSA master device unbinds on shutdown
      net: amd-xgbe: disable interrupts during pci removal
      tipc: rate limit warning for received illegal binding update
      net: mdio: aspeed: Add missing MODULE_DEVICE_TABLE
      ...
    torvalds committed Feb 11, 2022

Commits on Feb 10, 2022

  1. Merge tag 'linux-kselftest-fixes-5.17-rc4' of git://git.kernel.org/pu…

    …b/scm/linux/kernel/git/shuah/linux-kselftest
    
    Pull Kselftest fixes from Shuah Khan:
     "Build and run-time fixes to pidfd, clone3, and ir tests"
    
    * tag 'linux-kselftest-fixes-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
      selftests/ir: fix build with ancient kernel headers
      selftests: fixup build warnings in pidfd / clone3 tests
      pidfd: fix test failure due to stack overflow on some arches
    torvalds committed Feb 10, 2022
  2. Merge tag 'linux-kselftest-kunit-fixes-5.17-rc4' of git://git.kernel.…

    …org/pub/scm/linux/kernel/git/shuah/linux-kselftest
    
    Pull KUnit fixes from Shuah Khan:
     "Fixes to the test and usage documentation"
    
    * tag 'linux-kselftest-kunit-fixes-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
      Documentation: KUnit: Fix usage bug
      kunit: fix missing f in f-string in run_checks.py
    torvalds committed Feb 10, 2022
  3. net: dsa: mv88e6xxx: fix use-after-free in mv88e6xxx_mdios_unregister

    Since struct mv88e6xxx_mdio_bus *mdio_bus is the bus->priv of something
    allocated with mdiobus_alloc_size(), this means that mdiobus_free(bus)
    will free the memory backing the mdio_bus as well. Therefore, the
    mdio_bus->list element is freed memory, but we continue to iterate
    through the list of MDIO buses using that list element.
    
    To fix this, use the proper list iterator that handles element deletion
    by keeping a copy of the list element next pointer.
    
    Fixes: f53a2ce ("net: dsa: mv88e6xxx: don't use devres for mdiobus")
    Reported-by: Rafael Richter <rafael.richter@gin.de>
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Link: https://lore.kernel.org/r/20220210174017.3271099-1-vladimir.oltean@nxp.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    vladimiroltean authored and Jakub Kicinski committed Feb 10, 2022
  4. Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/gi…

    …t/tnguy/net-queue
    
    Tony Nguyen says:
    
    ====================
    Intel Wired LAN Driver Updates 2022-02-10
    
    Dan Carpenter propagates an error in FEC configuration.
    
    Jesse fixes TSO offloads of IPIP and SIT frames.
    
    Dave adds a dedicated LAG unregister function to resolve a KASAN error
    and moves auxiliary device re-creation after LAG removal to the service
    task to avoid issues with RTNL lock.
    
    * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
      ice: Avoid RTNL lock when re-creating auxiliary device
      ice: Fix KASAN error in LAG NETDEV_UNREGISTER handler
      ice: fix IPIP and SIT TSO offload
      ice: fix an error code in ice_cfg_phy_fec()
    ====================
    
    Link: https://lore.kernel.org/r/20220210170515.2609656-1-anthony.l.nguyen@intel.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Jakub Kicinski committed Feb 10, 2022
  5. net: mscc: ocelot: fix mutex lock error during ethtool stats read

    An ongoing workqueue populates the stats buffer. At the same time, a user
    might query the statistics. While writing to the buffer is mutex-locked,
    reading from the buffer wasn't. This could lead to buggy reads by ethtool.
    
    This patch fixes the former blamed commit, but the bug was introduced in
    the latter.
    
    Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
    Fixes: 1e1caa9 ("ocelot: Clean up stats update deferred work")
    Fixes: a556c76 ("net: mscc: Add initial Ocelot switch support")
    Reported-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Link: https://lore.kernel.org/all/20220210150451.416845-2-colin.foster@in-advantage.com/
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    colin-foster-in-advantage authored and Jakub Kicinski committed Feb 10, 2022
Older