Skip to content
Permalink
Xin-Long/sctp-…
Switch branches/tags

Commits on Jun 21, 2021

  1. sctp: process sctp over udp icmp err on sctp side

    Previously, sctp over udp was using udp tunnel's icmp err process, which
    only does sk lookup on sctp side. However for sctp's icmp error process,
    there are more things to do, like syncing assoc pmtu/retransmit packets
    for toobig type err, and starting proto_unreach_timer for unreach type
    err etc.
    
    Now after adding PLPMTUD, which also requires to process toobig type err
    on sctp side. This patch is to process icmp err on sctp side by parsing
    the type/code/info in .encap_err_lookup and call sctp's icmp processing
    functions. Note as the 'redirect' err process needs to know the outer
    ip(v6) header's, we have to leave it to udp(v6)_err to handle it.
    
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    lxin authored and intel-lab-lkp committed Jun 21, 2021
  2. sctp: extract sctp_v4_err_handle function from sctp_v4_err

    This patch is to extract sctp_v4_err_handle() from sctp_v4_err() to
    only handle the icmp err after the sock lookup, and it also makes
    the code clearer.
    
    sctp_v4_err_handle() will be used in sctp over udp's err handling
    in the following patch.
    
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    lxin authored and intel-lab-lkp committed Jun 21, 2021
  3. sctp: extract sctp_v6_err_handle function from sctp_v6_err

    This patch is to extract sctp_v6_err_handle() from sctp_v6_err() to
    only handle the icmp err after the sock lookup, and it also makes
    the code clearer.
    
    sctp_v6_err_handle() will be used in sctp over udp's err handling
    in the following patch.
    
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    lxin authored and intel-lab-lkp committed Jun 21, 2021
  4. sctp: remove the unessessary hold for idev in sctp_v6_err

    Same as in tcp_v6_err() and __udp6_lib_err(), there's no need to
    hold idev in sctp_v6_err(), so just call __in6_dev_get() instead.
    
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    lxin authored and intel-lab-lkp committed Jun 21, 2021
  5. sctp: enable PLPMTUD when the transport is ready

    sctp_transport_pl_reset() is called whenever any of these 3 members in
    transport is changed:
    
      - probe_interval
      - param_flags & SPP_PMTUD_ENABLE
      - state == ACTIVE
    
    If all are true, start the PLPMTUD when it's not yet started. If any of
    these is false, stop the PLPMTUD when it's already running.
    
    sctp_transport_pl_update() is called when the transport dst has changed.
    It will restart the PLPMTUD probe. Again, the pathmtu won't change but
    use the dst's mtu until the Search phase is done.
    
    Note that after using PLPMTUD, the pathmtu is only initialized with the
    dst mtu when the transport dst changes. At other time it is updated by
    pl.pmtu. So sctp_transport_pmtu_check() will be called only when PLPMTUD
    is disabled in sctp_packet_config().
    
    After this patch, the PLPMTUD feature from RFC8899 will be activated
    and can be used by users.
    
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    lxin authored and intel-lab-lkp committed Jun 21, 2021
  6. sctp: do state transition when receiving an icmp TOOBIG packet

    PLPMTUD will short-circuit the old process for icmp TOOBIG packets.
    This part is described in rfc8899#section-4.6.2 (PL_PTB_SIZE =
    PTB_SIZE - other_headers_len). Note that from rfc8899#section-5.2
    State Machine, each case below is for some specific states only:
    
      a) PL_PTB_SIZE < MIN_PLPMTU || PL_PTB_SIZE >= PROBED_SIZE,
         discard it, for any state
    
      b) MIN_PLPMTU < PL_PTB_SIZE < BASE_PLPMTU,
         Base -> Error, for Base state
    
      c) BASE_PLPMTU <= PL_PTB_SIZE < PLPMTU,
         Search -> Base or Complete -> Base, for Search and Complete states.
    
      d) PLPMTU < PL_PTB_SIZE < PROBED_SIZE,
         set pl.probe_size to PL_PTB_SIZE then verify it, for Search state.
    
    The most important one is case d), which will help find the optimal
    fast during searching. Like when pathmtu = 1392 for SCTP over IPv4,
    the search will be (20 is iphdr_len):
    
      1. probe with 1200 - 20
      2. probe with 1232 - 20
      3. probe with 1264 - 20
      ...
      7. probe with 1388 - 20
      8. probe with 1420 - 20
    
    When sending the probe with 1420 - 20, TOOBIG may come with PL_PTB_SIZE =
    1392 - 20. Then it matches case d), and saves some rounds to try with the
    1392 - 20 probe. But of course, PLPMTUD doesn't trust TOOBIG packets, and
    it will go back to the common searching once the probe with the new size
    can't be verified.
    
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    lxin authored and intel-lab-lkp committed Jun 21, 2021
  7. sctp: do state transition when a probe succeeds on HB ACK recv path

    As described in rfc8899#section-5.2, when a probe succeeds, there might
    be the following state transitions:
    
      - Base -> Search, occurs when probe succeeds with BASE_PLPMTU,
        pl.pmtu is not changing,
        pl.probe_size increases by SCTP_PL_BIG_STEP,
    
      - Error -> Search, occurs when probe succeeds with BASE_PLPMTU,
        pl.pmtu is changed from SCTP_MIN_PLPMTU to SCTP_BASE_PLPMTU,
        pl.probe_size increases by SCTP_PL_BIG_STEP.
    
      - Search -> Search Complete, occurs when probe succeeds with the probe
        size SCTP_MAX_PLPMTU less than pl.probe_high,
        pl.pmtu is not changing, but update *pathmtu* with it,
        pl.probe_size is set back to pl.pmtu to double check it.
    
      - Search Complete -> Search, occurs when probe succeeds with the probe
        size equal to pl.pmtu,
        pl.pmtu is not changing,
        pl.probe_size increases by SCTP_PL_MIN_STEP.
    
    So search process can be described as:
    
     1. When it just enters 'Search' state, *pathmtu* is not updated with
        pl.pmtu, and probe_size increases by a big step (SCTP_PL_BIG_STEP)
        each round.
    
     2. Until pl.probe_high is set when a probe fails, and probe_size
        decreases back to pl.pmtu, as described in the last patch.
    
     3. When the probe with the new size succeeds, probe_size changes to
        increase by a small step (SCTP_PL_MIN_STEP) due to pl.probe_high
        is set.
    
     4. Until probe_size is next to pl.probe_high, the searching finishes and
        it goes to 'Complete' state and updates *pathmtu* with pl.pmtu, and
        then probe_size is set to pl.pmtu to confirm by once more probe.
    
     5. This probe occurs after "30 * probe_inteval", a much longer time than
        that in Search state. Once it is done it goes to 'Search' state again
        with probe_size increased by SCTP_PL_MIN_STEP.
    
    As we can see above, during the searching, pl.pmtu changes while *pathmtu*
    doesn't. *pathmtu* is only updated when the search finishes by which it
    gets an optimal value for it. A big step is used at the beginning until
    it gets close to the optimal value, then it changes to a small step until
    it has this optimal value.
    
    The small step is also used in 'Complete' until it goes to 'Search' state
    again and the probe with 'pmtu + the small step' succeeds, which means a
    higher size could be used. Then probe_size changes to increase by a big
    step again until it gets close to the next optimal value.
    
    Note that anytime when black hole is detected, it goes directly to 'Base'
    state with pl.pmtu set to SCTP_BASE_PLPMTU, as described in the last patch.
    
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    lxin authored and intel-lab-lkp committed Jun 21, 2021
  8. sctp: do state transition when PROBE_COUNT == MAX_PROBES on HB send path

    The state transition is described in rfc8899#section-5.2,
    PROBE_COUNT == MAX_PROBES means the probe fails for MAX times, and the
    state transition includes:
    
      - Base -> Error, occurs when BASE_PLPMTU Confirmation Fails,
        pl.pmtu is set to SCTP_MIN_PLPMTU,
        probe_size is still SCTP_BASE_PLPMTU;
    
      - Search -> Base, occurs when Black Hole Detected,
        pl.pmtu is set to SCTP_BASE_PLPMTU,
        probe_size is set back to SCTP_BASE_PLPMTU;
    
      - Search Complete -> Base, occurs when Black Hole Detected
        pl.pmtu is set to SCTP_BASE_PLPMTU,
        probe_size is set back to SCTP_BASE_PLPMTU;
    
    Note a black hole is encountered when a sender is unaware that packets
    are not being delivered to the destination endpoint. So it includes the
    probe failures with equal probe_size to pl.pmtu, and definitely not
    include that with greater probe_size than pl.pmtu. The later one is the
    normal probe failure where probe_size should decrease back to pl.pmtu
    and pl.probe_high is set.  pl.probe_high would be used on HB ACK recv
    path in the next patch.
    
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    lxin authored and intel-lab-lkp committed Jun 21, 2021
  9. sctp: do the basic send and recv for PLPMTUD probe

    This patch does exactly what rfc8899#section-6.2.1.2 says:
    
       The SCTP sender needs to be able to determine the total size of a
       probe packet.  The HEARTBEAT chunk could carry a Heartbeat
       Information parameter that includes, besides the information
       suggested in [RFC4960], the probe size to help an implementation
       associate a HEARTBEAT ACK with the size of probe that was sent.  The
       sender could also use other methods, such as sending a nonce and
       verifying the information returned also contains the corresponding
       nonce.  The length of the PAD chunk is computed by reducing the
       probing size by the size of the SCTP common header and the HEARTBEAT
       chunk.
    
    Note that HB ACK chunk will carry back whatever HB chunk carried, including
    the probe_size we put it in; We also check hbinfo->probe_size in the HB ACK
    against link->pl.probe_size to validate this HB ACK chunk.
    
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    lxin authored and intel-lab-lkp committed Jun 21, 2021
  10. sctp: add the probe timer in transport for PLPMTUD

    There are 3 timers described in rfc8899#section-5.1.1:
    
      PROBE_TIMER, PMTU_RAISE_TIMER, CONFIRMATION_TIMER
    
    This patches adds a 'probe_timer' in transport, and it works as either
    PROBE_TIMER or PMTU_RAISE_TIMER. At most time, it works as PROBE_TIMER
    and expires every a 'probe_interval' time to send the HB probe packet.
    When transport pl enters COMPLETE state, it works as PMTU_RAISE_TIMER
    and expires in 'probe_interval * 30' time to go back to SEARCH state
    and do searching again.
    
    SCTP HB is an acknowledged packet, CONFIRMATION_TIMER is not needed.
    
    The timer will start when transport pl enters BASE state and stop
    when it enters DISABLED state.
    
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    lxin authored and intel-lab-lkp committed Jun 21, 2021
  11. sctp: add the constants/variables and states and some APIs for transport

    These are 4 constants described in rfc8899#section-5.1.2:
    
      MAX_PROBES, MIN_PLPMTU, MAX_PLPMTU, BASE_PLPMTU;
    
    And 2 variables described in rfc8899#section-5.1.3:
    
      PROBED_SIZE, PROBE_COUNT;
    
    And 5 states described in rfc8899#section-5.2:
    
      DISABLED, BASE, SEARCH, SEARCH_COMPLETE, ERROR;
    
    And these 4 APIs are used to reset/update PLPMTUD, check if PLPMTUD is
    enabled, and calculate the additional headers length for a transport.
    
    Note the member 'probe_high' in transport will be set to the probe
    size when a probe fails with this probe size in the next patches.
    
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    lxin authored and intel-lab-lkp committed Jun 21, 2021
  12. sctp: add SCTP_PLPMTUD_PROBE_INTERVAL sockopt for sock/asoc/transport

    With this socket option, users can change probe_interval for
    a transport, asoc or sock after it's created.
    
    Note that if the change is for an asoc, also apply the change
    to each transport in this asoc.
    
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    lxin authored and intel-lab-lkp committed Jun 21, 2021
  13. sctp: add probe_interval in sysctl and sock/asoc/transport

    PLPMTUD can be enabled by doing 'sysctl -w net.sctp.probe_interval=n'.
    'n' is the interval for PLPMTUD probe timer in milliseconds, and it
    can't be less than 5000 if it's not 0.
    
    All asoc/transport's PLPMTUD in a new socket will be enabled by default.
    
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    lxin authored and intel-lab-lkp committed Jun 21, 2021
  14. sctp: add pad chunk and its make function and event table

    This chunk is defined in rfc4820#section-3, and used to pad an
    SCTP packet. The receiver must discard this chunk and continue
    processing the rest of the chunks in the packet.
    
    Add it now, as it will be bundled with a heartbeat chunk to probe
    pmtu in the following patches.
    
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    lxin authored and intel-lab-lkp committed Jun 21, 2021

Commits on Jun 19, 2021

  1. Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

    Trivial conflicts in net/can/isotp.c and
    tools/testing/selftests/net/mptcp/mptcp_connect.sh
    
    scaled_ppm_to_ppb() was moved from drivers/ptp/ptp_clock.c
    to include/linux/ptp_clock_kernel.h in -next so re-apply
    the fix there.
    
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Jakub Kicinski committed Jun 19, 2021
  2. Merge tag 'net-5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel…

    …/git/netdev/net
    
    Pull networking fixes from Jakub Kicinski:
     "Networking fixes for 5.13-rc7, including fixes from wireless, bpf,
      bluetooth, netfilter and can.
    
      Current release - regressions:
    
       - mlxsw: spectrum_qdisc: Pass handle, not band number to find_class()
         to fix modifying offloaded qdiscs
    
       - lantiq: net: fix duplicated skb in rx descriptor ring
    
       - rtnetlink: fix regression in bridge VLAN configuration, empty info
         is not an error, bot-generated "fix" was not needed
    
       - libbpf: s/rx/tx/ typo on umem->rx_ring_setup_done to fix umem
         creation
    
      Current release - new code bugs:
    
       - ethtool: fix NULL pointer dereference during module EEPROM dump via
         the new netlink API
    
       - mlx5e: don't update netdev RQs with PTP-RQ, the special purpose
         queue should not be visible to the stack
    
       - mlx5e: select special PTP queue only for SKBTX_HW_TSTAMP skbs
    
       - mlx5e: verify dev is present in get devlink port ndo, avoid a panic
    
      Previous releases - regressions:
    
       - neighbour: allow NUD_NOARP entries to be force GCed
    
       - further fixes for fallout from reorg of WiFi locking (staging:
         rtl8723bs, mac80211, cfg80211)
    
       - skbuff: fix incorrect msg_zerocopy copy notifications
    
       - mac80211: fix NULL ptr deref for injected rate info
    
       - Revert "net/mlx5: Arm only EQs with EQEs" it may cause missed IRQs
    
      Previous releases - always broken:
    
       - bpf: more speculative execution fixes
    
       - netfilter: nft_fib_ipv6: skip ipv6 packets from any to link-local
    
       - udp: fix race between close() and udp_abort() resulting in a panic
    
       - fix out of bounds when parsing TCP options before packets are
         validated (in netfilter: synproxy, tc: sch_cake and mptcp)
    
       - mptcp: improve operation under memory pressure, add missing
         wake-ups
    
       - mptcp: fix double-lock/soft lookup in subflow_error_report()
    
       - bridge: fix races (null pointer deref and UAF) in vlan tunnel
         egress
    
       - ena: fix DMA mapping function issues in XDP
    
       - rds: fix memory leak in rds_recvmsg
    
      Misc:
    
       - vrf: allow larger MTUs
    
       - icmp: don't send out ICMP messages with a source address of 0.0.0.0
    
       - cdc_ncm: switch to eth%d interface naming"
    
    * tag 'net-5.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (139 commits)
      net: ethernet: fix potential use-after-free in ec_bhf_remove
      selftests/net: Add icmp.sh for testing ICMP dummy address responses
      icmp: don't send out ICMP messages with a source address of 0.0.0.0
      net: ll_temac: Avoid ndo_start_xmit returning NETDEV_TX_BUSY
      net: ll_temac: Fix TX BD buffer overwrite
      net: ll_temac: Add memory-barriers for TX BD access
      net: ll_temac: Make sure to free skb when it is completely used
      MAINTAINERS: add Guvenc as SMC maintainer
      bnxt_en: Call bnxt_ethtool_free() in bnxt_init_one() error path
      bnxt_en: Fix TQM fastpath ring backing store computation
      bnxt_en: Rediscover PHY capabilities after firmware reset
      cxgb4: fix wrong shift.
      mac80211: handle various extensible elements correctly
      mac80211: reset profile_periodicity/ema_ap
      cfg80211: avoid double free of PMSR request
      cfg80211: make certificate generation more robust
      mac80211: minstrel_ht: fix sample time check
      net: qed: Fix memcpy() overflow of qed_dcbx_params()
      net: cdc_eem: fix tx fixup skb leak
      net: hamradio: fix memory leak in mkiss_close
      ...
    torvalds committed Jun 19, 2021

Commits on Jun 18, 2021

  1. Merge tag 'for-5.13-rc6-tag' of git://git.kernel.org/pub/scm/linux/ke…

    …rnel/git/kdave/linux
    
    Pull btrfs fix from David Sterba:
     "One more fix, for a space accounting bug in zoned mode. It happens
      when a block group is switched back rw->ro and unusable bytes (due to
      zoned constraints) are subtracted twice.
    
      It has user visible effects so I consider it important enough for late
      -rc inclusion and backport to stable"
    
    * tag 'for-5.13-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
      btrfs: zoned: fix negative space_info->bytes_readonly
    torvalds committed Jun 18, 2021
  2. Merge tag 'pci-v5.13-fixes-2' of git://git.kernel.org/pub/scm/linux/k…

    …ernel/git/helgaas/pci
    
    Pull PCI fixes from Bjorn Helgaas:
    
     - Clear 64-bit flag for host bridge windows below 4GB to fix a resource
       allocation regression added in -rc1 (Punit Agrawal)
    
     - Fix tegra194 MCFG quirk build regressions added in -rc1 (Jon Hunter)
    
     - Avoid secondary bus resets on TI KeyStone C667X devices (Antti
       Järvinen)
    
     - Avoid secondary bus resets on some NVIDIA GPUs (Shanker Donthineni)
    
     - Work around FLR erratum on Huawei Intelligent NIC VF (Chiqijun)
    
     - Avoid broken ATS on AMD Navi14 GPU (Evan Quan)
    
     - Trust Broadcom BCM57414 NIC to isolate functions even though it
       doesn't advertise ACS support (Sriharsha Basavapatna)
    
     - Work around AMD RS690 BIOSes that don't configure DMA above 4GB
       (Mikel Rychliski)
    
     - Fix panic during PIO transfer on Aardvark controller (Pali Rohár)
    
    * tag 'pci-v5.13-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
      PCI: aardvark: Fix kernel panic during PIO transfer
      PCI: Add AMD RS690 quirk to enable 64-bit DMA
      PCI: Add ACS quirk for Broadcom BCM57414 NIC
      PCI: Mark AMD Navi14 GPU ATS as broken
      PCI: Work around Huawei Intelligent NIC VF FLR erratum
      PCI: Mark some NVIDIA GPUs to avoid bus reset
      PCI: Mark TI C667X to avoid bus reset
      PCI: tegra194: Fix MCFG quirk build regressions
      PCI: of: Clear 64-bit flag for non-prefetchable memory below 4GB
    torvalds committed Jun 18, 2021
  3. afs: Re-enable freezing once a page fault is interrupted

    If a task is killed during a page fault, it does not currently call
    sb_end_pagefault(), which means that the filesystem cannot be frozen
    at any time thereafter.  This may be reported by lockdep like this:
    
    ====================================
    WARNING: fsstress/10757 still has locks held!
    5.13.0-rc4-build4+ torvalds#91 Not tainted
    ------------------------------------
    1 lock held by fsstress/10757:
     #0: ffff888104eac530
     (
    sb_pagefaults
    
    as filesystem freezing is modelled as a lock.
    
    Fix this by removing all the direct returns from within the function,
    and using 'ret' to indicate whether we were interrupted or successful.
    
    Fixes: 1cf7a15 ("afs: Implement shared-writeable mmap")
    Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
    Signed-off-by: David Howells <dhowells@redhat.com>
    cc: linux-afs@lists.infradead.org
    Link: https://lore.kernel.org/r/20210616154900.1958373-1-willy@infradead.org/
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Matthew Wilcox (Oracle) authored and torvalds committed Jun 18, 2021
  4. Merge branch 'RPMSG-WWAN-CTRL-driver'

    Stephan Gerhold says:
    
    ====================
    net: wwan: Add RPMSG WWAN CTRL driver
    
    This patch series adds a WWAN "control" driver for the remote processor
    messaging (rpmsg) subsystem. This subsystem allows communicating with
    an integrated modem DSP on many Qualcomm SoCs, e.g. MSM8916 or MSM8974.
    
    The driver is a fairly simple glue layer between WWAN and RPMSG
    and is mostly based on the existing mhi_wwan_ctrl.c and rpmsg_char.c.
    
    For more information, see commit message in PATCH 2/3.
    
    I already posted a RFC for this a while ago:
    https://lore.kernel.org/linux-arm-msm/YLfL9Q+4860uqS8f@gerhold.net/
    and now I'm looking for some feedback for the actual changes. :)
    
    Changes in v3:
      - PATCH 2/3: Clarify commit message
      - PATCH 3/3: Fix build error for cdc-wdm.c, use extra tx_blocking() op instead
    v2: https://lore.kernel.org/netdev/20210618075243.42046-1-stephan@gerhold.net/
    
    Changes in v2: Only in PATCH 3/3
      - Fix EPOLLOUT being always set even if poll op is defined
      - Rename poll() op -> tx_poll() since it should be only used for TX
    v1: https://lore.kernel.org/netdev/20210615133229.213064-1-stephan@gerhold.net/
    ====================
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
    davem330 committed Jun 18, 2021
  5. net: wwan: Allow WWAN drivers to provide blocking tx and poll function

    At the moment, the WWAN core provides wwan_port_txon/off() to implement
    blocking writes. The tx() port operation should not block, instead
    wwan_port_txon/off() should be called when the TX queue is full or has
    free space again.
    
    However, in some cases it is not straightforward to make use of that
    functionality. For example, the RPMSG API used by rpmsg_wwan_ctrl.c
    does not provide any way to be notified when the TX queue has space
    again. Instead, it only provides the following operations:
    
      - rpmsg_send(): blocking write (wait until there is space)
      - rpmsg_trysend(): non-blocking write (return error if no space)
      - rpmsg_poll(): set poll flags depending on TX queue state
    
    Generally that's totally sufficient for implementing a char device,
    but it does not fit well to the currently provided WWAN port ops.
    
    Most of the time, using the non-blocking rpmsg_trysend() in the
    WWAN tx() port operation works just fine. However, with high-frequent
    writes to the char device it is possible to trigger a situation
    where this causes issues. For example, consider the following
    (somewhat unrealistic) example:
    
     # dd if=/dev/zero bs=1000 of=/dev/wwan0qmi0
     dd: error writing '/dev/wwan0qmi0': Resource temporarily unavailable
     1+0 records out
    
    This fails immediately after writing the first record. It's likely
    only a matter of time until this triggers issues for some real application
    (e.g. ModemManager sending a lot of large QMI packets).
    
    The rpmsg_char device does not have this problem, because it uses
    rpmsg_trysend() and rpmsg_poll() to support non-blocking operations.
    Make it possible to use the same in the RPMSG WWAN driver by adding
    two new optional wwan_port_ops:
    
      - tx_blocking(): send data blocking if allowed
      - tx_poll(): set additional TX poll flags
    
    This integrates nicely with the RPMSG API and does not require
    any change in existing WWAN drivers.
    
    With these changes, the dd example above blocks instead of exiting
    with an error.
    
    Cc: Loic Poulain <loic.poulain@linaro.org>
    Signed-off-by: Stephan Gerhold <stephan@gerhold.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    stephan-gh authored and davem330 committed Jun 18, 2021
  6. net: wwan: Add RPMSG WWAN CTRL driver

    The remote processor messaging (rpmsg) subsystem provides an interface
    to communicate with other remote processors. On many Qualcomm SoCs this
    is used to communicate with an integrated modem DSP that implements most
    of the modem functionality and provides high-level protocols like
    QMI or AT to allow controlling the modem.
    
    For QMI, most older Qualcomm SoCs (e.g. MSM8916/MSM8974) have
    a standalone "DATA5_CNTL" channel that allows exchanging QMI messages.
    Note that newer SoCs (e.g. SDM845) only allow exchanging QMI messages
    via a shared QRTR channel that is available via a socket API on Linux.
    
    For AT, the "DATA4" channel accepts at least a limited set of AT
    commands, on many older and newer Qualcomm SoCs, although QMI is
    typically the preferred control protocol.
    
    Often there are additional QMI/AT channels (usually named DATA*_CNTL
    for QMI and DATA* for AT), but it is not clear if those are really
    functional on all devices. Also, at the moment there is no use case
    for having multiple QMI/AT ports. If needed more channels could be
    added later after more testing.
    
    Note that the data path (network interface) is entirely separate
    from the control path and varies between Qualcomm SoCs, e.g. "IPA"
    on newer Qualcomm SoCs or "BAM-DMUX" on some older ones.
    
    The RPMSG WWAN CTRL driver exposes the QMI/AT control ports via the
    WWAN subsystem, and therefore allows userspace like ModemManager to
    set up the modem. Until now, ModemManager had to use the RPMSG-specific
    rpmsg-char where the channels must be explicitly exposed as a char
    device first and don't show up directly in sysfs.
    
    The driver is a fairly simple glue layer between WWAN and RPMSG
    and is mostly based on the existing mhi_wwan_ctrl.c and rpmsg_char.c.
    
    Cc: Loic Poulain <loic.poulain@linaro.org>
    Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
    Signed-off-by: Stephan Gerhold <stephan@gerhold.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    stephan-gh authored and davem330 committed Jun 18, 2021
  7. rpmsg: core: Add driver_data for rpmsg_device_id

    Most device_id structs provide a driver_data field that can be used
    by drivers to associate data more easily for a particular device ID.
    Add the same for the rpmsg_device_id.
    
    Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
    Signed-off-by: Stephan Gerhold <stephan@gerhold.net>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    stephan-gh authored and davem330 committed Jun 18, 2021
  8. Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/gi…

    …t/tnguy/next-queue
    
    Jesse Brandeburg says:
    
    ====================
    100GbE Intel Wired LAN Driver Updates 2021-06-18
    
    Update three of the Intel Ethernet drivers with similar (but not the
    same) improvements to simplify the packet type table init, while removing
    an unused structure entry. For the ice driver, the table is extended
    to 10 bits, which is the hardware limit, and for now is initialized
    to zero.
    
    The end result is slightly reduced memory usage, removal of a bunch
    of code, and more specific initialization.
    ====================
    
    Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
    davem330 committed Jun 18, 2021
  9. Revert "net: add pf_family_names[] for protocol family"

    This reverts commit 1f3c98e.
    
    Does not build...
    
    Signed-off-by: David S. Miller <davem@davemloft.net>
    davem330 committed Jun 18, 2021
  10. net: add pf_family_names[] for protocol family

    Modify the pr_info content from int to char *, this looks more readable.
    
    Signed-off-by: Yejune Deng <yejune.deng@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    YajunDeng authored and davem330 committed Jun 18, 2021
  11. net: ethernet: fix potential use-after-free in ec_bhf_remove

    static void ec_bhf_remove(struct pci_dev *dev)
    {
    ...
    	struct ec_bhf_priv *priv = netdev_priv(net_dev);
    
    	unregister_netdev(net_dev);
    	free_netdev(net_dev);
    
    	pci_iounmap(dev, priv->dma_io);
    	pci_iounmap(dev, priv->io);
    ...
    }
    
    priv is netdev private data, but it is used
    after free_netdev(). It can cause use-after-free when accessing priv
    pointer. So, fix it by moving free_netdev() after pci_iounmap()
    calls.
    
    Fixes: 6af55ff ("Driver for Beckhoff CX5020 EtherCAT master module.")
    Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    pskrgag authored and davem330 committed Jun 18, 2021
  12. Merge branch 'csock-seqpoacket-small-fixes'

    Stefano Garzarella says:
    
    ====================
    vsock: small fixes for seqpacket support
    
    This series contains few patches to clean up a bit the code
    of seqpacket recently merged in the net-next tree.
    
    No functionality changes.
    ====================
    
    Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
    davem330 committed Jun 18, 2021
  13. vsock/virtio: remove redundant copy_failed variable

    When memcpy_to_msg() fails in virtio_transport_seqpacket_do_dequeue(),
    we already set `dequeued_len` with the negative error value returned
    by memcpy_to_msg().
    
    So we can directly check `dequeued_len` value instead of using a
    dedicated flag variable to skip the copy path for the rest of
    fragments.
    
    Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    stefano-garzarella authored and davem330 committed Jun 18, 2021
  14. vsock: rename vsock_wait_data()

    vsock_wait_data() is used only by STREAM and SEQPACKET sockets,
    so let's rename it to vsock_connectible_wait_data(), using the same
    nomenclature (connectible) used in other functions after the
    introduction of SEQPACKET.
    
    Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    stefano-garzarella authored and davem330 committed Jun 18, 2021
  15. vsock: rename vsock_has_data()

    vsock_has_data() is used only by STREAM and SEQPACKET sockets,
    so let's rename it to vsock_connectible_has_data(), using the same
    nomenclature (connectible) used in other functions after the
    introduction of SEQPACKET.
    
    Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    stefano-garzarella authored and davem330 committed Jun 18, 2021
  16. NFC: nxp-nci: remove unnecessary label

    Remove unnecessary label chunk_exit and return directly.
    
    Signed-off-by: wengjianfeng <wengjianfeng@yulong.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    wengjianfeng authored and davem330 committed Jun 18, 2021
  17. net: dsa: sja1105: completely error out in sja1105_static_config_relo…

    …ad if something fails
    
    If reloading the static config fails for whatever reason, for example if
    sja1105_static_config_check_valid() fails, then we "goto out_unlock_ptp"
    but we print anyway that "Reset switch and programmed static config.",
    which is confusing because we didn't. We also do a bunch of other stuff
    like reprogram the XPCS and reload the credit-based shapers, as if a
    switch reset took place, which didn't.
    
    So just unlock the PTP lock and goto out, skipping all of that.
    
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    vladimiroltean authored and davem330 committed Jun 18, 2021
  18. net: dsa: sja1105: allow the TTEthernet configuration in the static c…

    …onfig for SJA1110
    
    Currently sja1105_static_config_check_valid() is coded up to detect
    whether TTEthernet is supported based on device ID, and this check was
    not updated to cover SJA1110.
    
    However, it is desirable to have as few checks for the device ID as
    possible, so the driver core is more generic. So what we can do is look
    at the static config table operations implemented by that specific
    switch family (populated by sja1105_static_config_init) whether the
    schedule table has a non-zero maximum entry count (meaning that it is
    supported) or not.
    
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    vladimiroltean authored and davem330 committed Jun 18, 2021
  19. net: hns3: fix reuse conflict of the rx page

    In the current rx page reuse handling process, the rx page buffer may
    have conflict between driver and stack in high-pressure scenario.
    
    To fix this problem, we need to check whether the page is only owned
    by driver at the begin and at the end of a page to make sure there is
    no reuse conflict between driver and stack when desc_cb->page_offset
    is rollbacked to zero or increased.
    
    Fixes: fa7711b ("net: hns3: optimize the rx page reuse handling process")
    Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
    Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>
    Yunsheng Lin authored and davem330 committed Jun 18, 2021
Older