Skip to content
Permalink
Zqiang/fs-inod…
Switch branches/tags

Commits on Oct 15, 2021

  1. fs: inode: use queue_rcu_work() instead of call_rcu()

    Call Trace:
     <IRQ>
     __init_work+0x2d/0x50 kernel/workqueue.c:519
     synchronize_rcu_expedited+0x3af/0x650 kernel/rcu/tree_exp.h:847
     bdi_remove_from_list mm/backing-dev.c:938 [inline]
     bdi_unregister+0x17f/0x5c0 mm/backing-dev.c:946
     release_bdi+0xa1/0xc0 mm/backing-dev.c:968
     kref_put include/linux/kref.h:65 [inline]
     bdi_put+0x72/0xa0 mm/backing-dev.c:976
     bdev_free_inode+0x11e/0x220 block/bdev.c:408
     i_callback+0x3f/0x70 fs/inode.c:226
     rcu_do_batch kernel/rcu/tree.c:2508 [inline]
     rcu_core+0x76d/0x16c0 kernel/rcu/tree.c:2743
     __do_softirq+0x1d7/0x93b kernel/softirq.c:558
     invoke_softirq kernel/softirq.c:432 [inline]
     __irq_exit_rcu kernel/softirq.c:636 [inline]
     irq_exit_rcu+0xf2/0x130 kernel/softirq.c:648
     sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1097
    
    The bdi_put() be called in RCU softirq, however the
    synchronize_rcu_expedited() and flush_delayed_work() that be called
    when wb shutdown, will trigger sleep action, use queue_rcu_work()
    instead of call_rcu(), the release operation be executed in task context.
    
    Reported-by: Hao Sun <sunhao.th@gmail.com>
    Signed-off-by: Zqiang <qiang.zhang1211@gmail.com>
    Zqiang authored and intel-lab-lkp committed Oct 15, 2021

Commits on Oct 14, 2021

  1. Merge tag 'net-5.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel…

    …/git/netdev/net
    
    Pull networking fixes from Jakub Kicinski:
     "Quite calm.
    
      The noisy DSA driver (embedded switches) changes, and adjustment to
      IPv6 IOAM behavior add to diffstat's bottom line but are not scary.
    
      Current release - regressions:
    
       - af_unix: rename UNIX-DGRAM to UNIX to maintain backwards
         compatibility
    
       - procfs: revert "add seq_puts() statement for dev_mcast", minor
         format change broke user space
    
      Current release - new code bugs:
    
       - dsa: fix bridge_num not getting cleared after ports leaving the
         bridge, resource leak
    
       - dsa: tag_dsa: send packets with TX fwd offload from VLAN-unaware
         bridges using VID 0, prevent packet drops if pvid is removed
    
       - dsa: mv88e6xxx: keep the pvid at 0 when VLAN-unaware, prevent HW
         getting confused about station to VLAN mapping
    
      Previous releases - regressions:
    
       - virtio-net: fix for skb_over_panic inside big mode
    
       - phy: do not shutdown PHYs in READY state
    
       - dsa: mv88e6xxx: don't use PHY_DETECT on internal PHY's, fix link
         LED staying lit after ifdown
    
       - mptcp: fix possible infinite wait on recvmsg(MSG_WAITALL)
    
       - mqprio: Correct stats in mqprio_dump_class_stats()
    
       - ice: fix deadlock for Tx timestamp tracking flush
    
       - stmmac: fix feature detection on old hardware
    
      Previous releases - always broken:
    
       - sctp: account stream padding length for reconf chunk
    
       - icmp: fix icmp_ext_echo_iio parsing in icmp_build_probe()
    
       - isdn: cpai: check ctr->cnr to avoid array index out of bound
    
       - isdn: mISDN: fix sleeping function called from invalid context
    
       - nfc: nci: fix potential UAF of rf_conn_info object
    
       - dsa: microchip: prevent ksz_mib_read_work from kicking back in
         after it's canceled in .remove and crashing
    
       - dsa: mv88e6xxx: isolate the ATU databases of standalone and bridged
         ports
    
       - dsa: sja1105, ocelot: break circular dependency between switch and
         tag drivers
    
       - dsa: felix: improve timestamping in presence of packe loss
    
       - mlxsw: thermal: fix out-of-bounds memory accesses
    
      Misc:
    
       - ipv6: ioam: move the check for undefined bits to improve
         interoperability"
    
    * tag 'net-5.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (60 commits)
      icmp: fix icmp_ext_echo_iio parsing in icmp_build_probe
      MAINTAINERS: Update the devicetree documentation path of imx fec driver
      sctp: account stream padding length for reconf chunk
      mlxsw: thermal: Fix out-of-bounds memory accesses
      ethernet: s2io: fix setting mac address during resume
      NFC: digital: fix possible memory leak in digital_in_send_sdd_req()
      NFC: digital: fix possible memory leak in digital_tg_listen_mdaa()
      nfc: fix error handling of nfc_proto_register()
      Revert "net: procfs: add seq_puts() statement for dev_mcast"
      net: encx24j600: check error in devm_regmap_init_encx24j600
      net: korina: select CRC32
      net: arc: select CRC32
      net: dsa: felix: break at first CPU port during init and teardown
      net: dsa: tag_ocelot_8021q: fix inability to inject STP BPDUs into BLOCKING ports
      net: dsa: felix: purge skb from TX timestamping queue if it cannot be sent
      net: dsa: tag_ocelot_8021q: break circular dependency with ocelot switch lib
      net: dsa: tag_ocelot: break circular dependency with ocelot switch lib driver
      net: mscc: ocelot: cross-check the sequence id from the timestamp FIFO with the skb PTP header
      net: mscc: ocelot: deny TX timestamping of non-PTP packets
      net: mscc: ocelot: warn when a PTP IRQ is raised for an unknown skb
      ...
    torvalds committed Oct 14, 2021
  2. icmp: fix icmp_ext_echo_iio parsing in icmp_build_probe

    In icmp_build_probe(), the icmp_ext_echo_iio parsing should be done
    step by step and skb_header_pointer() return value should always be
    checked, this patch fixes 3 places in there:
    
      - On case ICMP_EXT_ECHO_CTYPE_NAME, it should only copy ident.name
        from skb by skb_header_pointer(), its len is ident_len. Besides,
        the return value of skb_header_pointer() should always be checked.
    
      - On case ICMP_EXT_ECHO_CTYPE_INDEX, move ident_len check ahead of
        skb_header_pointer(), and also do the return value check for
        skb_header_pointer().
    
      - On case ICMP_EXT_ECHO_CTYPE_ADDR, before accessing iio->ident.addr.
        ctype3_hdr.addrlen, skb_header_pointer() should be called first,
        then check its return value and ident_len.
        On subcases ICMP_AFI_IP and ICMP_AFI_IP6, also do check for ident.
        addr.ctype3_hdr.addrlen and skb_header_pointer()'s return value.
        On subcase ICMP_AFI_IP, the len for skb_header_pointer() should be
        "sizeof(iio->extobj_hdr) + sizeof(iio->ident.addr.ctype3_hdr) +
        sizeof(struct in_addr)" or "ident_len".
    
    v1->v2:
      - To make it more clear, call skb_header_pointer() once only for
        iio->indent's parsing as Jakub Suggested.
    v2->v3:
      - The extobj_hdr.length check against sizeof(_iio) should be done
        before calling skb_header_pointer(), as Eric noticed.
    
    Fixes: d329ea5 ("icmp: add response to RFC 8335 PROBE messages")
    Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
    Signed-off-by: Xin Long <lucien.xin@gmail.com>
    Reviewed-by: Eric Dumazet <edumazet@google.com>
    Link: https://lore.kernel.org/r/31628dd76657ea62f5cf78bb55da6b35240831f1.1634205050.git.lucien.xin@gmail.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    lxin authored and Jakub Kicinski committed Oct 14, 2021
  3. MAINTAINERS: Update the devicetree documentation path of imx fec driver

    Change the devicetree documentation path
    to "Documentation/devicetree/bindings/net/fsl,fec.yaml"
    since 'fsl-fec.txt' has been converted to 'fsl,fec.yaml' already.
    
    Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
    Link: https://lore.kernel.org/r/20211014110214.3254-1-caihuoqing@baidu.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Cai Huoqing authored and Jakub Kicinski committed Oct 14, 2021
  4. sctp: account stream padding length for reconf chunk

    sctp_make_strreset_req() makes repeated calls to sctp_addto_chunk()
    which will automatically account for padding on each call. inreq and
    outreq are already 4 bytes aligned, but the payload is not and doing
    SCTP_PAD4(a + b) (which _sctp_make_chunk() did implicitly here) is
    different from SCTP_PAD4(a) + SCTP_PAD4(b) and not enough. It led to
    possible attempt to use more buffer than it was allocated and triggered
    a BUG_ON.
    
    Cc: Vlad Yasevich <vyasevich@gmail.com>
    Cc: Neil Horman <nhorman@tuxdriver.com>
    Cc: Greg KH <gregkh@linuxfoundation.org>
    Fixes: cc16f00 ("sctp: add support for generating stream reconf ssn reset request chunk")
    Reported-by: Eiichi Tsukata <eiichi.tsukata@nutanix.com>
    Signed-off-by: Eiichi Tsukata <eiichi.tsukata@nutanix.com>
    Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
    Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
    Reviewed-by: Xin Long <lucien.xin@gmail.com>
    Link: https://lore.kernel.org/r/b97c1f8b0c7ff79ac4ed206fc2c49d3612e0850c.1634156849.git.mleitner@redhat.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Etsukata authored and Jakub Kicinski committed Oct 14, 2021
  5. mlxsw: thermal: Fix out-of-bounds memory accesses

    Currently, mlxsw allows cooling states to be set above the maximum
    cooling state supported by the driver:
    
     # cat /sys/class/thermal/thermal_zone2/cdev0/type
     mlxsw_fan
     # cat /sys/class/thermal/thermal_zone2/cdev0/max_state
     10
     # echo 18 > /sys/class/thermal/thermal_zone2/cdev0/cur_state
     # echo $?
     0
    
    This results in out-of-bounds memory accesses when thermal state
    transition statistics are enabled (CONFIG_THERMAL_STATISTICS=y), as the
    transition table is accessed with a too large index (state) [1].
    
    According to the thermal maintainer, it is the responsibility of the
    driver to reject such operations [2].
    
    Therefore, return an error when the state to be set exceeds the maximum
    cooling state supported by the driver.
    
    To avoid dead code, as suggested by the thermal maintainer [3],
    partially revert commit a421ce0 ("mlxsw: core: Extend cooling
    device with cooling levels") that tried to interpret these invalid
    cooling states (above the maximum) in a special way. The cooling levels
    array is not removed in order to prevent the fans going below 20% PWM,
    which would cause them to get stuck at 0% PWM.
    
    [1]
    BUG: KASAN: slab-out-of-bounds in thermal_cooling_device_stats_update+0x271/0x290
    Read of size 4 at addr ffff8881052f7bf8 by task kworker/0:0/5
    
    CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 5.15.0-rc3-custom-45935-gce1adf704b14 torvalds#122
    Hardware name: Mellanox Technologies Ltd. "MSN2410-CB2FO"/"SA000874", BIOS 4.6.5 03/08/2016
    Workqueue: events_freezable_power_ thermal_zone_device_check
    Call Trace:
     dump_stack_lvl+0x8b/0xb3
     print_address_description.constprop.0+0x1f/0x140
     kasan_report.cold+0x7f/0x11b
     thermal_cooling_device_stats_update+0x271/0x290
     __thermal_cdev_update+0x15e/0x4e0
     thermal_cdev_update+0x9f/0xe0
     step_wise_throttle+0x770/0xee0
     thermal_zone_device_update+0x3f6/0xdf0
     process_one_work+0xa42/0x1770
     worker_thread+0x62f/0x13e0
     kthread+0x3ee/0x4e0
     ret_from_fork+0x1f/0x30
    
    Allocated by task 1:
     kasan_save_stack+0x1b/0x40
     __kasan_kmalloc+0x7c/0x90
     thermal_cooling_device_setup_sysfs+0x153/0x2c0
     __thermal_cooling_device_register.part.0+0x25b/0x9c0
     thermal_cooling_device_register+0xb3/0x100
     mlxsw_thermal_init+0x5c5/0x7e0
     __mlxsw_core_bus_device_register+0xcb3/0x19c0
     mlxsw_core_bus_device_register+0x56/0xb0
     mlxsw_pci_probe+0x54f/0x710
     local_pci_probe+0xc6/0x170
     pci_device_probe+0x2b2/0x4d0
     really_probe+0x293/0xd10
     __driver_probe_device+0x2af/0x440
     driver_probe_device+0x51/0x1e0
     __driver_attach+0x21b/0x530
     bus_for_each_dev+0x14c/0x1d0
     bus_add_driver+0x3ac/0x650
     driver_register+0x241/0x3d0
     mlxsw_sp_module_init+0xa2/0x174
     do_one_initcall+0xee/0x5f0
     kernel_init_freeable+0x45a/0x4de
     kernel_init+0x1f/0x210
     ret_from_fork+0x1f/0x30
    
    The buggy address belongs to the object at ffff8881052f7800
     which belongs to the cache kmalloc-1k of size 1024
    The buggy address is located 1016 bytes inside of
     1024-byte region [ffff8881052f7800, ffff8881052f7c00)
    The buggy address belongs to the page:
    page:0000000052355272 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1052f0
    head:0000000052355272 order:3 compound_mapcount:0 compound_pincount:0
    flags: 0x200000000010200(slab|head|node=0|zone=2)
    raw: 0200000000010200 ffffea0005034800 0000000300000003 ffff888100041dc0
    raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
    page dumped because: kasan: bad access detected
    
    Memory state around the buggy address:
     ffff8881052f7a80: 00 00 00 00 00 00 04 fc fc fc fc fc fc fc fc fc
     ffff8881052f7b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    >ffff8881052f7b80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
                                                                    ^
     ffff8881052f7c00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
     ffff8881052f7c80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    
    [2] https://lore.kernel.org/linux-pm/9aca37cb-1629-5c67-1895-1fdc45c0244e@linaro.org/
    [3] https://lore.kernel.org/linux-pm/af9857f2-578e-de3a-e62b-6baff7e69fd4@linaro.org/
    
    CC: Daniel Lezcano <daniel.lezcano@linaro.org>
    Fixes: a50c1e3 ("mlxsw: core: Implement thermal zone")
    Fixes: a421ce0 ("mlxsw: core: Extend cooling device with cooling levels")
    Signed-off-by: Ido Schimmel <idosch@nvidia.com>
    Tested-by: Vadim Pasternak <vadimp@nvidia.com>
    Link: https://lore.kernel.org/r/20211012174955.472928-1-idosch@idosch.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    idosch authored and Jakub Kicinski committed Oct 14, 2021
  6. ethernet: s2io: fix setting mac address during resume

    After recent cleanups, gcc started warning about a suspicious
    memcpy() call during the s2io_io_resume() function:
    
    In function '__dev_addr_set',
        inlined from 'eth_hw_addr_set' at include/linux/etherdevice.h:318:2,
        inlined from 's2io_set_mac_addr' at drivers/net/ethernet/neterion/s2io.c:5205:2,
        inlined from 's2io_io_resume' at drivers/net/ethernet/neterion/s2io.c:8569:7:
    arch/x86/include/asm/string_32.h:182:25: error: '__builtin_memcpy' accessing 6 bytes at offsets 0 and 2 overlaps 4 bytes at offset 2 [-Werror=restrict]
      182 | #define memcpy(t, f, n) __builtin_memcpy(t, f, n)
          |                         ^~~~~~~~~~~~~~~~~~~~~~~~~
    include/linux/netdevice.h:4648:9: note: in expansion of macro 'memcpy'
     4648 |         memcpy(dev->dev_addr, addr, len);
          |         ^~~~~~
    
    What apparently happened is that an old cleanup changed the calling
    conventions for s2io_set_mac_addr() from taking an ethernet address
    as a character array to taking a struct sockaddr, but one of the
    callers was not changed at the same time.
    
    Change it to instead call the low-level do_s2io_prog_unicast() function
    that still takes the old argument type.
    
    Fixes: 2fd3768 ("S2io: Added support set_mac_address driver entry point")
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Link: https://lore.kernel.org/r/20211013143613.2049096-1-arnd@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    arndb authored and Jakub Kicinski committed Oct 14, 2021
  7. MAINTAINERS: Update entry for the Stratix10 firmware

    Richard Gong is no longer at Intel, so update the MAINTAINER's entry for
    the Stratix10 firmware drivers.
    
    Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Dinh Nguyen authored and torvalds committed Oct 14, 2021
  8. Merge tag 'sound-5.15-rc6' of git://git.kernel.org/pub/scm/linux/kern…

    …el/git/tiwai/sound
    
    Pull sound fixes from Takashi Iwai:
     "This contains quite a few device-specific fixes for usual HD- and
      USB-audio in addition to a couple of ALSA core fixes (a UAF fix in
      sequencer and a fix for a misplaced PCM 32bit compat ioctl).
    
      Nothing really stands out"
    
    * tag 'sound-5.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
      ALSA: usb-audio: Add quirk for VF0770
      ALSA: hda: avoid write to STATESTS if controller is in reset
      ALSA: hda/realtek: Fix the mic type detection issue for ASUS G551JW
      ALSA: pcm: Workaround for a wrong offset in SYNC_PTR compat ioctl
      ALSA: hda/realtek: Fix for quirk to enable speaker output on the Lenovo 13s Gen2
      ALSA: hda: intel: Allow repeatedly probing on codec configuration errors
      ALSA: hda/realtek: Add quirk for TongFang PHxTxX1
      ALSA: hda/realtek - ALC236 headset MIC recording issue
      ALSA: usb-audio: Enable rate validation for Scarlett devices
      ALSA: hda/realtek: Add quirk for Clevo X170KM-G
      ALSA: hda/realtek: Complete partial device name to avoid ambiguity
      ALSA: hda - Enable headphone mic on Dell Latitude laptops with ALC3254
      ALSA: seq: Fix a potential UAF by wrong private_free call order
      ALSA: hda/realtek: Enable 4-speaker output for Dell Precision 5560 laptop
      ALSA: usb-audio: Fix a missing error check in scarlett gen2 mixer
    torvalds committed Oct 14, 2021
  9. Merge branch 'fix-two-possible-memory-leak-problems-in-nfc-digital-mo…

    …dule'
    
    Ziyang Xuan says:
    
    ====================
    Fix two possible memory leak problems in NFC digital module.
    ====================
    
    Link: https://lore.kernel.org/r/cover.1634111083.git.william.xuanziyang@huawei.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Jakub Kicinski committed Oct 14, 2021
  10. NFC: digital: fix possible memory leak in digital_in_send_sdd_req()

    'skb' is allocated in digital_in_send_sdd_req(), but not free when
    digital_in_send_cmd() failed, which will cause memory leak. Fix it
    by freeing 'skb' if digital_in_send_cmd() return failed.
    
    Fixes: 2c66dae ("NFC Digital: Add NFC-A technology support")
    Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
    Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Ziyang Xuan authored and Jakub Kicinski committed Oct 14, 2021
  11. NFC: digital: fix possible memory leak in digital_tg_listen_mdaa()

    'params' is allocated in digital_tg_listen_mdaa(), but not free when
    digital_send_cmd() failed, which will cause memory leak. Fix it by
    freeing 'params' if digital_send_cmd() return failed.
    
    Fixes: 1c7a4c2 ("NFC Digital: Add target NFC-DEP support")
    Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
    Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Ziyang Xuan authored and Jakub Kicinski committed Oct 14, 2021
  12. nfc: fix error handling of nfc_proto_register()

    When nfc proto id is using, nfc_proto_register() return -EBUSY error
    code, but forgot to unregister proto. Fix it by adding proto_unregister()
    in the error handling case.
    
    Fixes: c7fe3b5 ("NFC: add NFC socket family")
    Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
    Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
    Link: https://lore.kernel.org/r/20211013034932.2833737-1-william.xuanziyang@huawei.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Ziyang Xuan authored and Jakub Kicinski committed Oct 14, 2021
  13. Revert "net: procfs: add seq_puts() statement for dev_mcast"

    This reverts commit ec18e84.
    
    It turns out that there are user space programs which got broken by that
    change. One example is the "ifstat" program shipped by Debian:
    https://packages.debian.org/source/bullseye/ifstat
    which, confusingly enough, seems to not have anything in common with the
    much more familiar (at least to me) ifstat program from iproute2:
    https://git.kernel.org/pub/scm/network/iproute2/iproute2.git/tree/misc/ifstat.c
    
    root@debian:~# ifstat
    ifstat: /proc/net/dev: unsupported format.
    
    This change modified the header (first two lines of text) in
    /proc/net/dev so that it looks like this:
    
    root@debian:~# cat /proc/net/dev
    Interface|                            Receive                                       |                                 Transmit
             |            bytes      packets errs   drop fifo frame compressed multicast|            bytes      packets errs   drop fifo colls carrier compressed
           lo:            97400         1204    0      0    0     0          0         0            97400         1204    0      0    0     0       0          0
        bond0:                0            0    0      0    0     0          0         0                0            0    0      0    0     0       0          0
         sit0:                0            0    0      0    0     0          0         0                0            0    0      0    0     0       0          0
         eno2:          5002206         6651    0      0    0     0          0         0        105518642      1465023    0      0    0     0       0          0
         swp0:           134531         2448    0      0    0     0          0         0         99599598      1464381    0      0    0     0       0          0
         swp1:                0            0    0      0    0     0          0         0                0            0    0      0    0     0       0          0
         swp2:          4867675         4203    0      0    0     0          0         0            58134          631    0      0    0     0       0          0
        sw0p0:                0            0    0      0    0     0          0         0                0            0    0      0    0     0       0          0
        sw0p1:           124739         2448    0   1422    0     0          0         0         93741184      1464369    0      0    0     0       0          0
        sw0p2:                0            0    0      0    0     0          0         0                0            0    0      0    0     0       0          0
        sw2p0:          4850863         4203    0      0    0     0          0         0            54722          619    0      0    0     0       0          0
        sw2p1:                0            0    0      0    0     0          0         0                0            0    0      0    0     0       0          0
        sw2p2:                0            0    0      0    0     0          0         0                0            0    0      0    0     0       0          0
        sw2p3:                0            0    0      0    0     0          0         0                0            0    0      0    0     0       0          0
          br0:            10508          212    0    212    0     0          0       212         61369558       958857    0      0    0     0       0          0
    
    whereas before it looked like this:
    
    root@debian:~# cat /proc/net/dev
    Inter-|   Receive                                                |  Transmit
     face |bytes    packets errs drop fifo frame compressed multicast|bytes    packets errs drop fifo colls carrier compressed
        lo:   13160     164    0    0    0     0          0         0    13160     164    0    0    0     0       0          0
     bond0:       0       0    0    0    0     0          0         0        0       0    0    0    0     0       0          0
      sit0:       0       0    0    0    0     0          0         0        0       0    0    0    0     0       0          0
      eno2:   30824     268    0    0    0     0          0         0     3332      37    0    0    0     0       0          0
      swp0:       0       0    0    0    0     0          0         0        0       0    0    0    0     0       0          0
      swp1:       0       0    0    0    0     0          0         0        0       0    0    0    0     0       0          0
      swp2:   30824     268    0    0    0     0          0         0     2428      27    0    0    0     0       0          0
     sw0p0:       0       0    0    0    0     0          0         0        0       0    0    0    0     0       0          0
     sw0p1:       0       0    0    0    0     0          0         0        0       0    0    0    0     0       0          0
     sw0p2:       0       0    0    0    0     0          0         0        0       0    0    0    0     0       0          0
     sw2p0:   29752     268    0    0    0     0          0         0     1564      17    0    0    0     0       0          0
     sw2p1:       0       0    0    0    0     0          0         0        0       0    0    0    0     0       0          0
     sw2p2:       0       0    0    0    0     0          0         0        0       0    0    0    0     0       0          0
     sw2p3:       0       0    0    0    0     0          0         0        0       0    0    0    0     0       0          0
    
    The reason why the ifstat shipped by Debian (v1.1, with a Debian patch
    upgrading it to 1.1-8.1 at the time of writing) is broken is because its
    "proc" driver/backend parses the header very literally:
    
    main/drivers.c#L825
      if (!data->checked && strncmp(buf, "Inter-|", 7))
        goto badproc;
    
    and there's no way in which the header can be changed such that programs
    parsing like that would not get broken.
    
    Even if we fix this ancient and very "lightly" maintained program to
    parse the text output of /proc/net/dev in a more sensible way, this
    story seems bound to repeat again with other programs, and modifying
    them all could cause more trouble than it's worth. On the other hand,
    the reverted patch had no other reason than an aesthetic one, so
    reverting it is the simplest way out.
    
    I don't know what other distributions would be affected; the fact that
    Debian doesn't ship the iproute2 version of the program (a different
    code base altogether, which uses netlink and not /proc/net/dev) is
    surprising in itself.
    
    Fixes: ec18e84 ("net: procfs: add seq_puts() statement for dev_mcast")
    Link: https://lore.kernel.org/netdev/20211009163511.vayjvtn3rrteglsu@skbuf/
    Cc: Yajun Deng <yajun.deng@linux.dev>
    Cc: Matthieu Baerts <matthieu.baerts@tessares.net>
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Link: https://lore.kernel.org/r/20211013001909.3164185-1-vladimir.oltean@nxp.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    vladimiroltean authored and Jakub Kicinski committed Oct 14, 2021

Commits on Oct 13, 2021

  1. net: encx24j600: check error in devm_regmap_init_encx24j600

    devm_regmap_init may return error which caused by like out of memory,
    this will results in null pointer dereference later when reading
    or writing register:
    
    general protection fault in encx24j600_spi_probe
    KASAN: null-ptr-deref in range [0x0000000000000090-0x0000000000000097]
    CPU: 0 PID: 286 Comm: spi-encx24j600- Not tainted 5.15.0-rc2-00142-g9978db750e31-dirty torvalds#11 9c53a778c1306b1b02359f3c2bbedc0222cba652
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
    RIP: 0010:regcache_cache_bypass drivers/base/regmap/regcache.c:540
    Code: 54 41 89 f4 55 53 48 89 fb 48 83 ec 08 e8 26 94 a8 fe 48 8d bb a0 00 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 4a 03 00 00 4c 8d ab b0 00 00 00 48 8b ab a0 00
    RSP: 0018:ffffc900010476b8 EFLAGS: 00010207
    RAX: dffffc0000000000 RBX: fffffffffffffff4 RCX: 0000000000000000
    RDX: 0000000000000012 RSI: ffff888002de0000 RDI: 0000000000000094
    RBP: ffff888013c9a000 R08: 0000000000000000 R09: fffffbfff3f9cc6a
    R10: ffffc900010476e8 R11: fffffbfff3f9cc69 R12: 0000000000000001
    R13: 000000000000000a R14: ffff888013c9af54 R15: ffff888013c9ad08
    FS:  00007ffa984ab580(0000) GS:ffff88801fe00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 000055a6384136c8 CR3: 000000003bbe6003 CR4: 0000000000770ef0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    PKRU: 55555554
    Call Trace:
     encx24j600_spi_probe drivers/net/ethernet/microchip/encx24j600.c:459
     spi_probe drivers/spi/spi.c:397
     really_probe drivers/base/dd.c:517
     __driver_probe_device drivers/base/dd.c:751
     driver_probe_device drivers/base/dd.c:782
     __device_attach_driver drivers/base/dd.c:899
     bus_for_each_drv drivers/base/bus.c:427
     __device_attach drivers/base/dd.c:971
     bus_probe_device drivers/base/bus.c:487
     device_add drivers/base/core.c:3364
     __spi_add_device drivers/spi/spi.c:599
     spi_add_device drivers/spi/spi.c:641
     spi_new_device drivers/spi/spi.c:717
     new_device_store+0x18c/0x1f1 [spi_stub 4e02719357f1ff33f5a43d00630982840568e85e]
     dev_attr_store drivers/base/core.c:2074
     sysfs_kf_write fs/sysfs/file.c:139
     kernfs_fop_write_iter fs/kernfs/file.c:300
     new_sync_write fs/read_write.c:508 (discriminator 4)
     vfs_write fs/read_write.c:594
     ksys_write fs/read_write.c:648
     do_syscall_64 arch/x86/entry/common.c:50
     entry_SYSCALL_64_after_hwframe arch/x86/entry/entry_64.S:113
    
    Add error check in devm_regmap_init_encx24j600 to avoid this situation.
    
    Fixes: 04fbfce ("net: Microchip encx24j600 driver")
    Reported-by: Hulk Robot <hulkci@huawei.com>
    Signed-off-by: Nanyong Sun <sunnanyong@huawei.com>
    Link: https://lore.kernel.org/r/20211012125901.3623144-1-sunnanyong@huawei.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    sunnanyong authored and Jakub Kicinski committed Oct 13, 2021
  2. Merge tag 'mlx5-fixes-2021-10-12' of git://git.kernel.org/pub/scm/lin…

    …ux/kernel/git/saeed/linux
    
    Saeed Mahameed says:
    
    ====================
    mlx5 fixes 2021-10-12
    
    * tag 'mlx5-fixes-2021-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
      net/mlx5e: Fix division by 0 in mlx5e_select_queue for representors
      net/mlx5e: Mutually exclude RX-FCS and RX-port-timestamp
      net/mlx5e: Switchdev representors are not vlan challenged
      net/mlx5e: Fix memory leak in mlx5_core_destroy_cq() error path
      net/mlx5e: Allow only complete TXQs partition in MQPRIO channel mode
      net/mlx5: Fix cleanup of bridge delayed work
    ====================
    
    Link: https://lore.kernel.org/r/20211012205323.20123-1-saeed@kernel.org
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Jakub Kicinski committed Oct 13, 2021
  3. net: korina: select CRC32

    Fix the following build/link error by adding a dependency on the CRC32
    routines:
    
      ld: drivers/net/ethernet/korina.o: in function `korina_multicast_list':
      korina.c:(.text+0x1af): undefined reference to `crc32_le'
    
    Fixes: ef11291 ("Add support the Korina (IDT RC32434) Ethernet MAC")
    Cc: Arnd Bergmann <arnd@arndb.de>
    Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
    Acked-by: Florian fainelli <f.fainelli@gmail.com>
    Link: https://lore.kernel.org/r/20211012152509.21771-1-vegard.nossum@oracle.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    vegard authored and Jakub Kicinski committed Oct 13, 2021
  4. net: arc: select CRC32

    Fix the following build/link error by adding a dependency on the CRC32
    routines:
    
      ld: drivers/net/ethernet/arc/emac_main.o: in function `arc_emac_set_rx_mode':
      emac_main.c:(.text+0xb11): undefined reference to `crc32_le'
    
    The crc32_le() call comes through the ether_crc_le() call in
    arc_emac_set_rx_mode().
    
    [v2: moved the select to ARC_EMAC_CORE; the Makefile is a bit confusing,
    but the error comes from emac_main.o, which is part of the arc_emac module,
    which in turn is enabled by CONFIG_ARC_EMAC_CORE. Note that arc_emac is
    different from emac_arc...]
    
    Fixes: 775dd68 ("arc_emac: implement promiscuous mode and multicast filtering")
    Cc: Arnd Bergmann <arnd@arndb.de>
    Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
    Link: https://lore.kernel.org/r/20211012093446.1575-1-vegard.nossum@oracle.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    vegard authored and Jakub Kicinski committed Oct 13, 2021
  5. Merge tag 'modules-for-v5.15-rc6' of git://git.kernel.org/pub/scm/lin…

    …ux/kernel/git/jeyu/linux
    
    Pull modules fix from Jessica Yu:
    
     - Build fix for cfi_init() when CONFIG_MODULE_UNLOAD=n
    
    * tag 'modules-for-v5.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
      module: fix clang CFI with MODULE_UNLOAD=n
    torvalds committed Oct 13, 2021
  6. Merge branch 'felix-dsa-driver-fixes'

    Vladimir Oltean says:
    
    ====================
    Felix DSA driver fixes
    
    This is an assorted collection of fixes for issues seen on the NXP
    LS1028A switch.
    
    - PTP packet drops due to switch congestion result in catastrophic
      damage to the driver's state
    - loops are not blocked by STP if using the ocelot-8021q tagger
    - driver uses the wrong CPU port when two of them are defined in DT
    - module autoloading is broken* with both tagging protocol drivers
      (ocelot and ocelot-8021q)
    
    Changes in v2:
    - Stop printing that we aren't going to take TX timestamps if we don't
      have TX timestamping anyway, and we are just carrying PTP frames for a
      cascaded DSA switch.
    - Shorten the deferred xmit kthread name so that it fits the 16
      character limit (TASK_COMM_LEN)
    ====================
    
    Link: https://lore.kernel.org/r/20211012114044.2526146-1-vladimir.oltean@nxp.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Jakub Kicinski committed Oct 13, 2021
  7. net: dsa: felix: break at first CPU port during init and teardown

    The NXP LS1028A switch has two Ethernet ports towards the CPU, but only
    one of them is capable of acting as an NPI port at a time (inject and
    extract packets using DSA tags).
    
    However, using the alternative ocelot-8021q tagging protocol, it should
    be possible to use both CPU ports symmetrically, but for that we need to
    mark both ports in the device tree as DSA masters.
    
    In the process of doing that, it can be seen that traffic to/from the
    network stack gets broken, and this is because the Felix driver iterates
    through all DSA CPU ports and configures them as NPI ports. But since
    there can only be a single NPI port, we effectively end up in a
    situation where DSA thinks the default CPU port is the first one, but
    the hardware port configured to be an NPI is the last one.
    
    I would like to treat this as a bug, because if the updated device trees
    are going to start circulating, it would be really good for existing
    kernels to support them, too.
    
    Fixes: adb3dcc ("net: dsa: felix: convert to the new .change_tag_protocol DSA API")
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    vladimiroltean authored and Jakub Kicinski committed Oct 13, 2021
  8. net: dsa: tag_ocelot_8021q: fix inability to inject STP BPDUs into BL…

    …OCKING ports
    
    When setting up a bridge with stp_state 1, topology changes are not
    detected and loops are not blocked. This is because the standard way of
    transmitting a packet, based on VLAN IDs redirected by VCAP IS2 to the
    right egress port, does not override the port STP state (in the case of
    Ocelot switches, that's really the PGID_SRC masks).
    
    To force a packet to be injected into a port that's BLOCKING, we must
    send it as a control packet, which means in the case of this tagger to
    send it using the manual register injection method. We already do this
    for PTP frames, extend the logic to apply to any link-local MAC DA.
    
    Fixes: 7c83a7c ("net: dsa: add a second tagger for Ocelot switches based on tag_8021q")
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    vladimiroltean authored and Jakub Kicinski committed Oct 13, 2021
  9. net: dsa: felix: purge skb from TX timestamping queue if it cannot be…

    … sent
    
    At present, when a PTP packet which requires TX timestamping gets
    dropped under congestion by the switch, things go downhill very fast.
    The driver keeps a clone of that skb in a queue of packets awaiting TX
    timestamp interrupts, but interrupts will never be raised for the
    dropped packets.
    
    Moreover, matching timestamped packets to timestamps is done by a 2-bit
    timestamp ID, and this can wrap around and we can match on the wrong skb.
    
    Since with the default NPI-based tagging protocol, we get no notification
    about packet drops, the best we can do is eventually recover from the
    drop of a PTP frame: its skb will be dead memory until another skb which
    was assigned the same timestamp ID happens to find it.
    
    However, with the ocelot-8021q tagger which injects packets using the
    manual register interface, it appears that we can check for more
    information, such as:
    
    - whether the input queue has reached the high watermark or not
    - whether the injection group's FIFO can accept additional data or not
    
    so we know that a PTP frame is likely to get dropped before actually
    sending it, and drop it ourselves (because DSA uses NETIF_F_LLTX, so it
    can't return NETDEV_TX_BUSY to ask the qdisc to requeue the packet).
    
    But when we do that, we can also remove the skb from the timestamping
    queue, because there surely won't be any timestamp that matches it.
    
    Fixes: 0a6f17c ("net: dsa: tag_ocelot_8021q: add support for PTP timestamping")
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    vladimiroltean authored and Jakub Kicinski committed Oct 13, 2021
  10. net: dsa: tag_ocelot_8021q: break circular dependency with ocelot swi…

    …tch lib
    
    Michael reported that when using the "ocelot-8021q" tagging protocol,
    the switch driver module must be manually loaded before the tagging
    protocol can be loaded/is available.
    
    This appears to be the same problem described here:
    https://lore.kernel.org/netdev/20210908220834.d7gmtnwrorhharna@skbuf/
    where due to the fact that DSA tagging protocols make use of symbols
    exported by the switch drivers, circular dependencies appear and this
    breaks module autoloading.
    
    The ocelot_8021q driver needs the ocelot_can_inject() and
    ocelot_port_inject_frame() functions from the switch library. Previously
    the wrong approach was taken to solve that dependency: shims were
    provided for the case where the ocelot switch library was compiled out,
    but that turns out to be insufficient, because the dependency when the
    switch lib _is_ compiled is problematic too.
    
    We cannot declare ocelot_can_inject() and ocelot_port_inject_frame() as
    static inline functions, because these access I/O functions like
    __ocelot_write_ix() which is called by ocelot_write_rix(). Making those
    static inline basically means exposing the whole guts of the ocelot
    switch library, not ideal...
    
    We already have one tagging protocol driver which calls into the switch
    driver during xmit but not using any exported symbol: sja1105_defer_xmit.
    We can do the same thing here: create a kthread worker and one work item
    per skb, and let the switch driver itself do the register accesses to
    send the skb, and then consume it.
    
    Fixes: 0a6f17c ("net: dsa: tag_ocelot_8021q: add support for PTP timestamping")
    Reported-by: Michael Walle <michael@walle.cc>
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    vladimiroltean authored and Jakub Kicinski committed Oct 13, 2021
  11. net: dsa: tag_ocelot: break circular dependency with ocelot switch li…

    …b driver
    
    As explained here:
    https://lore.kernel.org/netdev/20210908220834.d7gmtnwrorhharna@skbuf/
    DSA tagging protocol drivers cannot depend on symbols exported by switch
    drivers, because this creates a circular dependency that breaks module
    autoloading.
    
    The tag_ocelot.c file depends on the ocelot_ptp_rew_op() function
    exported by the common ocelot switch lib. This function looks at
    OCELOT_SKB_CB(skb) and computes how to populate the REW_OP field of the
    DSA tag, for PTP timestamping (the command: one-step/two-step, and the
    TX timestamp identifier).
    
    None of that requires deep insight into the driver, it is quite
    stateless, as it only depends upon the skb->cb. So let's make it a
    static inline function and put it in include/linux/dsa/ocelot.h, a
    file that despite its name is used by the ocelot switch driver for
    populating the injection header too - since commit 40d3f29 ("net:
    mscc: ocelot: use common tag parsing code with DSA").
    
    With that function declared as static inline, its body is expanded
    inside each call site, so the dependency is broken and the DSA tagger
    can be built without the switch library, upon which the felix driver
    depends.
    
    Fixes: 39e5308 ("net: mscc: ocelot: support PTP Sync one-step timestamping")
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    vladimiroltean authored and Jakub Kicinski committed Oct 13, 2021
  12. net: mscc: ocelot: cross-check the sequence id from the timestamp FIF…

    …O with the skb PTP header
    
    The sad reality is that when a PTP frame with a TX timestamping request
    is transmitted, it isn't guaranteed that it will make it all the way to
    the wire (due to congestion inside the switch), and that a timestamp
    will be taken by the hardware and placed in the timestamp FIFO where an
    IRQ will be raised for it.
    
    The implication is that if enough PTP frames are silently dropped by the
    hardware such that the timestamp ID has rolled over, it is possible to
    match a timestamp to an old skb.
    
    Furthermore, nobody will match on the real skb corresponding to this
    timestamp, since we stupidly matched on a previous one that was stale in
    the queue, and stopped there.
    
    So PTP timestamping will be broken and there will be no way to recover.
    
    It looks like the hardware parses the sequenceID from the PTP header,
    and also provides that metadata for each timestamp. The driver currently
    ignores this, but it shouldn't.
    
    As an extra resiliency measure, do the following:
    
    - check whether the PTP sequenceID also matches between the skb and the
      timestamp, treat the skb as stale otherwise and free it
    
    - if we see a stale skb, don't stop there and try to match an skb one
      more time, chances are there's one more skb in the queue with the same
      timestamp ID, otherwise we wouldn't have ever found the stale one (it
      is by timestamp ID that we matched it).
    
    While this does not prevent PTP packet drops, it at least prevents
    the catastrophic consequences of incorrect timestamp matching.
    
    Since we already call ptp_classify_raw in the TX path, save the result
    in the skb->cb of the clone, and just use that result in the interrupt
    code path.
    
    Fixes: 4e3b046 ("net: mscc: PTP Hardware Clock (PHC) support")
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    vladimiroltean authored and Jakub Kicinski committed Oct 13, 2021
  13. net: mscc: ocelot: deny TX timestamping of non-PTP packets

    It appears that Ocelot switches cannot timestamp non-PTP frames,
    I tested this using the isochron program at:
    https://github.com/vladimiroltean/tsn-scripts
    
    with the result that the driver increments the ocelot_port->ts_id
    counter as expected, puts it in the REW_OP, but the hardware seems to
    not timestamp these packets at all, since no IRQ is emitted.
    
    Therefore check whether we are sending PTP frames, and refuse to
    populate REW_OP otherwise.
    
    Fixes: 4e3b046 ("net: mscc: PTP Hardware Clock (PHC) support")
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    vladimiroltean authored and Jakub Kicinski committed Oct 13, 2021
  14. net: mscc: ocelot: warn when a PTP IRQ is raised for an unknown skb

    When skb_match is NULL, it means we received a PTP IRQ for a timestamp
    ID that the kernel has no idea about, since there is no skb in the
    timestamping queue with that timestamp ID.
    
    This is a grave error and not something to just "continue" over.
    So print a big warning in case this happens.
    
    Also, move the check above ocelot_get_hwtimestamp(), there is no point
    in reading the full 64-bit current PTP time if we're not going to do
    anything with it anyway for this skb.
    
    Fixes: 4e3b046 ("net: mscc: PTP Hardware Clock (PHC) support")
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    vladimiroltean authored and Jakub Kicinski committed Oct 13, 2021
  15. net: mscc: ocelot: avoid overflowing the PTP timestamp FIFO

    PTP packets with 2-step TX timestamp requests are matched to packets
    based on the egress port number and a 6-bit timestamp identifier.
    All PTP timestamps are held in a common FIFO that is 128 entry deep.
    
    This patch ensures that back-to-back timestamping requests cannot exceed
    the hardware FIFO capacity. If that happens, simply send the packets
    without requesting a TX timestamp to be taken (in the case of felix,
    since the DSA API has a void return code in ds->ops->port_txtstamp) or
    drop them (in the case of ocelot).
    
    I've moved the ts_id_lock from a per-port basis to a per-switch basis,
    because we need separate accounting for both numbers of PTP frames in
    flight. And since we need locking to inc/dec the per-switch counter,
    that also offers protection for the per-port counter and hence there is
    no reason to have a per-port counter anymore.
    
    Fixes: 4e3b046 ("net: mscc: PTP Hardware Clock (PHC) support")
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    vladimiroltean authored and Jakub Kicinski committed Oct 13, 2021
  16. net: mscc: ocelot: make use of all 63 PTP timestamp identifiers

    At present, there is a problem when user space bombards a port with PTP
    event frames which have TX timestamping requests (or when a tc-taprio
    offload is installed on a port, which delays the TX timestamps by a
    significant amount of time). The driver will happily roll over the 2-bit
    timestamp ID and this will cause incorrect matches between an skb and
    the TX timestamp collected from the FIFO.
    
    The Ocelot switches have a 6-bit PTP timestamp identifier, and the value
    63 is reserved, so that leaves identifiers 0-62 to be used.
    
    The timestamp identifiers are selected by the REW_OP packet field, and
    are actually shared between CPU-injected frames and frames which match a
    VCAP IS2 rule that modifies the REW_OP. The hardware supports
    partitioning between the two uses of the REW_OP field through the
    PTP_ID_LOW and PTP_ID_HIGH registers, and by default reserves the PTP
    IDs 0-3 for CPU-injected traffic and the rest for VCAP IS2.
    
    The driver does not use VCAP IS2 to set REW_OP for 2-step timestamping,
    and it also writes 0xffffffff to both PTP_ID_HIGH and PTP_ID_LOW in
    ocelot_init_timestamp() which makes all timestamp identifiers available
    to CPU injection.
    
    Therefore, we can make use of all 63 timestamp identifiers, which should
    allow more timestampable packets to be in flight on each port. This is
    only part of the solution, more issues will be addressed in future changes.
    
    Fixes: 4e3b046 ("net: mscc: PTP Hardware Clock (PHC) support")
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    vladimiroltean authored and Jakub Kicinski committed Oct 13, 2021
  17. Merge branch 'fix-circular-dependency-between-sja1105-and-tag_sja1105'

    Vladimir Oltean says:
    
    ====================
    Fix circular dependency between sja1105 and tag_sja1105
    
    As discussed here:
    https://lore.kernel.org/netdev/20210908220834.d7gmtnwrorhharna@skbuf/
    DSA tagging protocols cannot use symbols exported by switch drivers.
    
    Eliminate the two instances of that from tag_sja1105, and that allows us
    to have a working setup with modules again.
    ====================
    
    Re-applying to net, this was mistakenly applied to net-next,
    see first Link.
    
    Link: https://lore.kernel.org/r/20211012114044.2526146-1-vladimir.oltean@nxp.com/
    Link: https://lore.kernel.org/r/20210922143726.2431036-1-vladimir.oltean@nxp.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Jakub Kicinski committed Oct 13, 2021
  18. net: dsa: sja1105: break dependency between dsa_port_is_sja1105 and s…

    …witch driver
    
    It's nice to be able to test a tagging protocol with dsa_loop, but not
    at the cost of losing the ability of building the tagging protocol and
    switch driver as modules, because as things stand, there is a circular
    dependency between the two. Tagging protocol drivers cannot depend on
    switch drivers, that is a hard fact.
    
    The reasoning behind the blamed patch was that accessing dp->priv should
    first make sure that the structure behind that pointer is what we really
    think it is.
    
    Currently the "sja1105" and "sja1110" tagging protocols only operate
    with the sja1105 switch driver, just like any other tagging protocol and
    switch combination. The only way to mix and match them is by modifying
    the code, and this applies to dsa_loop as well (by default that uses
    DSA_TAG_PROTO_NONE). So while in principle there is an issue, in
    practice there isn't one.
    
    Until we extend dsa_loop to allow user space configuration, treat the
    problem as a non-issue and just say that DSA ports found by tag_sja1105
    are always sja1105 ports, which is in fact true. But keep the
    dsa_port_is_sja1105 function so that it's easy to patch it during
    testing, and rely on dead code elimination.
    
    Fixes: 994d2cb ("net: dsa: tag_sja1105: be dsa_loop-safe")
    Link: https://lore.kernel.org/netdev/20210908220834.d7gmtnwrorhharna@skbuf/
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    vladimiroltean authored and Jakub Kicinski committed Oct 13, 2021
  19. net: dsa: move sja1110_process_meta_tstamp inside the tagging protoco…

    …l driver
    
    The problem is that DSA tagging protocols really must not depend on the
    switch driver, because this creates a circular dependency at insmod
    time, and the switch driver will effectively not load when the tagging
    protocol driver is missing.
    
    The code was structured in the way it was for a reason, though. The DSA
    driver-facing API for PTP timestamping relies on the assumption that
    two-step TX timestamps are provided by the hardware in an out-of-band
    manner, typically by raising an interrupt and making that timestamp
    available inside some sort of FIFO which is to be accessed over
    SPI/MDIO/etc.
    
    So the API puts .port_txtstamp into dsa_switch_ops, because it is
    expected that the switch driver needs to save some state (like put the
    skb into a queue until its TX timestamp arrives).
    
    On SJA1110, TX timestamps are provided by the switch as Ethernet
    packets, so this makes them be received and processed by the tagging
    protocol driver. This in itself is great, because the timestamps are
    full 64-bit and do not require reconstruction, and since Ethernet is the
    fastest I/O method available to/from the switch, PTP timestamps arrive
    very quickly, no matter how bottlenecked the SPI connection is, because
    SPI interaction is not needed at all.
    
    DSA's code structure and strict isolation between the tagging protocol
    driver and the switch driver break the natural code organization.
    
    When the tagging protocol driver receives a packet which is classified
    as a metadata packet containing timestamps, it passes those timestamps
    one by one to the switch driver, which then proceeds to compare them
    based on the recorded timestamp ID that was generated in .port_txtstamp.
    
    The communication between the tagging protocol and the switch driver is
    done through a method exported by the switch driver, sja1110_process_meta_tstamp.
    To satisfy build requirements, we force a dependency to build the
    tagging protocol driver as a module when the switch driver is a module.
    However, as explained in the first paragraph, that causes the circular
    dependency.
    
    To solve this, move the skb queue from struct sja1105_private :: struct
    sja1105_ptp_data to struct sja1105_private :: struct sja1105_tagger_data.
    The latter is a data structure for which hacks have already been put
    into place to be able to create persistent storage per switch that is
    accessible from the tagging protocol driver (see sja1105_setup_ports).
    
    With the skb queue directly accessible from the tagging protocol driver,
    we can now move sja1110_process_meta_tstamp into the tagging driver
    itself, and avoid exporting a symbol.
    
    Fixes: 566b18c ("net: dsa: sja1105: implement TX timestamping for SJA1110")
    Link: https://lore.kernel.org/netdev/20210908220834.d7gmtnwrorhharna@skbuf/
    Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    vladimiroltean authored and Jakub Kicinski committed Oct 13, 2021

Commits on Oct 12, 2021

  1. net: dsa: fix spurious error message when unoffloaded port leaves bridge

    Flip the sign of a return value check, thereby suppressing the following
    spurious error:
    
      port 2 failed to notify DSA_NOTIFIER_BRIDGE_LEAVE: -EOPNOTSUPP
    
    ... which is emitted when removing an unoffloaded DSA switch port from a
    bridge.
    
    Fixes: d371b7c ("net: dsa: Unset vlan_filtering when ports leave the bridge")
    Signed-off-by: Alvin Šipraga <alsi@bang-olufsen.dk>
    Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
    Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
    Link: https://lore.kernel.org/r/20211012112730.3429157-1-alvin@pqrs.dk
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    sipraga authored and Jakub Kicinski committed Oct 12, 2021
  2. nfp: flow_offload: move flow_indr_dev_register from app init to app s…

    …tart
    
    In commit 74fc4f8 ("net: Fix offloading indirect devices dependency
    on qdisc order creation"), it adds a process to trigger the callback to
    setup the bo callback when the driver regists a callback.
    
    In our current implement, we are not ready to run the callback when nfp
    call the function flow_indr_dev_register, then there will be error
    message as:
    
    kernel: Oops: 0000 [#1] SMP PTI
    kernel: CPU: 0 PID: 14119 Comm: kworker/0:0 Tainted: G
    kernel: Workqueue: events work_for_cpu_fn
    kernel: RIP: 0010:nfp_flower_indr_setup_tc_cb+0x258/0x410
    kernel: RSP: 0018:ffffbc1e02c57bf8 EFLAGS: 00010286
    kernel: RAX: 0000000000000000 RBX: ffff9c761fabc000 RCX: 0000000000000001
    kernel: RDX: 0000000000000001 RSI: fffffffffffffff0 RDI: ffffffffc0be9ef1
    kernel: RBP: ffffbc1e02c57c58 R08: ffffffffc08f33aa R09: ffff9c6db7478800
    kernel: R10: 0000009c003f6e00 R11: ffffbc1e02800000 R12: ffffbc1e000d9000
    kernel: R13: ffffbc1e000db428 R14: ffff9c6db7478800 R15: ffff9c761e884e80
    kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    kernel: CR2: fffffffffffffff0 CR3: 00000009e260a004 CR4: 00000000007706f0
    kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    kernel: PKRU: 55555554
    kernel: Call Trace:
    kernel: ? flow_indr_dev_register+0xab/0x210
    kernel: ? __cond_resched+0x15/0x30
    kernel: ? kmem_cache_alloc_trace+0x44/0x4b0
    kernel: ? nfp_flower_setup_tc+0x1d0/0x1d0 [nfp]
    kernel: flow_indr_dev_register+0x158/0x210
    kernel: ? tcf_block_unbind+0xe0/0xe0
    kernel: nfp_flower_init+0x40b/0x650 [nfp]
    kernel: nfp_net_pci_probe+0x25f/0x960 [nfp]
    kernel: ? nfp_rtsym_read_le+0x76/0x130 [nfp]
    kernel: nfp_pci_probe+0x6a9/0x820 [nfp]
    kernel: local_pci_probe+0x45/0x80
    
    So we need to call flow_indr_dev_register in app start process instead of
    init stage.
    
    Fixes: 74fc4f8 ("net: Fix offloading indirect devices dependency on qdisc order creation")
    Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
    Signed-off-by: Simon Horman <simon.horman@corigine.com>
    Signed-off-by: Louis Peens <louis.peens@corigine.com>
    Link: https://lore.kernel.org/r/20211012124850.13025-1-louis.peens@corigine.com
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    zhengbaowen authored and Jakub Kicinski committed Oct 12, 2021
Older