MAC filter failures #14

bzfbd · 2022-06-30T21:01:35Z

dpaa2_ni0: dpaa2_ni_update_mac_filters: failed to clear multicast MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: failed to update MAC filters: error=3
dpaa2_ni0: dpaa2_ni_update_mac_filters: failed to clear multicast MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: failed to update MAC filters: error=3
dpaa2_ni0: dpaa2_ni_update_mac_filters: failed to clear multicast MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: failed to update MAC filters: error=3
dpaa2_ni0: dpaa2_ni_update_mac_filters: failed to clear multicast MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: failed to update MAC filters: error=3
dpaa2_ni0: dpaa2_ni_update_mac_filters: failed to clear multicast MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: failed to update MAC filters: error=3

Just got this during bootup. Not seen it before.

The text was updated successfully, but these errors were encountered:

bzfbd · 2023-01-31T21:43:20Z

While talking I realized that on the Ten64 I see 7 of them, all on dpaa2_ni0; could it be that we try to do something on the N=0 interface for all of them instead of N=i?

dpaa2_ni0: dpaa2_ni_update_mac_filters: failed to clear multicast MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: failed to update MAC filters: error=3
dpaa2_ni0: dpaa2_ni_update_mac_filters: failed to clear multicast MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: failed to update MAC filters: error=3
dpaa2_ni0: dpaa2_ni_update_mac_filters: failed to clear multicast MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: failed to update MAC filters: error=3
dpaa2_ni0: dpaa2_ni_update_mac_filters: failed to clear multicast MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: failed to update MAC filters: error=3
dpaa2_ni0: dpaa2_ni_update_mac_filters: failed to clear multicast MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: failed to update MAC filters: error=3
dpaa2_ni0: dpaa2_ni_update_mac_filters: failed to clear multicast MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: failed to update MAC filters: error=3
dpaa2_ni0: dpaa2_ni_update_mac_filters: failed to clear multicast MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: failed to update MAC filters: error=3
Starting Network: lo0 dpni0 dpni1 dpni2 dpni3 dpni4 dpni5 dpni6 dpni7 dpni8 dpni9.

dsalychev · 2023-01-31T21:53:26Z

hm, I'll have to check on my own first

bzfbd · 2023-01-31T22:30:39Z

That seems non-sensical; we only call the ioctl on the interface which is UP so they likely are correct calls... but looking at the call path quickly I cannot spot where the 3 as error value comes from either...

dsalychev · 2023-02-02T18:43:10Z

@bzfbd Do you see the same errors on CURRENT? I've tried to netboot 17c8213559cd and got only:

# dmesg | grep dpaa2 | grep fail
dpaa2_mcp27: dpaa2_mcp_attach: failed to reset DPMCP: id=1, error=6
dpaa2_mcp27: dpaa2_mcp_attach: failed to allocate resources
dpaa2_mcp27: dpaa2_mcp_attach: failed to allocate resources

bzfbd · 2023-02-04T20:01:42Z

Yes, current and current + your latest patch both show the same problem on the ten64 for me. I should go and see on the honeycomb?

dsalychev · 2023-02-07T15:02:52Z

I haven't ever seen those errors on HoneyComb. I down-clocked my Ten64 to 1.2 GHz sometime ago:

U-Boot 2020.07-rc1-g2c6f1f4e (Jun 21 2022 - 11:26:02 +0000)

SoC:  LS1088AE Rev1.0 (0x87030010)
Clock Configuration:
       CPU0(A53):1200 MHz  CPU1(A53):1200 MHz  CPU2(A53):1200 MHz  
       CPU3(A53):1200 MHz  CPU4(A53):1200 MHz  CPU5(A53):1200 MHz  
       CPU6(A53):1200 MHz  CPU7(A53):1200 MHz  
       Bus:      500  MHz  DDR:      1600 MT/s
Reset Configuration Word (RCW):
       00000000: 30004014 00000030 00000000 00000000
       00000010: 00000000 000a0000 00300000 00000000
       00000020: 010011a0 00002580 00000000 00000000
       00000030: 013fe60a 00000000 00002c03 00000000
       00000040: 00000000 00000000 00000000 00000000
       00000050: 00000000 00000000 00000000 00000000
       00000060: 00000000 00000000 00000089 000009e7
       00000070: 44110000 0d007755
DRAM:  7.9 GiB
DDR    7.9 GiB (DDR4, 64-bit, CL=11, ECC on)
	7.4 GiB available for userspace

It might be a result of a race condition somewhere in the DPAA2 drivers, I guess.

bzfbd · 2023-02-14T11:25:11Z

Looks like the is a CLOSE happening somewhere? The XXX-BZ is printed in the error cases at the end of dpaa2_rc_exec_cmd():

dpaa2_ni0: dpaa2_ni_ioctl: cmd ADDMULTI ifp dpni0
dpaa2_ni0: dpaa2_ni_ioctl: cmd DELMULTI ifp dpni0
XXX-BZ dpaa2_rc_exec_cmd:3253 mcp 0xffffa000001187c0 cmd 0xffffa00000118e00 cmdid 0x2281 [ header: 0x8001026d000300d0 0: 0x5 1: 0x3e8 2: 0x3 3: 0 4: 0 5: 0 6: 0 ]
dpaa2_ni0: dpaa2_ni_update_mac_filters: failed to clear multicast MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: failed to update MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: cmd ADDMULTI ifp dpni0
XXX-BZ dpaa2_rc_exec_cmd:3253 mcp 0xffffa000001187c0 cmd 0xffffa00000118e00 cmdid 0x2281 [ header: 0x8001026d000300d0 0: 0x5 1: 0x3e8 2: 0x3 3: 0 4: 0 5: 0 6: 0 ]
dpaa2_ni0: dpaa2_ni_update_mac_filters: failed to clear multicast MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: failed to update MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: cmd ADDMULTI ifp dpni0
XXX-BZ dpaa2_rc_exec_cmd:3253 mcp 0xffffa000001187c0 cmd 0xffffa00000118e00 cmdid 0x2281 [ header: 0x8001026d000300d0 0: 0x5 1: 0x3e8 2: 0x3 3: 0 4: 0 5: 0 6: 0 ]
dpaa2_ni0: dpaa2_ni_update_mac_filters: failed to clear multicast MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: failed to update MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: cmd ADDMULTI ifp dpni0
XXX-BZ dpaa2_rc_exec_cmd:3253 mcp 0xffffa000001187c0 cmd 0xffffa00000118e00 cmdid 0x2281 [ header: 0x8001026d000300d0 0: 0x5 1: 0x3e8 2: 0x3 3: 0 4: 0 5: 0 6: 0 ]
dpaa2_ni0: dpaa2_ni_update_mac_filters: failed to clear multicast MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: failed to update MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: cmd ADDMULTI ifp dpni0
XXX-BZ dpaa2_rc_exec_cmd:3253 mcp 0xffffa000001187c0 cmd 0xffffa00000118e00 cmdid 0x2281 [ header: 0x8001026d000300d0 0: 0x5 1: 0x3e8 2: 0x3 3: 0 4: 0 5: 0 6: 0 ]
dpaa2_ni0: dpaa2_ni_update_mac_filters: failed to clear multicast MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: failed to update MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: cmd ADDMULTI ifp dpni0
XXX-BZ dpaa2_rc_exec_cmd:3253 mcp 0xffffa000001187c0 cmd 0xffffa00000118e00 cmdid 0x2281 [ header: 0x8001026d000300d0 0: 0x5 1: 0x3e8 2: 0x3 3: 0 4: 0 5: 0 6: 0 ]
dpaa2_ni0: dpaa2_ni_update_mac_filters: failed to clear multicast MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: failed to update MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: cmd ADDMULTI ifp dpni0
XXX-BZ dpaa2_rc_exec_cmd:3253 mcp 0xffffa000001187c0 cmd 0xffffa00000118e00 cmdid 0x2281 [ header: 0x8001026d000300d0 0: 0x5 1: 0x3e8 2: 0x3 3: 0 4: 0 5: 0 6: 0 ]
dpaa2_ni0: dpaa2_ni_update_mac_filters: failed to clear multicast MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: failed to update MAC filters: error=3

bzfbd · 2023-02-14T18:29:25Z

So what I noticed is that we (re-)use sc->cmd a lot but in a lot of places not under lock. Why is it that we keep using a single cmd storage rather than something on the stack?

bzfbd · 2023-02-14T18:48:02Z

Next bit:

c is a copy of the command before send off; cmd is what comes back ... These headers don't match up...

lo0: link state changed to UP
XXX-BZ dpaa2_rc_exec_cmd:3257 mcp 0xffffa000001187c0 cmd 0xffff00004039b798 cmdid 0x2281 c [ header: 0x2281020d000100d0 0: 0x2 1: 0 2: 0 3: 0 4: 0 5: 0 6: 0 ] cmd [ header: 0x8001026d000300d0 0: 0x5 1: 0x3e8 2: 0x3 3: 0 4: 0 5: 0 6: 0 ]
dpaa2_ni0: dpaa2_ni_update_mac_filters: failed to clear multicast MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: failed to update MAC filters: error=3
XXX-BZ dpaa2_rc_exec_cmd:3257 mcp 0xffffa000001187c0 cmd 0xffff000132dd1298 cmdid 0x2281 c [ header: 0x2281020d000100d0 0: 0x2 1: 0 2: 0 3: 0 4: 0 5: 0 6: 0 ] cmd [ header: 0x8001026d000300d0 0: 0x5 1: 0x3e8 2: 0x3 3: 0 4: 0 5: 0 6: 0 ]
dpaa2_ni0: dpaa2_ni_update_mac_filters: failed to clear multicast MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: failed to update MAC filters: error=3
XXX-BZ dpaa2_rc_exec_cmd:3257 mcp 0xffffa000001187c0 cmd 0xffff000132dd0f28 cmdid 0x2281 c [ header: 0x2281020d000100d0 0: 0x2 1: 0 2: 0 3: 0 4: 0 5: 0 6: 0 ] cmd [ header: 0x8001026d000300d0 0: 0x5 1: 0x3e8 2: 0x3 3: 0 4: 0 5: 0 6: 0 ]
dpaa2_ni0: dpaa2_ni_update_mac_filters: failed to clear multicast MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: failed to update MAC filters: error=3
XXX-BZ dpaa2_rc_exec_cmd:3257 mcp 0xffffa000001187c0 cmd 0xffff000132dd0f28 cmdid 0x2281 c [ header: 0x2281020d000100d0 0: 0x2 1: 0 2: 0 3: 0 4: 0 5: 0 6: 0 ] cmd [ header: 0x8001026d000300d0 0: 0x5 1: 0x3e8 2: 0x3 3: 0 4: 0 5: 0 6: 0 ]
dpaa2_ni0: dpaa2_ni_update_mac_filters: failed to clear multicast MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: failed to update MAC filters: error=3
XXX-BZ dpaa2_rc_exec_cmd:3257 mcp 0xffffa000001187c0 cmd 0xffff000132dd0f28 cmdid 0x2281 c [ header: 0x2281020d000100d0 0: 0x2 1: 0 2: 0 3: 0 4: 0 5: 0 6: 0 ] cmd [ header: 0x8001026d000300d0 0: 0x5 1: 0x3e8 2: 0x3 3: 0 4: 0 5: 0 6: 0 ]
dpaa2_ni0: dpaa2_ni_update_mac_filters: failed to clear multicast MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: failed to update MAC filters: error=3
XXX-BZ dpaa2_rc_exec_cmd:3257 mcp 0xffffa000001187c0 cmd 0xffff000132dd0ee8 cmdid 0x2281 c [ header: 0x2281020d000100d0 0: 0x2 1: 0 2: 0 3: 0 4: 0 5: 0 6: 0 ] cmd [ header: 0x8001026d000300d0 0: 0x5 1: 0x3e8 2: 0x3 3: 0 4: 0 5: 0 6: 0 ]
dpaa2_ni0: dpaa2_ni_update_mac_filters: failed to clear multicast MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: failed to update MAC filters: error=3
XXX-BZ dpaa2_rc_exec_cmd:3257 mcp 0xffffa000001187c0 cmd 0xffff000132dd0f28 cmdid 0x2281 c [ header: 0x2281020d000100d0 0: 0x2 1: 0 2: 0 3: 0 4: 0 5: 0 6: 0 ] cmd [ header: 0x8001026d000300d0 0: 0x5 1: 0x3e8 2: 0x3 3: 0 4: 0 5: 0 6: 0 ]
dpaa2_ni0: dpaa2_ni_update_mac_filters: failed to clear multicast MAC filters: error=3
dpaa2_ni0: dpaa2_ni_ioctl: failed to update MAC filters: error=3

dsalychev · 2023-02-15T09:06:17Z

I wanted to avoid struct dpaa2_cmd allocation/de-allocate wherever it was needed and remove calls to DPAA2_CMD_xx_OPEN() in dpaa2_ni.c. Hmm, I considered only simple use cases (like dpaa2_io.c or dpaa2_bp.c) only where the pre-allocated command is used in attach/detach function.

Anyway, the best way to solve it would be to have struct dpaa2_cmd allocated on a stack and initialized with dpaa2_mcp_init_command() without malloc, I think.

This is a first patch of a series to fix a bug #14.

dsalychev · 2023-02-27T11:43:46Z

@bzfbd 0746e1f doesn't fail to configure MAC filters for me on Ten64 with default firmware (at 1600 MHz).
EDIT: Kernel still panics under network stress test with iperf3 though.

Under certain loads, the following panic is hit: panic: page fault KDB: stack backtrace: #0 0xffffffff805db025 at kdb_backtrace+0x65 #1 0xffffffff8058e86f at vpanic+0x17f #2 0xffffffff8058e6e3 at panic+0x43 #3 0xffffffff808adc15 at trap_fatal+0x385 #4 0xffffffff808adc6f at trap_pfault+0x4f #5 0xffffffff80886da8 at calltrap+0x8 #6 0xffffffff80669186 at vgonel+0x186 #7 0xffffffff80669841 at vgone+0x31 #8 0xffffffff8065806d at vfs_hash_insert+0x26d #9 0xffffffff81a39069 at sfs_vgetx+0x149 #10 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 #11 0xffffffff8065a28c at lookup+0x45c #12 0xffffffff806594b9 at namei+0x259 #13 0xffffffff80676a33 at kern_statat+0xf3 #14 0xffffffff8067712f at sys_fstatat+0x2f #15 0xffffffff808ae50c at amd64_syscall+0x10c #16 0xffffffff808876bb at fast_syscall_common+0xf8 The page fault occurs because vgonel() will call VOP_CLOSE() for active vnodes. For this reason, define vop_close for zfsctl_ops_snapshot. While here, define vop_open for consistency. After adding the necessary vop, the bug progresses to the following panic: panic: VERIFY3(vrecycle(vp) == 1) failed (0 == 1) cpuid = 17 KDB: stack backtrace: #0 0xffffffff805e29c5 at kdb_backtrace+0x65 #1 0xffffffff8059620f at vpanic+0x17f #2 0xffffffff81a27f4a at spl_panic+0x3a #3 0xffffffff81a3a4d0 at zfsctl_snapshot_inactive+0x40 #4 0xffffffff8066fdee at vinactivef+0xde #5 0xffffffff80670b8a at vgonel+0x1ea #6 0xffffffff806711e1 at vgone+0x31 #7 0xffffffff8065fa0d at vfs_hash_insert+0x26d #8 0xffffffff81a39069 at sfs_vgetx+0x149 #9 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 #10 0xffffffff80661c2c at lookup+0x45c #11 0xffffffff80660e59 at namei+0x259 #12 0xffffffff8067e3d3 at kern_statat+0xf3 #13 0xffffffff8067eacf at sys_fstatat+0x2f #14 0xffffffff808b5ecc at amd64_syscall+0x10c #15 0xffffffff8088f07b at fast_syscall_common+0xf8 This is caused by a race condition that can occur when allocating a new vnode and adding that vnode to the vfs hash. If the newly created vnode loses the race when being inserted into the vfs hash, it will not be recycled as its usecount is greater than zero, hitting the above assertion. Fix this by dropping the assertion. FreeBSD-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252700 Reviewed-by: Andriy Gapon <avg@FreeBSD.org> Reviewed-by: Mateusz Guzik <mjguzik@gmail.com> Reviewed-by: Alek Pinchuk <apinchuk@axcient.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Rob Wing <rob.wing@klarasystems.com> Co-authored-by: Rob Wing <rob.wing@klarasystems.com> Submitted-by: Klara, Inc. Sponsored-by: rsync.net Closes #14501

bzfbd · 2023-07-04T00:05:03Z

I think this was sorted with one of the commits to main [1]; if I see it again I'll re-open otherwise.

[1] https://cgit.freebsd.org/src/commit/sys/dev/dpaa2?id=4cd966142822ce24c12751c863a073a8b7cb9c14

netlink(4) calls back into the driver during detach and it attempts to start an internal synchronized op recursively, causing an interruptible hang. Fix it by failing the ioctl if the VI has been marked as DOOMED by cxgbe_detach. Here's the stack for the hang for reference. #6 begin_synchronized_op #7 cxgbe_media_status #8 ifmedia_ioctl #9 cxgbe_ioctl #10 if_ioctl #11 get_operstate_ether #12 get_operstate #13 dump_iface #14 rtnl_handle_ifevent #15 rtnl_handle_ifnet_event #16 rt_ifmsg #17 if_unroute #18 if_down #19 if_detach_internal #20 if_detach #21 ether_ifdetach #22 cxgbe_vi_detach #23 cxgbe_detach #24 DEVICE_DETACH MFC after: 3 days Sponsored by: Chelsio Communications

Interesting fixes: 03c83f5 add __cxa_init_primary_exception (#23) 5d8a158 Fix two bugs in __cxa_end_cleanup() b00c6c5 Insert padding in __cxa_dependent_exception 45ca8b1 Insert padding in __cxa_exception struct for compatibility f2e5509 Fix unlock in two-word version and add missing comment. 6229590 Add an option for disabling emergency buffers. (#14)

Interesting fixes (* were already cherry-picked): - 03c83f5 add __cxa_init_primary_exception (#23) * 5d8a158 Fix two bugs in __cxa_end_cleanup() * b00c6c5 Insert padding in __cxa_dependent_exception * 45ca8b1 Insert padding in __cxa_exception struct for compatibility * f2e5509 Fix unlock in two-word version and add missing comment. - 6229590 Add an option for disabling emergency buffers. (#14) MFC after: 2 weeks

dsalychev added the bug Something isn't working label Jul 9, 2022

dsalychev self-assigned this Aug 1, 2022

dsalychev pushed a commit that referenced this issue Feb 15, 2023

dpaa2: Allocate dpaa2_cmd on stack for DPIO

bc093d5

This is a first patch of a series to fix a bug #14.

bzfbd closed this as completed Jul 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MAC filter failures #14

MAC filter failures #14

bzfbd commented Jun 30, 2022

bzfbd commented Jan 31, 2023

dsalychev commented Jan 31, 2023

bzfbd commented Jan 31, 2023

dsalychev commented Feb 2, 2023

bzfbd commented Feb 4, 2023

dsalychev commented Feb 7, 2023

bzfbd commented Feb 14, 2023

bzfbd commented Feb 14, 2023

bzfbd commented Feb 14, 2023

dsalychev commented Feb 15, 2023 •

edited

Loading

dsalychev commented Feb 27, 2023 •

edited

Loading

bzfbd commented Jul 4, 2023

MAC filter failures #14

MAC filter failures #14

Comments

bzfbd commented Jun 30, 2022

bzfbd commented Jan 31, 2023

dsalychev commented Jan 31, 2023

bzfbd commented Jan 31, 2023

dsalychev commented Feb 2, 2023

bzfbd commented Feb 4, 2023

dsalychev commented Feb 7, 2023

bzfbd commented Feb 14, 2023

bzfbd commented Feb 14, 2023

bzfbd commented Feb 14, 2023

dsalychev commented Feb 15, 2023 • edited Loading

dsalychev commented Feb 27, 2023 • edited Loading

bzfbd commented Jul 4, 2023

dsalychev commented Feb 15, 2023 •

edited

Loading

dsalychev commented Feb 27, 2023 •

edited

Loading