forked from freebsd/freebsd-src
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DPAA2: properly set link state on startup #11
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Factor out the link-state change tracking from media_change into a new function dpaa2_ni_miibus_statchg(). There use mii information for the link state as ifp->if_linkstate only gets updated after other mii functions are called and thus lacks behind. Add a new bus function to memac_mdio so we can set the dpni interface and call MIIBUS_STATCHG on the dpni and pass that through to dpaa2_ni_miibus_statchg(). It is a bit of a weird setup with two parallel device tress off ACPI as ideally memac_mdio0 would be a child of dpaa2_ni0. We will see how this will work with FDT at some point but at least the current way of doing is flexible enough. Currently we need to make sure MC0 can discover the memac_mdio via the ACPI reference upon attach and have a way for DPNI to query that. The new bus function now allows us to set a back-pointer in the other direction from dpaa2_ni_setup(). nexus0 acpi0 dpaa2_mc0 dpaa2_rc0 dpaa2_ni0 memac_mdio0 memacphy0 miibus1 atphy0 memac_mdio1 Most importantly and not to be missed at the end of dpaa2_ni_init() make sure we initialize the link state as otherwise we will not see any interrupts. That can be observed by a hanging NFS Root mount for example, whereas with the previous code ifconfig would trigger a SIOCGIFMEDIA call (from multi-user at some point) which then indirectly and more by accident set the the MAC LINK STATE via DPAA2_CMD_MAC_SET_LINK_STATE() and got the interface started.
It rocks :) I'll spend sometime reading it today evening and will merge. |
It panicked on Ten64:
|
I'll look into it later. |
Oh, no mii there (yet). if (sc->fixed_link) |
Yeah, you're right. It helped. |
Applied in 4529394. Thanks for that! |
dsalychev
pushed a commit
that referenced
this pull request
Apr 3, 2023
Under certain loads, the following panic is hit: panic: page fault KDB: stack backtrace: #0 0xffffffff805db025 at kdb_backtrace+0x65 #1 0xffffffff8058e86f at vpanic+0x17f #2 0xffffffff8058e6e3 at panic+0x43 #3 0xffffffff808adc15 at trap_fatal+0x385 #4 0xffffffff808adc6f at trap_pfault+0x4f #5 0xffffffff80886da8 at calltrap+0x8 #6 0xffffffff80669186 at vgonel+0x186 #7 0xffffffff80669841 at vgone+0x31 #8 0xffffffff8065806d at vfs_hash_insert+0x26d #9 0xffffffff81a39069 at sfs_vgetx+0x149 #10 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 #11 0xffffffff8065a28c at lookup+0x45c #12 0xffffffff806594b9 at namei+0x259 #13 0xffffffff80676a33 at kern_statat+0xf3 #14 0xffffffff8067712f at sys_fstatat+0x2f #15 0xffffffff808ae50c at amd64_syscall+0x10c #16 0xffffffff808876bb at fast_syscall_common+0xf8 The page fault occurs because vgonel() will call VOP_CLOSE() for active vnodes. For this reason, define vop_close for zfsctl_ops_snapshot. While here, define vop_open for consistency. After adding the necessary vop, the bug progresses to the following panic: panic: VERIFY3(vrecycle(vp) == 1) failed (0 == 1) cpuid = 17 KDB: stack backtrace: #0 0xffffffff805e29c5 at kdb_backtrace+0x65 #1 0xffffffff8059620f at vpanic+0x17f #2 0xffffffff81a27f4a at spl_panic+0x3a #3 0xffffffff81a3a4d0 at zfsctl_snapshot_inactive+0x40 #4 0xffffffff8066fdee at vinactivef+0xde #5 0xffffffff80670b8a at vgonel+0x1ea #6 0xffffffff806711e1 at vgone+0x31 #7 0xffffffff8065fa0d at vfs_hash_insert+0x26d #8 0xffffffff81a39069 at sfs_vgetx+0x149 #9 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 #10 0xffffffff80661c2c at lookup+0x45c #11 0xffffffff80660e59 at namei+0x259 #12 0xffffffff8067e3d3 at kern_statat+0xf3 #13 0xffffffff8067eacf at sys_fstatat+0x2f #14 0xffffffff808b5ecc at amd64_syscall+0x10c #15 0xffffffff8088f07b at fast_syscall_common+0xf8 This is caused by a race condition that can occur when allocating a new vnode and adding that vnode to the vfs hash. If the newly created vnode loses the race when being inserted into the vfs hash, it will not be recycled as its usecount is greater than zero, hitting the above assertion. Fix this by dropping the assertion. FreeBSD-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252700 Reviewed-by: Andriy Gapon <avg@FreeBSD.org> Reviewed-by: Mateusz Guzik <mjguzik@gmail.com> Reviewed-by: Alek Pinchuk <apinchuk@axcient.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Rob Wing <rob.wing@klarasystems.com> Co-authored-by: Rob Wing <rob.wing@klarasystems.com> Submitted-by: Klara, Inc. Sponsored-by: rsync.net Closes #14501
dsalychev
pushed a commit
that referenced
this pull request
Sep 9, 2023
netlink(4) calls back into the driver during detach and it attempts to start an internal synchronized op recursively, causing an interruptible hang. Fix it by failing the ioctl if the VI has been marked as DOOMED by cxgbe_detach. Here's the stack for the hang for reference. #6 begin_synchronized_op #7 cxgbe_media_status #8 ifmedia_ioctl #9 cxgbe_ioctl #10 if_ioctl #11 get_operstate_ether #12 get_operstate #13 dump_iface #14 rtnl_handle_ifevent #15 rtnl_handle_ifnet_event #16 rt_ifmsg #17 if_unroute #18 if_down #19 if_detach_internal #20 if_detach #21 ether_ifdetach #22 cxgbe_vi_detach #23 cxgbe_detach #24 DEVICE_DETACH MFC after: 3 days Sponsored by: Chelsio Communications
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Factor out the link-state change tracking from media_change into
a new function dpaa2_ni_miibus_statchg(). There use mii information
for the link state as ifp->if_linkstate only gets updated after other
mii functions are called and thus lacks behind.
Add a new bus function to memac_mdio so we can set the dpni interface
and call MIIBUS_STATCHG on the dpni and pass that through to
dpaa2_ni_miibus_statchg().
It is a bit of a weird setup with two parallel device tress off ACPI
as ideally memac_mdio0 would be a child of dpaa2_ni0.
We will see how this will work with FDT at some point but at least the
current way of doing is flexible enough.
Currently we need to make sure MC0 can discover the memac_mdio via the
ACPI reference upon attach and have a way for DPNI to query that.
The new bus function now allows us to set a back-pointer in the other
direction from dpaa2_ni_setup().
Most importantly and not to be missed at the end of dpaa2_ni_init()
make sure we initialize the link state as otherwise we will not see
any interrupts. That can be observed by a hanging NFS Root mount
for example, whereas with the previous code ifconfig would trigger
a SIOCGIFMEDIA call (from multi-user at some point) which then
indirectly and more by accident set the the MAC LINK STATE via
DPAA2_CMD_MAC_SET_LINK_STATE() and got the interface started.
This fixes #7